v0.3.1 - Classifier-Free Guidance (CFG) Support
·
29 commits
to main
since this release
Immutable
release. Only release title and notes can be modified.
🔄 Version Note
This is v0.3.1 - a coordinated release across the FluxFlow ecosystem.
Note: v0.3.0 does not exist on PyPI due to release coordination. This release contains all features originally planned for v0.3.0.
🚀 Major Features
Classifier-Free Guidance (CFG) Support
Enhanced inference control for better prompt following and quality:
New Parameters:
use_cfg- Toggle CFG on/off (boolean)guidance_scale- Control conditioning strength (1.0-15.0)- 1.0 = no guidance (standard sampling)
- 7.0-10.0 = recommended for most cases
- 15.0+ = very strong conditioning
negative_prompt- Specify unwanted features (string)
Implementation:
- Dual-pass sampling for CFG inference
- Efficient batching for conditional and unconditional passes
- Compatible with models trained with
cfg_dropout_prob > 0
Infrastructure:
- CFG-aware model loading and validation
- Enhanced configuration validation for CFG parameters
- Backward compatible - works with non-CFG models
🎨 Bézier Activations (v0.1.1+)
TrainableBezier for per-channel learnable transformations:
- Optimized implementation with
torch.addcmul(1.41× faster) - Used in VAE latent bottleneck (mu/logvar) and RGB output
- Total 1,036 learnable parameters: 1,024 (latent) + 12 (RGB)
Performance:
- Benchmark: 90sec/step → 8sec/step after SlidingBezier removal
- Fused multiply-add operations for efficiency
- Cached intermediate values (t², t³, t_inv², t_inv³)
✨ VAE Architecture Improvements
Enhanced Attention:
- Increased attention layers from 2 to 4
- Better global context modeling
- Richer latent representations
Improved RGB Output:
- Wider channels: 128→96→48→3
- GroupNorm + SiLU activation
- No squashing for full color range
- Fixes color tinting issues
Input Validation:
- Channel dimension validation in
FluxCompressor.forward() - Early detection of shape mismatches
🔧 Technical Improvements
SPADE Context Handling (v0.2.1):
- Use
Noneinstead oftorch.zeros_like()when disabled - More efficient - avoids unnecessary tensor allocation
- Semantically correct - explicitly means "no context"
Removed SlidingBezier:
- 25× slower than SiLU
- 5× memory overhead from
unfold() - Performance: 90sec/step → 8sec/step
📦 Installation
pip install fluxflow==0.3.1🔗 Links
- PyPI: https://pypi.org/project/fluxflow/0.3.1/
- Documentation: https://github.com/danny-mio/fluxflow-core/blob/v0.3.1/CHANGELOG.md
- Training Tools: https://github.com/danny-mio/fluxflow-training/releases/tag/v0.3.1
📋 What's Changed
Full Changelog: v0.2.1...v0.3.1
🧪 CI Status
- ✅ All tests passing (Python 3.10, 3.11, 3.12)
- ✅ Code quality: flake8 clean, black formatted
- ✅ Type checking: mypy clean