Skip to content

v1.2.1 USP, 2D/3D Parallel

Choose a tag to compare

@DefTruth DefTruth released this 02 Feb 02:57
· 244 commits to main since this release
7496c6a

🎉 v1.2.1 release is ready, the major updates including: Ring Attention w/ batched P2P, USP (Hybrid Ring and Ulysses), Hybrid 2D and 3D Parallelism (💥USP + TP), VAE-P Comm overhead reduce.

# Hybrid 2D/3D Parallelism in Cache-DiT is fully compatible w/ torch.compile, 
# Cache Acceleration, Text Encoder Parallelism, VAE Parallelism and more.
torchrun --nproc_per_node=8 -m cache_dit.generate flux2 --config parallel_2d.yaml --compile
torchrun --nproc_per_node=8 -m cache_dit.generate flux2 --config parallel_3d.yaml --compile
torchrun --nproc_per_node=8 -m cache_dit.generate --parallel ulysses_tp --cache --compile

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.2.1