v1.3.0: USP, 2D/3D Parallel, FP8 Blockwise, ...
v1.3.0 Major Release: USP, 2D/3D Parallel, FP8 Blockwise, ...
Cache-DiT v1.3.0 is a major release after v.1.2.0, the major changes incuding:
- cache-dit-generate command line tool
- Optimize VAE Parallel comm, use batched isend/irecv
- 2D/3D Parallelism: Hybrid CP(USP) + TP, e.g, SP2 + TP2
- Support USP (hybrid ulysses and ring attention)
- New models support: GLM-Image, FLUX.2-Klein, Helios, FireRed-Image-Edit, and more.
- Support pass a quantize_config to
enable_cacheAPI - Support load cache, parallelism and quantization config from yaml, docs
- FP8 Blockwise dynamic quantization support
- AMD GPUs support
- ...
Full Changelog: v1.2.0...v1.3.0