v1.3.1
What's Changed
- chore: update load configs docs by @DefTruth in #867
- fix: skip fp8 quantize linear w/ bias in tp by @DefTruth in #869
- chore: add quick start flags for quantize by @DefTruth in #871
- chore: update pypi download badge by @DefTruth in #872
- bugfix: remove un-supported quantize type by @DefTruth in #873
- feat: expand quantize config by @DefTruth in #874
- feat: support async ulysses for flux2 series by @DefTruth in #877
- chore: cleanup patch functors codes by @DefTruth in #878
- chore: fix docs typo by @DefTruth in #879
- chore: safe import metrics funcs by @DefTruth in #880
- chore: update quantization docs by @DefTruth in #881
- chore: use rel imports for calibrators by @DefTruth in #882
- chore: suppress torchao warnings by @DefTruth in #883
- chore: add tune alias for max-autotune by @DefTruth in #884
- remove manually graph break in cache blocks by @DefTruth in #885
- docs: format docs by @DefTruth in #886
- docs: fix typos by @DefTruth in #887
- [1/N] feat: support flux2-klein kv - tp + compile by @DefTruth in #888
- chore: cleanup tp utils codes by @DefTruth in #890
- chore: fix api docs typo by @DefTruth in #891
- chore: add mcc usage docs by @DefTruth in #892
- chore: update mcc usage docs by @DefTruth in #893
- chore: add mcc to cache-dit arch by @DefTruth in #894
- chore: update mcc docs by @DefTruth in #895
- [2/N] feat: support fp8 per-row + tp for flux2-klein kv by @DefTruth in #896
- quant: add float8 linear check by @DefTruth in #898
- docs: format docs by @DefTruth in #899
- deps: bump up torch to 2.11.0 by @DefTruth in #900
- quant: refactor torchao backend impl by @DefTruth in #901
- feat: support regional quantization by @DefTruth in #902
- chore: change docs highlight color by @DefTruth in #903
- chore: optimize quant stats summary by @DefTruth in #904
- kernel: register comm kernels as torch ops by @DefTruth in #905
- kernel: refactor custom triton kernels by @DefTruth in #907
- [2/N] kernel: refactor custom triton kernels by @DefTruth in #908
- [3/N] kernel: refactor custom triton kernels by @DefTruth in #909
- quant: refactor quantize api, deprecated kwargs by @DefTruth in #910
- [2/N] quant: refactor quantize api, deprecated kwargs by @DefTruth in #911
- chore: suppress diffusers torchao warnings by @DefTruth in #912
- chore: fix load configs docs typo by @DefTruth in #913
- chore: optimize quant ctx summary by @DefTruth in #914
Full Changelog: v1.3.0...v1.3.1