What's Changed
- GGUF format add support for mtp quantization by @n1ck-guo in #1866
- Add bf16 + NHD layout support, refactor sage_dynamic_quant by @luoyu-intel in #1882
- Suppress misleading warning when detecting model type for GGUF export directories by @lvliang-intel in #1887
- Fix performance regression by @wenhuach21 in #1886
- Fix random rotation and update rotation doc. by @lkk12014402 in #1884
- Update auto-round-lib release package build by @chensuyue in #1895
- Fix CI coverage & bug grep issue by @chensuyue in #1893
- Fix gguf opt-rtn regression by @wenhuach21 in #1905
- Fix: guard zero-division in GGUF quant kernels to avoid NaN block scales by @Entrpi in #1909
- Fallback compute type on b70 if needed by @yiliu30 in #1904
Full Changelog: v0.13.0...v0.13.1