What's Changed
- [CI] fix release action cannot find uv & clean actions by @CSY-ModelCloud in #2837
- [CI] install timm for internvl chat by @CSY-ModelCloud in #2839
- Add Laguna model support by @Qubitium in #2836
- [MODEL] support
ernie4_5_vl_moeby @ZX-ModelCloud in #2838 - fix AttributeError: 'NoneType' object has no attribute 'from_pretrained' by @CSY-ModelCloud in #2840
- Add GSM8K Platinum to Laguna regression by @Qubitium in #2841
- docs: update hardware support table by @Qubitium in #2842
- [FIX] ci test by @ZX-ModelCloud in #2843
- Add NPU quant method coverage by @Qubitium in #2845
- [FIX] AWQ device placement to follow planner target devices by @ZX-ModelCloud in #2847
- add nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 support by @CSY-ModelCloud in #2846
- fix test_subset by @CSY-ModelCloud in #2849
- new template for Nemotron_3 test by @CSY-ModelCloud in #2851
- update internvl_chat pkgs by @CSY-ModelCloud in #2850
- add inclusionAI/Ling-2.6-flash support by @CSY-ModelCloud in #2844
- Refactor LazyTurtle checkpoint tensor resolution by @ZX-ModelCloud in #2852
- fix AttributeError: '_DummyConfig' object has no attribute 'model_type' by @CSY-ModelCloud in #2853
- fix apply_moe_config was not found by @CSY-ModelCloud in #2854
- added "qwen3_5_moe_text" definition and "qwen3_5_text" definition by @ZX-ModelCloud in #2857
- fix first layer was asserted, but only last 2 layers are quanted by @CSY-ModelCloud in #2861
- [MODEL] support "glm4v_moe" by @ZX-ModelCloud in #2862
- add LogBar progress to sync_all_meta() writes by @ZX-ModelCloud in #2860
- fix: handle float8 tensor serialization in streaming safetensors saves by @ZX-ModelCloud in #2863
- move causal_conv1d to gptqmodel/hf_kernels, not in root by @CSY-ModelCloud in #2864
- store logs file in ./logs, not in root dir by @CSY-ModelCloud in #2865
- [CI] fix dust or dir may mot exist by @CSY-ModelCloud in #2867
- fix tests/models/test_qwen3_5_text_only by @ZX-ModelCloud in #2866
- fix TestVoxtral::test_voxtral - IndexError: index out of range in self by @CSY-ModelCloud in #2870
- [CI] fix envs conflict on one host runner by @CSY-ModelCloud in #2871
- [FIX] test_hymba by @ZX-ModelCloud in #2872
- fix KeyError: 'type' & AutoTokenizer was not found by @CSY-ModelCloud in #2873
- [MODEL] support zamba and zamba2 by @ZX-ModelCloud in #2868
- no need to import AutoTokenizer which is unused by @CSY-ModelCloud in #2874
- [FIX] FP8 dequantization for cross-shard scales and partial edge blocks by @ZX-ModelCloud in #2875
- add retry to fix remote files missing in cache dir by @CSY-ModelCloud in #2876
- fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models by @erm14254 in #2869
- fix 2 deps on CI by @CSY-ModelCloud in #2881
- fix assert was checking last layers by @CSY-ModelCloud in #2880
- fix rope_parameters is not inited by @CSY-ModelCloud in #2882
- fix AttributeError: 'FakeGPTQModel' object has no attribute '_sanitiz… by @CSY-ModelCloud in #2885
- [MODEL] support
minicpmv_4_6by @ZX-ModelCloud in #2884 - improve JIT extension failure diagnostics for CI flakiness by @CSY-ModelCloud in #2886
- log stack trace for marlin jit error by @CSY-ModelCloud in #2887
- fix Qwen3OmniMoe throws get_input_embeddings NotImplementedError by @CSY-ModelCloud in #2888
- Fix dequantization for ignored layers, padded FP8 scales, and non-4D tensors by @ZX-ModelCloud in #2889
- fix command got wrong args by @CSY-ModelCloud in #2891
- print expected in error by @CSY-ModelCloud in #2890
- Fix InternVL tokenizer compat on transformers 5 by @CSY-ModelCloud in #2892
- [MODEL] support deepseek_v4 by @ZX-ModelCloud in #2877
- add kimi 2.5 support by @CSY-ModelCloud in #2858
- [CI] use torch 2.12.0 & python 3.14t as default on CI by @CSY-ModelCloud in #2894
- [MODEL ]support mimo_v2 by @ZX-ModelCloud in #2893
- [CI] auto clean cache & retry by @CSY-ModelCloud in #2895
- [FIX] ovis incompatibility with transformers v5 by @ZX-ModelCloud in #2896
- [MODEL] support
ovis2_5by @ZX-ModelCloud in #2897 - fix paroquant was not included in release by @CSY-ModelCloud in #2900
- [CI] install release pkg instead of source by @CSY-ModelCloud in #2901
- [MODEL] support ovis2 6 moe by @ZX-ModelCloud in #2899
- [MODEL] support
interns1by @ZX-ModelCloud in #2902 - [MODEL] support ovis2_6_next by @ZX-ModelCloud in #2904
- [MODEL] support
hrm_textby @ZX-ModelCloud in #2905 - ascend kernel compat update: cann 9.1beta1 by @Qubitium in #2906
- fix test_subset.py by @ZX-ModelCloud in #2907
- Ascend tests by @Qubitium in #2908
- [MODEL] support
nemotron_labs_diffusionby @ZX-ModelCloud in #2909 - [MODEL] support
hunyuan_v1_denseandhunyuan_v1_moeby @ZX-ModelCloud in #2910 - [FIX] Avoid invoke
tensor.transpose(0, 1).contiguous()when the shapes already match by @ZX-ModelCloud in #2913 - [FIX]
DeepSeek-V4-Proexperts module can now be correctly dequantized toBF16by @ZX-ModelCloud in #2914 - [FIX]
weight_only_looperdid not support multi-GPU quantization. by @ZX-ModelCloud in #2915 - fix(hub): import create_repo from huggingface_hub (transformers dropped the passthrough) by @Anai-Guo in #2917
- [FIX] non-existent import in
transformers.utils.hubwith the latesttransformersby @ZX-ModelCloud in #2918 - prep for v7.1.0 release by @Qubitium in #2919
New Contributors
Full Changelog: v7.0.0...v7.1.0