Breaking Changes
- Intel Neural Compressor (INC) and Intel Extension for PyTorch (IPEX) integrations have been fully removed (#1687). Both were deprecated in v1.27.0. Users relying on these should stay on v1.27
- ONNX dependency was removed from package requirements (#1753)
- OpenVINO and NNCF are now installed by default, the [openvino] and [nncf] extras are now deprecated (#1602)
New Model Support
- Arcee Trinity (AFMoE) (#1569)
- Qwen3VL (#1551)
- Qwen3-next (hybrid SSM/attention) (#1523)
- Qwen3.5, Qwen3.5-MoE, Qwen3.6 (#1689)
- Gemma 4 (#1688)
- Eagle3: Speculative decoding draft model support (#1588)
- LFM2-MoE (#1691)
- Kokoro TTS (#1653)
- Qwen3-ASR (#1677)
- CohereLabs/tiny-aya-base (Command-R family) (#1623)
- HY-MT1.5-1.8B (#1621)
- VideoChat (#1637)
Quantization & Compression
- Extended dataset options for calibration: Datasets can now be specified with parameters, e.g. wikitext2:seq_len=128 (#1564)
- Default 8-bit quantization configs with configurable dynamic quantization group size (#1570)
- NNCF CB4 mode renamed to cb4_f8e4m3 for newer NNCF versions (#1597)
- Data-Aware AWQ for Qwen3-30B added to configuration (#1620)
- Fix quantized model save path: Immediate save after quantization now writes to the correct path (#1576)
- Fix per_layer_inputs value error during quantization (#1714)
- Fix calibration data collection (#1778)
Improvements
- Transformers v5 compatibility (#1589)
- Hybrid attention models: past_key_values in attention_mask now supported for stateful inference (#1641)
- beam_idx connected to Linear Attention Layers (CausalConv1D, SSM, GDN) for correct beam search with recurrent models (#1619)
- Fix long-context inference for Phi-3.5 and Phi-4 (#1744)
- Fix SpeechT5 dynamic batch inference (#1664)
- Fix MoE patching to enable ConvertTiledMoeBlockToGatherMatmuls transformation (#1741)
- Improved numpy input handling for model inputs with mixed types (#1646)
- Fix task inference for Phi-4-multimodal-instruct (#1610)
New Contributors
- @andrew-k-park made their first contribution in #1590
- @andrey-churkin made their first contribution in #1597
- @Mohamed-Ashraf273 made their first contribution in #1623
- @anzr299 made their first contribution in #1620
- @MissLostCodes made their first contribution in #1646
- @hf-security-analysis[bot] made their first contribution in #1663
- @paulinebm made their first contribution in #1671
- @rtrompier made their first contribution in #1685
- @xufang-lisa made their first contribution in #1637
- @openvino-agent made their first contribution in #1653
- @Lyamin-Roman made their first contribution in #1740
What's Changed
Full Changelog: v1.27.0...v2.0.0
Compatible with transformers>=v4.45,<v5.1