Skip to content

Intel Neural Compressor Release 3.8

Latest

Choose a tag to compare

@thuang6 thuang6 released this 02 Jun 01:20
· 29 commits to master since this release
v3.8
62ffebf
  • Highlights
  • Features
  • Improvements
  • Validated Hardware
  • Validated Configurations

Highlights

  • Introduced new JAX framework experimental support

Features

  • Support FP8 quantization for Keras/JAX (experimental)
  • Support FP8 KV cache static quantization (experimental)
  • Support FP8 Attention static quantization (experimental)

Improvements

  • New Gemma3 FP8 PTQ example for Keras/JAX
  • New ViT FP8 PTQ example for Keras/JAX
  • Llama 3 series MXFP4 / MXFP8 PTQ example with FP8 KV & Attention
  • Llama 4 Scout MXFP4 / MXFP8 PTQ example with FP8 KV & Attention
  • Qwen3 MXFP4 / MXFP8 PTQ example with FP8 KV & Attention
  • DeepSeek R1 MXFP4 / MXFP8 PTQ example with FP8 KV & Attention
  • Transformers v5 support
  • Removal of deprecated 2.x API

Validated Hardware

  • Intel Gaudi Al Accelerators (Gaudi 2 and 3)
  • Intel Xeon Scalable processor (4th, 5th and 6th Gen)
  • Intel® Arc™ B-Series Graphics GPU (B580 and B60)

Validated Configurations

  • Ubuntu 24.04 & Win 11
  • Python 3.11, 3.12, 3.13
  • PyTorch 2.9, 2.10
  • JAX 0.9

Notes

  • It is recommended to use version v3.8 or later to mitigate code CVEs.