Release Version 2.11.0 · qualcomm/aimet

New Feature
- PyTorch
  - SpinQuant (experimental) - implement SpinQuant PTQ technique (https://arxiv.org/pdf/2308.13137) for Llama, Qwen2, and Mistral families (R1 rotation w/o optimization) (7364b37)
  - Enable Adascale and Omniquant for Mistral (d33e98c)
- ONNX
  - Enable llm_configurator for Llama (Experimental) (08c17b8)
Bug fixes and Improvements
- Common
  - Represent LPBQ as DequantizeLinear in onnx QDQ (a967b8f)
  - Add additional sanity checks in LPBQ export logic (45c2a65)
  - Allow negative block axis in LPBQ QDQ export (6f670a4)
  - Add support for enabling param bw=2 in QuantSim (2d4e0eb)
  - Fix tanh output encoding range to [-1, 1] (3c92bb7)
- ONNX
  - Apply matmul exception rule only for integer quantization (bb93c76)
  - Optimize blockwise min-max encoding analyzer (4febdd4)
  - Remove explicit FP32 model creation inside AdaRound and optimize building sessions during the optimization process (b1415bd_)
  - Make Concat output quantizer inherit fixed input range (50f35dd)
  - Enable output quantizers to inherit input encoding when tying encodings (3750526)
  - Fix bug in CLE with bn_conv groups (654f4b1)
- PyTorch
  - Guarantee positive scale during aimet-torch QAT (2ed8305)
  - Add secondary progress bars to Adascale and Omniquant (6c92a97)
Documentation Updates
- Update Quick Start example and PTQ section (6c9f584)
- Add missing workflow images (f961ed4)
Known Issues
- Keras
  - Accuracy drop observed with AIMET Keras for certain models. Fix is planned for the next release.
  - Skipping 2.11 aimet-keras release due to regression

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2.11.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!