Skip to content

Releases: quic/aimet

version 1.34.0

16 Sep 16:10
Compare
Choose a tag to compare

What's New

  • PyTorch
    • Added support for WSL2
    • CUDA version upgraded for Pytorch 2.1
    • Extended QuantAnalyzer functionality for LLM range analysis
  • Keras
    • Adds support for certain TFOpLambda layers created by tf functional calls.
  • ONNX
    • Upgraded AIMET to support ONNX version 1.16.1 and ONNXRUNTIME version 1.18.1.

Documentation

Packages

  • aimet_torch-1.34.0.cu121-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 GPU package with Python 3.10 and CUDA 12.x
  • aimet_torch-1.34.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
  • aimet_torch-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_onnx-1.34.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.16 GPU package with Python 3.10 - Recommended for use with ONNX models
  • aimet_onnx-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.16 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_tensorflow-1.34.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
  • aimet_tensorflow-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA

version 1.33.5

13 Sep 00:32
Compare
Choose a tag to compare

What's New

  • PyTorch
    • Various bugfixes/QoL updates for LoRA
    • Updated minimum scale value and registered additional custom quantized ops with QuantSim 2.0

Documentation

Packages

  • aimet_torch-1.33.5.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
  • aimet_torch-1.33.5.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
  • aimet_torch-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_onnx-1.33.5.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
  • aimet_onnx-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_tensorflow-1.33.5.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
  • aimet_tensorflow-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA

version 1.33.0

11 Jul 21:15
283ee26
Compare
Choose a tag to compare

What's New

  • PyTorch
    • Enhancements done in export pipeline for GPU memory optimization with LLMs.
    • [Experimental] Added support for handling of LoRA (via PEFT API) in AIMET. and enabled export of required artifacts for QNN.
    • Added examples for training pipeline with for distributed KD-QAT.
    • [Experimental] Added support for block wise quantization (BQ) to support w4fp16 format, and the low-power block quantization (LPBQ) to support w4a8 and w4a16 formats. This feature needs QuantSim V2.

Documentation

Packages

  • aimet_torch-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
  • aimet_torch-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
  • aimet_torch-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_onnx-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
  • aimet_onnx-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_tensorflow-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
  • aimet_tensorflow-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA

version 1.32.0

04 Jun 18:35
Compare
Choose a tag to compare

What's New

  • PyTorch
    • Added MultiGPU support for Adaround.
    • Upgraded AIMET to support PyTorch version 2.1 as a new variant. AIMET with PyTorch version 1.13 remains the default.
  • Keras
    • For models with SeparableConv2D layers, use model_preparer first before applying any quantization API.
  • Common
    • Upgraded AIMET to support Ubuntu22 and Python3.10 for all AIMET variants.

Documentation

Packages

  • aimet_torch_gpu_pt21-1.32.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
  • aimet_torch_gpu-1.32.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
  • aimet_torch_cpu_pt21-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_torch_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • PyTorch 1.13 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_onnx_gpu-1.32.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
  • aimet_onnx_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_tf_gpu-1.32.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
  • aimet_tf_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA

version 1.31.2

28 May 19:18
Compare
Choose a tag to compare

What's New

  • TODO

Documentation

Packages

  • aimet_torch-gpu_1.31.2_cu117-cp310-cp310-manylinux_2_32_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.8 and CUDA 11.x - Recommended for use with PyTorch models
  • aimet_torch-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
    • PyTorch 1.13 CPU package with Python 3.8 - If installing on a machine without CUDA
  • aimet_tensorflow-gpu_1.31.2_cu118-cp310-cp310-manylinux_2_32_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.8 - Recommended for use with TensorFlow models
  • aimet_tensorflow-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.8 - If installing on a machine without CUDA
  • aimet_onnx-gpu_1.31.2_cu117-cp310-cp310-manylinux_2_32_x86_64.whl
    • ONNX 1.11.0 GPU package with Python 3.8 - Recommended for use with ONNX models
  • aimet_onnx-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
    • ONNX 1.11.0 CPU package with Python 3.8 - If installing on a machine without CUDA

version 1.31.0

25 Mar 17:57
Compare
Choose a tag to compare

What's New

  • ONNX
    • Added support for custom ops in QuantSim, CLE, AdaRound and AMP.
    • Added support for Quant Analyzer.
  • Keras
    • Added support for unrolled quantized LSTM with only Quantsim in PTQ mode.
    • Fix for ReLU Encoding min going past 0 for QAT.
    • Fixes Input Quantizers for TFOpLambda Layers (kwargs)
    • Fixes logic for placing input quantizers

Documentation

Packages

  • aimet_torch-torch_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • PyTorch 1.13 GPU package with Python 3.8 and CUDA 11.x - Recommended for use with PyTorch models
  • aimet_torch-torch_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • PyTorch 1.13 CPU package with Python 3.8 - If installing on a machine without CUDA
  • aimet_torch-torch_cpu_pt19_1.31.0-cp38-cp38-linux_x86_64.whl
    • PyTorch 1.9 CPU package with Python 3.8 - If installing on a machine without CUDA
  • aimet_tensorflow-tf_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.8 - Recommended for use with TensorFlow models
  • aimet_tensorflow-tf_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.8 - If installing on a machine without CUDA
  • aimet_onnx-onnx_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • ONNX 1.11.0 GPU package with Python 3.8 - Recommended for use with ONNX models
  • aimet_onnx-onnx_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
    • ONNX 1.11.0 CPU package with Python 3.8 - If installing on a machine without CUDA

version 1.30.0

17 Jan 10:39
Compare
Choose a tag to compare

What's New

ONNX

  • Upgraded AIMET to support Onnx version 1.14 and ONNXRUNTIME version 1.15.
  • Added support for AutoQuant.

Documentation

version 1.29.0

29 Nov 22:00
Compare
Choose a tag to compare

What's New

Keras

  • Fixes issues with TF Op Lambda Layers in Qc Quantize Wrappers call.

PyTorch

  • [experimental] Support for embedding AIMET encodings within the graph using ONNX quantize/dequantize operators. Currently this option is only supported when using 8bit per-tensor quantization.

ONNX

  • Added support for Adaround.

TensorFlow

  • No significant updates

Documentation

version 1.28.1

20 Oct 23:54
Compare
Choose a tag to compare
Release of the AI Model Efficiency toolkit package
User guide: https://quic.github.io/aimet-pages/releases/1.28.1/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.28.1/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

version 1.28.0

06 Sep 10:04
Compare
Choose a tag to compare

What's New

Keras

  • Added Support for Spatial SVD Compression feature.
  • [experimental] Debugging APIs have been added for dumping intermediate tensor outputs. This data can be used with current QNN/SNPE tools for debugging accuracy problems.

PyTorch

  • Upgraded AIMET Pytorch default version to 1.13. AIMET remains compatible with Pytorch version 1.9.

ONNX

  • [experimental] Debugging APIs have been added for dumping intermediate tensor outputs. This data can be used with current QNN/SNPE tools for debugging accuracy problems.

TensorFlow

  • No significant updates

Documentation