Skip to content

Version 2.32.0

Choose a tag to compare

@aimet-bot aimet-bot released this 03 Jun 16:25
· 2 commits to release-aimet-2.32 since this release
  • Bug fixes and Improvements
    • ONNX

      • Add C++ support for bfloat16 quantization (ca7d3e0)
      • Fix large model support with protobuf 7.x (9ef2251)
      • Skip QDQ pair scale/zp in duplicate_shared_initializers (05e8332)
      • Handle Identity passthrough in duplicate_shared_initializers (1b27d98)
      • Fix SpinQuant embed_tokens filter to exclude non-embedding Gathers (81b8041)
      • Inline fused supergroups after encoding propagation (68fdcb6)
    • Torch

      • Disable output quantizers of reused modules before output encoding propagation (b66b9a1)
      • Inline Q/DQ nodes statically without re-invoking torch.export (de79ae4)
      • Stop incorrect encoding propagation through non-grid-preserving ops (66b2834)