Release v0.2.1 · google-ai-edge/ai-edge-quantizer

This release focuses on significant improvements to quantization capabilities, particularly for advanced scenarios and performance.

Key Highlights:

Expanded Quantization Methods:
- Introduced OCTAV as an alternative uniform quantization method.
- Implemented Hadamard rotation quantization, including support for 3D tensors and performance optimizations.
Enhanced Blockwise Quantization:
- Added support for FullyConnected and EmbeddingLookup ops.
- Enabled blockwise quantization across various uniform algorithms, including OCTAV.
Improved Constant Tensor Handling:
- Added robust support for constant tensors with shared buffers but different quantization parameters.
- Optimized handling and duplication of constant tensors to prevent unnecessary duplicates and ensure correct transformation.
Core Infrastructure & Stability:
- Added support for calibrating composite decompositions.
- Addressed memory allocation issues and zero-size array crashes.
- Improved handling of graph input/output indices, custom ops, and dynamic shapes.
- Adjusted buffer size for larger model quantization.
- General code refactoring and test improvements for better maintainability and reliability.
New Recipes & Features:
- Added a new dynamic_wi4_afp32 recipe.
- Supported integer inputs in dataset creation.

Provide feedback