v0.4.0
π Release Notes: v0.4.0
β¨ New Features & Operator Support
- New Operator Support (AEQ): Added int8/int16 support for the following operations via Adaptive Edge Quantization (AEQ):
- MSE Quantization Support: Added Mean Squared Error (MSE) quantization support for:
- New Validation Metrics: Added new metrics for model validation:
-
Improved Quantization Precision: The minimum bound for the quantization scale was reduced from
$10^{-4}$ to$10^{-9}$ to support finer precision levels (e.g.,$10^{-6}$ for int8,$10^{-8}$ for int16). (Pull #344) - Blockwise Quantization: Added support for blockwise dequantization in AEQ. (Pull #314)
-
Weight Only Quantization: Added
Embedding Lookupto supported subchannel operations. (Pull #310)
βοΈ Improvements & Refinements
- Bias Quantization Enhancements:
- Hadamard Quantization:
- Quantization Scope & Granularity:
- Validator & SRQ Fixes:
- Added validator support for pre-quantized models. (Pull #331)
- Corrected
signature defswhen dequantization precedes graph output during Static Range Quantization (SRQ). (Pull #327) - Fixed a bug to allow the
float_castingalgorithm inadd_weight_only_config. (Pull #322) - Handle zero-length tensors in
cosine_similaritymetric. (Pull #330)
π Documentation, Refactoring & Stability
- Documentation Update: The
READMEnow includes detailed explanations of dynamic, weight-only, and static quantization, including their characteristics, pros, and cons. (Pull #295) - Refactoring: Extracted the constrained op list generation to a utility function. (Pull #301)
- Fixes & Stability:
- Fixed failing
getting_started.ipynbin nightly Colab. (Pull #300) - Fixed AEQ notebooks to run correctly in Google Colab. (Pull #315)
- Updated dependencies to use
tf-nightly. (Pull #298, Pull #305, Pull #210) - Added a note to the Colab notebook warning that
quantizer.validatemay cause an "out of memory" error. (Pull #321) - Added an error check: raise an error if the quantized dimension is not divisible by the block size. (Pull #307)
- Fixed failing
Full Changelog: v0.3.0...v0.4.0
Would you like me to focus on one of these sections, like the new operator support, and elaborate on it?