TensorFlow Model Optimization 0.7.0
TFMOT 0.7.0 adds updates for Quantization Aware Training (QAT) and Pruning API. Adds support for structured (MxN) pruning.
QAT now also has support for layers with swish activations and ability to disable per-axis quantization in the default 8bit scheme.
Adds support for combining pruning, QAT and weight clustering.
Keras Quantization API:
Tested against TensorFlow 2.6.0, 2.5.1 and nightly with Python 3.
- Added QuantizeWrapperV2 class which preserves order of weights is the default for quantize_apply.
- Added a flag to disable per-axis quantizers in default8_bit scheme.
- Added swish as supported activation.
Keras pruning API:
Tested against TensorFlow 2.6.0, 2.5.1 and nightly with Python 3.
- Added structural pruning with MxN sparsity.
Keras clustering API:
- Added support for RNNSimple, LSTM, GRU, StackedRNNCells, PeepholeLSTMCell, and Bidirectional layers.
- Updated and fixed sparsity-preserving clustering.
- Added an experimental quantization schemes for Quantization Aware Training for collaborative model.optimization:
- Pruning-Clustering-preserving QAT: pruned and clustered model can be QAT trained with preserved sparsity and the number of clusters.
- Updated Clustering initialization default to KMEANS_PLUS_PLUS.