Quantization-Aware-Training with Sparsity and Clustering Preservation

**System information**

- TensorFlow version (you are using): TF.2.3.0
- Are you willing to contribute it (Yes/No): Yes

**Motivation**

Sparse (pruned), clustered and quantized models can help to reduce the compressed model sizes for deployment to devices that may be constrained by memory size and processing power.  These model optimizations are already supported by the Model Optimization Toolkit and their benefits are already documented elsewhere.  Combining these optimizations in a single model can, currently, only be done at the post-training stage, with a subsequent loss of accuracy.  Applying fine-tuning to the resulting model, in order to recover the accuracy, will modify the weights so that the sparsity and clustering benefits are lost.

The use case is to be able to perform Quantization-Aware-Training without losing the sparsity or clustering properties of models that have already been optimized by these other techniques, while benefiting from the improved accuracy of the fine-tune training process.

**Describe the feature**

This feature proposes the addition of new Quantizer classes (as used in Quantization-Aware-Training) along with their associated QuantizeConfig classes.  Specifically, the following new Quantizer derived classes should be added:

- Prune Quantizer - To preserve zero weights during the training process.
- Cluster Quantizer - To preserve the number of unique weights (centroids) during training.
- Prune and Cluster Quantizer - To preserve the zero weights and the number of unique weights during training.
- QuantizeConfig derived classes for each of the above.

**Describe how the feature helps achieve the use case**

Quantizer classes that preserve the sparsity and clustering properties of a model that has already been optimized by these other techniques will allow the fine-tuning of the model during the quantization process, which will help to maintain the accuracy of the optimized model.

**Describe how existing APIs don't satisfy your use case (optional if obvious)**

It is not currently possible to preserve the sparsity or clustering of a model during the training process.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantization-Aware-Training with Sparsity and Clustering Preservation #513

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantization-Aware-Training with Sparsity and Clustering Preservation #513

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions