Skip to content

Quantization-Aware-Training with Sparsity and Clustering Preservation #513

@LesBellArm

Description

@LesBellArm

System information

  • TensorFlow version (you are using): TF.2.3.0
  • Are you willing to contribute it (Yes/No): Yes

Motivation

Sparse (pruned), clustered and quantized models can help to reduce the compressed model sizes for deployment to devices that may be constrained by memory size and processing power. These model optimizations are already supported by the Model Optimization Toolkit and their benefits are already documented elsewhere. Combining these optimizations in a single model can, currently, only be done at the post-training stage, with a subsequent loss of accuracy. Applying fine-tuning to the resulting model, in order to recover the accuracy, will modify the weights so that the sparsity and clustering benefits are lost.

The use case is to be able to perform Quantization-Aware-Training without losing the sparsity or clustering properties of models that have already been optimized by these other techniques, while benefiting from the improved accuracy of the fine-tune training process.

Describe the feature

This feature proposes the addition of new Quantizer classes (as used in Quantization-Aware-Training) along with their associated QuantizeConfig classes. Specifically, the following new Quantizer derived classes should be added:

  • Prune Quantizer - To preserve zero weights during the training process.
  • Cluster Quantizer - To preserve the number of unique weights (centroids) during training.
  • Prune and Cluster Quantizer - To preserve the zero weights and the number of unique weights during training.
  • QuantizeConfig derived classes for each of the above.

Describe how the feature helps achieve the use case

Quantizer classes that preserve the sparsity and clustering properties of a model that has already been optimized by these other techniques will allow the fine-tuning of the model during the quantization process, which will help to maintain the accuracy of the optimized model.

Describe how existing APIs don't satisfy your use case (optional if obvious)

It is not currently possible to preserve the sparsity or clustering of a model during the training process.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestfeature requesttechnique:allApplicable to all techniques and TFMOT collectively.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions