TinyDL

This a side project to explore different techniques and tools which are dedicated to Deep Learning neural networks acceration.

Quantization Aware Training

Quantization is the process of transforming deep learning models to use parameters and computations at a lower precision. Traditionally, DNN training and inference have relied on the IEEE single-precision floating-point format, using 32 bits to represent the floating-point model weights and activation tensors. This compute budget may be acceptable at training as most DNNs are trained in data centers or GPUs. However, during deployment, these models are most often required to run on devices with much smaller computing resources and lower power budgets at the edge. Running a DNN inference using the full 32-bit representation is not practical for real-time analysis given the compute, memory, and power constraints of the edge.
As opposed to computing scale factors to activation tensors after the DNN is trained (also called hard quantization), the quantization error is considered when training the model. The training graph is modified to simulate the lower precision behavior in the forward pass of the training process. This introduces the quantization errors as part of the training loss, which the optimizer tries to minimize during the training. Thus, QAT helps in modeling the quantization errors during training and mitigates its effects on the accuracy of the model at deployment.
For a simple example in Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference :

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Brevitas		Brevitas
Distiller		Distiller
Figures		Figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TinyDL

Quantization Aware Training

About

Releases

Packages

Languages

uslumt/TinyDL

Folders and files

Latest commit

History

Repository files navigation

TinyDL

Quantization Aware Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages