Skip to content

Commit

Permalink
Merge branch '148-callout-tip-for-exercises-on-frameworksqmd-is-not-w…
Browse files Browse the repository at this point in the history
…orking'
  • Loading branch information
profvjreddi committed Jan 11, 2024
2 parents af9d666 + b180728 commit 9870762
Showing 1 changed file with 0 additions and 2 deletions.
2 changes: 0 additions & 2 deletions contents/ondevice_learning/ondevice_learning.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -202,8 +202,6 @@ The QAS process involves two main steps:

* **Quantization-aware training:** In this step, the neural network is trained with quantization in mind, using simulated quantization to mimic the effects of quantization during the forward and backward passes. This allows the model to learn to compensate for the quantization errors and improve its performance on low-precision hardware. Refer to QAT section in model optimizations for details.

![Visualization of quantization effects in forward and backward pass ([Credit](https://raw.githubusercontent.com/matlab-deep-learning/quantization-aware-training/main/images/png/ste.png))](https://raw.githubusercontent.com/matlab-deep-learning/quantization-aware-training/main/images/png/ste.png)

* **Quantization and scaling:** After training, the model is quantized to low-precision format, and the scale factors are adjusted to minimize the quantization errors. The scale factors are chosen based on the distribution of the weights and activations in the model, and are adjusted to ensure that the quantized values are within the range of the low-precision format.

QAS is used to overcome the difficulties of optimizing models on tiny devices. Without needing hyperparamter tuning. QAS automatically scales tensor gradients with various bit-precisions. This in turn stabilizes the training process and matches the accuracy of floating-point precision.
Expand Down

0 comments on commit 9870762

Please sign in to comment.