Merge branch '148-callout-tip-for-exercises-on-frameworksqmd-is-not-w…

…orking'
harvard-edge · Jan 11, 2024 · 9870762 · 9870762
2 parents af9d666 + b180728
commit 9870762
Showing 1 changed file with 0 additions and 2 deletions.
diff --git a/contents/ondevice_learning/ondevice_learning.qmd b/contents/ondevice_learning/ondevice_learning.qmd
@@ -202,8 +202,6 @@ The QAS process involves two main steps:
 
 * **Quantization-aware training:** In this step, the neural network is trained with quantization in mind, using simulated quantization to mimic the effects of quantization during the forward and backward passes. This allows the model to learn to compensate for the quantization errors and improve its performance on low-precision hardware. Refer to QAT section in model optimizations for details.
 
-![Visualization of quantization effects in forward and backward pass ([Credit](https://raw.githubusercontent.com/matlab-deep-learning/quantization-aware-training/main/images/png/ste.png))](https://raw.githubusercontent.com/matlab-deep-learning/quantization-aware-training/main/images/png/ste.png)
-
 * **Quantization and scaling:** After training, the model is quantized to low-precision format, and the scale factors are adjusted to minimize the quantization errors. The scale factors are chosen based on the distribution of the weights and activations in the model, and are adjusted to ensure that the quantized values are within the range of the low-precision format.
 
 QAS is used to overcome the difficulties of optimizing models on tiny devices. Without needing hyperparamter tuning. QAS automatically scales tensor gradients with various bit-precisions. This in turn stabilizes the training process and matches the accuracy of floating-point precision.