# Quantizing LeNet-5 model
This notebook shows how to quantize a pre-trained Fireball model using Codebook Quantization. It assumes 
that a trained LeNet-5 model already exists in the ```Models``` directory. You can use the notebook
[Handwritten Digit recognition (LeNet-5/MNIST)](LeNet5-MNIST.ipynb) to create and train a LeNet-5 model.

If you want to quantize a Low-Rank model, you can use [this](LeNet5-MNIST-Reduce.ipynb) notebook
to reduce the number of parameters in LeNet-5.

Model quantization reduces the size of the model by using less number of bits for each floating 
point parameter. Fireball uses a codebook quantization method based on K-Means clustering algorithm.

[quantizeModel](https://interdigitalinc.github.io/Fireball/html/source/model.html#fireball.model.Model.quantizeModel) is a class method that receives the file names of input and output to the 
quantization process. It also receives the quantization parameters such as ```minBits```, ```maxBits```, 
and ```mseUb```.

Fireball can create models with 2-bit to 12-bit quantization (Codebook sizes 4 to 4096). For the quantized
model to be compatible with [CoreML](https://developer.apple.com/documentation/coreml), we need to make sure the codebook size is a power of 2, less than or equal to 256, and only "weight" parameters are quantized (not biases)


## Quantizing a pretrained model
The code in the following cell quantizes the model specified by ```orgFileName``` and creates a
new quantized model.

For each parameter tensor of the model, we try quantization bits 2 to 8 and find the best quantization
that satisfies the specified MSE value.

To get better quantization (smaller model) increase ```mse```; to get better performance (larger model)
use a smaller ```mse```.

In [1]:
from fireball import Model

orgFileName = "Models/LeNet5RRPR.fbm"    # Reduced - Retrained - Pruned - Retrained

quantizedFileName = orgFileName.replace('.fbm', 'Q.fbm')  # Append 'Q' to the filename for "Quantized"

# quantizing the model
qResults = Model.quantizeModel(orgFileName, quantizedFileName, 
                               mseUb=.001, minBits=2, maxBits=8, reuseEmptyClusters=True, weightsOnly=True,
                               quiet=False, verbose=True, numWorkers=0)


Reading model parameters from "Models/LeNet5RRPR.fbm" ... Done.
Quantizing 13 tensors ... 
   Quantization Parameters:
        mseUb .............. 0.001
        pdfFactor .......... 0.1
        reuseEmptyClusters . True
        weightsOnly ........ True
        minBits ............ 2
        maxBits ............ 8
    Tensor 1 of 13 Shape: 5x5x1x6 ........... Quantized. (16 clusters - MSE: 0.0006)
    Tensor 2 of 13 Shape: 6 ................. Ignored. (1-D Tensor)
    Tensor 3 of 13 Shape: 5x5x6x8 ........... Quantized. (16 clusters - MSE: 0.0007)
    Tensor 4 of 13 Shape: 1x1x8x16 .......... Quantized. (32 clusters - MSE: 0.0002)
    Tensor 5 of 13 Shape: 16 ................ Ignored. (1-D Tensor)
    Tensor 6 of 13 Shape: 400x8 ............. Quantized. (16 clusters - MSE: 0.0010)
    Tensor 7 of 13 Shape: 8x120 ............. Quantized. (32 clusters - MSE: 0.0002)
    Tensor 8 of 13 Shape: 120 ............... Ignored. (1-D Tensor)
    Tensor 9 of 13 Shape: 120x8 ............. Quantiz

Compare the data and file sizes before and after quantization.
## Evaluate the quantized model

In [2]:
from fireball.datasets.mnist import MnistDSet

testDs = MnistDSet.makeDatasets('test', batchSize=128)

model = Model.makeFromFile(quantizedFileName, testDs=testDs, gpus='0')   
model.initSession()

results = model.evaluate()


Reading from "Models/LeNet5RRPRQ.fbm" ... Done.
Creating the fireball model "LeNet-5" ... Done.
  Processed 10000 Sample. (Time: 1.67 Sec.)                              

Observed Accuracy:  0.9894
Expected Accuracy: 0.100355
Kappa: 0.988218 (Excellent)


## Re-training after quantization
Fireball can retrain the quantized models by modifying (learning) the quantization codebooks. The following cell creates a "tune" dataset by sampling from the training dataset and uses it to "fine-tune" the quantized model for 5 epochs. The re-trained model is then evaluated and saved to the ```Models``` directory.

In [3]:
tuneDs,validDs = MnistDSet.makeDatasets('tune,valid', batchSize=128)
print(tuneDs)

model = Model.makeFromFile(quantizedFileName, 
                           trainDs=tuneDs, validationDs=validDs, # Use the "tuneDs" for training
                           numEpochs=5,
                           learningRate=(1e-3,1e-5),
                           optimizer="Momentum",
#                            gpus=[-1])
                           gpus="0")
model.initSession()
model.train()

model.evaluateDSet(testDs)

retrainedFileName = quantizedFileName.replace('.fbm', 'R.fbm')  # Append 'R' to the filename for "Re-trained"
model.save(retrainedFileName)   # Save the re-trained model to the "Models" directory

MnistDSet Dataset Info:
    Dataset Name ................................... tune
    Dataset Location ............................... /data/mnist/
    Number of Classes .............................. 10
    Number of Samples .............................. 10800
    Sample Shape ................................... (28, 28, 1)


Reading from "Models/LeNet5RRPRQ.fbm" ... Done.
Creating the fireball model "LeNet-5" ... Done.
+--------+---------+---------------+-----------+-------------------+
| Epoch  | Batch   | Learning Rate | Loss      | Valid/Test Error  |
+--------+---------+---------------+-----------+-------------------+
| 1      | 84      | 0.00034056156 | 0.0161217 |    1.50% N/A      |
| 2      | 169     | 0.00011598216 | 0.0135505 |    1.58% N/A      |
| 3      | 254     | 0.00003949906 | 0.0128184 |    1.58% N/A      |
| 4      | 339     | 0.00001345186 | 0.0126654 |    1.58% N/A      |
| 5      | 424     | 0.00000435213 | 0.012674  |    1.58% N/A      |
+--------+---------+--

## Also look at

[Exporting LeNet-5 Model to ONNX](LeNet5-MNIST-ONNX.ipynb)

[Exporting LeNet-5 Model to TensorFlow](LeNet5-MNIST-TF.ipynb)

[Exporting LeNet-5 Model to CoreML](LeNet5-MNIST-CoreML.ipynb)

[Hand-written Digit Recognition as a Regression problem](Regression.ipynb)

---

[Fireball Playgrounds](../Contents.ipynb)

[Handwritten Digit Recognition (LeNet-5/MNIST)](LeNet5-MNIST.ipynb)

[Reducing number of parameters of LeNet-5 Model](LeNet5-MNIST-Reduce.ipynb)

[Pruning LeNet-5 Model](LeNet5-MNIST-Prune.ipynb)
