# Apple's ML Quant

Large Neural Networks are difficult to use in production environments as they are memory intensive and are slow during inference. Most successful Deep Learning Models such as Transformers are being followed by their Lite Versions which dramatically speed up inference trading off accuracy. In this article, let’s explore Least Squares Quantization, an algorithm to speed up large neural networks by quantizing them while reducing the accuracy gap from the non-quantized  model.

To read about it more, please refer [this](https://analyticsindiamag.com/what-is-apples-quant-for-neural-networks-quantization/) article.

# Quantization Example

 Let’s see how it affects a model on the CIFAR 100 dataset.

In [1]:
!python -m pip install pip --upgrade --user -q
!python -m pip install numpy pandas seaborn matplotlib scipy sklearn statsmodels tensorflow keras --user -q

In [2]:
!git clone https://github.com/apple/ml-quant.git

fatal: destination path 'ml-quant' already exists and is not an empty directory.


CIFAR100 contains colour images that need to be classified into one of 100 classes.

In [3]:
%cd ml-quant
!python -m pip install -U pip wheel --user -q 
!python -m pip install -r requirements.txt --user -q

/home/aishwarya/machine-hack.py-practice/2_General_ML_AI/15_Miscellaneous/14_ML_Quant/ml-quant
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.5.0 requires tensorboard~=2.5, but you have tensorboard 2.0.0 which is incompatible.[0m


In [4]:
!python -m pip install flit --user -q
import os
os.environ['FLIT_ROOT_INSTALL'] = '1'
!flit install -s

[?1l>Extras to install for deps 'all': {'test', 'doc', '.none'}        [32mI-flit.install[m
Installing requirements                                           [32mI-flit.install[m
Collecting pytest-mypy
  Downloading pytest_mypy-0.8.1-py3-none-any.whl (6.7 kB)
Collecting pytest-flake8
  Downloading pytest_flake8-1.0.7-py2.py3-none-any.whl (6.4 kB)
Collecting pytest-cov
  Downloading pytest_cov-3.0.0-py3-none-any.whl (20 kB)
Collecting flake8-docstrings
  Downloading flake8_docstrings-1.6.0-py2.py3-none-any.whl (5.7 kB)
Collecting flake8-copyright
  Downloading flake8_copyright-0.2.2-py3-none-any.whl (5.0 kB)
Collecting sphinx-rtd-theme
  Downloading sphinx_rtd_theme-1.0.0-py2.py3-none-any.whl (2.8 MB)
     |████████████████████████████████| 2.8 MB 4.3 MB/s            
[?25hCollecting sphinx-autodoc-typehints
  Downloading sphinx_autodoc_typehints-1.12.0-py3-none-any.whl (9.4 kB)
Collecting m2r
  Downloading m2r-0.2.1.tar.gz (16 kB)
  Preparing metadata (setup.py) ... [?25ldone
C

In [None]:
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

In [5]:
%load_ext tensorboard

Let’s see how a full precision resnet model performs on this dataset.We can train resnet using the following command

In [6]:
!python examples/cifar100/cifar100.py --config examples/cifar100/cifar100_fp.yaml --experiment-name cifar100-fp

Downloading https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz to data/cifar100/cifar-100-python.tar.gz
169009152it [04:53, 575844.79it/s]                                              
Extracting data/cifar100/cifar-100-python.tar.gz to data/cifar100/
Files already downloaded and verified
Traceback (most recent call last):
  File "examples/cifar100/cifar100.py", line 24, in <module>
    platform.run(experiment)
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/compute_platform.py", line 107, in run
    experiment.run(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/experiment.py", line 108, in run
    train_epoch_metrics, test_epoch_metrics = self.task_fn(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/tasks.py", line 131, in classification_task
    model = get_model(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/initialization.py", line 121, in get_model
    raise ValueEr

Now let us see how a quantized model compares to this model. We will use knowledge distillation to teach the quantized model. Full precision model can be used as reference for this.

To make the quantized model refer to the full precision model, we need to edit the config file and set the teacher path.

In [7]:
!python examples/cifar100/cifar100.py --config examples/cifar100/cifar100_ls1_weight_ls2_activation_kd.yaml --experiment-name cifar100-ls2

Files already downloaded and verified
Files already downloaded and verified
Traceback (most recent call last):
  File "examples/cifar100/cifar100.py", line 24, in <module>
    platform.run(experiment)
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/compute_platform.py", line 107, in run
    experiment.run(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/experiment.py", line 108, in run
    train_epoch_metrics, test_epoch_metrics = self.task_fn(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/tasks.py", line 124, in classification_task
    teacher, kd_loss = get_teacher_and_kd_loss(
  File "/home/aishwarya/anaconda3/lib/python3.8/site-packages/quant/common/tasks.py", line 59, in get_teacher_and_kd_loss
    with open(teacher_config_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'experiments/cifar100-teacher/config.yaml'


In [8]:
train_loss,test_loss=[],[]
train_top1_accuracy,test_top1_accuracy=[],[]
train_top5_accuracy,test_top5_accuracy=[],[]
import tensorflow as tf
for e in tf.compat.v1.train.summary_iterator('experiments/cifar100-ls2/tensorboard/events.out.tfevents.1615783285.76b24c522ac8.17901.0'):
    for v in e.summary.value:
      if v.tag=='Top-1_Accuracy/train':
        train_top1_accuracy.append(v.simple_value)
      if v.tag=='Top-5_Accuracy/test':
        test_top1_accuracy.append(v.simple_value)
      if v.tag=='Top-1_Accuracy/train':
        train_top5_accuracy.append(v.simple_value)
      if v.tag=='Top-5_Accuracy/test':
        test_top5_accuracy.append(v.simple_value)
      if v.tag=='Loss/train':
        train_loss.append(v.simple_value)
      if v.tag=='Loss/test':
        test_loss.append(v.simple_value)

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


NotFoundError: experiments/cifar100-ls2/tensorboard/events.out.tfevents.1615783285.76b24c522ac8.17901.0; No such file or directory

In [None]:
train_loss_fp,test_loss_fp=[],[]
train_top1_accuracy_fp,test_top1_accuracy_fp=[],[]
train_top5_accuracy_fp,test_top5_accuracy_fp=[],[]
import tensorflow as tf
try:
  for e in tf.compat.v1.train.summary_iterator('experiments/cifar100-fp/tensorboard/events.out.tfevents.1615781108.76b24c522ac8.11165.0'):
        for v in e.summary.value:
          if v.tag=='Top-1_Accuracy/train':
            train_top1_accuracy_fp.append(v.simple_value)
          if v.tag=='Top-5_Accuracy/test':
            test_top1_accuracy_fp.append(v.simple_value)
          if v.tag=='Top-1_Accuracy/train':
            train_top5_accuracy_fp.append(v.simple_value)
          if v.tag=='Top-5_Accuracy/test':
            test_top5_accuracy_fp.append(v.simple_value)
          if v.tag=='Loss/train':
            train_loss_fp.append(v.simple_value)
          if v.tag=='Loss/test':
            test_loss_fp.append(v.simple_value)
except:
      pass

In [None]:
len(train_loss[::3][:-1]),len(test_loss)
import matplotlib.pyplot as plt

plt.rcParams['font.size'] = '22'
fig, ax = plt.subplots(1,2,figsize=(30,10))

ax[0].plot(train_loss[::3][:-1],label='train Loss Quantized Model')
ax[0].plot(test_loss,label='test Loss Quantized Model')
ax[0].legend(prop={"size":24})
ax[0].set_xlabel('Epochs', fontsize=24)
ax[0].set_ylabel('Loss', fontsize=24)

ax[1].plot(train_loss_fp[::15][:-1],label='train Loss Full precision')
ax[1].plot(test_loss_fp,label='test Loss Full Precision')
ax[1].legend(prop={"size":24})
ax[1].set_xlabel('Epochs', fontsize=24)
ax[1].set_ylabel('Loss', fontsize=24)

plt.show()

In [None]:
plt.rcParams['font.size'] = '22'
fig, ax = plt.subplots(1,2,figsize=(30,10))

ax[0].plot(train_top1_accuracy[::3][:-1],label='train top1 accuracy Quantized Model')
ax[0].plot(test_top1_accuracy,label='test top1 accuracy Quantized Model')
ax[0].legend(prop={"size":24})
ax[0].set_xlabel('Epochs', fontsize=24)
ax[0].set_ylabel('Top1 Accuracy', fontsize=24)

ax[1].plot(train_top1_accuracy_fp[::15][:-1],label='train top1 accuracy Full precision')
ax[1].plot(test_top1_accuracy_fp,label='test top1 accuracy Full Precision')
ax[1].legend(prop={"size":24})
ax[1].set_xlabel('Epochs', fontsize=24)
ax[1].set_ylabel('Top1 Accuracy', fontsize=24)

plt.show()

In [None]:
plt.rcParams['font.size'] = '22'
fig, ax = plt.subplots(1,2,figsize=(30,10))

ax[0].plot(train_top5_accuracy[::3][:-1],label='train top5 accuracy Quantized Model')
ax[0].plot(test_top5_accuracy,label='test top5 accuracy Quantized Model')
ax[0].legend(prop={"size":24})
ax[0].set_xlabel('Epochs', fontsize=24)
ax[0].set_ylabel('Top5 Accuracy', fontsize=24)

ax[1].plot(train_top5_accuracy_fp[::15][:-1],label='train top5 accuracy Full precision')
ax[1].plot(test_top5_accuracy_fp,label='test top5 accuracy Full Precision')
ax[1].legend(prop={"size":24})
ax[1].set_xlabel('Epochs', fontsize=24)
ax[1].set_ylabel('Top5 Accuracy', fontsize=24)

plt.show()