# Post Training Quantization using the Model Compression Toolkit - A Quick-Start Guide

[Run this tutorial in Google Colab](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/example_keras_imagenet.ipynb)

## Overview

This tutorial shows how to quantize a pre-trained model using the Model Compression Toolkit (MCT). We will do so by giving an example of MCT's post-training quantization. As we will see, post-training quantization is a low complexity yet effective quantization method. In this example, we will quantize the model and evaluate the accuracy before and after quantization.

## Summary

In this tutorial we will cover:

1. Post-Training Quantization using MCT.
2. Loading and preprocessing ImageNet's validation dataset.
3. Loading and preprocessing an unlabeled representative dataset from the ImageNet trainset.
4. Accuracy evaluation of the floating-point and the quantized models.

## Setup

Install and import the relevant packages:


In [None]:
!pip install -q tensorflow
!pip install -q tensorflow-model-optimization
!pip install -q model-compression-toolkit

In [None]:
import tensorflow as tf
import keras
import model_compression_toolkit as mct

## Dataset preparation

Assuming we've downloaded ImageNet's training dataset to a folder, let's set the folder path:

In [None]:
TRAIN_DATASET_FOLDER = '/path/to/imagenet/training/dir'

Now, let's create two functions. The first is for preprocessing the dataset and the second is for creating an unlabeled representative dataset for quantization calibration. We will use a batch size of 50:

In [None]:
def imagenet_preprocess_input(images, labels):
    return tf.keras.applications.mobilenet_v2.preprocess_input(images), labels

In [None]:
BATCH_SIZE = 50
n_iter=10

def get_representative_dataset():
    print('loading dataset, this may take few minutes ...')
    dataset = tf.keras.utils.image_dataset_from_directory(
        directory=TRAIN_DATASET_FOLDER,
        batch_size=BATCH_SIZE,
        image_size=[224, 224],
        shuffle=True,
        crop_to_aspect_ratio=True,
        interpolation='bilinear')
    dataset = dataset.map(lambda x, y: (imagenet_preprocess_input(x, y)))

    def representative_dataset():
        for _ in range(n_iter):
            yield dataset.take(1).get_single_element()[0].numpy()

    return representative_dataset
representative_dataset_gen = get_representative_dataset()

## Model post training quantization using MCT

Now for the main part.

First, let's load a pre-trained mobilenet-v2 model from Keras, in 32-bits floating-point precision format:

In [None]:
from keras.applications.mobilenet_v2 import MobileNetV2 
float_model = MobileNetV2()

Now, we apply post-training quantization on the model. In this example, we use the default 8-bits precision and 10 calibration iterations over the representative dataset:

In [None]:
quantized_model, quantization_info = mct.ptq.keras_post_training_quantization_experimental(float_model, representative_dataset_gen)

That's it! Our model is now quantized.

## Models evaluation

In order to evaluate our models, we first need to load the validation dataset. As before, let's assume we downloaded the ImageNet validation dataset to a folder with the path below:

In [None]:
TEST_DATASET_FOLDER = '/path/to/imagenet/test/dir'
def get_validation_dataset():
    dataset = tf.keras.utils.image_dataset_from_directory(
        directory=TEST_DATASET_FOLDER,
        batch_size=BATCH_SIZE,
        image_size=[224, 224],
        shuffle=False,
        crop_to_aspect_ratio=True,
        interpolation='bilinear')
    dataset = dataset.map(lambda x, y: (imagenet_preprocess_input(x, y)))
    return dataset

In [None]:
evaluation_dataset = get_validation_dataset()

Let's start with the floating-point model evaluation.

We need to compile the model before evaluation and set the loss and the evaluation metric:

In [None]:
float_model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
results = float_model.evaluate(evaluation_dataset)

Finally, let's evaluate the quantized model:

In [None]:
quantized_model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
results = quantized_model.evaluate(evaluation_dataset)

You can see that we got a very small degradation with a compression rate of x4 !

## Conclusion 

In this tutorial, we demonstrated how to quantize a pre-trained model using MCT with a few lines of code. We saw that we can achieve an x4 compression ratio with minimal performance degradation.





Copyright 2022 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
