# Quantization using the Model Compression Toolkit - ImageNet dataset example

## Overview

This quick start guide covers how to use the Model Compression Toolkit (MCT) for quantizing a pre-trained model on ImageNet. We will do so by giving an example, quantizing a pre-trained model, then evaluating the accuracy on ImageNet dataset

## Summary

In this tutorial we will cover:

1. Loading and preprocessing ImageNet validation dataset
2. Loading and preprocessing ImageNet unlabeled representative dataset
3. Post-Training-Quantization using MCT
4. Accuracy evaluation of the floating-point and the quantized models

## Setup

Install the relevant packages:


In [1]:
!pip install -q model-compression-toolkit
!pip install -q tensorflow
!pip install -q tensorflow-model-optimization

In [2]:
import tensorflow as tf
import keras
import model_compression_toolkit as mct
from keras.applications.mobilenet_v2 import MobileNetV2

## Dataset preparation

Define dataset folders:

In [3]:
TEST_DATASET_FOLDER = '/data/projects/swat/datasets_src/ImageNet/ILSVRC2012_img_val_TFrecords'
TRAIN_DATASET_FOLDER = '/data/projects/swat/datasets_src/ImageNet/ILSVRC2012_img_train'

Let us define few helper functions to load the evaluation dataset, unlabeled represantative dataset (for quantization calibration) and to perform preprocessing:

In [4]:
def imagenet_preprocess_input(images, labels):
    return tf.keras.applications.mobilenet_v2.preprocess_input(images), labels

In [5]:
def get_validation_dataset():
    dataset = tf.keras.utils.image_dataset_from_directory(
        directory=TEST_DATASET_FOLDER,
        batch_size=50,
        image_size=[224, 224],
        shuffle=False,
        crop_to_aspect_ratio=True,
        interpolation='bilinear')
    dataset = dataset.map(lambda x, y: (imagenet_preprocess_input(x, y)))
    return dataset

In [6]:
def get_representative_dataset():
    print('loading dataset, this may take few minutes ...')
    dataset = tf.keras.utils.image_dataset_from_directory(
        directory=TRAIN_DATASET_FOLDER,
        batch_size=50,
        image_size=[224, 224],
        shuffle=True,
        crop_to_aspect_ratio=True,
        interpolation='bilinear')
    dataset = dataset.map(lambda x, y: (imagenet_preprocess_input(x, y)))

    def representative_dataset():
        return dataset.take(1).get_single_element()[0].numpy()

    return representative_dataset

## Floating point model evaluation

Now we would like to evaluate the 32-bits FP precision model, show how to compress it into 8-bits precision model, and check the effect on the model accuracy.

First, we need to load the evaluation dataset from ImageNet folder:

In [7]:
evaluation_dataset = get_validation_dataset()

Found 50000 files belonging to 1000 classes.


Secondly, load pre-trained mobilenet-v2 model, in a floating-point precision format:

In [8]:
model = MobileNetV2()

then evaluate the model using the evaluation dataset. note that we need to compile the model before evaluation and set the loss and the evaluation metric:

In [9]:
model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
results = model.evaluate(evaluation_dataset)
print('Float model accuracy: ' + str(results[1]))

Float model accuracy: 0.7185199856758118


## Model quantization using MCT and re-evaluation the model

Next, we would like to quantize the model using MCT. To do so, we need to define representative dataset generator, which is a function that returns a list of unlabeled images:

In [10]:
representative_dataset_gen = get_representative_dataset()

loading dataset, this may take few minutes ...
Found 1281167 files belonging to 1000 classes.


then, to apply the hardware-friendly post training quantization on the model. we use 10 iterations for calibration:

In [11]:
quantized_model, quantization_info = mct.keras_post_training_quantization(model, representative_dataset_gen, n_iter=10)

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [01:01<00:00,  6.15s/it]


Finally, we evaluate the quantized model performance and note to the small accuracy gap:

In [12]:
quantized_model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
results = quantized_model.evaluate(evaluation_dataset)
print('Quantized model accuracy: ' + str(results[1]))

Quantized model accuracy: 0.7160000205039978
