# PyTorch Model from Timm - Quantization for IMX500

[Run this tutorial in Google Colab](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/notebooks/imx500_notebooks/pytorch/pytorch_timm_classification_model_for_imx500.ipynb)

## Overview

In this tutorial, we will illustrate a basic and quick process of preparing a pre-trained model for deployment using MCT. 
We will use an existing pre-trained model from [Timm](https://github.com/huggingface/pytorch-image-models). The user can choose any other timm model from this list of compatible model for his requirements.   



## Setup
### Install the relevant packages

In [None]:
!pip install -q torch
!pip install onnx
!pip install timm

Install MCT (if it’s not already installed). Additionally, in order to use all the necessary utility functions for this tutorial, we also copy [MCT tutorials folder](https://github.com/sony/model_optimization/tree/main/tutorials) and add it to the system path.

In [None]:
import sys
import importlib

if not importlib.util.find_spec('model_compression_toolkit'):
    !pip install model_compression_toolkit
!git clone https://github.com/sony/model_optimization.git temp_mct && mv temp_mct/tutorials . && \rm -rf temp_mct
sys.path.insert(0,"tutorials")


### Download ImageNet validation set
Download ImageNet dataset with only the validation split.

Note that for demonstration purposes we use the validation set for the model quantization routines. Usually, a subset of the training dataset is used, but loading it is a heavy procedure that is unnecessary for the sake of this demonstration.

This step may take several minutes...

In [None]:
import os
if not os.path.isdir('imagenet'):
    !mkdir imagenet
    !wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_devkit_t12.tar.gz
    !mv ILSVRC2012_devkit_t12.tar.gz imagenet/
    !wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
    !mv ILSVRC2012_img_val.tar imagenet/

## Model Quantization

### Download a Pre-Trained Model 


In [None]:
import timm

selected_model = 'mobilenetv2_100.ra_in1k'

model = timm.create_model(selected_model, pretrained=True)
model.eval()


### Post training quantization using Model Compression Toolkit 

Now, we're all set to use MCT's post-training quantization. To begin, we'll define a representative dataset and proceed with the model quantization. Please note that, for demonstration purposes, we'll use the evaluation dataset as our representative dataset. We'll calibrate the model using 80 representative images, divided into 20 iterations of 'batch_size' images each. 

In [None]:
import model_compression_toolkit as mct
from model_compression_toolkit.core.pytorch.pytorch_device_config import get_working_device
from timm.data import create_loader, resolve_data_config
from typing import Iterator, Tuple, List
import torchvision


BATCH_SIZE = 4
n_iters = 20
IMG_SIZE = 256
DATA_ARGS = {'img_size': IMG_SIZE}
device = get_working_device()

# Load representative dataset
data_config = resolve_data_config(args=DATA_ARGS,
                                  model=model)

# Extract ImageNet validation dataset using torchvision "datasets" module
val_dataset = torchvision.datasets.ImageNet(root='./imagenet', split='val')
    
representative_dataset = create_loader(
    val_dataset,
    input_size=data_config['input_size'],
    batch_size=BATCH_SIZE,
    interpolation=data_config['interpolation'],
    mean=data_config['mean'],
    std=data_config['std'],
    crop_pct=data_config['crop_pct'],
    device=device)

# Define representative dataset generator
def get_representative_dataset(n_iter: int, dataset_loader: Iterator[Tuple]):
    """
    This function creates a representative dataset generator. The generator yields numpy
        arrays of batches of shape: [Batch, H, W ,C].
    Args:
        n_iter: number of iterations for MCT to calibrate on
    Returns:
        A representative dataset generator
    """       
    def representative_dataset() -> Iterator[List]:
        ds_iter = iter(dataset_loader)
        for _ in range(n_iter):
            yield [next(ds_iter)[0]]

    return representative_dataset

# Get representative dataset generator
representative_dataset_gen = get_representative_dataset(n_iter=n_iters,
                                                        dataset_loader=representative_dataset)

# Perform post training quantization with the default configuration
quant_model, _ = mct.ptq.pytorch_post_training_quantization(model, representative_dataset_gen)
print('Quantized model is ready')

### Model Export

Now, we can export the quantized model, ready for deployment, into a `.onnx` format file. Please ensure that the `save_model_path` has been set correctly. 

In [None]:
mct.exporter.pytorch_export_model(model=quant_model,
                                  save_model_path='./qmodel.onnx',
                                  repr_dataset=representative_dataset_gen,
                                  onnx_opset_version=17)

## Evaluation on ImageNet dataset

### Floating point model evaluation
Please ensure that the dataset path has been set correctly before running this code cell.

In [None]:
from tutorials.resources.utils.pytorch_tutorial_tools import classification_eval

val_loader = create_loader(
    val_dataset,
    input_size=data_config['input_size'],
    batch_size=BATCH_SIZE,
    interpolation=data_config['interpolation'],
    mean=data_config['mean'],
    std=data_config['std'],
    crop_pct=data_config['crop_pct'],
    device=device)

# Evaluate the model on ImageNet
eval_results = classification_eval(model, val_loader)

# Print float model Accuracy results
print("Float model Accuracy: {:.4f}".format(round(100 * eval_results[0], 2)))

### Quantized model evaluation
We can evaluate the performance of the quantized model. There is a slight decrease in performance that can be further mitigated by either expanding the representative dataset or employing MCT's advanced quantization methods, such as GPTQ (Gradient-Based/Enhanced Post Training Quantization).

In [None]:
# Evaluate the quantized model on ImageNet
eval_results = classification_eval(quant_model, val_loader)

# Print quantized model Accuracy results
print("Quantized model Accuracy: {:.4f}".format(round(100 * eval_results[0], 2)))

\
Copyright 2024 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.