# YOLOX-Tiny Object Detection - Quantization for IMX500

[Run this tutorial in Google Colab](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/notebooks/imx500_notebooks/pytorch/pytorch_yolox-tiny_for_imx500.ipynb)

## Overview

In this tutorial, we will illustrate a basic and quick process of preparing a pre-trained model for deployment using MCT. Specifically, we will demonstrate how to download a pre-trained pytorch YOLOX-Tiny model, compress it, and make it deployment-ready using MCT's post-training quantization techniques.

We will use an existing pre-trained YOLOX-Tiny model based on [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) and integrate Box decoding and NMS to the model. The model was slightly adjusted for model quantization. We will quantize the model using MCT post training quantization technique and evaluate the performance of the floating point model and the quantized model on COCO dataset.


## Summary

In this tutorial we will cover:

1. Post-Training Quantization (PTQ) using MCT of YoloX object detection model.
2. Data preparation: loading and preprocessing validation and representative datasets from COCO.
3. Accuracy evaluation of the floating-point and the quantized models.

## Setup
### Install the relevant packages

In [None]:
!pip install -q torch
!pip install onnx
!pip install -q pycocotools
!pip install 'sony-custom-layers'

Install MCT (if it’s not already installed). Additionally, in order to use all the necessary utility functions for this tutorial, we also copy [MCT tutorials folder](https://github.com/sony/model_optimization/tree/main/tutorials) and add it to the system path.

In [None]:
import sys
import os
import importlib

if not importlib.util.find_spec('model_compression_toolkit'):
    !pip install model_compression_toolkit
!git clone https://github.com/sony/model_optimization.git temp_mct && mv temp_mct/tutorials . && \rm -rf temp_mct
sys.path.insert(0,"tutorials")

### Download COCO evaluation set

In [None]:
if not os.path.isdir('coco'):
    !wget -nc http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !unzip -q -o annotations_trainval2017.zip -d ./coco
    !echo Done loading annotations
    !wget -nc http://images.cocodataset.org/zips/val2017.zip
    !unzip -q -o val2017.zip -d ./coco
    !echo Done loading val2017 images

## Quantization

### Download a Pre-Trained Model 

We begin by downloading a pre-trained YOLOX-Tiny model from [YOLOX github](https://github.com/Megvii-BaseDetection/YOLOX). This implementation is based on [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) and includes a slightly modified version of YOLOX detection-head (mainly the box decoding part) that was adapted for model quantization. For further insights into the model's implementation details, please refer to [MCT Models Garden - YOLOX](https://github.com/sony/model_optimization/tree/main/tutorials/mct_model_garden/models_pytorch/yolox).  

In [None]:
# Download YOLOX-Tiny
!wget -nc https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_tiny.pth

from tutorials.mct_model_garden.models_pytorch.yolox.yolox import YOLOX
import yaml

yaml_path = "tutorials/mct_model_garden/models_pytorch/yolox/yolox.yaml"
with open(yaml_path, 'r', encoding='utf-8') as f:
    yolox_cfg = yaml.safe_load(f)

yolox_tiny_cfg = yolox_cfg['tiny']
model = YOLOX(yolox_tiny_cfg)
model.load_weights("yolox_tiny.pth")
model.eval()

### Post training quantization (PTQ) using Model Compression Toolkit (MCT)

Now, we are all set to use MCT's post-training quantization. To begin, we'll define a representative dataset and proceed with the model quantization. Please note that, for demonstration purposes, we'll use the evaluation dataset as our representative dataset. We'll calibrate the model using 80 representative images, divided into 30 iterations of 'batch_size' images each. 


In [None]:
import model_compression_toolkit as mct
from tutorials.mct_model_garden.evaluation_metrics.coco_evaluation import coco_dataset_generator
from tutorials.mct_model_garden.models_pytorch.yolox.yolox_preprocess import yolox_preprocess_chw_transpose
from tutorials.mct_model_garden.models_pytorch.yolox.yolox import YOLOXPostProcess
from typing import Iterator

REPRESENTATIVE_DATASET_FOLDER = './coco/val2017/'
REPRESENTATIVE_DATASET_ANNOTATION_FILE = './coco/annotations/instances_val2017.json'
BATCH_SIZE = 4
n_iters = 30

# Load representative dataset
representative_dataset = coco_dataset_generator(dataset_folder=REPRESENTATIVE_DATASET_FOLDER,
                                                annotation_file=REPRESENTATIVE_DATASET_ANNOTATION_FILE,
                                                preprocess=yolox_preprocess_chw_transpose,
                                                batch_size=BATCH_SIZE)


def get_representative_dataset(dataset: Iterator, n_iter: int):
    """
    This function creates a representative dataset generator. The generator yields numpy
        arrays of batches of shape: [Batch, H, W ,C].
    Args:
        dataset: dataset iterator
        n_iter: number of iterations for MCT for calibration
    Returns:
        A representative dataset generator
    """       
    def _generator():
        for _ind in range(n_iter):
            batch, label = next(iter(dataset))
            yield [batch]

    return _generator

# Get representative dataset generator
representative_dataset_gen = get_representative_dataset(dataset=representative_dataset, n_iter=n_iters)

# Set IMX500 TPC
tpc = mct.get_target_platform_capabilities(fw_name="pytorch",
                                           target_platform_name='imx500',
                                           target_platform_version='v3')

# Define target Resource Utilization for mixed precision weights quantization.
# Number of parameters of YOLOx-Tiny is 5M and we set target memory (in Bytes) of 87% of 'standard' 8-bit quantization.
resource_utilization = mct.core.ResourceUtilization(weights_memory=5e6 * 0.87)

# Perform post training quantization
quant_model, _ = mct.ptq.pytorch_post_training_quantization(in_module=model,
                                                            representative_data_gen=representative_dataset_gen,
                                                            target_resource_utilization=resource_utilization,
                                                            target_platform_capabilities=tpc)

# Integrate the quantized model with box decoder and NMS
quant_model = YOLOXPostProcess(quant_model)

print('Quantized model is ready!')

### Export

Now, we can export the quantized model, ready for deployment om IMX500, into a `.onnx` format file. Please ensure that the `save_model_path` has been set correctly. 

In [None]:
mct.exporter.pytorch_export_model(model=quant_model,
                                  save_model_path='./model.onnx',
                                  repr_dataset=representative_dataset_gen)

## Evaluation on COCO dataset

### Floating point model evaluation
Next, we evaluate the floating point model by using `cocoeval` library alongside additional dataset utilities. We can verify the mAP accuracy aligns with that of the original model. 
Note that we set the preprocessing according to [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX).
Please ensure that the dataset path has been set correctly before running this code cell.

In [None]:
from tutorials.mct_model_garden.evaluation_metrics.coco_evaluation import coco_evaluate
from tutorials.mct_model_garden.models_pytorch.yolox.yolox import model_predict

EVAL_DATASET_FOLDER = './coco/val2017'
EVAL_DATASET_ANNOTATION_FILE = './coco/annotations/instances_val2017.json'

# Define boxes resizing information to map between the model's output and the original image dimensions
output_resize = {'shape': yolox_tiny_cfg['img_size'], 'aspect_ratio_preservation': True, "align_center": False, 'normalized_coords': False}

# Integrate the floating-point model with box decoder and NMS
model = YOLOXPostProcess(model)

# Evaluate the floating-point model
eval_results = coco_evaluate(model=model,
                             dataset_folder=EVAL_DATASET_FOLDER,
                             annotation_file=EVAL_DATASET_ANNOTATION_FILE,
                             preprocess=yolox_preprocess_chw_transpose,
                             output_resize=output_resize,
                             batch_size=BATCH_SIZE,
                             model_inference=model_predict)

print("Floating-point model mAP: {:.4f}".format(eval_results[0]))

### Quantized model evaluation
We can evaluate the performance of the quantized model. There is a slight decrease in performance that can be further mitigated by either expanding the representative dataset or employing MCT's advanced quantization methods, such as GPTQ (Gradient-Based/Enhanced Post Training Quantization).

In [None]:
# Evaluate quantized model
eval_results = coco_evaluate(model=quant_model,
                             dataset_folder=EVAL_DATASET_FOLDER,
                             annotation_file=EVAL_DATASET_ANNOTATION_FILE,
                             preprocess=yolox_preprocess_chw_transpose,
                             output_resize=output_resize,
                             batch_size=BATCH_SIZE,
                             model_inference=model_predict)

print("Quantized model mAP: {:.4f}".format(eval_results[0]))

\
Copyright 2024 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.