# Post Training Quantization a Nanodet-Plus Object Detection Model

[Run this tutorial in Google Colab](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/notebooks/imx500_notebooks/keras/example_keras_nanodet_plus_for_imx500.ipynb)

## Overview


In this tutorial, we'll demonstrate the post-training quantization using MCT for a pre-trained object detection model in Keras. Specifically, we'll integrate post-processing, including the non-maximum suppression (NMS) layer, into the model. This integration aligns with the imx500 target platform capabilities.

In this example we will use an existing pre-trained Nanodet-Plus model taken from [https://github.com/RangiLyu/nanodet](https://github.com/RangiLyu/nanodet). We will convert the model to a Tensorflow model that includes box decoding and NMS layer. Further, we will quantize the model using MCT post training quantization and evaluate the performance of the floating point model and the quantized model on COCO dataset.


## Summary

In this tutorial we will cover:

1. Post-Training Quantization using MCT of Keras object detection model including the post-processing.
2. Data preparation - loading and preprocessing validation and representative datasets from COCO.
3. Accuracy evaluation of the floating-point and the quantized models.

## Setup
Install the relevant packages.

In [None]:
TF_VER = '2.14.0'

!pip install -q tensorflow=={TF_VER}
!pip install -q pycocotools
!pip install 'huggingface-hub<=0.21.4'

Install MCT (if it's not already installed). Additionally, in order to use all the necessary utility functions for this tutorial, we also copy [MCT tutorials folder](https://github.com/sony/model_optimization/tree/main/tutorials) and add it to the system path.


In [None]:
import sys
import os
import importlib

if not importlib.util.find_spec('model_compression_toolkit'):
    !pip install model_compression_toolkit
!git clone https://github.com/sony/model_optimization.git temp_mct && mv temp_mct/tutorials . && \rm -rf temp_mct
sys.path.insert(0,"tutorials")

Finally, load COCO evaluation set

In [None]:
if not os.path.isdir('coco'):
    !wget -nc http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !unzip -q -o annotations_trainval2017.zip -d ./coco
    !echo Done loading annotations
    !wget -nc http://images.cocodataset.org/zips/val2017.zip
    !unzip -q -o val2017.zip -d ./coco
    !echo Done loading val2017 images

## Floating Point Model

### Load the pre-trained weights of Nanodet-Plus
We begin by loading a pre-trained [Nanodet-Plus](https://huggingface.co/SSI-DNN/keras_nanodet_plus_x1.5_416x416) model. This implementation is based on [nanodet](https://github.com/RangiLyu/nanodet). For further insights into the model's implementation details, please refer to [mct_model_garden](https://github.com/sony/model_optimization/tree/main/tutorials/mct_model_garden/models_keras/nanodet). 

In [None]:
from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras('SSI-DNN/keras_nanodet_plus_x1.5_416x416')

### Generate Nanoedet-Plus Keras model
In the following steps, we integrate the post-processing components, which include box decoding layers following by tensorflow [tf.image.combined_non_max_suppression](https://www.tensorflow.org/api_docs/python/tf/image/combined_non_max_suppression) layer.


In [None]:
import tensorflow as tf
from keras.models import Model
from tutorials.mct_model_garden.models_keras.nanodet.nanodet_keras_model import nanodet_box_decoding

# Parameters of nanodet-plus-m-1.5x_416
INPUT_RESOLUTION = 416
INPUT_SHAPE = (INPUT_RESOLUTION, INPUT_RESOLUTION, 3)
SCALE_FACTOR = 1.5
BOTTLENECK_RATIO = 0.5
FEATURE_CHANNELS = 128

# Add Nanodet Box decoding layer (decode the model outputs to bounding box coordinates)
scores, boxes = nanodet_box_decoding(model.output, res=INPUT_RESOLUTION)

# Add Tensorflow NMS layer
outputs = tf.image.combined_non_max_suppression(
    boxes,
    scores,
    max_output_size_per_class=300,
    max_total_size=300,
    iou_threshold=0.65,
    score_threshold=0.001,
    pad_per_class=False,
    clip_boxes=False
    )

model = Model(model.input, outputs, name='Nanodet_plus_m_1.5x_416')

print('Model is ready for evaluation')

#### Evaluate the floating point model
Next, we evaluate the floating point model by using `cocoeval` library alongside additional dataset utilities. We can verify the mAP accuracy aligns with that of the original model. 
Note that we set the "batch_size" to 5 and the preprocessing according to [Nanodet](https://github.com/RangiLyu/nanodet/tree/main).
Please ensure that the dataset path has been set correctly before running this code cell.

In [None]:
import cv2
from tutorials.mct_model_garden.evaluation_metrics.coco_evaluation import coco_dataset_generator, CocoEval

EVAL_DATASET_FOLDER = './coco/val2017'
EVAL_DATASET_ANNOTATION_FILE = './coco/annotations/instances_val2017.json'

BATCH_SIZE = 5

def nanodet_preprocess(x):
    img_mean = [103.53, 116.28, 123.675]
    img_std = [57.375, 57.12, 58.395]
    x = cv2.resize(x, (416, 416))
    x = (x - img_mean) / img_std
    return x

# Load COCO evaluation set
val_dataset = coco_dataset_generator(dataset_folder=EVAL_DATASET_FOLDER,
                                     annotation_file=EVAL_DATASET_ANNOTATION_FILE,
                                     preprocess=nanodet_preprocess,
                                     batch_size=BATCH_SIZE)

# Initialize the evaluation metric object
coco_metric = CocoEval(EVAL_DATASET_ANNOTATION_FILE)

# Iterate and the evaluation set
for batch_idx, (images, targets) in enumerate(val_dataset):
    
    # Run inference on the batch
    outputs = model(images)

    # Add the model outputs to metric object (a dictionary of outputs after postprocess: boxes, scores & classes)
    coco_metric.add_batch_detections(outputs, targets)
    if (batch_idx + 1) % 100 == 0:
        print(f'processed {(batch_idx + 1) * BATCH_SIZE} images')

# Print float model mAP results
print("Float model mAP: {:.4f}".format(coco_metric.result()[0]))

## Quantize Model

### Post training quantization using Model Compression Toolkit 

Now we are ready to use MCT's post training quantization! We will define a representative dataset and proceed with the model quantization. Please note that, for the sake of demonstration, we'll use the evaluation dataset as our representative dataset (and skip the download of the training dataset). We will use 100 representative images for calibration (20 iterations of "batch_size" images each).
Same as the above section, please ensure that the dataset path has been set correctly.

In [None]:
import model_compression_toolkit as mct
from typing import Iterator, Tuple, List

REPRESENTATIVE_DATASET_FOLDER = './coco/val2017'
REPRESENTATIVE_DATASET_ANNOTATION_FILE = './coco/annotations/instances_val2017.json'
n_iters = 20

# Load representative dataset
representative_dataset = coco_dataset_generator(dataset_folder=REPRESENTATIVE_DATASET_FOLDER,
                                                annotation_file=REPRESENTATIVE_DATASET_ANNOTATION_FILE,
                                                preprocess=nanodet_preprocess,
                                                batch_size=BATCH_SIZE)

# Define representative dataset generator
def get_representative_dataset(n_iter: int, dataset_loader: Iterator[Tuple]):
    """
    This function creates a representative dataset generator.
    
    Args:
        n_iter: number of iterations for MCT to calibrate on
    Returns:
        A representative dataset generator
    """    
    def representative_dataset() -> Iterator[List]:
        """
        Creates a representative dataset generator from a PyTorch data loader, The generator yields numpy
        arrays of batches of shape: [Batch, H, W ,C].
        
        Returns:
            A representative dataset generator
        """
        ds_iter = iter(dataset_loader)
        for _ in range(n_iter):
            yield [next(ds_iter)[0]]

    return representative_dataset

# Preform post training quantization 
quant_model, _ = mct.ptq.keras_post_training_quantization(model,
                                                          get_representative_dataset(n_iters, representative_dataset))

print('Quantized model is ready')

### Evaluate quantized model
Lastly, we can evaluate the performance of the quantized model. There is a slight decrease in performance that can be further mitigated by either expanding the representative dataset or employing MCT's advanced quantization methods, such as EPTQ (Enhanced Post Training Quantization).

In [None]:
# Re-load COCO evaluation set
val_dataset = coco_dataset_generator(dataset_folder=EVAL_DATASET_FOLDER,
                                     annotation_file=EVAL_DATASET_ANNOTATION_FILE,
                                     preprocess=nanodet_preprocess,
                                     batch_size=BATCH_SIZE)

# Initialize the evaluation metric object
coco_metric = CocoEval(EVAL_DATASET_ANNOTATION_FILE)

# Iterate and the evaluation set
for batch_idx, (images, targets) in enumerate(val_dataset):
    # Run inference on the batch
    outputs = quant_model(images)

    # Add the model outputs to metric object (a dictionary of outputs after postprocess: boxes, scores & classes)
    coco_metric.add_batch_detections(outputs, targets)
    if (batch_idx + 1) % 100 == 0:
        print(f'processed {(batch_idx + 1) * BATCH_SIZE} images')

# Print quantized model mAP results
print("Quantized model mAP: {:.4f}".format(coco_metric.result()[0]))

\
Copyright 2024 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.