# Post Training Quantization an EfficientDet Object Detection Model

[Run this tutorial in Google Colab](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/notebooks/keras/ptq/example_keras_effdet_lite0.ipynb)

## Overview

In this notebook, we'll demonstrate the post-training quantization using MCT for a pre-trained object detection model in Keras. In addition, we'll integrate a post-processing custom layer from [sony-custom-layers](https://github.com/sony/custom_layers) into the model. This custom layer is supported by the imx500 target platform capabilities.

In this example we will use an existing pre-trained EfficientDet model taken from [efficientdet-pytorch](https://github.com/rwightman/efficientdet-pytorch). We will convert the model to a Keras functional model that includes the custom [PostProcess Layer](https://github.com/sony/custom_layers/blob/main/sony_custom_layers/keras/object_detection/ssd_post_process.py). Further, we will quantize the model using MCT post training quantization and evaluate the performance of the floating point model and the quantized model on the COCO dataset.

We'll use the [timm](https://github.com/huggingface/pytorch-image-models)'s data loader and evaluation capabilities used for the original PyTorch pretrained model. The conversion to the Keras model will not be covered. You can go over the conversion [here](https://github.com/sony/model_optimization/tree/main/tutorials/mct_model_garden/models_keras/efficientdet).

Steps:
* **Setup the environment**: install relevant packages, import them
* **Initialize the dataset**: Download the COCO evaluation set and prepare the evaluation code
* **Keras float model**: Create the Keras model, assign the pretrained weights and evaluate it
* **Quantize Keras mode**: Quantize the model and evaluate it

**Note**: The following code should be run on a GPU.

## Setup

install and import relevant packages

In [None]:
!pip install -q tensorflow
!pip install -q mct-nightly
!pip install -q torch
!pip install -q torchvision
!pip install -q timm
!pip install -q effdet
!pip install -q sony-custom-layers

In [None]:
from typing import Dict, Optional
from time import time
import torch
import tensorflow as tf
from timm.utils import AverageMeter
from effdet.config import get_efficientdet_config
from effdet import create_dataset, create_loader, create_evaluator
from effdet.data import resolve_input_config

In order to convert the PyTorch model, you'll need to use the conversion code in the [MCT tutorials folder](https://github.com/sony/model_optimization/tree/main/tutorials), so we'll clone the MCT repository to a local folder and only use that code. The installed MCT package will be used for quantization. 
  **It's important to note that we use the most up-to-date MCT code available.**

In [None]:
!git clone https://github.com/sony/model_optimization.git local_mct

In [None]:
import sys
sys.path.insert(0,"./local_mct")
!pip install -r ./local_mct/requirements.txt
from tutorials.mct_model_garden.models_keras.efficientdet import EfficientDetKeras

## Initialize dataset

### Load the COCO evaluation set

In [None]:
!wget -nc http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip -q -o annotations_trainval2017.zip -d ./coco
!echo Done loading annotations
!wget -nc http://images.cocodataset.org/zips/val2017.zip
!unzip -q -o val2017.zip -d ./coco
!echo Done loading val2017 images

### Initialize the data loader and evaluation functions

These functions were adapted from the [efficientdet-pytorch](https://github.com/rwightman/efficientdet-pytorch) repository.

In [None]:
class TorchWrapper(torch.nn.Module):
    """
    A class to wrap the EfficientDet Keras model in a torch.nn.Module
    so it can be evaluated with timm's evaluation code
    """
    def __init__(self, keras_model: tf.keras.Model):
        super(TorchWrapper, self).__init__()
        self.model = keras_model

    @property
    def config(self):
        # a property used by the evaluation code
        return self.model.config

    def forward(self, x: torch.Tensor,
                img_info: Optional[Dict[str, torch.Tensor]] = None):
        """
        mimics the forward inputs of the EfficientDet PyTorch model.
        Args:
            x: inputs images
            img_info: input image info for scaling the outputs

        Returns:
            A torch.Tensor of shape [Batch, Boxes, 6], the same as
            the PyTorch model

        """
        device = x.device
        keras_input = x.detach().cpu().numpy().transpose((0, 2, 3, 1))
        outputs = self.model(keras_input)

        outs = [torch.Tensor(o.numpy()).to(device) for o in outputs]
        # reorder boxes (y, x, y2, x2) to (x, y, x2, y2)
        outs[0] = outs[0][:, :, [1, 0, 3, 2]]
        # scale boxes to original image size
        outs[0] = outs[0] * img_info['img_scale'].view((-1, 1, 1))
        return torch.cat([outs[0], outs[1].unsqueeze(2),
                          outs[2].unsqueeze(2) + 1], 2)


def get_coco_dataloader(batch_size=16, split='val', config=None):
    """
        Get the torch data-loader and evaluation object
    Args:
        batch_size: batch size for data loader
        split: dataset split
        config: model config

    Returns:
        The DataLoader and evaluation object for calculating accuracy

    """
    root = './coco'

    args = dict(interpolation='bilinear', mean=None,
                std=None, fill_color=None)
    dataset = create_dataset('coco', root, split)
    input_config = resolve_input_config(args, config)
    loader = create_loader(
        dataset,
        input_size=input_config['input_size'],
        batch_size=batch_size,
        use_prefetcher=True,
        interpolation=input_config['interpolation'],
        fill_color=input_config['fill_color'],
        mean=input_config['mean'],
        std=input_config['std'],
        num_workers=0,
        pin_mem=False,
    )
    evaluator = create_evaluator('coco', dataset, pred_yxyx=False)

    return loader, evaluator


def acc_eval(_model: tf.keras.Model, batch_size=16, config=None):
    """
    This function takes a Keras model, wraps it in a Torch model and runs evaluation
    Args:
        _model: Keras model
        batch_size: batch size of the data loader
        config: model config

    Returns:

    """
    # wrap Keras model in a Torch model so it can run in timm's evaluation code
    _model = TorchWrapper(_model)
    # EValuate input model
    val_loader, evaluator = get_coco_dataloader(batch_size=batch_size, config=config)

    batch_time = AverageMeter()
    end = time()
    last_idx = len(val_loader) - 1
    with torch.no_grad():
        for i, (input, target) in enumerate(val_loader):
            output = _model(input, img_info=target)

            evaluator.add_predictions(output, target)

            # measure elapsed time
            batch_time.update(time() - end)
            end = time()
            if i % 10 == 0 or i == last_idx:
                print(
                    f'Test: [{i:>4d}/{len(val_loader)}]  '
                    f'Time: {batch_time.val:.3f}s ({batch_time.avg:.3f}s, {input.size(0) / batch_time.avg:>7.2f}/s)  '
                )

    return evaluator.evaluate()

## Keras model

Create the Keras model and copy weights from pretrained PyTorch weights file. Saved as "model.keras".

In [None]:
model_name = 'tf_efficientdet_lite0'
config = get_efficientdet_config(model_name)

model = EfficientDetKeras(config, pretrained_backbone=False).get_model([*config.image_size] + [3])

### Evaluate Keras model

We evaluate the model to verify the conversion to a Keras model succeeded. The result will be compared to the quantized model evaluation.

In [None]:
float_map = acc_eval(model, batch_size=64, config=config)

## Quantize Keras model

In this section, the Keras model will be quantized by the MCT, with the following parameters:
- **Target Platform**: IMX500-v1
- **Mixed-Precision** weights compression so the model will fit the IMX500 memory size

The quantized model is saved as "quant_model.keras".

In [None]:
loader, _ = get_coco_dataloader(split='val', config=config)


def get_representative_dataset(n_iter):
    """
    This function creates a representative dataset generator
    Args:
        n_iter: number of iterations for MCT to calibrate on

    Returns:
        A representative dataset generator

    """

    def representative_dataset():
        """
        Creates a representative dataset generator from a PyTorch data loader, The generator yields numpy
        arrays of batches of shape: [Batch, H, W ,C]
        Returns:
            A representative dataset generator

        """
        ds_iter = iter(loader)
        for _ in range(n_iter):
            t = next(ds_iter)[0]
            # Convert the Torch tensor from the data loader to a numpy array and transpose to the
            # right shape: [B, C, H, W] -> [B, H, W, C]
            tf_shaped_tensor = t.detach().cpu().numpy().transpose((0, 2, 3, 1))
            yield [tf_shaped_tensor]

    return representative_dataset


# Set IMX500-v1 TPC
tpc = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version='v1')
# set weights memory size, so the quantized model will fit the IMX500 memory
kpi = mct.core.KPI(weights_memory=2674291)
# set MixedPrecision configuration for compressing the weights
mp_config = mct.core.MixedPrecisionQuantizationConfig(use_hessian_based_scores=False)
core_config = mct.core.CoreConfig(mixed_precision_config=mp_config)
quant_model, _ = mct.ptq.keras_post_training_quantization(
    model,
    get_representative_dataset(20),
    target_kpi=kpi,
    core_config=core_config,
    target_platform_capabilities=tpc)

### Evaluate quantized Keras model

Quantized Keras model evaluation applied the same as the original model.

In [None]:
quant_map = acc_eval(quant_model, batch_size=64, config=config)

print(f' ===>> Float model mAP = {100*float_map:2.3f}, Quantized model mAP = {100*quant_map:2.3f}')

## Export and Load the quantized model
Lastly, we will demonstrate how to export the quantized model into a file and then load it.

We will use `keras_export_model` function to save the quantized model with the integrated custom quantizers into a ".keras" file format.

In [None]:
# Export a keras model with mctq custom quantizers into a file
mct.exporter.keras_export_model(model=quant_model,
                                save_model_path='./quant_model.keras')

Then, we can load the saved model using `keras_load_quantized_model` function. For this specific case, we'll have to supply the load function with an extra custom layer integrated into the model, namely `SSDPostProcess`.

In [None]:
from sony_custom_layers.keras.object_detection.ssd_post_process import SSDPostProcess

custom_objects = {SSDPostProcess.__name__: SSDPostProcess} # An extra custom layer integrated in the model 
quant_model_from_file = mct.keras_load_quantized_model('./quant_model.keras', custom_objects=custom_objects)

\
Copyright 2023 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.