# EfficientDet and Mixed-Precision Post-Training Quantization in Keras using the Model Compression Toolkit(MCT)

[Run this tutorial in Google Colab](https://colab.research.google.com/github/SonySemiconductorSolutions/mct-model-optimization/blob/main/tutorials/notebooks/task_notebooks/keras/example_effdet_keras_mixed_precision_ptq.ipynb)

### Attention

The MCT (Model Compression Toolkit) used in this tutorial requires TensorFlow 2.15 or earlier, which are not compatible with the default Google Colab environment (Python 3.12 or later).

**If you are running this tutorial on Google Colab, you must change the runtime type to use Python 3.11 before proceeding.**  
For detailed instructions, please refer to the [README.md](../../../README.md).

## Overview
This quick-start guide explains how to use the **Model Compression Toolkit (MCT)** to quantize a EfficientDet model. We will load a pre-trained model and quantize it using the MCT with **Mixed-Precision Post-Training Quantization (PTQ)** .

## Summary
In this tutorial, we will cover:

1. Loading and preprocessing COCO’s dataset.
2. Constructing an unlabeled representative dataset.
3. Post-Training Quantization using MCT.
4. Accuracy evaluation of the floating-point and the quantized models.

## efficientdet-pytorch(Dependent External Repository)
This tutorial uses a pre-trained pytorch model from the repository linked below, and converts to a pre-trained keras model.  
Installation instructions are provided in the **Setup** section.   
[efficientdet-pytorch](https://github.com/rwightman/efficientdet-pytorch)

### License(efficientdet-pytorch)
   Copyright 2020 Ross Wightman

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.

## Additional Code Attribution

This tutorial uses custom model conversion code located in the `models/efficientdet/` and `models/utils/` directory.
These files facilitate the conversion of PyTorch EfficientDet models to Keras/TensorFlow format for use with MCT.

### Source Code Attribution

The following files contain code derived from open-source PyTorch implementations:

**efficientdet-pytorch**
- Source: https://github.com/rwightman/efficientdet-pytorch
- License: Apache License 2.0
- Files: `effdet_keras.py`, `torch2keras_weights_translation.py`
- Modifications: PyTorch layers converted to Keras/TensorFlow equivalents, weight loading adapted for Keras format

**pytorch-image-models (timm)**
- Source: https://github.com/huggingface/pytorch-image-models
- License: Apache License 2.0
- Files: `effnet_keras.py`, `effnet_blocks_keras.py`
- Modifications: `torch.nn.Module` classes converted to Keras layers

### License(efficientdet-pytorch)
Please refer to the license section described earlier in this notebook.

### License(pytorch-image-models)
```
   Copyright 2019 Ross Wightman

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
```

For detailed attribution information, see the header comments in each file under `models/`.

## Setup  

First, install the relevant packages:  
This step may take several minutes...


In [None]:
!pip install tensorflow==2.15.*
!pip install numpy==1.26.4
!pip install opencv-python==4.9.0.80
!pip install pycocotools==2.0.10

# install efficientdet-pytorch(effdet) and dependencies
!pip install torch==2.6.0 torchvision==0.21.0
!pip install timm==0.9.16
!pip install effdet==0.4.1

In [None]:
import importlib
import sys
if not importlib.util.find_spec('model_compression_toolkit'):
    !pip install model_compression_toolkit
!git clone https://github.com//SonySemiconductorSolutions/mct-model-optimization.git temp_mct && mv temp_mct/tutorials/notebooks/task_notebooks/keras/models . && \rm -rf temp_mct
sys.path.insert(0,"models")

In [None]:
import tensorflow as tf
from typing import Dict, List, Tuple, Any
import random
import os
import cv2
import numpy as np
import itertools
from tqdm import tqdm
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

import model_compression_toolkit as mct
from edgemdt_cl.keras import SSDPostProcess
from edgemdt_cl.keras.object_detection import ScoreConverter

from effdet.config import get_efficientdet_config
from effdet.anchors import Anchors

### Various Settings
Here, you can configure the parameters listed below.  

#### Parameter setting
- IMG_HEIGHT, IMG_WIDTH  
  This parameter allows you to set the size of input images.
- SCORE_THR  
  This parameter allows you to set the threshold of class score for the Non-Maximum Suppression (NMS) and evaluation.
- IOU_THR  
  This parameter allows you to set the threshold of iou for the Non-Maximum Suppression (NMS).
- CALIB_ITER  
  This parameter allows you to set how many samples to use when generating representative data for quantization.
- WEIGHTS_COMPRESSION_RATIO  
  This parameter allows you to set the quantization ratio based on the weight size of the 8-bit model when using mixed-precision quantization.

In [None]:
# Parameter setting
IMG_HEIGHT = 320
IMG_WIDTH = 320
SCORE_THR = 0.001
IOU_THR = 0.50
CALIB_ITER = 10
WEIGHTS_COMPRESSION_RATIO = 0.85
BATCH_SIZE = 16

Load a pre-trained PyTorch model, and Convert to Keras model.

In [None]:
from models.efficientdet import EfficientDetKeras
model_name = 'tf_efficientdet_lite0'
config = get_efficientdet_config(model_name)
input_shape = [*config.image_size] + [3]

float_model = EfficientDetKeras(config, pretrained_backbone=False).get_model(input_shape)

Next, we add the CustomLayer (edgemdt_cl) **SSDPostProcess** as post-processing.  

SSDPostProcess: Decodes EfficientDet inference results from Anchor format to BoundingBox format and Executes the Non-Maximum Suppression to remove overlapping boxes.

In [None]:
# Make CustomLayer Instance
anchors = tf.constant(Anchors.from_config(config).boxes.detach().cpu().numpy())
ssd_pp = SSDPostProcess(anchors, [1, 1, 1, 1], [*config.image_size],
                        ScoreConverter.SIGMOID, score_threshold=SCORE_THR, iou_threshold=IOU_THR,
                        max_detections=config.max_det_per_image)

# Add CustomLayer to model
input = tf.keras.layers.Input(shape=input_shape)
x_class, x_box = float_model(input)
outputs = ssd_pp((x_box, x_class))
full_float_model = tf.keras.Model(inputs=input, outputs=outputs)

The input and output formats of SSDPostProcess are shown below.  
For detailed attribution information, see [API Document](https://sonysemiconductorsolutions.github.io/aitrios-edge-mdt-cl/edgemdt_cl/keras.html#SSDPostProcess).

Inputs:  
&emsp;A list or tuple of:  
- rel_codes: Relative codes (encoded offsets).  
- scores: Scores or logits.  

Returns:  
&emsp;'CombinedNonMaxSuppression' named tuple:  
- nmsed_boxes: Selected boxes sorted by scores in descending order.
- nmsed_scores: Scores corresponding to the selected boxes.
- nmsed_classes: Labels corresponding to the selected boxes. 
- valid_detections: The number of valid detections out of max_detections(unused in this tutorial).

## Dataset preparation
### Download COCO's dataset

**Note**  
In this tutorial, we will use a subset of COCO train2017 for calibration during quantization and COCO val2017 for evaluation.

This step may take several minutes...

In [None]:
if not os.path.isdir('COCO_dataset'):
    !mkdir COCO_dataset
    !wget -P COCO_dataset http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !wget -P COCO_dataset http://images.cocodataset.org/zips/train2017.zip
    !wget -P COCO_dataset http://images.cocodataset.org/zips/val2017.zip
    !unzip COCO_dataset/annotations_trainval2017.zip -d COCO_dataset
    !unzip COCO_dataset/train2017.zip -d COCO_dataset
    !unzip COCO_dataset/val2017.zip -d COCO_dataset

Here, we are setting the paths for the annotation file and image folder of the downloaded dataset.

In [None]:
COCO_TRAIN_IMG_DIR = "COCO_dataset/train2017/"
COCO_VAL_IMG_DIR = "COCO_dataset/val2017/"
COCO_TRAIN_ANN_JSON = "COCO_dataset/annotations/instances_train2017.json"
COCO_VAL_ANN_JSON = "COCO_dataset/annotations/instances_val2017.json"

In this class, we process the downloaded COCO's dataset for calibration during quantization and for use in evaluation.  
We define the dataset and dataloader for COCO's dataset.

In [None]:
class CocoDataset:
    """
    COCO dataset class like pytorch.
    Preprocessor matching the pipeline of EfficientDet(efficientdet-pytorch).
    """

    def __init__(self, img_dir: str, ann_json: str, img_size: Tuple = (320, 320)):
        """
        Initialize COCO dataset.

        Args:
            img_dir (str): A directory path containing COCO images.
            ann_json (str): A file path to COCO annotation json file.
            img_size (Tuple[int, int]): Target image size for EfficientDet(efficientdet-pytorch) model.
        """
        self.img_dir = img_dir
        self.coco = COCO(ann_json)
        self.img_ids = self.coco.getImgIds()
        self.img_size = img_size
        
        # Normalization parameters matching EfficientDet(efficientdet-pytorch) configuration
        self.mean = np.array([0.5, 0.5, 0.5], dtype=np.float32)
        self.std = np.array([0.5, 0.5, 0.5], dtype=np.float32)
        self.fill_value = (self.mean * 255).astype(np.uint8)

    def __len__(self) -> int: 
        return len(self.img_ids)

    def __getitem__(self, idx: int) -> Dict[str, Any]:
        """
        Iteration of COCO dataset.

        Args:
            idx (int): Index of the image in the dataset.

        Returns:
            Dict[str, Any]: A dictionary containing:
                'input' (np.ndarray): Preprocessed image.
                'id' (int): Image ID.
                'file_name' (str): Image file name.
                'ratio' (float): Scale factor used in preprocessing.
        """
        img_id = self.img_ids[idx]
        img_info = self.coco.loadImgs([img_id])[0]
        img_path = os.path.join(self.img_dir, img_info['file_name'])

        org_img = cv2.imread(img_path)
        org_img = cv2.cvtColor(org_img, cv2.COLOR_BGR2RGB)
        input_img, ratio = self.preprocess(input_img=org_img)

        sample = {
            'input': input_img,
            'id': img_id,
            'file_name': img_info['file_name'],
            'ratio': ratio
        }
        return sample
    
    def preprocess(self, input_img: np.ndarray) -> Tuple:
        """
        Preprocess image to match the pipeline of EfficientDet(efficientdet-pytorch).
        
        Args:
            input_img (np.ndarray): Input image in HWC format.

        Returns:
            Tuple[np.ndarray, float]:
                - Preprocessed image.
                - Scale factor used in resizing.
        """
        height, width = input_img.shape[:2]
        target_h, target_w = self.img_size
        
        # Calculate scale factor for letterbox resize
        img_scale = min(target_h / height, target_w / width)
        
        # Resize with bilinear interpolation
        scaled_h = int(height * img_scale)
        scaled_w = int(width * img_scale)
        resized_img = cv2.resize(input_img, (scaled_w, scaled_h), interpolation=cv2.INTER_LINEAR)
        
        # Pad with mean value
        padded_img = np.full((target_h, target_w, 3), self.fill_value, dtype=np.uint8)
        padded_img[:scaled_h, :scaled_w, :] = resized_img
        
        # Normalize: (x/255 - mean) / std
        normalized_img = (padded_img.astype(np.float32) / 255.0 - self.mean) / self.std

        return normalized_img, img_scale

In [None]:
class CocoDataLoader:
    """
    Dataloader class like pytorch for CocoDataset.
    """

    def __init__(self, dataset: List[Tuple], batch_size: int, shuffle: bool = False):
        """
        Initialize COCO dataloader.

        Args:
            dataset (List[Tuple]): A list of dataset samples.
            batch_size (int): Number of samples per batch.
            shuffle (bool): Whether to shuffle the dataset at the start of each iteration.
        """
        self.dataset = dataset
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.count = 0
        self.inds = list(range(len(dataset)))

    def __iter__(self):
        self.count = 0
        if self.shuffle:
            random.shuffle(self.inds)

        return self

    def __next__(self) -> Dict[str, Any]:
        """
        Iteration of COCO dataloader.

        Returns:
            Dict[str, Any]: A dictionary containing:
                'input' (np.ndarray): Preprocessed image.
                'id' (int): Image ID.
                'file_name' (str): Image file name.
                'ratio' (float): Scale factor used in preprocessing.
        """
        if self.count >= len(self.dataset):
            raise StopIteration

        batch_sample = {}
        batch_count = 0
        while batch_count < self.batch_size and self.count < len(self.dataset):
            index = self.inds[self.count]
            sample = self.dataset[index]
            for sample_key in sample.keys():
                batch_sample.setdefault(sample_key, []).append(sample[sample_key])
            self.count += 1
            batch_count += 1
        for sample_key in batch_sample.keys():
            batch_sample[sample_key] = np.array(batch_sample[sample_key])

        return batch_sample

In [None]:
val_dataset = CocoDataset(
    img_dir = COCO_VAL_IMG_DIR, ann_json = COCO_VAL_ANN_JSON,
    img_size = (IMG_HEIGHT, IMG_WIDTH)
)
train_dataset = CocoDataset(
    img_dir = COCO_TRAIN_IMG_DIR, ann_json=COCO_TRAIN_ANN_JSON,
    img_size = (IMG_HEIGHT, IMG_WIDTH)
)

# For evaluation
val_dataloader = CocoDataLoader(
    val_dataset, batch_size=BATCH_SIZE, shuffle=False
)
# For calibration（No label required）
calib_loader = CocoDataLoader(
    train_dataset, batch_size=1, shuffle=False
)

print(len(train_dataset))
print(len(val_dataset))

## Representative Dataset
For quantization with MCT, we need to define a representative dataset required by the PTQ algorithm. This dataset is a generator that returns a list of images:

In [None]:
def representative_dataset_gen():
    for sample in itertools.islice(itertools.cycle(calib_loader), CALIB_ITER):
        yield [sample['input']]

## Target Platform Capabilities (TPC)
In addition, MCT optimizes the model for dedicated hardware platforms. This is done using TPC (for more details, please visit our [documentation](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html)). Here, we use the TPC object for imx500 hardware with version 1.0:

In [None]:
tpc = mct.get_target_platform_capabilities(tpc_version='1.0', device_type='imx500')

## Mixed Precision Configurations
We will create a `MixedPrecisionQuantizationConfig` that defines the search options for mixed-precision:


In [None]:
configuration = mct.core.CoreConfig(
    mixed_precision_config=mct.core.MixedPrecisionQuantizationConfig(num_of_images=CALIB_ITER))

In [None]:
# Get Resource Utilization information to constraint your model's memory size.
resource_utilization_data = mct.core.keras_resource_utilization_data(
    full_float_model,
    representative_dataset_gen,
    configuration,
    target_platform_capabilities=tpc)
 
# Define target Resource Utilization for mixed precision weights quantization.
resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * WEIGHTS_COMPRESSION_RATIO)

# Post-Training Quantization using MCT
Now for the exciting part! Let's run PTQ on the model.

In [None]:
quantized_model, quantization_info = mct.ptq.keras_post_training_quantization(
                                        in_model=full_float_model,
                                        representative_data_gen=representative_dataset_gen,
                                        target_platform_capabilities=tpc,
                                        core_config=configuration,
                                        target_resource_utilization=resource_utilization)

# Model Evaluation
Now, we will create a function for evaluating a model.  
The inference results before and after quantization are displayed on the terminal.

In [None]:
@tf.function
def inference_step(model: tf.keras.Model, 
                   input_imgs: np.ndarray) -> Any:
    """
    Model Inference Wrapper for @tf.function.

    Args:
        model (tf.keras.Model): Evaluation model.
        input_imgs (np.ndarray): Input image.
    
    Returns:
        Any: Model outputs.
    """
    return model(input_imgs, training=False)

def evaluate(model: tf.keras.Model, val_dataloader: CocoDataLoader,
             score_threshold: float = 0.1):
    """
    Evaluation of the COCO dataset.

    Args:
        model (tf.keras.Model): Evaluation model.
        val_dataloader (CocoDataLoader): Evaluation dataset.
        score_threshold (float): Score threshold.
    """
    model.trainable = False

    results = []
    for sample in tqdm(val_dataloader, desc="Evaluating"):
        input_imgs = sample['input']
        img_ids = sample['id']
        ratios = sample['ratio']

        nmsed_boxes, nmsed_scores, nmsed_classes, _ = inference_step(model, input_imgs)
        nmsed_boxes = nmsed_boxes.numpy()
        nmsed_scores = nmsed_scores.numpy()
        nmsed_classes = nmsed_classes.numpy()

        for batch_idx in range(len(img_ids)):
            img_id = img_ids[batch_idx]
            ratio = ratios[batch_idx]
            # boxes: [N, 4] (ymin, xmin, ymax, xmax), scores: [N], labels: [N]
            for box, score, label in zip(nmsed_boxes[batch_idx], nmsed_scores[batch_idx], nmsed_classes[batch_idx]):
                if score > score_threshold:
                    box /= ratio
                    y_min, x_min, y_max, x_max = box.tolist()
                    width = x_max - x_min
                    height = y_max - y_min
                    result = {
                        'image_id': int(img_id),
                        'category_id': int(label) + 1,  # Convert class index (0-based) to COCO category ID (1-based)
                        'bbox': [int(x_min), int(y_min), int(width), int(height)],
                        'score': float(score),
                    }
                    results.append(result)

    # evaluation
    coco_gt = val_dataset.coco

    coco_dt = coco_gt.loadRes(results)
    evaluator = COCOeval(coco_gt, coco_dt, iouType='bbox')
    evaluator.evaluate()
    evaluator.accumulate()
    evaluator.summarize()

Let's start with the floating-point model evaluation.  
This step may take several minutes...

In [None]:
print("evaluating float model（COCO mAP）...")
evaluate(full_float_model, val_dataloader,
         score_threshold = SCORE_THR)

Finally, let's evaluate the quantized model:  
This step may take several minutes...

In [None]:
print("evaluating quantized model（COCO mAP）...")
evaluate(quantized_model, val_dataloader,
         score_threshold = SCORE_THR)

## Export and Load the quantized model
Lastly, we will demonstrate how to export the quantized model into a file and then load it.

We will use `keras_export_model` function to save the quantized model with the integrated custom quantizers into a ".keras" file format.

In [None]:
# Export a keras model with mctq custom quantizers into a file
mct.exporter.keras_export_model(model=quantized_model,
                                save_model_path='./effdet_keras_mixed_precision_ptq.keras')

Then, we can load the saved model using `keras_load_quantized_model` function. For this specific case, we'll have to supply the load function with an extra custom layer integrated into the model, namely `SSDPostProcess`.

In [None]:
from edgemdt_cl.keras.object_detection.ssd_post_process import SSDPostProcess

custom_objects = {SSDPostProcess.__name__: SSDPostProcess} # An extra custom layer integrated in the model 
quant_model_from_file = mct.keras_load_quantized_model('./effdet_keras_mixed_precision_ptq.keras', custom_objects=custom_objects)

## Copyrights

Copyright 2025 Sony Semiconductor Solutions, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
