# YOLOX and Mixed-Precision Post-Training Quantization in PyTorch using the Model Compression Toolkit(MCT)

## Overview
This quick-start guide explains how to use the **Model Compression Toolkit (MCT)** to quantize a YOLOX model. We will load a pre-trained model and quantize it using the MCT with **Mixed-Precision Post-Training Quantization (PTQ)** .

## Summary
In this tutorial, we will cover:

1. Loading and preprocessing COCO’s dataset.
2. Constructing an unlabeled representative dataset.
3. Post-Training Quantization using MCT.
4. Accuracy evaluation of the floating-point and the quantized models.

## YOLOX(Dependent External Repository)
This tutorial uses the repository linked below. Installation instructions are provided in the **Setup** section.   
[YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)

### License(YOLOX)
   Copyright (c) 2021-2022 Megvii Inc. All rights reserved.

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.

## Setup  

First, install the relevant packages:  
This step may take several minutes...


In [None]:
!pip install torch==2.6.0 torchvision==0.21.0
!pip install onnx==1.16.1
!pip install numpy==1.26.4
!pip install opencv-python==4.9.0.80
!pip install pycocotools==2.0.10
!pip install onnx-simplifier~=0.4.10
!pip install loguru
!pip install tqdm
!pip install thop
!pip install ninja
!pip install tabulate
!pip install psutil
!pip install tensorboard

Clone the GitHub repository and install.
This repository is mentioned earlier.

In [None]:
import os

if not os.path.isdir('YOLOX'):
    !git clone https://github.com/Megvii-BaseDetection/YOLOX.git

Download a pre-trained YOLOX-Tiny model.

In [None]:
if not os.path.isfile('yolox_tiny.pth'):
    !wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_tiny.pth

In [None]:
import importlib
if not importlib.util.find_spec('model_compression_toolkit'):
    !pip install model_compression_toolkit

In [None]:
import sys
import itertools
from typing import Dict, Tuple, Any
import numpy as np
import cv2
import torch
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

import model_compression_toolkit as mct
from edgemdt_cl.pytorch import FasterRCNNBoxDecode, MulticlassNMS
sys.path.append('./YOLOX')
from yolox.exp import get_exp

### Various Settings
Here, you can configure the parameters listed below.  

#### Parameter setting
- IMG_HEIGHT, IMG_WIDTH  
  This parameter allows you to set the size of input images.
- SCORE_THR  
  This parameter allows you to set the threshold of class score for the Non-Maximum Suppression (NMS) and evaluation.
- IOU_THR  
  This parameter allows you to set the threshold of iou for the Non-Maximum Suppression (NMS).
- NUM_WORKERS  
  This parameter allows you to set the number of processes for parallelizing the data loading process.
- CALIB_ITER  
  This parameter allows you to set how many samples to use when generating representative data for quantization.
- WEIGHTS_COMPRESSION_RATIO  
  This parameter allows you to set the quantization ratio based on the weight size of the 8-bit model when using mixed-precision quantization.

In [None]:
# Parameter setting
IMG_HEIGHT = 416
IMG_WIDTH = 416
SCORE_THR = 0.1
IOU_THR = 0.65
NUM_WORKERS = 0
CALIB_ITER = 10
WEIGHTS_COMPRESSION_RATIO = 0.85

Load a pre-trained YOLOX-Tiny model.  

In [None]:
exp = get_exp('./YOLOX/exps/default/yolox_tiny.py', None)
float_model = exp.get_model()
float_model.head.decode_in_inference = False
weights = torch.load('./yolox_tiny.pth', map_location=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
float_model.load_state_dict(weights["model"])

Next, we add the CustomLayer (edgemdt_cl) as post-processing.  

* FasterRCNNBoxDecode: Decodes YOLOX inference results from Anchor format to BoundingBox format.    
* MulticlassNMS: Executes the Non-Maximum Suppression to remove overlapping boxes.

Note: YOLOX returns the data of (xc, yc, h, w) but FasterRCNNBoxDecode use the data of (yc, xc, h, w), so convert data to (yc, xc, h, w) before decoding.

In [None]:
class YOLOXPostProcess(torch.nn.Module):
    """
    Wrapping YoloX with post process functionality: FasterRCNNBoxDecode and MulticlassNMS from edgemdt_cl.
    """

    def __init__(self, model: torch.nn.Module, img_size: Tuple = (416, 416),
                 score_threshold: float = 0.1, iou_threshold: float = 0.65,
                 max_detections: int = 200):
        """
        Args:
            model (torch.nn.Module): Model instance.
            img_size (tuple): Image size input of the model.
            score_threshold (float): Score threshold for non-maximum suppression.
            iou_threshold (float): Intersection over union threshold for non-maximum suppression.
            max_detections (int): The number of detections to return.
        """
        super(YOLOXPostProcess, self).__init__()
        self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
        self.img_size = img_size
        self.strides = [8, 16, 32]  # strides to bed used in anchors.
        self.model = model
        self.box_decoder = FasterRCNNBoxDecode(anchors=self.create_anchors(),
                                               scale_factors=[1,1,1,1],
                                               clip_window=[0,0,*img_size])
        self.nms = MulticlassNMS(score_threshold, iou_threshold, max_detections)

    def create_anchors(self) -> torch.tensor:
        """
        Create anchors for box decoding operation.

        Returns: 
            anchors (torch.tensor): tesnor of anchors.
        """
        fmap_grids = []
        fmap_strides = []
        hsizes = [self.img_size[0] // stride for stride in self.strides]
        wsizes = [self.img_size[1] // stride for stride in self.strides]
        for hsize, wsize, stride in zip(hsizes, wsizes, self.strides):
            yv, xv = torch.meshgrid([torch.arange(hsize), torch.arange(wsize)])
            grid = torch.stack((xv, yv), 2).view(1, -1, 2)
            fmap_grids.append(grid)
            shape = grid.shape[:2]
            fmap_strides.append(torch.full((*shape, 1), stride))

        s = torch.cat(fmap_strides, dim=1).to(self.device)
        offsets = s * torch.cat(fmap_grids, dim=1).to(self.device)
        xc, yc = offsets[..., 0:1], offsets[..., 1:2]
        anchors = torch.concat([(2 * yc - s) / 2, (2 * xc - s) / 2,
                                (2 * yc + s) / 2, (2 * xc + s) / 2], dim=-1)
        anchors = anchors.squeeze(0)
        return anchors

    def forward(self, images: torch.tensor) -> Tuple:
        """
        Forward processing.

        Args:
            images (torch.tensor): Input images.

        Returns:
            nms_out.boxes: Bounding boxes after NMS processing.
            nms_out.scores: Scores after NMS processing.
            nms_out.labels: labels after NMS processing.
        """        
        outputs = self.model(images)
        boxes = outputs[..., :4]
        # Convert from (xc, yc, w, h) to (yc, xc, h, w)
        xc, yc, w, h = boxes[..., 0], boxes[..., 1], boxes[..., 2], boxes[..., 3]
        boxes = torch.stack([yc, xc, h, w], dim=-1)
        # Box decoder
        boxes = self.box_decoder(boxes)
        scores = outputs[..., 5:] * outputs[..., 4:5]  # classes * scores
        # NMS
        nms_out = self.nms(boxes, scores)
        return nms_out.boxes, nms_out.scores, nms_out.labels

Wrapping YoloX with post process.

In [None]:
full_float_model = YOLOXPostProcess(model = torch.nn.Sequential(float_model.backbone, float_model.head),
                                    img_size = (IMG_HEIGHT, IMG_WIDTH),
                                    score_threshold = SCORE_THR, iou_threshold = IOU_THR)

## Dataset preparation
### Download COCO's dataset

**Note**  
In this tutorial, we will use a subset of COCO train2017 for calibration during quantization and COCO val2017 for evaluation.

This step may take several minutes...

In [None]:
if not os.path.isdir('COCO_dataset'):
    !mkdir COCO_dataset
    !wget -P COCO_dataset http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !wget -P COCO_dataset http://images.cocodataset.org/zips/train2017.zip
    !wget -P COCO_dataset http://images.cocodataset.org/zips/val2017.zip
    !unzip COCO_dataset/annotations_trainval2017.zip -d COCO_dataset
    !unzip COCO_dataset/train2017.zip -d COCO_dataset
    !unzip COCO_dataset/val2017.zip -d COCO_dataset

Here, we are setting the paths for the annotation file and image folder of the downloaded dataset.

In [None]:
COCO_TRAIN_IMG_DIR = "COCO_dataset/train2017/"
COCO_VAL_IMG_DIR = "COCO_dataset/val2017/"
COCO_TRAIN_ANN_JSON = "COCO_dataset/annotations/instances_train2017.json"
COCO_VAL_ANN_JSON = "COCO_dataset/annotations/instances_val2017.json"

In this class, we process the downloaded COCO's dataset for calibration during quantization and for use in evaluation.

In [None]:
class CocoDataset(Dataset):
    """
    Define the COCO dataset.
    """

    def __init__(self, img_dir: str, ann_json: str, img_size: Tuple = (416,416)):
        """
        Args:
            img_dir (str): Data folder path.
            ann_json (str): Annotation file name.
            img_size (tuple): Image size input of the model.
        """
        self.img_dir = img_dir
        self.coco = COCO(ann_json)
        self.img_ids = self.coco.getImgIds()
        self.img_size = img_size
        self.pad_values = 114   # Value used for padding in preprocessing.

    def __len__(self) -> int: 
        """
        Returns:
            len(self.img_ids) (int): Number of images.
        """
        return len(self.img_ids)

    def __getitem__(self, idx: int) -> Dict[str, Any]:
        """
        Args:
            idx (int): Index number.

        Returns:
            sample (dict): Store image information.
        """
        img_id = self.img_ids[idx]
        img_info = self.coco.loadImgs([img_id])[0]
        img_path = os.path.join(self.img_dir, img_info['file_name'])

        # Load and preprocess the image
        org_img = cv2.imread(img_path)
        input_img, ratio = self.preprocess(input_img=org_img)   
        input_tensor = torch.from_numpy(input_img).unsqueeze(0)

        sample = {
            'input': input_tensor,
            'id': img_id,
            'file_name': img_info['file_name'],
            'ratio': ratio
        }
        return sample
    
    def preprocess(self, input_img: np.ndarray) -> Tuple:
        """
        Preprocess an input image for YOLOX model with reshape and CHW transpose (for PyTorch implementation)

        Args:
            input_img (np.ndarray): Input image as a NumPy array.

        Returns:
            padded_img (np.ndarray): Preprocessed image as a NumPy array.
            ratio (float): Ratio when resizing image.
        """
        padded_img = np.ones((self.img_size[0], self.img_size[1], 3), dtype=np.uint8) * self.pad_values
        ratio = min(self.img_size[0] / input_img.shape[0], self.img_size[1] / input_img.shape[1])
        resized_img = cv2.resize(input_img, (int(input_img.shape[1] * ratio), int(input_img.shape[0] * ratio)),
                                 interpolation=cv2.INTER_LINEAR).astype(np.uint8)
        padded_img[: int(input_img.shape[0] * ratio), : int(input_img.shape[1] * ratio)] = resized_img

        padded_img = padded_img.transpose((2, 0, 1))
        padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
        return padded_img, ratio

In [None]:
val_dataset = CocoDataset(
    img_dir = COCO_VAL_IMG_DIR, ann_json = COCO_VAL_ANN_JSON,
    img_size = (IMG_HEIGHT, IMG_WIDTH)
)
calib_dataset = CocoDataset(
    img_dir = COCO_TRAIN_IMG_DIR, ann_json=COCO_TRAIN_ANN_JSON,
    img_size = (IMG_HEIGHT, IMG_WIDTH)
)

# For evaluation (batch size 1)
val_dataloader = DataLoader(
    val_dataset, batch_size=1, shuffle=False,
    num_workers=NUM_WORKERS, collate_fn=lambda x: x[0]
)
# For calibration（No label required）
calib_loader = DataLoader(
    calib_dataset, batch_size=1, shuffle=True,
    num_workers=NUM_WORKERS, collate_fn=lambda x: x[0]
)

print(len(calib_dataset))
print(len(val_dataset))

## Representative Dataset
For quantization with MCT, we need to define a representative dataset required by the PTQ algorithm. This dataset is a generator that returns a list of images:

In [None]:
def representative_dataset_gen():
    for sample in itertools.islice(itertools.cycle(calib_loader), CALIB_ITER):
        yield [sample['input']]

## Target Platform Capabilities (TPC)
In addition, MCT optimizes the model for dedicated hardware platforms. This is done using TPC (for more details, please visit our [documentation](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html)). Here, we use the default Pytorch TPC:

In [None]:
tpc = mct.get_target_platform_capabilities('pytorch', 'default')

## Mixed Precision Configurations
We will create a `MixedPrecisionQuantizationConfig` that defines the search options for mixed-precision:


In [None]:
configuration = mct.core.CoreConfig(
    mixed_precision_config=mct.core.MixedPrecisionQuantizationConfig(num_of_images=CALIB_ITER))

In [None]:
# Get Resource Utilization information to constraint your model's memory size.
resource_utilization_data = mct.core.pytorch_resource_utilization_data(
    full_float_model,
    representative_dataset_gen,
    configuration,
    target_platform_capabilities=tpc)
 
# Define target Resource Utilization for mixed precision weights quantization.
resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * WEIGHTS_COMPRESSION_RATIO)

# Post-Training Quantization using MCT
Now for the exciting part! Let's run PTQ on the model.

In [None]:
quantized_model, quantization_info = mct.ptq.pytorch_post_training_quantization(
                                        in_module=full_float_model,
                                        representative_data_gen=representative_dataset_gen,
                                        target_platform_capabilities=tpc,
                                        core_config=configuration,
                                        target_resource_utilization=resource_utilization)

# Model Evaluation
Now, we will create a function for evaluating a model.  
The inference results before and after quantization are displayed on the terminal.

In [None]:
@torch.no_grad()
def evaluate(model: torch.nn.Module, val_dataloader: DataLoader,
             score_threshold: float = 0.1):
    """
    Evaluation of the COCO dataset.

    Args:
        model (torch.nn.Module): Evaluation model.
        val_dataloader (DataLoader): Evaluation dataset.
        score_threshold (float): Score threshold.
    """
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model.to(device)
    model.eval()

    class_ids = sorted(val_dataset.coco.getCatIds())
    results = []
    for sample in tqdm(val_dataloader, desc="Evaluating"):
        input_img = sample['input'].to(device)
        img_id = sample['id']
        ratio = sample['ratio']
        boxes, scores, labels = model(input_img)

        # boxes: [N, 4] (ymin, xmin, ymax, xmax), scores: [N], labels: [N]
        for box, score, label in zip(boxes[0], scores[0], labels[0]):
            if score > score_threshold:
                box /= ratio
                # FasterRCNNBoxDecode return the data of (y, x).
                y_min, x_min, y_max, x_max = box.tolist()
                width = x_max - x_min
                height = y_max - y_min
                result = {
                    'image_id': img_id,
                    'category_id': class_ids[int(label)],
                    'bbox': [x_min, y_min, width, height],
                    'score': float(score),
                }
                results.append(result)

    # evaluation
    coco_gt = val_dataset.coco

    coco_dt = coco_gt.loadRes(results)
    evaluator = COCOeval(coco_gt, coco_dt, iouType='bbox')
    evaluator.evaluate()
    evaluator.accumulate()
    evaluator.summarize()

Let's start with the floating-point model evaluation.  
This step may take several minutes...

In [None]:
print("evaluating float model（COCO mAP）...")
evaluate(full_float_model, val_dataloader,
         score_threshold = SCORE_THR)

Finally, let's evaluate the quantized model:  
This step may take several minutes...

In [None]:
print("evaluating quantized model（COCO mAP）...")
evaluate(quantized_model, val_dataloader,
         score_threshold = SCORE_THR)

## Copyrights

Copyright 2025 Sony Semiconductor Solutions, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
