# 03. Quantization and Export

This notebook demonstrates how to export YOLOv5 models to ONNX and apply Quantization (FP16, INT8).

## Objectives
1. Export to ONNX.
2. Apply FP16 Quantization.
3. Apply INT8 Post-Training Quantization (PTQ).

**Note**: Ensure you are in the `yolov5` directory or have it in your python path.

In [None]:
# Setup
import os
if os.path.basename(os.getcwd()) != 'yolov5':
    %cd yolov5
    
import onnx
from onnxruntime.quantization import quantize_dynamic, quantize_static, CalibrationDataReader, QuantType
import cv2
import numpy as np
import glob

## 1. Export to ONNX (Base)
We export the trained model (e.g., `runs/train/exp/weights/best.pt`) or a pretrained one.

In [None]:
# Export to ONNX
!python export.py --weights yolov5n.pt --include onnx --opset 12

## 2. FP16 Quantization
YOLOv5's export script can often handle FP16 export directly (using `--half` if on GPU), or modern inference runtimes handle it. For explicit ONNX FP16 conversion:

In [None]:
from onnxconverter_common import float16

model_fp32 = 'yolov5n.onnx'
model_fp16 = 'yolov5n_fp16.onnx'

model = onnx.load(model_fp32)
model_fp16_onnx = float16.convert_float_to_float16(model)
onnx.save(model_fp16_onnx, model_fp16)
print(f"Exported {model_fp16}")

## 3. INT8 Quantization (Post-Training)
For INT8, we need a calibration dataset (representative of real data) to calculate dynamic ranges.

### Calibration Data Reader

In [None]:
class YOLODataReader(CalibrationDataReader):
    def __init__(self, image_folder, input_name='images', size=(640, 640)):
        self.image_folder = image_folder
        self.image_paths = glob.glob(os.path.join(image_folder, '*.jpg'))[:50] # Use 50 samples
        self.input_name = input_name
        self.size = size
        self.enum_data = iter(self.image_paths)

    def get_next(self):
        try:
            image_path = next(self.enum_data)
            img = cv2.imread(image_path)
            img = cv2.resize(img, self.size)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            img = img.astype('float32') / 255.0
            img = img.transpose(2, 0, 1) # HWC to CHW
            img = np.expand_dims(img, axis=0)
            return {self.input_name: img}
        except StopIteration:
            return None

# Define Dataset path (ensure you have images here from notebook 01)
train_images = '../data/datasets/4weed/images/train'
if not os.path.exists(train_images):
    print("Warning: Calibration dataset not found. Please run Notebook 01 first.")
else:
    dr = YOLODataReader(train_images)
    
    # Quantize
    quantize_static(
        model_input='yolov5n.onnx',
        model_output='yolov5n_int8.onnx',
        calibration_data_reader=dr,
        quant_format=QuantType.QDQ,
        weight_type=QuantType.QInt8
    )
    print("Exported yolov5n_int8.onnx")

## 4. Compare Size
Check the file size reduction.

In [None]:
def get_size(file_path):
    size = os.path.getsize(file_path)
    return size / (1024 * 1024)

print(f"FP32: {get_size('yolov5n.onnx'):.2f} MB")
print(f"FP16: {get_size('yolov5n_fp16.onnx'):.2f} MB")
if os.path.exists('yolov5n_int8.onnx'):
    print(f"INT8: {get_size('yolov5n_int8.onnx'):.2f} MB")