# Post-Training Quantization + Conversion to IMX500 of a MobileNetV2 Keras Model

## Overview

This tutorial demonstrates how to apply Post Training Quantization to a Keras pretrained model using the [**Model Compression Toolkit (MCT)**](https://github.com/sony/model_optimization) and how to convert the resulting model to a binary format suitable to load to IMX500 using the [**IMX500-converter**](https://github.com/ssi-dnn/imx500-converter) . 

This example is not intended to demonstrate evaluating MCT PTQ performance and as such intentionally uses generated random data   to speed up the process.
 
For the tutorial on MCT's PTQ  see - [*MCT PTQ Keras Tutorial*](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/imx500_notebooks/keras/example_keras_mobilenetv2_for_imx500.ipynb)

For tutorials on other quantization features of MCT see [*MCT Features Tutorials*](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/mct_features_notebooks/README.md)

## Summary

In this tutorial we cover the following steps:

1. Post-Training Quantization using MCT.
2. Converting the model to a IMX500 suitable representation using IMX500-Converter

## Setup

Install and import the relevant packages:


In [None]:
from importlib import util
TF_VER = '2.15.1'
EDGE_MDT_VER =  '0.0.0.dev0' #todo: can remove when we are published
try:
    import tensorflow as tf
except ImportError:
    print(f"Installing TensorFlow {TF_VER}")
    !pip install tensorflow=={TF_VER}

if not util.find_spec('edge_mdt') or not util.find_spec("uni.tensorflow"):
    print(f"Installing edge-mdt {EDGE_MDT_VER}")
    !pip install edge-mdt[tf]=={EDGE_MDT_VER}


In [None]:
import keras
import os

## Model Post-Training quantization using MCT

### Representative dataset construction
We're all set to use MCT's post-training quantization. To begin, we'll define a representative dataset generator. Please note that for demonstration purposes, we will generate random data of the desired image shape instead of using real images.
Then, we will apply PTQ on our model using the dataset generator we have created. For more details on using MCT, refer to the MCT tutorials

In [None]:
import numpy as np

# Define batch size and iterations
batch_size = 4
n_iter = 2

# Define representative dataset generator
def representative_dataset_gen():
    for _ in range(n_iter):
        yield [np.random.rand(batch_size, 224, 224, 3).astype(np.float32)]  # Yield random batch

## Model Post-Training quantization using MCT

Now we are ready to quantize our model.

First, we load a pre-trained MobileNetV2 model from Keras, in 32-bits floating-point precision format.

In [None]:
from keras.applications.mobilenet_v2 import MobileNetV2
float_model = MobileNetV2()

Next, we need to define a `TargetPlatformCapability` object, representing the HW specifications on which we wish to eventually deploy our quantized model.

In addition, we need to define the Quantization Configuration for our PTQ routine.

Here, we demonstrate how to define a quantization configuration with several key argument that can be controlled by the user.
**Note** that you can skip this part if you prefer to use the default quantization settings.

In [None]:
import model_compression_toolkit as mct
from edgemdt_tpc import get_target_platform_capabilities
# Target platform capabilities
tpc = get_target_platform_capabilities(tpc_version='1.0', device_type='imx500')

# Perform Post-Training Quantization (PTQ)
quantized_model, quantization_info = mct.ptq.keras_post_training_quantization(
    in_model=float_model,
    representative_data_gen=representative_dataset_gen,
    target_platform_capabilities=tpc
)

That's it! Our model is now quantized.

## Model Conversion

###Exporting to Keras serialization 
 In order to convert our model to an binary suitable to load to IMX500, we first need to serialize it to Keras format. Please ensure that the `save_model_path` has been set correctly.

In [None]:
quantized_model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
save_folder="./mobilenet_tf"
os.makedirs(save_folder, exist_ok=True)
keras_path = os.path.join(save_folder, 'qmodel.keras')
mct.exporter.keras_export_model(model=quantized_model, save_model_path=keras_path)

before we can run the IMX500 converter we need to make sure java 17 or up is installed. for colab you can use this dist

In [None]:
!sudo apt install -y openjdk-17-jdk openjdk-17-jre

###Running the IMX500 Converter 
 Now, we can convert the model to create the PackerOut which can be loaded to IMX500

In [None]:
import subprocess
import sys

# basic cmd line
cmd = ["imxconv-tf", "-i", keras_path,  "-o", save_folder, "--overwrite-output"]

env_bin_path = os.path.dirname(sys.executable)
os.environ["PATH"] = f"{env_bin_path}:{os.environ['PATH']}"
env = os.environ.copy()

subprocess.run(cmd, env=env, check=True)

## Conclusion

In this tutorial, we demonstrated how to quantize a pre-trained model using MCT then convert it to a binary suitable for IMX500 execution, all with a few lines of code. for full documentation of the IMX500 converter see [here](https://github.com/ssi-dnn/imx500-converter).





Copyright 2025 Sony Semiconductor Israel, Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
