# Using a pretrained image model from Nvidia TAO model zoo on STM32 platforms with STEdgeAI Developer Cloud

This notebook provides an example of using a pretrained image-classification model from ngc model zoo, optimize and adapt it to deploy and benchmark on STM32 targets.

The scripts below let you download and use one of the pretrained models from pretrained [classification model repository of Nvidia TAO](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_classification/version). The models come under [creative commons license](https://creativecommons.org/licenses/by/4.0/) as mentioned on the [explainability](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_classification/explainability) link for these models. The notebook in this folder provides a demo example of how a user can download mobilenet_v2 model from the repository, convert it to onnx, and then quantize it, and finally, benchmarking using [STEdgeAI Developer Cloud](https://stm32ai.st.com/stm32-cube-ai-dc/).pretrained image classification mdoel from the  complete life cycle of the model training, optimization and benchmarking using [NVIDIA TAO Toolkit](https://developer.nvidia.com/tao-toolkit) and [STEdgeAI Developer Cloud](https://stm32ai.st.com/stm32-cube-ai-dc/).

## License

This software component is licensed by ST under BSD-3-Clause license,
the "License"; 

You may not use this file except in compliance with the
License. 

You may obtain a copy of the License at: https://opensource.org/licenses/BSD-3-Clause

Copyright (c) 2024 STMicroelectronics. All rights reserved.

Copyright (c) 2024 Nvidia. All rights reserved.

#### (Optional) A. Set proxy variables if working behind corporate proxies.

The following section sets the proxies and ssl verification flag when the users are working behind the proxies. This setup is necessary to be able to communicate with internet.

Replace the `userName`, `password`, and `proxy_port` with your correct username, password and proxy port.

In [1]:
# set proxies
import os
# os.environ["http_proxy"]='http://userName:password.@example.com:proxy_port'
# os.environ["https_proxy"] = 'http://userName:password.@example.com:proxy_port'
# os.environ["NO_SSL_VERIFY"]="1"
# os.environ["SSL_VERIFY"]="False"

## Install required modules

In [None]:
!pip install tf2onnx==1.14.0
!pip install opencv_python==4.6.0.66
!pip install onnxruntime==1.14.1
!pip install onnx==1.12.0
!pip install matplotlib==3.3.3
!pip install tensorflow==2.9

## Helpful functions to make the code more modular

In [1]:
import numpy as np
import cv2
import onnxruntime as ort
import json
import requests
import tensorflow as tf
import tf2onnx
def get_label_from_class_map_file(class_id:int,
                                  json_file_path:str ='classmap.json'):
    """
    Get the class label given the path to the class map JSON file and the class ID.

    Args:
    class_id (int): The ID of the class.
    json_file_path (str): Path to the class map JSON file.

    Returns:
    str: The label of the class corresponding to the given ID.
    """
    # Load the class map from the JSON file
    with open(json_file_path, 'r') as f:
        class_map = json.load(f)

    # Create a reverse mapping from IDs to class labels
    id_to_class_map = {v: k for k, v in class_map.items()}

    # Retrieve and return the class label for the given ID
    return id_to_class_map.get(class_id, "Unknown ID")


def load_image(image_path:str,
               input_shape:tuple=(224,224)):
    """
    Load the image in 'bgr' mode, and prepare it for the inference of the model
    This will include the image loading as pixel values between 0-255 in bgr mode
    subtract the mean values for these channel to make the images zero centered
    add a dimension for batch and finally transpose the images to have channel first
    Args:
    image_path (str): path to the image file
    input_shape (integer tuple): shape of the image after resize

    Returns:
    np array of 4D: The preprocessed image
    """
    channel_means = [np.float32(103.939), np.float32(116.779), np.float32(123.68)]
    # Load and preprocess the image
    image = cv2.imread(image_path)  # Load image in BGR format
    image_resized = cv2.resize(image, input_shape) # Resize to input_shape
    image_resized = image_resized - channel_means # changing pixels to zero-mean values
    
    # Convert image to float32 
    input_image = image_resized.astype(np.float32) 

    # Add a batch dimension (b, h, w, c)
    input_image = np.expand_dims(input_image, axis=0)

    # Transpose to (1, C, H, W) if the model expects channels first
    input_image = np.transpose(input_image, (0, 3, 1, 2))
    return input_image


def run_prediction(model_path, input_image):
    """
    Load the model and run the inference to get the output of the inference
    
    Args:
    model_path (str): path to the model file (the supported formats are hdf5 or onnx)
    input_image (np.array): the preprocessed image

    Returns:
    np array(s) of the predictions
    """
    if model_path.split('.')[-1] == 'onnx':
        try:
            # Load the ONNX model
            session = ort.InferenceSession(model_path)
            # Get model input name
            input_name = session.get_inputs()[0].name
            # Run inference
            return np.asarray(session.run(None, {input_name: input_image})).squeeze()
        except:
            print('Error occured while performing model inference!\nProblem with model file or input format!')
            return None
    elif model_path.split('.')[-1] == 'hdf5':
        try:
            # load model and run inference
            model = tf.keras.models.load_model(model_path)
            return model.predict(input_image)
        except:
            print('Error occured while performing model inference!\nProblem with model file or input format!')
            return None
    else:
        print('model is not supported!')
        return None

def download_file(url:str, save_path:str):
    """
    Downloads a file from the given URL and saves it to the specified location.

    Args:
    url (str): URL of the file to be downloaded
    save_path (str): Path where the file should be saved
    """
    try:
        # Send a HTTP request to the URL
        response = requests.get(url, stream=True, verify=False)
        response.raise_for_status()  # Check for HTTP errors

        # Open the file in write-binary mode and write the content
        with open(save_path, 'wb') as file:
            for chunk in response.iter_content(chunk_size=8192):
                file.write(chunk)

        print(f"File downloaded successfully and saved to {save_path}")

    except requests.exceptions.RequestException as e:
        print(f"Error occurred while downloading the file: {e}")

### Downloading the pretrained model

This notebook will use [`mobilenet_v2`](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_classification/files?version=mobilenet_v2) as an example but users can use any other model from the [repositry](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_classification/version). 

__NOTE__: In case a different mode is used all the sections of notebook have to be adapted to have valid file paths and names.

In [None]:
model_url = 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/pretrained_classification/mobilenet_v2/files?redirect=true&path=mobilenet_v2.hdf5'
model_save_path = 'mobilenet_v2.hdf5'
download_file(url=model_url, save_path=model_save_path)

## Convert the downloaded model to onnx format

The downloaded pretrained models will work as it is for the GPU machines but can not be used with the tensorflow 2.x on the CPU machines. To make them cross-compatible we will convert the downloaded model to onnx format in the cell below.

In [None]:
import tensorflow as tf
import tf2onnx

input_model_path = './mobilenet_v2.hdf5'
opset_version = 15  # You can change this to the desired opset version
output_model_path = "mobilenet_v2.onnx"

# Load your pretrained model from the H5 file
model = tf.keras.models.load_model(input_model_path)

# Define the input signature
spec = (tf.TensorSpec(model.inputs[0].shape, tf.float32, name="input"),)

# Convert the model to ONNX format
tf2onnx.convert.from_keras(model, input_signature=spec, opset=opset_version, output_path=output_model_path)
print(f"Model has been converted to ONNX format with opset version {opset_version} and saved to {output_model_path}")

## Download some test images from Open Image dataset to perform the test

In [None]:
import pandas as pd
import os

#create the test_images directory
test_image_dir = './test_images/'
if not os.path.isdir(test_image_dir):
    os.makedirs(test_image_dir)

# loading the csv file with the url and labels of the images and downloading the images one by one
image_urls = pd.read_csv('sample_image_paths.csv',header=0)
for i in range(len(image_urls)):
    download_file(url=image_urls.iloc[i].url,save_path= os.path.join(test_image_dir, image_urls.iloc[i].label))

## Running inference on the converted model

In [None]:
onnx_model_path = 'mobilenet_v2.onnx'
test_image_path = './test_images/cookie.jpg'
input_shape=(224,224) # open the downloaded model in netron to confirm the shape if different from 224x224

input_image = load_image(test_image_path, input_shape=input_shape)

# run prediction on the onnx model
result = run_prediction(model_path=onnx_model_path, input_image=input_image)

print(f'prediction on {test_image_path}')
print(f'\tconfidence : {np.max(result)}\n\tlabel : {get_label_from_class_map_file(np.argmax(result))}')

## Quantizing the onnx model using QDQ post-training quantization
The following cell quantize the onnx model into QDQ format so that it is more optimized and better suited for EdgeAI applications.

In [None]:
# quantize the model
import sys
sys.path.append('../classification_tf2/')
import onnx_utils
onnx_utils.quantize_onnx_model(input_model = './mobilenet_v2.onnx', 
                              calibration_dataset_path = './test_images/',
                              preproc_mode='caffe')

## Running inference on the quantized model

In [None]:
onnx_model_path = 'mobilenet_v2_QDQ_quant.onnx'
test_image_path = './test_images/cookie.jpg'

input_shape=(224,224) # open the downloaded model in netron to confirm the shape if different from 224x224

input_image = load_image(test_image_path, input_shape=input_shape)

# run prediction on the onnx model
result = run_prediction(model_path=onnx_model_path, input_image=input_image)

print(f'prediction on {test_image_path}')
print(f'\tconfidence : {np.max(result)}\n\tlabel : {get_label_from_class_map_file(np.argmax(result))}')

## Benchmarking the optimized model using STEdgeAI Developer Cloud

Getting the package for connecting to STEdgeAI Developer Cloud.

In [None]:
!pip install gitdir
!pip install ipywidgets

In [8]:
!gitdir https://github.com/STMicroelectronics/stm32ai-modelzoo-services/tree/main/common/stm32ai_dc

import os
import shutil

# Reorganize local folders
if os.path.exists('./stm32ai_dc'):
    shutil.rmtree('./stm32ai_dc')
shutil.move('./common/stm32ai_dc', './stm32ai_dc')
shutil.rmtree('./common')

Downloaded: README.md
Downloaded: __init__.py
Downloaded: benchmark_service.py
Downloaded: cloud.py
Downloaded: endpoints.py
Downloaded: file_service.py
Downloaded: generate_nbg_service.py
Downloaded: helpers.py
Downloaded: login_service.py
Downloaded: stm32ai_service.py
Downloaded: user_service.py
Downloaded: errors.py
Downloaded: stm32ai.py
Downloaded: types.py

✔ Download complete


##### Import, helper and UI functions

In [2]:
import os
import sys
from typing import List
import matplotlib.pyplot as plt
import seaborn as sns
import ipywidgets as widgets

from stm32ai_dc import (CliLibraryIde, CliLibrarySerie, CliParameters, MpuParameters, MpuEngine, AtonParameters,
                        CloudBackend, Stm32Ai)

sys.path.append(os.path.abspath('stm32ai'))
os.environ['STATS_TYPE'] = 'stm32ai_tao'

# create a directory for outputs for stm32ai developer cloud operations
stm32ai_output_dir = './results_mobilenetv2/stm32ai_outputs'
os.makedirs(stm32ai_output_dir, exist_ok=True)


def get_mpu_options(board_name: str = None) -> tuple:
    """
    Get MPU benchmark options depending on MPU board selected
    Each MPU board has different settings,
    i.e. different number of cpu_cores or engine (CPU only or HW_Accelerator also)
    Input:
        board_name:str, name of the mpu board
    Returns:
        tuple: engine_used and num_cpu_cores.
    """

    #define configuration by MPU board
    STM32MP257F_EV1 = {
        "engine": MpuEngine.HW_ACCELERATOR,
        "cpu_cores": 2
    }

    STM32MP157F_DK2 = {
        "engine": MpuEngine.CPU,
        "cpu_cores": 2
    }

    STM32MP135F_DK = {
        "engine": MpuEngine.CPU,
        "cpu_cores": 1
    }

    #recover parameters based on board name:
    if board_name == "STM32MP257F-EV1":
        engine_used = STM32MP257F_EV1.get("engine")
        num_cpu_cores = STM32MP257F_EV1.get("cpu_cores")
    elif board_name == "STM32MP157F-DK2":
        engine_used = STM32MP157F_DK2.get("engine")
        num_cpu_cores = STM32MP157F_DK2.get("cpu_cores")
    elif board_name == "STM32MP135F-DK":
        engine_used = STM32MP135F_DK.get("engine")
        num_cpu_cores = STM32MP135F_DK.get("cpu_cores")
    else :
        engine_used = MpuEngine.CPU
        num_cpu_cores = 1

    return engine_used, num_cpu_cores

def analyze_footprints(report: object = None) -> None:
    """
    Analyzes the memory footprint of a STEdgeAI model.

    Args:
        report: A report object containing information about the model.

    Returns:
        None
    """
    activations_ram: float = report.ram_size / 1024
    runtime_ram: float = report.estimated_library_ram_size / 1024
    total_ram: float = activations_ram + runtime_ram
    weights_rom: float = report.rom_size / 1024
    code_rom: float = report.estimated_library_flash_size / 1024
    total_flash: float = weights_rom + code_rom
    macc: float = report.macc / 1e6
    print("[INFO] : STEdgeAI model memory footprint")
    print("[INFO] : MACCs : {} (M)".format(macc))
    print("[INFO] : Total Flash : {0:.1f} (KiB)".format(total_flash))
    print("[INFO] :     Flash Weights  : {0:.1f} (KiB)".format(weights_rom))
    print("[INFO] :     Estimated Flash Code : {0:.1f} (KiB)".format(code_rom))
    print("[INFO] : Total RAM : {0:.1f} (KiB)".format(total_ram))
    print("[INFO] :     RAM Activations : {0:.1f} (KiB)".format(activations_ram))
    print("[INFO] :     RAM Runtime : {0:.1f} (KiB)".format(runtime_ram))


def benchmark_model(stmai:object,
                    model_path:str,
                    model_name:str,
                    optimization:str,
                    from_model:str,
                    board_name:str,
                    allocateInputs:bool =True,
                    allocateOutputs:bool=True) -> float:
    """
    Benchmarks the give model to calculate the footprint on a STM32 Target board.

    Args:
        stmai:object, an object of stm32ai_dc
        model_path:str, path to the model file
        model_name:str, path to the model file
        optimization:str, the way model is to be optimized available options ['balanced', 'time', 'ram']
        from_model:str, if the model is coming from zoo or is a custom model from the user
        board_name:str, target board name from one of the available boards on the dev cloud
        allocateInputs:bool, If set to true activations buffer will be also used to handle the input buffers. 
        allocateOutputs:bool, If set to "True", activations buffer will be also used to handle the output buffers.

    Returns:
        fps: frames per second (1/inference_time)
    """
    print(f"Benchmarking on: {board_name}")
    if "mp" in board_name.lower():
        # if mpu is selected as the target
        model_extension = os.path.splitext(model_path)[1]
        # only supported options are quantized tflite or onnx models
        if model_extension in ['.onnx', '.tflite']:
            if "stm32mp2" in board_name.lower(): # if mp2 is selected as the target board optimize the model to generate a .nbg file
                optimized_model_path = os.path.dirname(model_path) + "/"
                try:
                    stmai.upload_model(model_path)
                    model = model_name
                    res = stmai.generate_nbg(model)
                    stmai.download_model(res, optimized_model_path + res)
                    model_path=os.path.join(optimized_model_path,res)
                    nb_model_name = os.path.splitext(os.path.basename(model_path))[0] + ".nb"
                    rename_model_path=os.path.join(optimized_model_path,nb_model_name)
                    os.rename(model_path, rename_model_path)
                    model_path = rename_model_path
                    print("[INFO] : Optimized Model Name:", model_name)
                    print("[INFO] : Optimization done ! Model available at :",optimized_model_path)
                    model_name = nb_model_name
                except Exception as e:
                    print(f"[FAIL] : Model optimization via Cloud failed : {e}.")
                    print("[INFO] : Use default model instead of optimized ...")
        else:
            print("[ERROR]: Only .tflite or .onnx models can be benchmarked for MPU")
            fps = 0
            return fps

        engine, nbCores = get_mpu_options(board_name)
        stmai_params = MpuParameters(model=model_name,
                                     nbCores=nbCores,
                                     engine=engine)

    elif 'stm32n6' in board_name.lower():
        model_extension = os.path.splitext(model_path)[1]
        if model_extension in ['.onnx', '.tflite']:
            stmai_params = CliParameters(model=model_name,
                                         target='stm32n6',
                                         stNeuralArt='default',
                                         atonnOptions=AtonParameters())
        else:
            print("[ERROR]: Only .tflite or .onnx models can be benchmarked for N6")
        
    else:
        # target board in mcu, prepare stm32ai parameters
        stmai_params = CliParameters(model=model_name,
                                     optimization=optimization,
                                     allocateInputs=allocateInputs,
                                     allocateOutputs=allocateOutputs,
                                     fromModel=from_model)
    # running the benchmarking with prepared params
    benchmark_report_dir = f'{stm32ai_output_dir}/benchmark_reports/'
    os.makedirs(benchmark_report_dir, exist_ok=True)


    try:
        result = stmai.benchmark(stmai_params, board_name)
        fps = analyze_inference_time(report=result,
                                     target_mpu="mp" in board_name.lower())
        # Save the result in outputs folder
        with open(f'./{benchmark_report_dir}/{model_name}_{board_name}.txt', 'w') as file_benchmark:
            file_benchmark.write(f'{result}')
        return fps

    except Exception as e:
        print(f"Benchmarking failed on board: {board_name}")
        fps = 0
        return fps

def analyze_inference_time(report: object = None,
                           target_mpu = False) -> float:
    """
    Analyzes the inference time of a STEdgeAI model, prints the report and return the FPS.
    Args:
        report: A report object containing information about the model.
        target_mpu: a boolean (True: if target is MPU, False: otherwise)

    Returns:
        The frames per second (FPS) of the model.
    """

    inference_time: float = report.duration_ms
    fps: float = 1000.0/inference_time
    if not target_mpu:
        # in mpu benchmark result report we do not have cycles
        cycles: int = report.cycles
        print("[INFO] : Number of cycles : {} ".format(cycles))
    print("[INFO] : Inference Time : {0:.1f} (ms)".format(inference_time))
    print("[INFO] : FPS : {0:.1f}".format(fps))
    return fps


# UI widgets

# optimization options
optimization: List[str] = ["balanced", "time", "ram"]
optim_dropdown: widgets.Dropdown = widgets.Dropdown(
    options=optimization,
    value=optimization[0],
    description='Optim:',
    disabled=False
)

# STM32MCU series for code generation target
series_name: List[str] = [
    "STM32H7", "STM32F7", "STM32F4", "STM32L4", "STM32G4",
    "STM32F3", "STM32U5", "STM32L5", "STM32F0", "STM32L0",
    "STM32G0", "STM32C0", "STM32WL", "STM32H5"
]
series_dropdown: widgets.Dropdown = widgets.Dropdown(
    options=series_name,
    value=series_name[0],
    description='Series:',
    disabled=False
)

# options for the IDE while code generation
IDE_name: List[str] = ["gcc", "iar", "keil"]
ide_dropdown: widgets.Dropdown = widgets.Dropdown(
    options=IDE_name,
    value=IDE_name[0],
    description='IDE:',
    disabled=False
)

### A. Login to STEdgeAI Developer Cloud
Set environment variables with your credentials to acces STEdgeAI Developer Cloud.

If you don't have an account yet go to: https://stm32ai-cs.st.com/home and click on sign in to create an account. 

Then set the environment variables below with your credentials.

In [None]:
import getpass
# Set environment variables with your credentials to access 
# STEdgeAI Developer Cloud services
# Fill the username with your login address 
username = 'your.email@example.com'
os.environ['stmai_username'] = username
print('Enter you password')
password = getpass.getpass()
os.environ['stmai_password'] = password

In [None]:
# Log in STEdgeAI Developer Cloud 
try:
    stmai = Stm32Ai(CloudBackend(str(username), str(password)))
    print("Successfully Connected!")
except Exception as e:
    print("Error: please verify your credentials")
    print(e)

### B. Upload the model on STEdgeAI Developer Cloud

In [None]:
model_name = 'mobilenet_v2'
model_path = './mobilenet_v2_QDQ_quant.onnx'
model_name = os.path.basename(model_path)
from_model = 'user'

try:
  stmai.upload_model(model_path)
  print(f'Model {model_name} is uploaded !')
except Exception as e:
    print("ERROR: ", e)

### C. Select the STEdgeAI optimization setting
| Configuration | Description |
| --- | --- |
| balanced | default compromise between RAM footprint and latency. |
| time | optimize for latency. |
| ram | optimize for minimal RAM footprint. |

In [None]:
display(optim_dropdown)

### D. Analyze your model memory footprints for STM32MCU targets
When analyzing the footprints of the model for STM32MCU targets, following parameters can be configured for stm32ai.analyze callback:

CLIParameters (options of STEdgeAI):

| Parameter | Description |
| --- | --- |
| model | Model name corresponding to the file name uploaded. This parameter is __required__. |
| optimization | Optimization setting: "balanced", "time" or "ram". This parameter is __required__. |
| allocateInputs | If set to "True", activations buffer will be also used to handle the input buffers. This parameter is __optional__. Default value is "True". |
| allocateOutputs | If set to "True", activations buffer will be also used to handle the output buffers. This parameter is __optional__. Default value is "True". |
| noOnnxOptimizer | If set to "True", allows to disable the ONNX optimizer pass. This parameter is __optional__. Default value is "False". |
| fromModel | To identify the origin model when coming from ST model zoo. This parameter is __optional__. Default value is "user".|

In [None]:
# Analyze RAM/Flash model memory footprints after optimization by STEdgeAI
optimization = optim_dropdown.value
print(f'Anlyzing model : {model_name}, using opimization : {optimization}')
# The runtime library footprint varies slightly depending on the STM32 series
# For an estimation, we use the default series to the STM32F4
try:
  result = stmai.analyze(CliParameters(model=model_path,
                                       optimization=optimization,
                                       allocateInputs=True,
                                       allocateOutputs=True,
                                       fromModel=from_model))
  # analyze and print the summary of footprint report
  analyze_footprints(report=result)
  
  # Save the result in outputs folder
  stm32ai_analysis_dir = f'{stm32ai_output_dir}/analysis_report'
  os.makedirs(stm32ai_analysis_dir, exist_ok=True)
  with open(f'./{stm32ai_analysis_dir}/{model_name}_analyze.txt', 'w') as file_analyze:
    file_analyze.write(f'{result}')
except Exception as e:
    print("Error: ", e)

### E. Benchmark your model on a STM32 target
Starting from STEdgeAI dev cloud version 10.0.0 onwards, the models can be benchmarked for STM32MCU and STM32MPU as well as for STM32NPU target boards.

Here's a table with the parameters and their descriptions while benchmarking for the STM32MCU targets (CLIParameters options of STEdgeAI):

| Parameter | Description |
| --- | --- |
| model | Model name corresponding to the file name uploaded. This parameter is required. |
| optimization | Optimization setting: "balanced", "time" or "ram". This parameter is required. |
| allocateInputs | If set to "True", activations buffer will be also used to handle the input buffers. This parameter is optional. Default value is "True". |
| allocateOutputs | If set to "True", activations buffer will be also used to handle the output buffers. This parameter is optional. Default value is "True". |
| noOnnxOptimizer | If set to "True", allows to disable the ONNX optimizer pass. This parameter is optional. Default value is "False". Apply only to ONNX file will be ignored otherwise. |
| fromModel | To identify the origin model when coming from ST model zoo. This parameter is optional. Default value is "user". |


While for the STM32MPU targets, only needed parameters are:

| Parameter | Description |
| --- | --- |
| model | Model name corresponding to the file name uploaded. This parameter is __required__. |
| nbCores | Number of CPU cores used for benchmarking. This parameter is __set by the code__ depending on the type of MPU. The value should be an integer "1", or "2". |
| engine | Choice of the hardware engine used on the board for benchmarking.This parameter is __set by the code__ depending on the target MPU. For STM32MP1X boards it is "MpuEngine.CPU" and for STM32MP2X this is "MpuEngine.HW_ACCELERATOR". |

* Note that the the code section below, the boad_name to benchmark the model on should be a string

In [None]:
# Get the available board on STEdgeAI Developer Cloud
boards = stmai.get_benchmark_boards()
board_names = [boards[i].name for i in range(len(boards))]
print("Available boards:", board_names)

#### Option 1. Benchmark on all available STM32 boards

In [None]:
# Benchmark the model on all STEdgeAI Developer Cloud boards
print(model_name)
fps_array = []
# loop through all boards
for board_name in board_names:
        fps_array.append(benchmark_model(stmai=stmai,
                                         model_path=model_path,
                                         model_name=model_name,
                                         optimization=optimization,
                                         from_model=from_model,
                                         board_name=board_name,
                                         allocateInputs= True,
                                         allocateOutputs=True))

In [None]:
# Display the Frame per Second benchmark
sorted_fps = sorted(fps_array, reverse=True)
sorted_boards = [board_names[fps_array.index(i)] for i in sorted_fps]
fig = plt.figure(1, figsize=(15, 8), tight_layout=True)
# colors = sns.color_palette()

colors = ['#4C72B0', '#55A868', '#C44E52', '#8172B2', '#CCB974',
          '#64B5CD', '#B4A7D6', '#AEC7E8', '#FFA07A', '#FFC0CB',
          '#FFFFB3', '#8DD3C7', '#BEBADA', '#FDB462', '#FB8072',
          '#FF6347', '#4682B4', '#6A5ACD', '#7FFF00', '#D2691E']

plt.bar(sorted_boards, sorted_fps, color=colors[:len(boards)], width=0.7)
plt.ylabel('FPS', fontsize=15)
plt.yticks(fontsize=12)
plt.xticks(sorted_boards, rotation = 75)
plt.title('STM32 FPS benchmark')
plt.show()

#### Option 2. Benchmark on a selected board

In [None]:
# Select a board among the available boards
board_dropdown = widgets.Dropdown(
    options = board_names,
    value = 'STM32N6570-DK',
    description ='Board:',
    disabled = False,)

display(board_dropdown)

In [None]:
board_name = board_dropdown.value
print(model_name, board_name)
fps = benchmark_model(stmai=stmai,
                      model_path=model_path,
                      model_name=model_name,
                      optimization=optimization,
                      from_model=from_model,
                      board_name=board_name,
                      allocateInputs= True,
                      allocateOutputs=True)

### F. Generate your model optimized C code for STM32MCU targets

To deploy the model on an STM32MCU target the user has to generate the C-Code of the optimized model. Here's a table with the parameters and their descriptions for the stm32.generate callback (CLIParameters of STEdgeAI):

| Parameter | Description |
| --- | --- |
| model | Model name corresponding to the file name uploaded. This parameter is required. |
| optimization | Optimization setting: "balanced", "time" or "ram". This parameter is required. |
| allocateInputs | If set to "True", activations buffer will be also used to handle the input buffers. This parameter is optional. Default value is "True". |
| allocateOutputs | If set to "True", activations buffer will be also used to handle the output buffers. This parameter is optional. Default value is "True". |
| noOnnxOptimizer | If set to "True", allows to disable the ONNX optimizer pass. This parameter is optional. Default value is "False". Apply only to ONNX file will be ignored otherwise. |
| includeLibraryForSerie | Include the runtime library for the given STM32 series. This parameter is optional. |
| fromModel | To identify the origin model when coming from ST model zoo. This parameter is optional. |



### NOTE

There is no need for this step if the deployment is intended on the MPU. One can directly deploy the .tflite model on the STM32MPUs. In case of STM32MP2x, an optimized version of the model should be already available in the path where the starting model was placed with the same name as model and extension ".nb".

In [None]:
display(series_dropdown)
display(ide_dropdown)

In [None]:
series = series_dropdown.value
IDE = ide_dropdown.value
print(f'Generating optimized C code of {model_name} model, for {series} series boards!\n')
# Generate model .c/.h code + Lib/Inc on STEdgeAI Developer Cloud
stm32ai_code_dir = f'{stm32ai_output_dir}/generated_code'
os.makedirs(stm32ai_code_dir, exist_ok=True)
result = stmai.generate(CliParameters(
    model=model_name,
    output=stm32ai_code_dir,
    optimization=optimization,
    allocateInputs=True,
    allocateOutputs=True,
    includeLibraryForSerie=CliLibrarySerie(series),
    includeLibraryForIde=CliLibraryIde(IDE),
    fromModel=from_model
))
!ls "{stm32ai_code_dir}"
# print 20 first lines of the report
if os.path.isfile(f'./{stm32ai_code_dir}/network_generate_report.txt'):
  print("\n\n---- code generation report ----\n","*" * 80)
  with open(f'./{stm32ai_code_dir}/network_generate_report.txt', 'r') as f:
    for _ in range(20): print(next(f))


#### You are ready to integrate your model in your STM32 application !

#### (Optional) : Delete your model from your STEdgeAI Developer Cloud space

In [None]:
if stmai.delete_model(model_name):
    print(f'{model_name} deleted from STEdgeAI developer Cloud workspace!')

## Deployment on STM32N6, STM32H7* and STM32MPU

The `QDQ` quantized models can be deployed using the [stm32ai-modelzoo-services](https://github.com/STMicroelectronics/stm32ai-modelzoo-services) as an [image_classification](https://github.com/STMicroelectronics/stm32ai-modelzoo-services/tree/main/image_classification) model. For knowing more details on how to do that please refer to [Deploying Image Classification models on STM32MCU](https://github.com/STMicroelectronics/stm32ai-modelzoo-services/blob/main/image_classification/deployment/README.md) and [Deploying Image Classificaiton Models on STM32MPU](https://github.com/STMicroelectronics/stm32ai-modelzoo-services/blob/main/image_classification/deployment/README_MPU.md).