# Model Optimization Notebook

This Jupyter Notebook demonstrates the process of preparing image data, downloading the COCO dataset, preprocessing images, and converting a YOLO NAS ONNX model to various SNPE DLC formats for deployment. The workflow includes:

- Importing necessary libraries for image processing and file management.
- Downloading and extracting a subset of the COCO validation dataset.
- Preprocessing images to the required input format for model inference.
- Converting the YOLO NAS ONNX model to SNPE DLC format, including quantization and graph preparation for specific hardware targets.
- Documenting each step for reproducibility and clarity.

This notebook serves as a practical guide for deploying deep learning models on Qualcomm platforms using the SNPE toolkit.

## How to Use

1. **Build and start the Docker Compose environment** as described in the project documentation.
2. **Access this notebook** in your browser at:  
    [http://127.0.0.1:8888/notebooks/model_optimization.ipynb](http://127.0.0.1:8888/notebooks/model_optimization.ipynb)
3. **Run all cells** in order to optimiza the ONNX model to Qualcomm chipsets.

In [None]:
# Import necessary libraries.
import glob
import os
import torch

import cv2 as cv
import numpy as np

## Data cleaning.

The `preprocess` function resizes an input image to 320x320 pixels, normalizes its pixel values to the range [0, 1], and returns the processed image as a NumPy array of type float32, preparing it for model inference.

In [None]:
def preprocess(original_image: np.ndarray) -> np.ndarray:
    """
    Preprocess the input image for model inference.

    Args:
        original_image (np.ndarray): The input image in BGR format.

    Returns:
        np.ndarray: The preprocessed image in the format expected by the model.
    """

    # Resize the image to 320x320 pixels and normalize pixel values to [0, 1].
    resized_image = cv.resize(original_image, (320, 320))
    return (resized_image / 255.).astype(np.float32)

### Getting the coco dataset
The COCO (Common Objects in Context) dataset is a large-scale image dataset designed for object detection, segmentation, and captioning tasks. In this pipeline, we use a subset of the COCO validation images to test and optimize our deep learning model. The images are downloaded, preprocessed, and converted into a raw format suitable for model inference and quantization steps. This ensures that the model is evaluated and optimized using real-world, diverse data representative of common objects and scenes.

In [None]:
if not os.path.exists("val2017.zip"):
    !wget http://images.cocodataset.org/zips/val2017.zip -q --show-progress

if not os.path.exists("val2017"):
    !unzip val2017.zip

if not os.path.exists("raw"):
    !mkdir "raw"
    filenames = glob.glob("raw/*.jpg")
    if len(filenames) < 15:
        filenames = glob.glob("val2017/*.jpg")[:15]
        for filename in filenames:
            image = cv.imread(filename)
            image = preprocess(image)
            image.tofile(
                filename.replace("val2017", "raw").replace(".jpg", ".raw")
            )

!zsh -c 'find raw -name "*.raw" > ./raw/input.txt'

## Model Optimization
This section covers the process of optimizing a deep learning model for deployment on Qualcomm® chipsets using the Qualcomm® Neural Processing SDK for AI (SNPE). The workflow includes converting a YOLO NAS ONNX model to the SNPE DLC format, quantizing the model for efficient inference, and preparing the model for specific hardware targets.

### 1. Model Conversion

The first step is to convert the ONNX model to the SNPE Deep Learning Container (DLC) format. This is achieved using the `snpe-onnx-to-dlc` tool, which translates the ONNX model into a format compatible with Qualcomm® hardware accelerators.

**Command:**
```
snpe-onnx-to-dlc -i /models/yolo_nas_s.onnx -o /models/yolo_nas_s_fp32.dlc
```

In [None]:
!zsh -c 'snpe-onnx-to-dlc -i /models/yolo_nas_s.onnx -o /models/yolo_nas_s_fp32.dlc'

### 2. Model Inspection

After conversion, the `snpe-dlc-info` tool is used to inspect the DLC file. This step ensures that the model has been correctly converted and provides information about the model's input and output tensors.

**Command:**
```
snpe-dlc-info -i /models/yolo_nas_s_fp32.dlc
```

In [None]:
!zsh -c 'snpe-dlc-info -i /models/yolo_nas_s_fp32.dlc'

### 3. Model Quantization

Quantization reduces the model size and increases inference speed by converting floating-point weights to 8-bit integers. The `snpe-dlc-quantize` tool uses a calibration dataset (prepared in the previous steps) to optimize the model for INT8 precision.

**Command:**
```
snpe-dlc-quantize --input_dlc /models/yolo_nas_s_fp32.dlc --input_list ./raw/input.txt --output_dlc /models/yolo_nas_s_int8.dlc
```

In [None]:
!zsh -c 'snpe-dlc-quantize --input_dlc /models/yolo_nas_s_fp32.dlc --input_list ./raw/input.txt --output_dlc /models/yolo_nas_s_int8.dlc'

### 4. Post-Quantization Inspection

After quantization, the model is inspected again to verify the changes and ensure the quantized model is ready for deployment.

**Command:**
```
snpe-dlc-info -i /models/yolo_nas_s_int8.dlc
```

In [None]:
!zsh -c 'snpe-dlc-info -i /models/yolo_nas_s_int8.dlc'


### 5. Hardware-Specific Graph Preparation

To further optimize the model for a specific Qualcomm® SoC (e.g., SM7325), the `snpe-dlc-graph-prepare` tool is used. This step configures the model's output tensors and prepares it for execution on the target hardware's HTP (Hexagon Tensor Processor).

**Command:**
```
snpe-dlc-graph-prepare --input_dlc /models/yolo_nas_s_int8.dlc --set_output_tensors=output_bboxes,output_classes --htp_socs=sm7325 --output_dlc=/models/yolo_nas_s_int8_htp_sm7325.dlc
```

In [None]:
!zsh -c 'snpe-dlc-graph-prepare --input_dlc /models/yolo_nas_s_int8.dlc --set_output_tensors=output_bboxes,output_classes --htp_socs=sm7325 --output_dlc=/models/yolo_nas_s_int8_htp_sm7325.dlc'

### 6. Final Model Inspection

A final inspection confirms that the model is correctly prepared for the target hardware and ready for deployment.

**Command:**
```
snpe-dlc-info -i /models/yolo_nas_s_int8_htp_sm7325.dlc
```

In [None]:
!zsh -c 'snpe-dlc-info -i /models/yolo_nas_s_int8_htp_sm7325.dlc'

By following these steps, the model is optimized for efficient inference on Qualcomm® platforms, leveraging hardware acceleration for real-time AI applications. This process ensures that the model is both accurate and performant, making it suitable for deployment in edge devices powered by Qualcomm® chipsets.