<a href="https://colab.research.google.com/github/marcory-hub/hailo/blob/main/DFC.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Prepair you dataset


1. Dataset Structure:
  
  Make a annotated dataset with this folder and naming structure. If you do not have your own dataset, you can download the [hornet3000+ dataset](https://www.kaggle.com/datasets/marcoryvandijk/vespa-velutina-v-crabro-vespulina-vulgaris). If you want to annotate your own dataset take a look at CVAT or Roboflow to annotate images and save it in YOLO format.

```
data/
  train/
    images/
      image_1.jpg  # Image file
      image_2.png  # Image file
      etc...
    labels/
      image_1.txt  # Label file
      image_2.txt  # Label file
      etc...
  val/
    images/
      image_3.jpg  # Image for validation
      image_4.png  # Image for validation
      ...
    labels/
      image_3.txt  # Label file for validation
      image_4.txt  # Label file for validation
      ...
```

2. Data.yaml Configuration (Optional)

  Add `data.yaml` (configuration file used by the training script to locate the data) to folder datasets. This contains:
  - absolute path to train images (train)
  - absolute path validation (val)
  - the number of classes (nc)
  - and the class names (names)

  For example a dataset with 3 types of insects would look like this:
```
train: /content/data/train # path to train images
val: /content/data/val # path to val images
nc: 3
names: ['Vespa_velutina', 'Vespa_crabro', 'Vespula_vulgaris']
```
Make sure to adjust the `nc` (number of classes) and `names` accordingly.

3. Zip Your Dataset (Optional):

  Only if you use own dataset: zip the data folder to a file names `dataset.zip` (or a custom name). On mac use `ditto -c -k --norsrc --keepParent images dataset.zip` to exclude finderfiles from the zipped file.

4. Copy the zipped dataset:
  
  Copy the zipped file with dataset and yaml to your google drive.



## 2. Unzip dataset.zip and rename the folder on google drive

Adjust the names of the `dataset_path` and `dataset_filename` in the boxes on the right.

In [None]:
## 2. Unzip dataset.zip and rename the folder on google drive
from google.colab import drive

drive.mount('/content/gdrive')

In [None]:
import os

# Define Paths with Parameters
dataset_path = "/content/gdrive/MyDrive/vespA/100_1_dataset.zip"  # @param {type:"string"}
dataset_filename = "100_1_sampleSize"  # @param {type:"string"}

# Unzip the Dataset (using the defined path)
!unzip {dataset_path} -d '/content/'

# Rename the Extracted Folder
old_path = f'/content/{dataset_filename}'
new_path = '/content/dataset'
os.rename(old_path, new_path)

Optional: Check is path is correct

output should be
- train valid
- images labels
- /content/gdrive/MyDrive/vespA/data.yaml


In [None]:
# Optional chech datapaths

!ls '/content/dataset/'
!ls '/content/dataset/train/'
!ls '/content/dataset/valid/'
!ls '/content/gdrive/MyDrive/vespA/data.yaml'

## 3. Train YOLO11 model

1. Install ultralytics

In [None]:
#Installing the python package
!pip install ultralytics

#Verifying the installation
!pip show ultralytics
import ultralytics
ultralytics.checks()
from ultralytics import YOLO
from IPython.display import Image

Ultralytics 8.3.49 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 33.2/107.7 GB disk)


2. Retrain model

TODO fix log W&B and project name with params

In [None]:
from ultralytics import YOLO

model_name = "yolo11s" #@param {type:"string"}
dataset_path = "/content/gdrive/MyDrive/vespA/data.yaml" #@param {type:"string"}

# Get pre-trained model
model = YOLO(f"{model_name}.pt")

# Train/fine-tune model
model.train(project="yolo11vespA",  # wandb project name
            name="train1",          # run name wandb
            data=dataset_path,
            epochs=3,
            imgsz=640,
            fraction=0.1)

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt to 'yolo11s.pt'...


100%|██████████| 18.4M/18.4M [00:00<00:00, 176MB/s]


Ultralytics 8.3.49 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolo11s.pt, data=/content/gdrive/MyDrive/vespA/data.yaml, epochs=3, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=yolo11vespA, name=train1, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=0.1, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=Tr

100%|██████████| 755k/755k [00:00<00:00, 58.6MB/s]


Overriding model.yaml nc=80 with nc=5

                   from  n    params  module                                       arguments                     
  0                  -1  1       928  ultralytics.nn.modules.conv.Conv             [3, 32, 3, 2]                 
  1                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  2                  -1  1     26080  ultralytics.nn.modules.block.C3k2            [64, 128, 1, False, 0.25]     
  3                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
  4                  -1  1    103360  ultralytics.nn.modules.block.C3k2            [128, 256, 1, False, 0.25]    
  5                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
  6                  -1  1    346112  ultralytics.nn.modules.block.C3k2            [256, 256, 1, True]           
  7                  -1  1   1180672  ultralytics

[34m[1mtrain: [0mScanning /content/dataset/train/labels... 205 images, 0 backgrounds, 0 corrupt: 100%|██████████| 205/205 [00:00<00:00, 262.55it/s]

[34m[1mtrain: [0mNew cache created: /content/dataset/train/labels.cache





[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))


  check_for_updates()
[34m[1mval: [0mScanning /content/dataset/valid/labels... 582 images, 0 backgrounds, 0 corrupt: 100%|██████████| 582/582 [00:01<00:00, 376.04it/s]

[34m[1mval: [0mNew cache created: /content/dataset/valid/labels.cache





Plotting labels to yolo11vespA/train1/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.001111, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0)
[34m[1mTensorBoard: [0mmodel graph visualization added ✅
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to [1myolo11vespA/train1[0m
Starting training for 3 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        1/3         0G      1.775      5.782       1.95         29        640:  15%|█▌        | 2/13 [02:14<12:20, 67.34s/it]


KeyboardInterrupt: 

3. Validate model

set to gdrive, adjust when workflow below is debugged

In [None]:
!yolo task=detect mode=val \
model="/content/gdrive/MyDrive/best.pt" \
    data="/content/gdrive/MyDrive/vespA/data.yaml"

Ultralytics 8.3.49 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
YOLO11s summary (fused): 238 layers, 9,414,735 parameters, 0 gradients, 21.3 GFLOPs
[34m[1mval: [0mScanning /content/dataset/valid/labels.cache... 582 images, 0 backgrounds, 0 corrupt: 100% 582/582 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  41% 15/37 [00:39<00:57,  2.64s/it]
Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/cfg/__init__.py", line 972, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/engine/model.py", line 639, in val
    validator(model=self.model)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.

4. Optional: Zip and download runs/train folder

TODO: parameters voor folder path en file name



In [None]:
from google.colab import files

train_folder_path = "/content/yolo11vespA/train1"
download_file_name = "/content/train1.zip"
try:
  # Zipping the folder
  !zip -r {download_file_name} {train_folder_path}
  # Downloading the zipped file
  files.download(download_file_name)
except Exception as e:
  print(f"An error occurred: {e}")
  print("Click 'Runtime' -> 'Restart session' and try running the code again.")


  adding: content/yolo11vespA/train1/ (stored 0%)
  adding: content/yolo11vespA/train1/train_batch1.jpg (deflated 1%)
  adding: content/yolo11vespA/train1/weights/ (stored 0%)
  adding: content/yolo11vespA/train1/events.out.tfevents.1733931970.8f797b4d64d2.2840.0 (deflated 94%)
  adding: content/yolo11vespA/train1/args.yaml (deflated 53%)
  adding: content/yolo11vespA/train1/train_batch0.jpg (deflated 3%)
  adding: content/yolo11vespA/train1/labels.jpg (deflated 31%)
  adding: content/yolo11vespA/train1/labels_correlogram.jpg (deflated 43%)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

4. Convert .pt file to ONNX via CLI

TODO parameter locatie best
now set path to gdrive for debugging

In [None]:
!yolo export model=/content/gdrive/MyDrive/best.pt format=onnx opset=9 # export custom trained model

Ultralytics 8.3.49 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
YOLO11s summary (fused): 238 layers, 9,414,735 parameters, 0 gradients, 21.3 GFLOPs

[34m[1mPyTorch:[0m starting from '/content/gdrive/MyDrive/best.pt' with input shape (1, 3, 320, 320) BCHW and output shape(s) (1, 9, 2100) (18.2 MB)
[31m[1mrequirements:[0m Ultralytics requirements ['onnx>=1.12.0', 'onnxslim', 'onnxruntime'] not found, attempting AutoUpdate...
Collecting onnx>=1.12.0
  Downloading onnx-1.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
Collecting onnxslim
  Downloading onnxslim-0.1.43-py3-none-any.whl.metadata (4.2 kB)
Collecting onnxruntime
  Downloading onnxruntime-1.20.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.5 kB)
Collecting coloredlogs (from onnxruntime)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime)
  Downloading humanfriendly-10.0

OR

convert .pt file to ONNX via python code

TODO: parameters
now checkpoint set to gdrive for debugging

In [None]:
import torch

# Load our model into our environment
checkpoint = torch.load('/content/gdrive/MyDrive/best.pt')
model = checkpoint['model']

model = model.float()
model.eval()

# Dummy input in FP32
dummy_input = torch.randn(16, 3, 640, 640, dtype=torch.float)

# Export to ONNX
torch.onnx.export(
    model,
    dummy_input,
    "modified_run_3.onnx",
    export_params=True,
    opset_version=9,  # Adjust opset version if needed, changed from 11 to 9
    do_constant_folding=True,
    input_names=['input'],
    output_names=['output']
)
print("ONNX model exported successfully!")

  checkpoint = torch.load('/content/gdrive/MyDrive/best.pt')
  if self.format != "imx" and (self.dynamic or self.shape != shape):
  for i, stride in enumerate(strides):
ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
We recommend using opset 11 and above for models using this operator.


ONNX model exported successfully!


5. Verify the

TODO parameters

onnx model
input shape

In [None]:
import onnx
import onnxruntime as ort
import torch

# Load the ONNX model
onnx_model = onnx.load("/content/gdrive/MyDrive/best_opset9.onnx")
onnx.checker.check_model(onnx_model)
print("ONNX model is valid!")

# Test the ONNX model with ONNX Runtime
dummy_input = torch.randn(1, 3, 320, 320).numpy() # change to 640 again after debugging
ort_session = ort.InferenceSession("/content/gdrive/MyDrive/best_opset9.onnx")
outputs = ort_session.run(None, {"images": dummy_input})
print(outputs[0])

ONNX model is valid!
[[[     6.0062      10.285      21.618 ...      214.13      252.06       282.9]
  [      6.665      5.3329      5.0171 ...      266.74      268.46      272.43]
  [     12.015      18.834      21.744 ...      254.57      206.63      188.73]
  ...
  [  0.0016357   0.0014248  0.00082025 ...  0.00015348  0.00014409  0.00031254]
  [  0.0002355  0.00030908  0.00029042 ...   0.0060495   0.0071386    0.007114]
  [ 0.00030759  0.00025287  0.00017902 ...  0.00024694  0.00030869  0.00037241]]]


## 4. Hailo compilation

The Hailo compilation process involves three main steps:

### 1. Model Conversion:
- input: yolo.onnx
- output: .har

The YOLO11 model, is parsed into a format that the Hailo device can understand. This involves converting the model's structure and operations into a Hailo-specific representation.

### 2. Model Quantization:
- input:
- output:

The model's precision is reduced from high-precision floating-point numbers to lower-precision integers (usually 8-bit). This significantly reduces the model's size and memory footprint, making it more efficient for the Hailo device.

### 3. Model Compilation:

- input:
- output: .hef

The quantized model is compiled into a specific binary format called HEF (Hailo Executable Format). This format is optimized for the Hailo device's architecture and allows for efficient execution of the model's operations.

To make this work in Colab the DataFlowCompiler (DFC) should be downloaded from the site, copied to google drive and installed in a vitual  environment in colab

1. Make virtual environment in Colab

In [None]:
!sudo apt-get update
!sudo apt-get install -y python3-dev python3-distutils python3-tk libfuse2 graphviz libgraphviz-dev

# Will need a venv to install the DFC in
!pip install --upgrade pip virtualenv
!virtualenv my_env

0% [Working]            Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
0% [Waiting for headers] [Waiting for headers] [1 InRelease 3,626 B/3,626 B 1000% [Waiting for headers] [Waiting for headers] [Connected to r2u.stat.illinois.                                                                               Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [2 InRelea0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Connected                                                                               Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
0% [Waiting for headers] [3 InRelease 11.3 kB/129 kB 9%] [Waiting for headers]                                                                                Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 https://

2. Download hailo dataflow compiler (python 3.10) from https://hailo.ai/developer-zone/software-downloads/ (you need to make an account)

3. Then copy the .whl to google drive


In [None]:
#Installing the WHL file for Hailo DFC
!my_env/bin/pip install /content/gdrive/MyDrive/hailo_dataflow_compiler-3.29.0-py3-none-linux_x86_64.whl

# Making sure it's installed properly
!my_env/bin/hailo --version
!my_env/bin/hailo -h

Processing ./gdrive/MyDrive/hailo_dataflow_compiler-3.29.0-py3-none-linux_x86_64.whl
Collecting absl-py (from hailo-dataflow-compiler==3.29.0)
  Downloading absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting annotated-types==0.4.0 (from hailo-dataflow-compiler==3.29.0)
  Downloading annotated_types-0.4.0-py3-none-any.whl.metadata (13 kB)
Collecting argcomplete (from hailo-dataflow-compiler==3.29.0)
  Downloading argcomplete-3.5.2-py3-none-any.whl.metadata (16 kB)
Collecting contextlib2 (from hailo-dataflow-compiler==3.29.0)
  Downloading contextlib2-21.6.0-py2.py3-none-any.whl.metadata (4.1 kB)
Collecting future (from hailo-dataflow-compiler==3.29.0)
  Downloading future-1.0.0-py3-none-any.whl.metadata (4.0 kB)
Collecting jsonref (from hailo-dataflow-compiler==3.29.0)
  Downloading jsonref-1.1.0-py3-none-any.whl.metadata (2.7 kB)
Collecting jsonschema (from hailo-dataflow-compiler==3.29.0)
  Downloading jsonschema-4.23.0-py3-none-any.whl.metadata (7.9 kB)
Collecting matplotlib

###1.  Model conversion onnx --> har

HAR format = Hailo Archive
a tar.gz archive file that contains the representation of the graph structure and the weights. These are deployed to Hailo's runtime.


HAR compressed file, which includes HN and NPZ files). The HN model is a textual JSON output file. The weights are also returned as a NumPy NPZ file.


chose_hw_arch ="hailo8" for the raspberry AI-kit

Identifying the six end node names via www.netron.app

To identify the Yolo’s end nodes, they are the nodes right before the post-processing operations at the very bottom of the model. Their are 2 end nodes per map. I used a search for `onnx::Reshape` to get to the two `conv` layers that pointed to the `onnx::Reshape`

In yolov8 till yolo11 model this are the endpoints
```
"/model.23/cv2.2/cv2.2.2/Conv",
"/model.23/cv3.2/cv3.2.2/Conv",
"/model.23/cv2.1/cv2.1.2/Conv",
"/model.23/cv3.1/cv3.1.2/Conv",
"/model.23/cv2.0/cv2.0.2/Conv",
"/model.23/cv3.0/cv3.0.2/Conv",
```


In [None]:
with open("translate_model.py", "w") as f:
    f.write("""

from hailo_sdk_client import ClientRunner

print("Starting model translation...")

# Define the ONNX model path and configuration
chosen_hw_arch = "hailo8l" # Replace with your hardware architecture
onnx_model_name = "best_opset9"
onnx_path = "/content/gdrive/MyDrive/best_opset9.onnx"  # Replace with your ONNX model path


# Initialize the ClientRunner
runner = ClientRunner(hw_arch=chosen_hw_arch)

# Use the recommended end node names for translation
end_node_names = [
  "/model.23/cv2.0/cv2.0.2/Conv",
  "/model.23/cv3.0/cv3.0.2/Conv",
  "/model.23/cv2.1/cv2.1.2/Conv",
  "/model.23/cv3.1/cv3.1.2/Conv",
  "/model.23/cv2.2/cv2.2.2/Conv",
  "/model.23/cv3.2/cv3.2.2/Conv",
]

try:
    # Translate the ONNX model to Hailo's format
    hn, npz = runner.translate_onnx_model(
        onnx_path,
        onnx_model_name,
        end_node_names=end_node_names,
        net_input_shapes={"images": [1, 3, 320, 320]},  # Adjust input shapes if needed
    )
    print("Model translation successful.")
except Exception as e:
    print(f"Error during model translation: {e}")
    raise

# Save the Hailo model HAR file
hailo_model_har_name = f"{onnx_model_name}_hailo_model.har"
try:
    runner.save_har(hailo_model_har_name)
    print(f"HAR file saved as: {hailo_model_har_name}")
except Exception as e:
    print(f"Error saving HAR file: {e}")


""")

In [None]:
# Run model in CLI
!my_env/bin/python translate_model.py

Starting model translation...
[[32minfo[0m] Translation started on ONNX model best_opset9
[[32minfo[0m] Restored ONNX model best_opset9 (completion time: 00:00:00.35)
[[32minfo[0m] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:01.12)
[[32minfo[0m] NMS structure of yolov8 (or equivalent architecture) was detected.
[[32minfo[0m] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
[[32minfo[0m] Start nodes mapped from original model: 'images': 'best_opset9/input_layer1'.
[[32minfo[0m] End nodes mapped from original model: '/model.23/cv2.0/cv2.0.2/Conv', '/model.23/cv3.0/cv3.0.2/Conv', '/model.23/cv2.1/cv2.1.2/Conv', '/model.23/cv3.1/cv3.1.2/Conv', '/model.23/cv2.2/cv2.2.2/Conv', '/model.23/cv3.2/cv3.2.2/Conv'.
[[32minfo[0m] Translation c

3. Optimize model

- input: HAR file in Hailo Model state (before optimization; with native weights)
- output: quantized HAR file with quantized weights

In [None]:
with open("har_model.py", "w") as f:

    f.write("""

from hailo_sdk_client import ClientRunner

# Load the HAR file
har_path = "/content/best_opset9_hailo_model.har"

runner = ClientRunner(har=har_path)

from pprint import pprint

try:
    # Access the HailoNet as an OrderedDict
    hn_dict = runner.get_hn()  # Or use runner._hn if get_hn() is unavailable
    print("Inspecting layers from HailoNet (OrderedDict):")

    # Pretty-print each layer
    for key, value in hn_dict.items():
        print(f"Key: {key}")
        pprint(value)
        print("\\n" + "="*80 + "\\n", flush=True)


except Exception as e:
    print(f"Error while inspecting hn_dict: {e}")

""")

In [None]:
# Run model in CLI
!my_env/bin/python har_model.py


Inspecting layers from HailoNet (OrderedDict):
Key: name
'best_opset9'


Key: net_params
OrderedDict([('version', '1.0'),
             ('stage', 'HN'),
             ('clusters_placement', [[]]),
             ('clusters_to_skip', []),
             ('output_layers_order',
              ['best_opset9/conv51',
               'best_opset9/conv54',
               'best_opset9/conv62',
               'best_opset9/conv65',
               'best_opset9/conv77',
               'best_opset9/conv80']),
             ('transposed_net', False),
             ('net_scopes', ['best_opset9'])])


Key: layers
OrderedDict([('best_opset9/input_layer1',
              OrderedDict([('type', 'input_layer'),
                           ('input', []),
                           ('output', ['best_opset9/conv1']),
                           ('input_shapes', [[-1, 320, 320, 3]]),
                           ('output_shapes', [[-1, 320, 320, 3]]),
                           ('original_names', ['images']),
              

Now, you can scroll through the output to verify which layers correspond to which end node in your ONNX model. In this dict, each layer is stored under a new name, and it’s original name is a key within the layer under ‘original_names’. You will need this when generating a NMS file for your model, you can find examples NMS configs here.

In [None]:
import json
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/gdrive/', force_remount=True)

# Updated NMS layer configuration dictionary
nms_layer_config = {
    "nms_scores_th": 0.3,
    "nms_iou_th": 0.7,
    "image_dims": [640, 640],
    "max_proposals_per_class": 25,
    "classes": 1,
    "regression_length": 16,
    "background_removal": False,
    "background_removal_index": 0,
    "bbox_decoders": [
        {
            "name": "best_opset9/bbox_decoder51",
            "stride": 8,
            "reg_layer": "conv41",
            "cls_layer": "conv42"
        },
        {
            "name": "best_opset9/bbox_decoder62",
            "stride": 16,
            "reg_layer": "conv62",
            "cls_layer": "conv65"
        },
        {
            "name": "best_opset9/bbox_decoder77",
            "stride": 32,
            "reg_layer": "best_opset9/conv77",
            "cls_layer": "best_opset9/conv80"
        }
    ]
}

# Path to save the updated JSON configuration
output_dir = "/content/"
os.makedirs(output_dir, exist_ok=True)  # Create the directory if it doesn't exist
output_path = os.path.join("nms_layer_config.json")

# Save the updated configuration as a JSON file
with open(output_path, "w") as json_file:
    json.dump(nms_layer_config, json_file, indent=4)

print(f"NMS layer configuration saved to {output_path}")

Mounted at /content/gdrive/
NMS layer configuration saved to nms_layer_config.json


After this, I made calibration data for the optimization step.

Model Optimization for Hailo Hardware

This step prepares the model for deployment on Hailo hardware. A key part of this process is quantization, which involves converting the model's parameters from floating-point numbers to integer numbers. This is essential for efficient execution on Hailo hardware.

Timing and Resource Considerations:

Model Translation: This step occurs after converting the model from its original framework (e.g., TensorFlow, PyTorch).
Hardware Requirements: For optimal performance during model optimization, it's recommended to use a machine equipped with a GPU.
Calibration Data: To achieve the best results, prepare a calibration dataset containing at least 1024 data samples.
By following these steps and guidelines, you can effectively optimize your model for deployment on Hailo hardware.

Callibrationset:
- at leat 1024 images
- annotated
- representative

Some confliction information:

not more than 64: https://community.hailo.ai/t/should-i-use-as-many-images-as-possible-for-calibration/155

or at least 1024 (https://hailo.ai/developer-zone/documentation/dataflow-compiler-v3-29-0/?sp_referrer=sdk/model_optimization.html)


No labels are needed, the calibration is a way to collect statistics on the data that the network is going to process, thus allowing for a smarter choise of the quanlized bins. The statistics are collected per layer.

Both would probably be good. but it should be represantative, so if you have a real video from a scene, taking 100 consective images would not be a good choice. (https://community.hailo.ai/t/calibration-dataset-details/6967)



unzip images in .zip



In [None]:
## 2. Unzip dataset.zip and rename the folder on google drive
from google.colab import drive

drive.mount('/content/gdrive')

import os

# Define Paths with Parameters
dataset_path = "/content/gdrive/MyDrive/vespA/100_1_dataset.zip"  # @param {type:"string"}
dataset_filename = "100_1_sampleSize"  # @param {type:"string"}

# Unzip the Dataset (using the defined path)
!unzip {dataset_path} -d '/content/'

# Rename the Extracted Folder
old_path = f'/content/{dataset_filename}'
new_path = '/content/dataset'
os.rename(old_path, new_path)

In [None]:
import numpy as np
from PIL import Image
import os
from google.colab import drive


# Paths to directories and files
image_dir = '/content/dataset/valid/images'
output_dir = '/content/output_dir'
os.makedirs(output_dir, exist_ok=True)  # Create the directory if it doesn't exist

# File paths for saving calibration data
calibration_data_path = os.path.join(output_dir, "calibration_data.npy")
processed_data_path = os.path.join(output_dir, "processed_calibration_data.npy")

# Initialize an empty list for calibration data
calib_data = []

# Process all image files in the directory
for img_name in os.listdir(image_dir):
    img_path = os.path.join(image_dir, img_name)
    if img_name.lower().endswith(('.jpg', '.jpeg', '.png')):
        img = Image.open(img_path).resize((640, 640))  # Resize to desired dimensions
        img_array = np.array(img) / 255.0  # Normalize to [0, 1]
        calib_data.append(img_array)

# Convert the calibration data to a NumPy array
calib_data = np.array(calib_data)

# Save the normalized calibration data
np.save(calibration_data_path, calib_data)
print(f"Normalized calibration dataset saved with shape: {calib_data.shape} to {calibration_data_path}")

# Scale the normalized data back to [0, 255]
processed_calibration_data = calib_data * 255.0

# Save the processed calibration data
np.save(processed_data_path, processed_calibration_data)
print(f"Processed calibration dataset saved with shape: {processed_calibration_data.shape} to {processed_data_path}")

Aangepaste versie ivm teveel verbruik van RAM (nog verder verbeteren)

In [None]:
import numpy as np
from PIL import Image
import os

# Paths to directories and files
image_dir = '/content/dataset/valid/images'
output_dir = '/content/output_dir'
os.makedirs(output_dir, exist_ok=True)  # Create the directory if it doesn't exist

# File paths for saving processed data
calibration_data_path = os.path.join(output_dir, "calibration_data.npy")
processed_data_path = os.path.join(output_dir, "processed_calibration_data.npy")

# Process and save each image incrementally to avoid high memory usage
with open(calibration_data_path, 'wb') as calib_file, open(processed_data_path, 'wb') as processed_file:
    for img_name in os.listdir(image_dir):
        img_path = os.path.join(image_dir, img_name)

        if img_name.lower().endswith(('.jpg', '.jpeg', '.png')):
            # Resize and normalize the image
            img = Image.open(img_path).resize((640, 640))
            img_array = np.array(img) / 255.0  # Normalize to [0, 1]

            # Append the normalized data directly to the file
            np.save(calib_file, img_array, allow_pickle=False)
            print(f"Saved {img_name} normalized data to calibration file.")

            # Scale the normalized data back to [0, 255] and save incrementally
            processed_calibration_data = img_array * 255.0
            np.save(processed_file, processed_calibration_data, allow_pickle=False)
            print(f"Saved {img_name} processed calibration data to file.")

print("All images processed and saved.")

Saved train_vvel0359.jpg normalized data to calibration file.
Saved train_vvel0359.jpg processed calibration data to file.
Saved test_vcra0104.jpg normalized data to calibration file.
Saved test_vcra0104.jpg processed calibration data to file.
Saved train_vcra1859.jpg normalized data to calibration file.
Saved train_vcra1859.jpg processed calibration data to file.
Saved train_vvel0355.jpg normalized data to calibration file.
Saved train_vvel0355.jpg processed calibration data to file.
Saved valid_vzon0185.jpg normalized data to calibration file.
Saved valid_vzon0185.jpg processed calibration data to file.
Saved train_vcra0299.jpg normalized data to calibration file.
Saved train_vcra0299.jpg processed calibration data to file.
Saved train_vzon0930.jpg normalized data to calibration file.
Saved train_vzon0930.jpg processed calibration data to file.
Saved train_vvel0020.jpg normalized data to calibration file.
Saved train_vvel0020.jpg processed calibration data to file.
Saved train_vzon07

Now, we’re finally ready to optimize it with this script, you can find sample .alls files here, I referenced yolo10nms.json as a base to create my alls file.

Note that the change_output_activation applied to my CLS_layer, you can go back and verify this with Netron like specified above.



In [None]:
import os
from hailo_sdk_client import ClientRunner

# Define your model's HAR file name
model_name = "best_opset9"
hailo_model_har_name = f"{model_name}_hailo_model.har"
hailo_model_har_name = "modified_run_3_renamed_hailo_model.har"

# Ensure the HAR file exists
assert os.path.isfile(hailo_model_har_name), "Please provide a valid path for the HAR file"

# Initialize the ClientRunner with the HAR file
runner = ClientRunner(har=hailo_model_har_name)

# Define the model script to add a normalization layer
# Normalization for [0, 1] range
alls = \"\"\"
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv54, sigmoid)
change_output_activation(conv62, sigmoid)
change_output_activation(conv80, sigmoid)
nms_postprocess("/content/nms_layer_config.json", meta_arch=yolov8, engine=cpu)
performance_param(compiler_optimization_level=max)
\"\"\"

# Load the model script into the ClientRunner
runner.load_model_script(alls)

# Define a calibration dataset
# Replace 'calib_dataset' with the actual dataset you're using for calibration
# For example, if it's a directory of images, prepare the dataset accordingly
calib_dataset = "/content/output_dir/processed_calibration_data.npy"

# Perform optimization with the calibration dataset
runner.optimize(calib_dataset)

# Save the optimized model to a new Quantized HAR file
quantized_model_har_path = f"{model_name}_quantized_model.har"
runner.save_har(quantized_model_har_path)

print(f"Quantized HAR file saved to: {quantized_model_har_path}")

In [None]:
with open("optimize_model.py", "w") as f:

    f.write("""

import os
from hailo_sdk_client import ClientRunner

# Define your model's HAR file name
model_name = "best_opset9"
hailo_model_har_name = f"{model_name}_hailo_model.har"
hailo_model_har_name = "best_opset9_hailo_model.har"

# Ensure the HAR file exists
assert os.path.isfile("/content/best_opset9_hailo_model.har")

# Initialize the ClientRunner with the HAR file
runner = ClientRunner(har=hailo_model_har_name)

# Define the model script to add a normalization layer
# Normalization for [0, 1] range
alls = \"\"\"
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv54, sigmoid)
change_output_activation(conv62, sigmoid)
change_output_activation(conv80, sigmoid)
nms_postprocess("/content/nms_layer_config.json", meta_arch=yolov8, engine=cpu)
performance_param(compiler_optimization_level=max)
\"\"\"

# Load the model script into the ClientRunner
runner.load_model_script(alls)

# Define a calibration dataset
# Replace 'calib_dataset' with the actual dataset you're using for calibration
# For example, if it's a directory of images, prepare the dataset accordingly
calib_dataset = "/content/output_dir/processed_calibration_data.npy"

# Perform optimization with the calibration dataset
runner.optimize(calib_dataset)

# Save the optimized model to a new Quantized HAR file
quantized_model_har_path = f"{model_name}_quantized_model.har"
runner.save_har(quantized_model_har_path)

print(f"Quantized HAR file saved to: {quantized_model_har_path}")

""")

Now run it

In [None]:
!my_env/bin/python optimize_model.py

[[32minfo[0m] ParsedPerformanceParam command, setting optimization_level(max=2)
[[32minfo[0m] Loading model script commands to best_opset9 from string
[[32minfo[0m] ParsedPerformanceParam command, setting optimization_level(max=2)
[[32minfo[0m] The activation function of layer best_opset9/conv42 was replaced by a Sigmoid
[[32minfo[0m] The activation function of layer best_opset9/conv65 was replaced by a Sigmoid
Traceback (most recent call last):
  File "/content/optimize_model.py", line 37, in <module>
    runner.optimize(calib_dataset)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 2093, in optimize
    self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wra

Compiling model

In [None]:
from hailo_sdk_client import ClientRunner

# Define the quantized model HAR file
model_name = "modified_run_3_renamed"
quantized_model_har_path = f"{model_name}_quantized_model.har"

# Initialize the ClientRunner with the HAR file
runner = ClientRunner(har=quantized_model_har_path)
print("[info] ClientRunner initialized successfully.")

# Compile the model
try:
    hef = runner.compile()
    print("[info] Compilation completed successfully.")
except Exception as e:
    print(f"[error] Failed to compile the model: {e}")
    raise
file_name = f"{model_name}.hef"
with open(file_name, "wb") as f:
    f.write(hef)

now run

In [None]:
!my_env/bin/python compile_model.py


Zip and Download the results to your local computer
Data is lost when closing the page!!!

If an error occurs:
Click Runtime in the toolbar on the top
Select Restart session / Sessie opnieuw starten from the drop down menu
Run this codeblock again

In [None]:
# DON'T FORGET TO DOWNLOAD THE RESULTS!!!

# Data is lost when closing the page!!!

# IN MOST CASES AN ERROR OCCURS:
# Click `Runtime`, select `Restart session` / `Sessie opnieuw opstarten`
# then run this codeblock again

from google.colab import files

try:
  !zip -r /content/runs.zip /content/runs
  files.download('/content/runs.zip')
except Exception as e:
  print(f"An error occurred: {e}")
  print("Click 'Runtime' -> 'Restart session' and try running the code again.")