<a href="https://colab.research.google.com/github/marcory-hub/hailo-colab/blob/main/dataset_yolo_onnx.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# From Dataset to YOLO11 ONNX file

1. **Prepare Files:** Download a dataset in yolo format or create your own.
2. **Zip and Upload:** Zip your dataset and upload it to your Google Drive.
3. **Colab Access:** Unzip the dataset in your Colab notebook environment.
4. **Train YOLO11:** Use the provided code to train a YOLO11 model.
5. **Convert to ONNX:** Convert the trained model `best.pt` to ONNX format `best.onnx` for Hailo's Dataflow Compiler.
6. **Save model:** Zip training folders with the `best.pt` and `best.onnx` in the folder `weights`.

### Before you start

1. **Check internet** for the most recent information (for example on github) before you install Ultralytics. A miner was injected in ultralytics 8.3.41, 8.3.42, 8.3.45, and 8.3.46.
2. **Select runtime type**
Select a runtime type with a GPU, this is needed to train the YOLO model (T4 (free) or A100).

## 1. Prepair Files

To train YOLO11 model with your data you need use images and labels in the correct folder structure and a yaml file.
1. **Dataset:** a collection of images and their corresponding labels. Labels tell the model what objects are in the image and where they are located.You can either download an existing dataset in YOLO format from online sources like the [hornet3000+ dataset](https://www.kaggle.com/datasets/marcoryvandijk/vespa-velutina-v-crabro-vespulina-vulgaris) or create your own dataset using tools like CVAT or Roboflow to annotate images with bounding boxes around the objects you want the model to detect.

2. **data.yaml:** a configuration file that tells the training script where to find your dataset and what it contains.

### Steps to prepare your files

1. Organize your dataset:
  - If you created your own dataset, make sure it follows this structure. use these exact names of the folders (only the jpg en txt files in these folders have custom names)
    ```
    dataset\
      train\
        images\
        labels\
      valid
        images\
        labels\
      test (optional)
        images\
        labels\
    ```

2. Create data.yaml:
  - The example provided shows how to fill it out based on your dataset location. Remember to adjust the number of classes (nc) and class names (names) to match your specific dataset in the order of the class numbers (so 0 is Vespa_velutina in this example)

    ```
train: /content/dataset/train # path to train images
val: /content/dataset/valid # path to val images
nc: 3
names: ['Vespa_velutina', 'Vespa_crabro', 'Vespula_vulgaris']
    ```
Add this file to your dataset folder.

## 2. Zip and Upload


1. Zip your dataset folder:
  - This step speeds up uploading your dataset to Google Drive. On MacOS use `ditto -c -k --norsrc --keepParent images dataset.zip` to exclude finderfiles from the zipped file.
2. Upload the zipped folder to your Google Drive.


## 3. Colab Access

To get easy access to the dataset files needed to (re-)train the YOLO11 model run the following codeblocks for these steps:

1. Mount google drive.
2. Unzip the dataset in Colab folder and rename this folder (this exact name is needed so the YOLO training codeblocks can access the data).
3. Optional: check presence of data.yaml and folder structure.

1. Mount google drive.

In [None]:
# Mount google drive
from google.colab import drive

drive.mount('/content/gdrive')

2. Unzip the dataset.

  Adjust the names in the boxes on the right side if needed.

- **dataset_path:** This should be the path to your zipped dataset file on your Google Drive. Typically, it's a URL starting with "/content/drive/MyDrive/" followed by the specific path to your file. You can copy this by clicking the folder icon on the left sidebar, navigating to "gdrive" -> "MyDrive", and right-clicking on the desired folder to copy the path.


In [None]:
# Unzip the dataset and rename the folder
import os

# Set Paths to dataset and dataset filename
dataset_path = "/content/gdrive/MyDrive/hailo/vespA17000.zip"  # @param {type:"string"}

# Unzip the dataset
!unzip {dataset_path} -d '/content/'


3. Rename the folder with the dataset.

- **dataset_filename:** This should be the exact filename of your zipped dataset, including the file extension (e.g., dataset.zip or dataset_cats.zip). This name will be used to create a new folder in your Colab environment to extract the dataset.

In [None]:
# Unzip the dataset and rename the folder
import os

# Set Paths to dataset and dataset filename
dataset_foldername = "vespA_combined2024"  # @param {type:"string"}

# Rename the extracted folder to dataset
old_path = f'/content/{dataset_foldername}'
new_path = '/content/dataset'
os.rename(old_path, new_path)

4. Optional: Check is path is correct

  The output should be:
  ```
  data.yaml train valid
  images labels
  images labels
  ```


In [None]:
# Optional check dataset folder structure and data.yaml

dataset_path = "/content/gdrive/MyDrive/vespA/100_1_dataset.zip"  # @param {type:"string"}

!ls '/content/dataset/'
!ls '/content/dataset/train/'
!ls '/content/dataset/valid/'


## 4. Train YOLO11

Alternatively, you can train or download a model from Ultralytics Hub and use 'best.pt' to convert it in step five to 'best.onnx'.

To train the YOLO11 model we need the dataset with the structure as described in step 1.


1. Install ultralytics and wandb (weigths and biasis). The latter install is optional, you can remove these lines of code if you do not want to monitor your training metrics)
2. Optional: Track your yolo training with Weight and Biases.
3. Train your YOLO11 model.



1. Install ultralytics package and wandb

In [None]:
#Installing ultralytics and wandb
!pip install -U ultralytics wandb

2. Optional: Track with Weights and Biases

  Ultralytics YOLO11 integration with Weights & Biases for enhanced experiment tracking, model-checkpointing, and visualization of model performance. For more information click [Weights and Biasis](https://docs.ultralytics.com/integrations/weights-biases/)
  1. Make account and a project in [W&B](www.wandb.ai).
  2. Store your API key in the Secrets in Google Colab
In your Colab notebook, click the key icon in the left sidebar (this is the secrets manager).
  3. Add a secret with the name WANDB_API_KEY and paste your WandB API key as its value. You can find your api key [here](https://wandb.ai/authorize).

In [None]:
import wandb
from ultralytics import YOLO
from google.colab import userdata

# Retrieve the API key from Colab secrets
wandb_api_key = userdata.get('WANDB_API_KEY')

# Initialize your Weights & Biases environment
wandb.login(key=wandb_api_key)

3. Train your YOLO11 model.

In this setting it uses the default values. All hyperparameters can be found on the [YOLO11 documentation](https://docs.ultralytics.com/models/yolo11/#performance-metrics).

In [None]:
import wandb
from ultralytics import YOLO
from google.colab import userdata

# Retrieve the API key from Colab secrets
wandb_api_key = userdata.get('WANDB_API_KEY')

# Initialize your Weights & Biases environment
wandb.login(key=wandb_api_key)

# Load a YOLO model
model = YOLO("yolo11n.pt")

# Train and Fine-Tune the Model
model.train(data="/content/dataset/data.yaml",
            epochs=100,
            imgsz=640,
            batch=16
            patience=20,
            project="ultralytics",
            name="yolo11n"
            )

## 5. Convert .pt file to ONNX

1. Export to [onnx](https://onnx.ai/onnx/intro/) format (Open Neural Network Exchange). This is an input format the Hailo's DataFlow Compiler can handle.
- Changing the opset is similar to upgrading a library. onnx and onnx runtimes must support backward compatibility. Default it is the latest version.
- CHECK: change model_path to /content/ultralytics/...

In [None]:
# Define the model path (outside a code cell)
model_path = "/content/ultralytics/yolo11n/weights/best.pt"  #@param {type:"string"}

# Import libraries (within a code cell)
from ultralytics import YOLO

# Load the YOLO11 model (within a code cell)
model = YOLO(model_path)

# Export the model to ONNX format (within a code cell)
model.export(format="onnx")  # creates 'yolo11n.onnx'

2. Verify ONNX model validity. Expected output (example):

```
  ONNX model is valid!
  [[[5.85030460e+00 1.06851845e+01 1.97524452e+01 ... 2.14798126e+02
   2.52929199e+02 2.87759094e+02]
  ...
  [2.34663486e-04 1.97619200e-04 1.53064728e-04 ... 2.67028809e-04
   3.86625528e-04 4.52518463e-04]]]
   ```

In [None]:
# Check the .onnx file

import onnx
import onnxruntime as ort
import torch

# Set path to .onnx file
onnx_model_path = '/content/ultralytics/yolo11n/weights/best.onnx'  # Path to your ONNX file

# Load the ONNX model
onnx_model = onnx.load(onnx_model_path)  # Load the ONNX model
onnx.checker.check_model(onnx_model)  # Validate the model
print("ONNX model is valid!")

# Test the ONNX model with ONNX Runtime
dummy_input = torch.randn(1, 3, 640, 640).numpy()  # Adjust size to match model input (check Netron)
ort_session = ort.InferenceSession(onnx_model_path)  # Pass file path instead of the loaded model
outputs = ort_session.run(None, {"images": dummy_input})  # Match input name to ONNX model
print(outputs[0])



## 6. Zip the folder

Zip the folder with yolo training session(s).

When an **error** occurs select runtime -> restart session and run this codeblock again.

This code downloads the zipped file to Google Drive and to your local computer.


In [None]:
from google.colab import drive
from google.colab import files

import os
import shutil

# Set foldername and filename
train_folder_path = "/content/ultralytics/" #@param {type:"string"}
download_file_name = "/content/yolo11n.zip" #@param {type:"string"}

# mount google drive
drive.mount('/content/gdrive')

# zip the ultralytics folder with training result(s) folders
try:
  # Zipping the folder
  !zip -r {download_file_name} {train_folder_path}
  # Downloading the zipped file
  files.download(download_file_name)
except Exception as e:
  print(f"An error occurred: {e}")
  print("Click 'Runtime' -> 'Restart session' and try running the code again.")

# Source file path in Colab
source_file = 'download_file_name'

# Destination directory in Google Drive
drive_path = '/content/drive/MyDrive'
dest_dir = os.path.join(drive_path, 'ultralytics')

# Create the destination directory if it doesn't exist
if not os.path.exists(dest_dir):
    os.makedirs(dest_dir)

# Copy the file to the destination directory
shutil.copy(download_file_name, dest_dir)