<img src="https://raw.githubusercontent.com/maxsitt/insect-detect-docs/main/docs/assets/logo.png" width="500">

# YOLOv8 detection model training for deployment on Luxonis OAK

[![DOI](https://zenodo.org/badge/580963598.svg)](https://zenodo.org/badge/latestdoi/580963598)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://choosealicense.com/licenses/agpl-3.0/)

Author: &nbsp; Maximilian Sittinger &nbsp;
[<img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" width="24">](https://github.com/maxsitt) &nbsp;
[<img src="https://upload.wikimedia.org/wikipedia/commons/0/06/ORCID_iD.svg" width="24">](https://orcid.org/0000-0002-4096-8556)

- [**Insect Detect Docs**](https://maxsitt.github.io/insect-detect-docs/) 📑
- [`insect-detect-ml`](https://github.com/maxsitt/insect-detect-ml) GitHub repo

&nbsp;

**Train a [YOLOv8](https://github.com/ultralytics/ultralytics) object detection model on your own custom dataset!**

- Go to **File** in the top menu bar and choose **Save a copy in Drive** before running the notebook.
- Go to **Runtime** and make sure that **GPU** is selected as Hardware accelerator under **Change runtime type**.
- If you are using Firefox, please make sure to allow notifications for this website.
- Using dataset import from [Roboflow](https://roboflow.com/) is recommended, but is not required.
> Choose option [`Upload dataset from Google Drive`](#scrollTo=RxOnnOadc5vR) instead.
- Connecting to Google Drive is recommended, but is not required.
> Choose option [`Upload dataset from your local file system`](#scrollTo=qKTCWdtkOUw7) (slower!) and [`Download results`](#scrollTo=h90_4rFQx0mp) instead.

&nbsp;

---

**References**

1. Official YOLOv8 tutorial notebook by Ultralytics &nbsp;
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb)
2. Roboflow tutorial notebook for YOLOv8 training &nbsp;
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov8-object-detection-on-custom-dataset.ipynb)
3. DepthAI tutorial notebook for YOLOv8 training &emsp;
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/luxonis/depthai-ml-training/blob/master/colab-notebooks/YoloV8_training.ipynb)

# Initialization

## Show GPU + CPU and Linux distribution

In [None]:
!nvidia-smi -L
print("\nCPU:")
!grep "model name" /proc/cpuinfo
print("\nLinux distribution:")
!grep "PRETTY_NAME" /etc/os-release

## YOLOv8 setup

In [None]:
%pip install -q ultralytics

import ultralytics
ultralytics.checks()

## Upload dataset from Roboflow

If you are not sure how to export your annotated dataset, check the [Roboflow docs](https://docs.roboflow.com/exporting-data).

> Alternatively you can upload your dataset ([YOLOv8 format](https://roboflow.com/formats/yolov8-pytorch-txt)) from [**Google Drive**](#scrollTo=RxOnnOadc5vR) or from your [**local file system**](#scrollTo=qKTCWdtkOUw7) in the next steps.

In [None]:
%pip install -q roboflow

**Copy only the last three lines of your Download Code and insert them in the next code cell:**

In [None]:
from pathlib import Path
from roboflow import Roboflow

%cd /content

### Paste your Download Code here:
rf = Roboflow(api_key="XXXXXXXXXXXXXXXXXXXX")
project = rf.workspace("maximilian-sittinger").project("insect_detect_detection")
dataset = project.version(7).download("yolov8")
###

dataset_location = dataset.location

print(f"\nLocation of dataset: {dataset_location}")
print(f"\nTotal number of images: {len(list(Path(dataset_location).glob('**/*.jpg')))}")

if Path(f"{dataset_location}/train/images").exists():
  print(f"\nNumber of training images: {len(list(Path(f'{dataset_location}/train/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/valid/images").exists():
  print(f"Number of validation images: {len(list(Path(f'{dataset_location}/valid/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/test/images").exists():
  print(f"Number of test images: {len(list(Path(f'{dataset_location}/test/images').glob('*.jpg')))}")
print("\nContent of data.yaml file:")
%cat {dataset_location}/data.yaml

## Recommended: Connect to Google Drive

In [None]:
from google.colab import drive
drive.mount("/content/drive")

In [None]:
#@title ## Upload dataset from Google Drive {display-mode: "form"}

#@markdown ### Google Drive path to your (zipped) dataset folder:
dataset_path = "/content/drive/MyDrive/yolov8_dataset.zip" #@param {type: "string"}
#@markdown - Please make sure to compress your dataset folder to **.zip** file for much faster upload speed!
#@markdown - Dataset has to be in [YOLOv8 format](https://roboflow.com/formats/yolov8-pytorch-txt).

from pathlib import Path

dataset_location = f"/content/{Path(dataset_path).stem}"

print("Uploading dataset from Google Drive...\n")
!rsync -ah --info=progress2 --no-i-r {dataset_path} /content
if Path(dataset_path).suffix == ".zip":
  import zipfile
  zip_path = f"/content/{Path(dataset_path).stem}.zip"
  if len(list(zipfile.Path(zip_path).iterdir())) > 1:
    !unzip -uq {zip_path} -d {dataset_location}
  else:
    !unzip -uq {zip_path} -d /content
  %rm {zip_path}
print("\nDataset was successfully uploaded!")

print(f"\nLocation of dataset: {dataset_location}")
print(f"\nTotal number of images: {len(list(Path(dataset_location).glob('**/*.jpg')))}")

if Path(f"{dataset_location}/train/images").exists():
  print(f"\nNumber of training images: {len(list(Path(f'{dataset_location}/train/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/valid/images").exists():
  print(f"Number of validation images: {len(list(Path(f'{dataset_location}/valid/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/test/images").exists():
  print(f"Number of test images: {len(list(Path(f'{dataset_location}/test/images').glob('*.jpg')))}")
print("\nContent of data.yaml file:")
%cat {dataset_location}/data.yaml

In [None]:
#@title ## Upload dataset from your local file system {display-mode: "form"}

#@markdown ### Name of your zipped dataset folder:
dataset_name = "yolov8_dataset" #@param {type: "string"}
#@markdown - Please make sure to compress your dataset folder to **.zip** file before uploading!
#@markdown - The name of the .zip file should be the same as for the dataset folder.
#@markdown - Dataset has to be in [YOLOv8 format](https://roboflow.com/formats/yolov8-pytorch-txt).

from pathlib import Path
import zipfile
from google.colab import files

dataset_location = f"/content/{dataset_name}"

uploaded = files.upload()

if len(list(zipfile.Path(f"{dataset_name}.zip").iterdir())) > 1:
  !unzip -uq {dataset_name}.zip -d {dataset_location}
else:
  !unzip -uq {dataset_name}.zip -d /content
%rm {dataset_name}.zip

print(f"\nLocation of dataset: {dataset_location}")
print(f"\nTotal number of images: {len(list(Path(dataset_location).glob('**/*.jpg')))}")

if Path(f"{dataset_location}/train/images").exists():
  print(f"\nNumber of training images: {len(list(Path(f'{dataset_location}/train/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/valid/images").exists():
  print(f"Number of validation images: {len(list(Path(f'{dataset_location}/valid/images').glob('*.jpg')))}")
if Path(f"{dataset_location}/test/images").exists():
  print(f"Number of test images: {len(list(Path(f'{dataset_location}/test/images').glob('*.jpg')))}")
print("\nContent of data.yaml file:")
%cat {dataset_location}/data.yaml

## Edit `data.yaml`

Check the `data.yaml` file in your dataset folder to make sure the paths to the train, valid and test folders are correct.

- Open your dataset folder in the File Explorer (Folder symbol on the left side bar).
- Double-click on the `data.yaml` file, it will open in the editor to the right.

  Make sure that the paths to the train, valid and test folders are as follows:

  ``` yaml
  train: train/images
  val: valid/images
  test: test/images
  ```

- Save your changes with **Ctrl + S** and close the editor.

# Model training

In [None]:
#@title ## Optional: Select external logger {display-mode: "form"}

logger = "Weights&Biases" #@param ["Weights&Biases", "Comet", "ClearML"]

#@markdown > More info: [YOLOv8 logging](https://docs.ultralytics.com/modes/train/#logging)

if logger == "Weights&Biases":
  %pip install -q wandb
  import wandb
  wandb.login()
elif logger == "Comet":
  %pip install -q comet_ml
  import comet_ml
  comet_ml.init()
elif logger == "ClearML":
  %pip install -q clearml
  import clearml
  clearml.browser_login()

## Tensorboard logger

> If you are using Firefox, **disable Enhanced Tracking Protection** for this website (click on the shield to the left of the address bar) for the Tensorboard logger to work correctly!

In [None]:
%load_ext tensorboard
%tensorboard --logdir /content/runs/detect

## Train YOLOv8 detection model

- `name` name of the training run
- `imgsz` input image size (recommended: same size as for inference)
- `batch` specify batch size (recommended: 32)
- `epochs` set the number of training [epochs](https://machine-learning.paperspace.com/wiki/epoch) (recommended: 100-300+)
- `data` path to `data.yaml` file
- `model` specify the [pretrained model weights](https://github.com/ultralytics/ultralytics#models)
> `model=yolov8n.pt` YOLOv8n model (recommended)  
  `model=yolov8s.pt` YOLOv8s model
- `cache` cache images in RAM for faster training
- `patience` epochs to wait for no observable improvement for early stopping of training (default: 50)

> More information on YOLOv8 [model training](https://docs.ultralytics.com/modes/train/) 🚀

In [None]:
training_run_name = "YOLOv8n_320_batch32_epochs200" #@param {type: "string"}
#@markdown Add UTC timestamp in front of training run name:
add_timestamp = True #@param {type:"boolean"}
#@markdown ---

image_size = 320 #@param {type: "integer"}
batch_size = 32 #@param {type:"slider", min:32, max:128, step:32}
number_epochs = 200 #@param {type:"slider", min:10, max:500, step:10}
model = "yolov8n.pt" #@param ["yolov8n.pt", "yolov8s.pt"]

if add_timestamp:
  from datetime import datetime
  utc_timestamp = datetime.now().strftime("%Y%m%d_%H-%M")
  train_run_name = f"{utc_timestamp}_{training_run_name}"
else:
  train_run_name = training_run_name

%cd /content

!yolo detect train \
name={train_run_name} \
imgsz={image_size} \
batch={batch_size} \
epochs={number_epochs} \
data={dataset_location}/data.yaml \
model={model} \
cache=True \
#patience=0 # disable EarlyStopping (default: 50)

In [None]:
#@title ## Export to Google Drive or Download training results {display-mode: "form"}

training_results = "Export_Google_Drive" #@param ["Export_Google_Drive", "Download"]
#@markdown ---

#@markdown ### Path for saving training results in Google Drive:
GDrive_save_path = "/content/drive/MyDrive/Training_results/YOLOv8" #@param {type: "string"}

if training_results == "Export_Google_Drive":
  print("Exporting training results to Google Drive...\n")
  !rsync -ah --mkpath --info=progress2 --no-i-r /content/runs/detect/{train_run_name} {GDrive_save_path}
  print("\nTraining results were successfully exported!")
elif training_results == "Download":
  from google.colab import files
  %cd /content/runs/detect
  !zip -rq {train_run_name}.zip {train_run_name}
  %cd -
  files.download(f"/content/runs/detect/{train_run_name}.zip")

# Model validation

Test the performance of your model on the validation and/or test dataset.

> Copy the validation results (cell output) and save to .txt file, as they will not be saved automatically.

In [None]:
task = "val" #@param ["val", "test"]
#@markdown > Use `task: test` to validate on the dataset test split.

val_run_name = f"{train_run_name}_validate_{task}"

%cd /content

!yolo detect val \
name={val_run_name} \
model=/content/runs/detect/{train_run_name}/weights/best.pt \
data={dataset_location}/data.yaml \
imgsz={image_size} \
split={task}

In [None]:
#@title ## Export to Google Drive or Download validation results {display-mode: "form"}

validation_results = "Export_Google_Drive" #@param ["Export_Google_Drive", "Download"]
#@markdown ---

#@markdown ### Path for saving validation results in Google Drive:
GDrive_save_path = "/content/drive/MyDrive/Training_results/YOLOv8" #@param {type: "string"}

if validation_results == "Export_Google_Drive":
  print("Exporting validation results to Google Drive...\n")
  !rsync -ah --mkpath --info=progress2 --no-i-r /content/runs/detect/{val_run_name} {GDrive_save_path}/{train_run_name}
  print("\nValidation results were successfully exported!")
elif validation_results == "Download":
  from google.colab import files
  %cd /content/runs/detect
  !zip -rq {val_run_name}.zip {val_run_name}
  %cd -
  files.download(f"/content/runs/detect/{val_run_name}.zip")

# Model inference

Use your model to detect insects on images in the dataset test split.

In [None]:
#@markdown #### Decrease confidence threshold to detect objects with lower confidence score:
confidence_threshold = 0.5 #@param {type:"slider", min:0.1, max:1, step:0.1}
#@markdown #### Increase IoU threshold if the same object is detected multiple times:
iou_threshold = 0.5 #@param {type:"slider", min:0.1, max:1, step:0.1}

det_run_name = f"{train_run_name}_detect"

%cd /content

!yolo detect predict \
name={det_run_name} \
model=/content/runs/detect/{train_run_name}/weights/best.pt \
source={dataset_location}/test/images \
imgsz={image_size} \
conf={confidence_threshold} \
iou={iou_threshold} \
save=True \
line_width=1 # bounding box line thickness and label size (default: 3)

In [None]:
#@title ## Export to Google Drive or Download inference results {display-mode: "form"}

inference_results = "Export_Google_Drive" #@param ["Export_Google_Drive", "Download"]
#@markdown ---

#@markdown ### Path for saving inference results in Google Drive:
GDrive_save_path = "/content/drive/MyDrive/Training_results/YOLOv8" #@param {type: "string"}

%cd /content/runs/detect
!zip -rq {det_run_name}.zip {det_run_name}
%cd -

if inference_results == "Export_Google_Drive":
  print("\nExporting inference results to Google Drive...\n")
  !rsync -ah --mkpath --info=progress2 --no-i-r /content/runs/detect/{det_run_name}.zip {GDrive_save_path}/{train_run_name}
  print("\nInference results were successfully exported!")
elif inference_results == "Download":
  from google.colab import files
  files.download(f"/content/runs/detect/{det_run_name}.zip")

## Show inference results on test images

In [None]:
from pathlib import Path
from IPython.display import Image, display

for img in Path(f"/content/runs/detect/{det_run_name}").glob("*.jpg"):
  display(Image(img))
  print("\n")

# Model conversion

**Go to [tools.luxonis.com](https://tools.luxonis.com/):**

- Select `YoloV8 (detection only)` as Yolo Version.
- Rename your model weights file from `best.pt` to e.g. `yolov8n_320.pt`.
- Select your model weights file (`yolov8n_320.pt`) for upload.
- Use your image size (e.g. `320`) as input image shape.
- Open the `Advanced options` and choose `Shaves: 4`.
- Hit `Submit` to upload your PyTorch model weights and download the converted ONNX, OpenVINO and .blob model.

> Follow the instructions in the README of the [`luxonis/tools`](https://github.com/luxonis/tools) GitHub repo to run the model conversion locally.

---

Recommended number of shaves the model can use is **4-5** for deployment with the Insect Detect camera trap.

> More information about SHAVES can be found at the [DepthAI FAQ](https://docs.luxonis.com/en/latest/pages/faq/#what-are-the-shaves).

> More information on model conversion can be found at the [DepthAI Docs](https://docs.luxonis.com/en/latest/pages/model_conversion/).

## Generate JSON config file

Together with the converted .blob model, a .json config file with model specific settings will be created after following the conversion steps at [tools.luxonis.com](https://tools.luxonis.com/).

To set the correct class/label name(s) and adjust the confidence or IoU threshold, you can change these values directly in the .json file you downloaded from tools.luxonis.com or create your own .json config file in the following step.

In [None]:
#@markdown ### Name of the JSON config file:
json_name = "yolov8_320" #@param {type: "string"}
#@markdown ---

image_size = 320 #@param {type: "integer"}
number_classes = 1 #@param {type: "integer"}
#@markdown ---

#@markdown #### For several classes/labels: **["class1", "class2", "class3"]**
labels = ["insect"] #@param {type: "raw"}
#@markdown ---

#@markdown #### Decrease confidence threshold to detect objects with lower confidence score:
confidence_threshold = 0.5 #@param {type:"slider", min:0.1, max:1, step:0.1}
#@markdown #### Increase IoU threshold if the same object is detected multiple times:
iou_threshold = 0.5 #@param {type:"slider", min:0.1, max:1, step:0.1}

import json
from google.colab import files

!wget -q https://raw.githubusercontent.com/luxonis/depthai-experiments/master/gen2-yolo/device-decoding/json/yolov5.json -P /content/

with open("/content/yolov5.json", "r") as json_template:
  json_data = json.load(json_template)
  json_data["nn_config"]["input_size"] = f"{image_size}x{image_size}"
  json_data["nn_config"]["NN_specific_metadata"]["classes"] = number_classes
  json_data["nn_config"]["NN_specific_metadata"]["anchors"] = []
  json_data["nn_config"]["NN_specific_metadata"]["anchor_masks"] = {}
  json_data["nn_config"]["NN_specific_metadata"]["iou_threshold"] = iou_threshold
  json_data["nn_config"]["NN_specific_metadata"]["confidence_threshold"] = confidence_threshold
  json_data["mappings"]["labels"] = labels

with open(f"/content/{json_name}.json", "w") as json_file:
  json.dump(json_data, json_file, indent = 4)

files.download(f"/content/{json_name}.json")

# Model deployment

That's it! You trained your own [YOLOv8](https://github.com/ultralytics/ultralytics) object detection model with your custom dataset and converted it to .blob format which is necessary to run inference on the [Luxonis OAK devices](https://docs.luxonis.com/projects/hardware/en/latest/).

> To deploy the YOLOv8 model on your OAK device you can check out the Luxonis GitHub repository for [on-device decoding](https://github.com/luxonis/depthai-experiments/tree/master/gen2-yolo/device-decoding) or use the deployment options from the [**Insect Detect Docs**](https://maxsitt.github.io/insect-detect-docs/software/programming/) (e.g. for continuous automated insect monitoring with the DIY camera trap).