<a href="https://colab.research.google.com/github/ValentinBhend/Ballmaze/blob/main/Object_detection/YOLO_train_export.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## How the model used in the Ballmaze project was trained (as of 21.03.2025)
A model trained with this can be found at https://github.com/ValentinBhend/Ballmaze

#### If this is executed in Google Colab, it should handle the dataset filepaths. If not, you have to copy the downloaded dataset manually to the right path or change the Ultralytics default path (so this doesn't mess with your local files :). There should be an error message describing what to do.
**Don't forget to select a GPU in Colab, else you might not live to see it finish.**

In [None]:
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
  %pip install ultralytics
  # used for OpenVINO export, versions (& requirements) might change (as of 21.03.2025)
  %pip install 'openvino>=2024.0.0,!=2025.0.0'
  %pip install 'nncf>=2.14.0'

##### Load the dataset from kaggle like this or as zip.

In [None]:
import kagglehub

# Download latest version
path_kaggle = kagglehub.dataset_download("valentinbhend/ballmaze-labeled-images")

print(f"Path to dataset files: {path_kaggle}. Copy the folder datasets to the current path. ")

##### Define the dataset .yaml file like this or create a .yaml file by hand.

In [None]:
import yaml
# Define dataset configuration as a dictionary. Or use an existing .yaml file.
dataset = {
    'train': ["dataset_128/images/train", "dataset_480_1/images/train", "dataset_480/images/train"],
    'val': ["dataset_128/images/val", "dataset_480_1/images/val", "dataset_480/images/val"],
    'nc': 1,  # Number of classes
    'names': ['Ball']
}
dataset_file = 'dataset.yaml'
with open(dataset_file, 'w') as f:
    yaml.dump(dataset, f)

if IN_COLAB:
  import os
  source = path_kaggle + "/datasets"
  destination = "/content/"
  if not os.path.exists(destination):
    !cp -r {source} {destination}
  else:
    print(f"The folder {destination} already exists. If this is the first execution of this, there already was a folder with this name. ")

##### Train the model with the data defined in the .yaml file.
**!! Run with 1-2 epochs first to see if all packages are installed and working !!**

In [None]:
from ultralytics import YOLO
import torch
# Load a pretrained model
model = YOLO('yolo11n.pt')

device = "cuda" if torch.cuda.is_available() else "cpu"


# Train the model, 500 epochs might be a bit much
results = model.train(data="dataset.yaml", epochs=500, patience=400, warmup_epochs=50,
                      imgsz=128, int8=True, single_cls=True, max_det=1,
                      device=device, seed=0, deterministic=True)

##### Export the model to OpenVINO or other.
If used only for inference, a pytorch model (.pt) is much slower than an exported one.

In [None]:
# load a trained model (if not still loaded)
# model = YOLO("path/to/weights/best.pt")

# Export the model
target_device = "cpu" # In my case, could also be "cuda" depending on the device it will be deployed in the end
model.export(data="dataset.yaml", format="openvino", device=target_device,
             imgsz=128, int8=True, single_cls=True, max_det=1)