<a href="https://colab.research.google.com/github/bernardlawes/vision-train-py/blob/master/notebooks/train_yolo_cutom.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Train a Custom YOLO Model

## Verify NVIDIA GPU Availability
Ensure that you are using GPU in your Google Colab Notebook.

In [None]:
!nvidia-smi

## Gather and Label Training Images
If labeling data, leverage opensource softare like LabelStudio.  Alternatively, I can use precreated datasets available on the Roboflow Universe, Kaggle, or Goole Images V7....  For the sake of time I will use a pre-labeled dataset found in Roboflow.

<p align=center><img src="https://img-blog.csdnimg.cn/1d6b4ff8c172411885a7dae188fb4281.png" height="360" /></br>

# Install Required Libraries
In this case, we only need Ultralytics for YOLO

In [None]:
!pip install ultralytics

In [None]:
import ultralytics
from ultralytics import YOLO
from IPython.display import Image

# Install and Import Roboflow
Only necessary to access datasets in Roboflow Universe

In [None]:
!pip install roboflow

Import Data from Roboflow

In [None]:
from roboflow import Roboflow
rf = Roboflow(api_key=userdata.get('ROBOFLOW_API_KEY'))
project = rf.workspace("pest-vision-major").project("pest-vision-major-zfqvh")
version = project.version(6)
dataset = version.download("yolov11")

# Download Data into Colab

Download evanjuras' coin dataset into Colab

In [None]:
!wget -O /content/data.zip https://s3.us-west-1.amazonaws.com/evanjuras.com/resources/YOLO_coin_data_12DEC30.zip

Unzip the Data

In [None]:
!unzip -q /content/data.zip -d /content/custom_data

# Split Images into train and validation folders.....   

Download evanjuras' script used to split the data into train and validation folders

In [None]:
!wget -O /content/train_val_split.py https://raw.githubusercontent.com/EdjeElectronics/Train-and-Deploy-YOLO-Models/refs/heads/main/utils/train_val_split.py

Run the Python script to split the data

In [None]:
# TO DO: Improve robustness of train_val_split.py script so it can handle nested data folders, etc
!python train_val_split.py --datapath="/content/custom_data" --train_pct=0.9

Created folder at /content/data/train/images.
Created folder at /content/data/train/labels.
Created folder at /content/data/validation/images.
Created folder at /content/data/validation/labels.
Number of image files: 750
Number of annotation files: 750
Images moving to train: 675
Images moving to validation: 75


# Python function to create data.yaml config file
1. Reads "classes.txt" file to get list of class names
2. Creates data dictionary with correct paths to folders, number of classes, and names of classes
3. Writes data in YAML format to data.yaml

In [None]:
# Python function to automatically create data.yaml config file
# 1. Reads "classes.txt" file to get list of class names
# 2. Creates data dictionary with correct paths to folders, number of classes, and names of classes
# 3. Writes data in YAML format to data.yaml

import yaml
import os

def create_data_yaml(path_to_classes_txt, path_to_data_yaml):

  # Read class.txt to get class names
  if not os.path.exists(path_to_classes_txt):
    print(f'classes.txt file not found! Please create a classes.txt labelmap and move it to {path_to_classes_txt}')
    return
  with open(path_to_classes_txt, 'r') as f:
    classes = []
    for line in f.readlines():
      if len(line.strip()) == 0: continue
      classes.append(line.strip())
  number_of_classes = len(classes)

  # Create data dictionary
  data = {
      'path': '/content/data',
      'train': 'train/images',
      'val': 'validation/images',
      'nc': number_of_classes,
      'names': classes
  }

  # Write data to YAML file
  with open(path_to_data_yaml, 'w') as f:
    yaml.dump(data, f, sort_keys=False)
  print(f'Created config file at {path_to_data_yaml}')

  return

# Define path to classes.txt and run function
path_to_classes_txt = '/content/custom_data/classes.txt'
path_to_data_yaml = '/content/data.yaml'

create_data_yaml(path_to_classes_txt, path_to_data_yaml)

print('\nFile contents:\n')
!cat /content/data.yaml

Created config file at /content/data.yaml

File contents:

path: /content/data
train: train/images
val: validation/images
nc: 4
names:
- penny
- nickel
- dime
- quarter


# Train the Model

In [None]:
!yolo detect train data=/content/data.yaml model=yolo11s.pt epochs=80 imgsz=640

# Test the Model with validation data

In [None]:
!yolo detect predict model=runs/detect/train/weights/best.pt source=data/validation/images save=True

Display the images overlayed with predictions

In [None]:
import glob
from IPython.display import Image, display
for image_path in glob.glob(f'/content/runs/detect/predict/*.jpg')[:10]:
  display(Image(filename=image_path, height=400))
  print('\n')

# Download YOLO Model

In [None]:
# Create "my_model" folder to store model weights and train results
!mkdir /content/my_model
!cp /content/runs/detect/train/weights/best.pt /content/my_model/my_model.pt
!cp -r /content/runs/detect/train /content/my_model

# Zip into "my_model.zip"
%cd my_model
!zip /content/my_model.zip my_model.pt
!zip -r /content/my_model.zip train
%cd /content

This will download the file through the browser, but it takes a very long time.
Alternatively, just download 'my_model.zip' using Colab's sidebar

In [None]:
from google.colab import files
files.download('/content/my_model.zip')