<a href="https://colab.research.google.com/github/AlessandriniAntoine/Eden_Robotics/blob/ros/Python/vision/detection/yolo/Yolov5_Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Custom Yolo Object Detector

This tutorial is based on the [YOLOv5 repository](https://github.com/ultralytics/yolov5) by [Ultralytics](https://www.ultralytics.com/). 

To prepare your dataset, you can use [Roboflow](https://roboflow.com/) web site. It will gives you tool to annotate, split data into training, validation and test. You can also annote image locally using [labelImg](https://pypi.org/project/labelImg/) package. In that case, you upload manually data in the dataset folder.

### Steps Covered in this Tutorial

To train our detector we take the following steps:

* Install YOLOv5 dependencies
* Download custom YOLOv5 object detection data
* Write our YOLOv5 Training configuration
* Run YOLOv5 training
* Evaluate YOLOv5 performance
* Visualize YOLOv5 training data
* Run YOLOv5 inference on test images
* Export saved YOLOv5 into ONNX format for futur inference


#Install Dependencies

_(Remember to choose GPU in Runtime if not already selected. Runtime --> Change Runtime Type --> Hardware accelerator --> GPU)_

In [None]:
!git clone https://github.com/ultralytics/yolov5  # clone repo

Cloning into 'yolov5'...
remote: Enumerating objects: 14906, done.[K
remote: Total 14906 (delta 0), reused 0 (delta 0), pack-reused 14906[K
Receiving objects: 100% (14906/14906), 13.95 MiB | 32.77 MiB/s, done.
Resolving deltas: 100% (10237/10237), done.


In [None]:
# clone YOLOv5 repository
%cd /content/yolov5
!git reset --hard fbe67e465375231474a2ad80a4389efc77ecff99
!pip install -qr requirements.txt  # install dependencies (ignore errors)

/content/yolov5
HEAD is now at fbe67e4 Fix `OMP_NUM_THREADS=1` for macOS (#8624)
[K     |████████████████████████████████| 1.6 MB 5.0 MB/s 
[?25h

In [None]:
!rm -rf sample_data

## Import Packages

In [None]:
# install dependencies as necessary
import torch
import os
import yaml
import google

from IPython.display import Image, clear_output  # to display images

# clear_output()
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Setup complete. Using torch 1.13.0+cu116 _CudaDeviceProperties(name='Tesla T4', major=7, minor=5, total_memory=15109MB, multi_processor_count=40)


## Define Paths

In [None]:
paths = {
    'DATASET' : '',
    'DATASETS' : '/content/datasets',
    'MODELS' : '/content/models',
    'YOLOV5' : '/content/yolov5',
}

In [None]:
for path in paths.values():
    if not os.path.exists(path):
        !mkdir -p {path}
  

mkdir: missing operand
Try 'mkdir --help' for more information.


# Download Correctly Formatted Custom Dataset 

We'll download our dataset from Roboflow. Use the "**YOLOv5 PyTorch**" export format. Note that the Ultralytics implementation calls for a YAML file defining where your training and test data is. The Roboflow export also writes this format for us.

To get your data into Roboflow, follow the [Getting Started Guide](https://blog.roboflow.ai/getting-started-with-roboflow/).

If you want to use **ROBOFLOW** to upload your dataset you run the following cells

In [None]:
!pip install -q roboflow

[K     |████████████████████████████████| 45 kB 2.8 MB/s 
[K     |████████████████████████████████| 67 kB 2.3 MB/s 
[K     |████████████████████████████████| 138 kB 54.4 MB/s 
[K     |████████████████████████████████| 54 kB 3.2 MB/s 
[K     |████████████████████████████████| 178 kB 66.1 MB/s 
[K     |████████████████████████████████| 62 kB 1.6 MB/s 
[?25h  Building wheel for wget (setup.py) ... [?25l[?25hdone


In [None]:
from roboflow import Roboflow
rf = Roboflow(model_format="yolov5", notebook="ultralytics")

upload and label your dataset, and get an API KEY here: https://app.roboflow.com/?model=yolov5&ref=ultralytics


In [None]:
os.environ['DARASET_DIRECTORY'] = paths['DATASETS']

In [None]:
%cd {paths['DATASETS']}
#after following the link above, recieve python code with these fields filled in
from roboflow import Roboflow
rf = Roboflow(api_key="YOU_KEY")
project = rf.workspace("eden-ssr4z").project("yolov5-lovpt")
dataset = project.version(2).download("yolov5")
paths['DATASET'] = dataset.location

If you upload by hand your dataset, just run the next cell to precise the name of the dataset. In that case, don't forget to upload the [data.yalm]() file.

In [None]:
dataset_name = 'Yolov5'
paths['DATASET'] = os.path.join(paths['DATASETS'],dataset_name)
paths['TEST'] = os.path.join(paths['DATASETS'],dataset_name,'test')
paths['TRAIN'] = os.path.join(paths['DATASETS'],dataset_name,'train')
paths['VALID'] = os.path.join(paths['DATASETS'],dataset_name,'valid')

In [None]:
!mkdir -p {paths['DATASET']}

In [None]:
!unzip {os.path.join(paths['DATASETS'],dataset_name,'test.zip')} -d {paths['TEST']}
!rm {os.path.join(paths['DATASETS'],dataset_name,'test.zip')}
!unzip {os.path.join(paths['DATASETS'],dataset_name,'train.zip')} -d {paths['TRAIN']}
!rm {os.path.join(paths['DATASETS'],dataset_name,'train.zip')}
!unzip {os.path.join(paths['DATASETS'],dataset_name,'valid.zip')} -d {paths['VALID']}
!rm {os.path.join(paths['DATASETS'],dataset_name,'valid.zip')}

This will create the path to the dataset depending if you used Roboflow or not

# Define Files Path

In [None]:
files = {
    'CONFIG_YAML' : os.path.join(paths['DATASET'],'config.yaml'),
    'DATA_YAML' : os.path.join(paths['DATASET'],'data.yaml'),
    'DETECT_PY' : os.path.join(paths['YOLOV5'],'detect.py'),
    'EXPORT_PY' : os.path.join(paths['YOLOV5'],'export.py'),
    'TRAIN_PY' : os.path.join(paths['YOLOV5'],'train.py'),
    'VAL_PY' : os.path.join(paths['YOLOV5'],'val.py'),
}

In [None]:
# this is the YAML file Roboflow wrote for us that we're loading into this notebook with our data
%cat {files['DATA_YAML']}

# Define Model Configuration and Architecture

We will write a yaml script that defines the parameters for our model like the number of classes, anchors, and each layer.

You do not need to edit these cells, but you may.

If you did not use Roboflow to load your data, run the next cells

In [None]:
labels = ['pen','pencil','scissors','eraser']

In [None]:
dataset_yaml = {
    'path': paths['DATASET'],
    'train': os.path.join('train', 'images'),
    'val': os.path.join('valid', 'images'),
    'test': os.path.join('test', 'images'),
    'names': dict(enumerate(labels)),
    'nc' : len(labels)
}

In [None]:
with open(files['DATA_YAML'], 'w') as f:
    documents = yaml.dump(dataset_yaml, f)

If you upload data using Roboflow, run the next cell.

In [None]:
with open(files['DATA_YAML'], 'r') as stream:
    dataset_yaml  = yaml.safe_load(stream)
dataset_yaml['path'] = paths['DATASET']
with open(files['DATA_YAML'], 'w') as f:
    documents = yaml.dump(dataset_yaml, f)

# Define Model Architecture

We will write a yaml script that defines the parameters for our model like the number of classes, anchors, and each layer. This file is from the the yolov5 model (s,n,x...) you want to use. We just modify the number of classes.

In [None]:
# define number of classes based on YAML
with open(files['DATA_YAML'], 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])

In [None]:
# define model type
model_type = 'yolov5s'
path = os.path.join(paths['YOLOV5'],'models',f'{model_type}.yaml')
%cat {path}

In [None]:
#customize iPython writefile so we can write variables
from IPython.core.magic import register_line_cell_magic

@register_line_cell_magic
def writetemplate(line, cell):
    with open(line, 'w') as f:
        f.write(cell.format(**globals()))

In [None]:
%%writetemplate {files['CONFIG_YAML']}

# parameters
nc: {num_classes}  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

## Load Model

In case you want to train you model from a previous model, you can upload it in the models folder.

In [None]:
model_number = 1
model_zip = os.path.join(paths['MODELS'],f'model_{model_number}.zip')
model_path = os.path.join(paths['MODELS'],f'model_{model_number}')


In [None]:
!mkdir -p {model_path}
!unzip {model_zip} -d {model_path}
!rm {model_zip}

Archive:  /content/models/model_1.zip
  inflating: /content/models/best.pt  


# Train Custom YOLOv5 Detector


Here, we are able to pass a number of arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** set the path to our yaml file
- **cfg:** specify our model configuration
- **weights:** specify a custom path to weights. (Note, if you want to use a personnal model from where to start trainin, you have to upload it in the models folder.)
- **name:** result names
- **nosave:** only save the final checkpoint
- **cache:** cache images for faster training

In [None]:
img_size = 416
epochs = 350
model_number = 1 # model from where to start, None if start from scratch

In [None]:
if model_number is not None:
    model_path = os.path.join(paths['MODELS'],f'model_{model_number}')
    weights_path = os.path.join(model_path,'best.pt')
else : 
    weights_path = ''

In [None]:
try:
    list_models = [name for name in os.listdir(paths['MODELS']) if 'model_' in name]
    new_model_number = max(int(name[-1]) for name in list_models)+1
except Exception:
    new_model_number = 0
new_model_name = os.path.join(f'model_{new_model_number}','train')

In [None]:
print(new_model_name)
print(weights_path)

model_2/train
/content/models/model_1/best.pt


In [None]:
!rm -rf /content/models/model_1

In [None]:
# train yolov5s on custom data for 100 epochs
# time its performance
%%time
command = f'python {files["TRAIN_PY"]} --img {img_size} --batch 16 --epochs {epochs} --data {files["DATA_YAML"]} --cfg {files["CONFIG_YAML"]} --project {paths["MODELS"]} --name {new_model_name}'
if weights_path : 
    command = f'{command} --weights {weights_path} --cache'
else :
  command = f"{command} --weights '' --cache"
!{command}

# Evaluate Custom YOLOv5 Detector Performance

Training losses and performance metrics are saved to Tensorboard and also to a logfile defined above with the **--name** flag when we train. In our case, we named this `yolov5s_results`. (If given no name, it defaults to `results.txt`.) The results file is plotted as a png after training completes.


Partially completed `results.txt` files can be plotted with `from utils.utils import plot_results; plot_results()`.

In [None]:
model_number = 2

In [None]:
test_model_name = os.path.join(f'model_{model_number}','test')
weigths_path =  os.path.join(paths['MODELS'],f'model_{model_number}','train','weights','best.pt')

In [None]:
!python {files["VAL_PY"]} --weights {weigths_path} --data {files["DATA_YAML"]} --img {img_size} --project {paths["MODELS"]} --name {test_model_name}

[34m[1mval: [0mdata=/content/datasets/Yolov5/data.yaml, weights=['/content/models/model_2/train/weights/best.pt'], batch_size=32, imgsz=416, conf_thres=0.001, iou_thres=0.6, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=/content/models, name=model_2/test, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.1-306-gfbe67e4 Python-3.8.16 torch-1.13.0+cu116 CUDA:0 (Tesla T4, 15110MiB)

Fusing layers... 
config summary: 213 layers, 7020913 parameters, 0 gradients, 15.8 GFLOPs
[34m[1mval: [0mScanning '/content/datasets/Yolov5/valid/labels.cache' images and labels... 83 found, 3 missing, 0 empty, 0 corrupt: 100% 86/86 [00:00<?, ?it/s]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 3/3 [00:02<00:00,  1.05it/s]
                 all         86         99      0.909      0.876      0.893      0.619
                   0         86         31

In [None]:
# Start tensorboard
# Launch after you have started training
# logs save in the folder "runs"
%load_ext tensorboard
%tensorboard --logdir {paths['MODELS']}

#Run Inference  With Trained Weights
Run inference with a pretrained checkpoint on contents of `test/images` folder downloaded from Roboflow.

In [None]:
model_number = 1
conf = 0.4

In [None]:
model_name = os.path.join(f'model_{model_number}','detect')
weigths_path =  os.path.join(paths['MODELS'],f'model_{model_number}','train','weights','best.pt')
test_path = os.path.join(paths['DATASET'],'test','images')

In [None]:
!python {files['DETECT_PY']} --weights {weigths_path} --img {img_size} --conf {conf} --source {test_path} --project {paths['MODELS']} --name {model_name}

In [None]:
#display inference on ALL test images
#this looks much better with longer training above

import glob
from IPython.display import Image, display

path = os.path.join(paths['MODELS'],model_name,'*jpg')
for imageName in glob.glob(path): #assuming JPG
    display(Image(filename=imageName))
    print("\n")

## Export to onnx format

In [None]:
model_number = 2

In [None]:
weigths_path =  os.path.join(paths['MODELS'],f'model_{model_number}','train','weights','best.pt')

In [None]:
!python {files['EXPORT_PY']} --weights {weigths_path} --imgsz {img_size} {img_size} --include onnx

In [None]:
!pip uninstall onnx --yes

Found existing installation: onnx 1.13.0
Uninstalling onnx-1.13.0:
  Successfully uninstalled onnx-1.13.0


## Download Model Data


In [None]:
model_number = 2

In [None]:
model_path = os.path.join(paths['MODELS'],f'model_{model_number}')
zip_name = f'model_{model_number}.zip'

In [None]:
print(model_path)

/content/models/model_2


In [None]:
!cd {model_path} && zip -r {zip_name} test train

In [None]:
google.colab.files.download(os.path.join(model_path,f'model_{model_number}.zip')) 

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Export Trained Weights for Future Inference

Now that you have trained your custom detector, you can export the trained weights you have made here for inference on your device elsewhere

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
%cp /content/yolov5/runs/train/yolov5s_results/weights/best.pt /content/gdrive/My\ Drive