# YOLOv5s General

This repository includes training and evaluation of a YOLOv5s object detection model with either binary classes (BIN) (object/no-object) or multiple classes (MC).

For more information, see Roboflow's tutorial on YOLOv5s, either their notebook, blog or video - they are quite intuitive and easy to follow.

If the reader prefers to run YOLOv5l, the configuration (YAML) file in cell 6 must be changed (an easy fix).

# Setup

These cells import the repository for YOLOv5 and necessary requirements (e.g. requirements.txt)

In [None]:
# clone YOLOv5 and reset to a specific git checkpoint that has been verified working
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
!git reset --hard 68211f72c99915a15855f7b99bf5d93f5631330f

In [None]:
# install dependencies as necessary
!pip install -qr requirements.txt  # install dependencies (ignore errors)
import torch

from IPython.display import Image, clear_output  # to display images
from utils.google_utils import gdrive_download  # to download models/datasets

# clear_output()
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

# Download Correctly Formatted Custom Dataset 

This project uses Roboflow.com to convert the data to the correct format. That is, if you have annotated data on any format, add them to Roboflow.com and the application will convert the data to the desired format (here: YOLOv5). To import data from Roboflow, you will get your own key (the text that goes inside the quotation marks " ". As the key includes sensitive information, it is removed from this repository).

If you already have the data on YOLOv5 format, you can directly import/upload the data. Note that this project uses Google Colab, therefore "/content/" is the path where all files are added, and the path where the YOLOv5 data must be added as well. Therefore, if adding data not through Roboflow, remember to still add the data to the correct path.

In [None]:
# Export code snippet and paste here
%cd /content
!curl -L " " > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

In [None]:
# this is the YAML file Roboflow wrote for us that we're loading into this notebook with our data
%cat data.yaml

# Change data from MC to BIN

### Note that this is only important if you want to change from multiple classes (e.g. 5, 6) to binary classes (object/no-object). If you want to continue with either multiple classes or binary, do NOT run the next two cells.

These cells changes the annotations from multiple objects to binary (object/no-object).

In [None]:
%%writefile /content/data.yaml
train: ../train/images
val: ../valid/images
test: ../test/images

nc: 1
names: ['Maritime Object']

Overwriting /content/data.yaml


In [None]:
import os

directory = '/content/test/labels/' # Run for test, train and valid

for navn in os.listdir(directory):
    if navn.endswith(".txt"):

        # Open the file as read
        filename = open(directory + navn, "r+")
        # Create an array to hold write data
        new_file = []
        # Loop the file line by line
        for line in filename:
            # Split A,B on , and use first position [0], aka A, then add to the new array
            line_splitted = line.split(" ")
            # Add
            print(line_splitted)

            label = line_splitted[0]
            x_cen = float(line_splitted[1])
            y_cen = float(line_splitted[2])
            width = float(line_splitted[3])
            height = float(line_splitted[4])

            new_string = str(0) + ' ' + str(x_cen) + ' ' + str(y_cen) + ' ' + str(width) + ' ' + str(height)
            new_file.append(new_string)

        # Open the file as Write, loop the new array and write with a newline
        with open(directory + navn, "w+") as f:
            for i, item in enumerate(new_file):
                if i == (len(new_file) - 1):
                    f.write(item)
                else:
                    f.write(item + '\n')

# Define Model Configuration and Architecture

Change config YAML file for the model to be specified for the custom dataset. This is simply the configuration file for the YOLOv5 model (here: YOLOv5s).

In [None]:
# define number of classes based on YAML
import yaml
with open("data.yaml", 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])

In [None]:
num_classes

'1'

In [None]:
#customize iPython writefile so we can write variables
from IPython.core.magic import register_line_cell_magic

@register_line_cell_magic
def writetemplate(line, cell):
    with open(line, 'w') as f:
        f.write(cell.format(**globals()))


In [None]:
%%writetemplate /content/yolov5/models/custom_yolov5s.yaml

# parameters
nc: {num_classes}  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, BottleneckCSP, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, BottleneckCSP, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, BottleneckCSP, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, BottleneckCSP, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, BottleneckCSP, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

# Train the YOLOv5 Detector

This is for training. If you already have trained weights, do not run!

Arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** set the path to our yaml file
- **cfg:** specify our model configuration
- **weights:** specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive [folder](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J))
- **name:** result names
- **nosave:** only save the final checkpoint
- **cache:** cache images for faster training

In [None]:
# time its performance
%%time
%cd /content/yolov5/
!python train.py --img 640 --batch 16 --epochs 750 --data '../data.yaml' --cfg ./models/custom_yolov5s.yaml --weights /content/gdrive/MyDrive/0Thesis/Hi-Res/BIN_v5small_NormBox/runs/train/yolov5s_results/weights/best.pt --name yolov5s_results  --cache

# Evaluate Custom YOLOv5 Detector Performance
This cell evaluates the training. Do not run if training is not run, as this cell needs log file.

In [None]:
# Start tensorboard
# logs save in the folder "runs"
%load_ext tensorboard
#%tensorboard --logdir runs

%tensorboard --logdir .../runs

# Export Trained Weights for Future Inference

Export the trained weights to Google Drive. This is done during training such that the weights are stored "locally" as well (outside of runtime).

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
%mv source destination

# Evaluation
This is the evaluation. The cells provide the results of the model on the test data. If one wants to evaluate the validation data for some reason, one can simply change one line in the configuration file below (/content/data.yaml) such that "test: ../test/images" is "test: ../val/images" instead.

#### Important:
- "nc" are number of classes. Make sure that this number is correct (MC: often 5, BIN: 1).

- Also, the names must be correct and in correct order. However, cell number 4 in this project shows the content of the original data.yaml file. Alternatively, run %cat data.yaml before the cell below to see the names and their order.

- Most projects are with img-size 640, but some are with higher resolution (for instance 2k).

In [None]:
%%writefile /content/data.yaml
train: ../train/images
test: ../test/images

nc: 1
names: ['Maritime Object']

Overwriting /content/data.yaml


In [None]:
%cd /content/yolov5/
!python test.py --weights <.../name_of_weights.pt> --img-size 640 --conf-thres 0.001 --data /content/data.yaml --task 'test' --verbose

#!python test.py --weights /content/gdrive/MyDrive/0Thesis/Hi-Res/BIN_v5small_NormBox/best.pt --img-size 640 --conf-thres 0.001 --data /content/data.yaml --task 'test' --verbose

/content/yolov5
Namespace(augment=False, batch_size=32, conf_thres=0.001, data='/content/data.yaml', device='', exist_ok=False, img_size=640, iou_thres=0.6, name='exp', project='runs/test', save_conf=False, save_json=False, save_txt=False, single_cls=False, task='test', verbose=True, weights=['/content/gdrive/MyDrive/0Thesis/Hi-Res/BIN_v5small_NormBox/best.pt'])
Using torch 1.7.0+cu101 CUDA:0 (Tesla P100-PCIE-16GB, 16280MB)

Fusing layers... 
Model Summary: 232 layers, 7246518 parameters, 0 gradients
Scanning '../test/labels.cache' for images and labels... 81 found, 0 missing, 27 empty, 0 corrupted: 100% 81/81 [00:00<00:00, 834738.63it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 3/3 [00:05<00:00,  1.97s/it]
                 all          81          89       0.747       0.955       0.945        0.49
Speed: 3.2/2.1/5.3 ms inference/NMS/total per 640x640 image at batch-size 32
Results saved to runs/test/exp9
