# YOLOv5 helmet detection

**The goal of this notebook is to explore method for object detection, concretly helmet detection, using YOLOv5 model.**

YOLOv5 (You only look once) 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.


### References

* [Train] NFL Extra Images YOLOv5 with W&B - https://www.kaggle.com/ayuraj/train-nfl-extra-images-yolov5-with-w-b/ 
* YOLOv5 repository - https://github.com/ultralytics/yolov5
* NFL Helmet Assignment - Getting Started Guide - https://www.kaggle.com/robikscube/nfl-helmet-assignment-getting-started-guide
* MMDet CascadeRCNN helmet detection for beginners - https://www.kaggle.com/eneszvo/mmdet-cascadercnn-helmet-detection-for-beginners

### Setup

The structure that requires YOLOv5

```
/parent_folder
    /dataset
         /images
         /labels
    /yolov5
```

In [None]:
%cd ../
!mkdir tmp
%cd tmp

In [None]:
# Download YOLOv5
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
# Install dependencies
%pip install -qr requirements.txt  # install dependencies

%cd ../
import torch
print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

In [None]:
import os
import gc
import cv2
import numpy as np
import pandas as pd
from tqdm import tqdm
from shutil import copyfile
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from IPython.core.display import Video, display
import subprocess

In [None]:
%cd ../

TRAIN_PATH = 'input/nfl-health-and-safety-helmet-assignment/images/'
BATCH_SIZE = 16
EPOCHS = 2

print(f'Number of extra images: {len(os.listdir(TRAIN_PATH))}') 

In [None]:
# Load image level csv file
extra_df = pd.read_csv('input/nfl-health-and-safety-helmet-assignment/image_labels.csv')
print('Number of ground truth bounding boxes: ', len(extra_df))

# Number of unique labels
label_to_id = {label: i for i, label in enumerate(extra_df.label.unique())}
print('Unique labels: ', label_to_id)

# Group together bbox coordinates belonging to the same image. 
# key is the name of the image, value is a dataframe with label and bbox coordinates. 
image_bbox_label = {} 
for image, df in extra_df.groupby('image'): 
    image_bbox_label[image] = df.reset_index(drop=True)

# Visualize
extra_df.head()

In [None]:
# Create train and validation split.
train_names, valid_names = train_test_split(list(image_bbox_label), test_size=0.2, random_state=42)
print(f'Size of dataset: {len(image_bbox_label)},\
       training images: {len(train_names)},\
       validation images: {len(valid_names)}')

The required folder structure for the dataset directory is:

```
/parent_folder
    /dataset
         /images
             /train
             /val
         /labels
             /train
             /val
    /yolov5
```

In [None]:
os.makedirs('tmp/nfl_extra/images/train', exist_ok=True)
os.makedirs('tmp/nfl_extra/images/valid', exist_ok=True)

os.makedirs('tmp/nfl_extra/labels/train', exist_ok=True)
os.makedirs('tmp/nfl_extra/labels/valid', exist_ok=True)

# Move the images to relevant split folder.
for img_name in tqdm(train_names):
    copyfile(f'{TRAIN_PATH}/{img_name}', f'tmp/nfl_extra/images/train/{img_name}')

for img_name in tqdm(valid_names):
    copyfile(f'{TRAIN_PATH}/{img_name}', f'tmp/nfl_extra/images/valid/{img_name}')

In [None]:
import yaml

data_yaml = dict(
    train = '../nfl_extra/images/train',
    val = '../nfl_extra/images/valid',
    nc = 5,
    names = list(extra_df.label.unique())
)

# Note that the file is created in the yolov5/data/ directory.
with open('tmp/yolov5/data/data.yaml', 'w') as outfile:
    yaml.dump(data_yaml, outfile, default_flow_style=True)
    
%cat tmp/yolov5/data/data.yaml

In [None]:
def get_yolo_format_bbox(img_w, img_h, box):
    """
    Convert the bounding boxes in YOLO format.
    
    Input:
    img_w - Original/Scaled image width
    img_h - Original/Scaled image height
    box - Bounding box coordinates in the format, "left, width, top, height"
    
    Output:
    Return YOLO formatted bounding box coordinates, "x_center y_center width height".
    """
    w = box.width # width 
    h = box.height # height
    xc = box.left + int(np.round(w/2)) # xmin + width/2
    yc = box.top + int(np.round(h/2)) # ymin + height/2

    return [xc/img_w, yc/img_h, w/img_w, h/img_h] # x_center y_center width height
    
# Iterate over each image and write the labels and bbox coordinates to a .txt file. 
for img_name, df in tqdm(image_bbox_label.items()):
    # open image file to get the height and width 
    img = cv2.imread(TRAIN_PATH+'/'+img_name)
    height, width, _ = img.shape 
    
    # iterate over bounding box df
    bboxes = []
    for i in range(len(df)):
        # get a row
        box = df.loc[i]
        # get bbox in YOLO format
        box = get_yolo_format_bbox(width, height, box)
        bboxes.append(box)
    
    if img_name in train_names:
        img_name = img_name[:-4]
        file_name = f'tmp/nfl_extra/labels/train/{img_name}.txt'
    elif img_name in valid_names:
        img_name = img_name[:-4]
        file_name = f'tmp/nfl_extra/labels/valid/{img_name}.txt'
        
    with open(file_name, 'w') as f:
        for i, bbox in enumerate(bboxes):
            label = label_to_id[df.loc[i].label]
            bbox = [label]+bbox
            bbox = [str(i) for i in bbox]
            bbox = ' '.join(bbox)
            f.write(bbox)
            f.write('\n')

In [None]:
%cd tmp/yolov5/

In [None]:
# turn off W&B syncing if you don't need

os.environ['WANDB_MODE'] = 'offline'

In [None]:
!python train.py --img 720 \
                 --batch {BATCH_SIZE} \
                 --epochs {EPOCHS} \
                 --data data.yaml \
                 --weights yolov5s.pt \
                 --save_period 1 \
                 --project nfl-extra

In [None]:
data_dir = '/kaggle/input/nfl-health-and-safety-helmet-assignment/'
example_video = f'{data_dir}/test/57906_000718_Endzone.mp4'

#video example
frac = 0.65
display(Video(example_video, embed=True, height=int(720*frac), width=int(1280*frac)))

In [None]:
# create frames from video
img_ext = 'png'
image_name = '57906_000718_Endzone'
frame_dir = '/kaggle/tmp/mp4_img/'
os.makedirs(frame_dir, exist_ok=True)

cmd = 'ffmpeg -i \"{}\" -qscale:v 2 \"{}/{}_%d.{}\"'.format(example_video, frame_dir, image_name, img_ext)
print(cmd)
subprocess.call(cmd, shell=True)

In [None]:
# output folder name
project_name = '57906_000718_Endzone'
# best weights after training
best_weights = 'nfl-extra/exp/weights/best.pt'

### Inference

In [None]:
!python detect.py --weights {best_weights} \
                  --source {frame_dir} \
                  --img 720 \
                  --save-txt \
                  --save-conf \
                  --project {project_name}

In [None]:
# make video from frames
video_name = '57906_000718_Endzone_fps60.mp4'
tmp_video_path = os.path.join('/kaggle/working/', f'tmp_{video_name}')
video_path = os.path.join('/kaggle/working/', video_name)

frame_rate = 60

images = [img for img in os.listdir(f'{project_name}/exp')]
images.remove('labels')
images.sort(key = lambda x: int(x.split('_')[-1][:-4]))

frame = cv2.imread(os.path.join('57906_000718_Endzone/exp', images[0]))
height, width, layers = frame.shape

video = cv2.VideoWriter(tmp_video_path, cv2.VideoWriter_fourcc(*'MP4V'),
                        frame_rate, (width,height))

for f in images:
    img = cv2.imread(os.path.join('57906_000718_Endzone/exp', f))
    video.write(img)

video.release()

# Not all browsers support the codec, we will re-load the file at tmp_video_path
# and convert to a codec that is more broadly readable using ffmpeg

if os.path.exists(video_path):
    os.remove(video_path)
    
subprocess.run(["ffmpeg", "-i", tmp_video_path, "-crf", "18", "-preset", "veryfast",
                "-vcodec","libx264", video_path,])

os.remove(tmp_video_path)

## Results

In [None]:
frac = 0.65
display(Video(video_path, embed=True, height=int(720*frac), width=int(1280*frac)))