# NFL helmet with Yolov5-deepsort starter

In this competition NFL ask Kaggler to develop a solution for tracking identity of player helmet in order to have better understanding of player collision during the game. For machine learning, it's a problem named as "Multi-Object Tracking" or MOT in short which ultilize object detection and tracking detection box across frames. Paper with Codes leaderboard for SOTA method: https://paperswithcode.com/task/multi-object-tracking

The below code use Deepsort: https://arxiv.org/abs/1703.07402 which is a old but simple algothrim to try extracting detection box and link player identity.

In [None]:
import numpy as np
import pandas as pd
import os

In [None]:
!git clone --recurse-submodules https://github.com/etrain-xyz/Yolov5_DeepSort_Pytorch.git

In [None]:
!ls Yolov5_DeepSort_Pytorch/yolov5

In [None]:
!cd Yolov5_DeepSort_Pytorch && pip install -r requirements.txt

Yolov5 helmet detection weight from https://www.kaggle.com/duythanhng/yolov5-v6-0-helmet-detection

In [None]:
!cp "../input/yolov5-v6-0-helmet-detection/yolov5_weight/exp/weights/best.pt" Yolov5_DeepSort_Pytorch

In [None]:
!add-apt-repository --yes ppa:ubuntu-toolchain-r/test
!apt-get update --yes
!apt-get upgrade --yes
!apt install --yes gcc-9 libstdc++6

In [None]:
%cd "/kaggle/working/Yolov5_DeepSort_Pytorch"

In [None]:
import sys
# sys.path.insert(0, './yolov5')
sys.path.append('/kaggle/working/Yolov5_DeepSort_Pytorch/yolov5')

from yolov5.models.experimental import attempt_load
from yolov5.utils.downloads import attempt_download
from yolov5.utils.datasets import LoadImages, LoadStreams
from yolov5.utils.general import check_img_size, non_max_suppression, scale_coords, check_imshow, xyxy2xywh
from yolov5.utils.torch_utils import select_device, time_sync
from yolov5.utils.plots import Annotator, colors
from deep_sort_pytorch.utils.parser import get_config
from deep_sort_pytorch.deep_sort import DeepSort
import argparse
import os
import platform
import shutil
import time
from pathlib import Path
import cv2
import torch
import torch.backends.cudnn as cudnn

def detect(source):
    out = "inference/output"
    imgsz = 1280
    yolo_weights = "best.pt"
    deep_sort_weights = "deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7"
    config_deepsort = "deep_sort_pytorch/configs/deep_sort.yaml"
    augment = False
    conf_thres = 0.4
    iou_thres = 0.5
    classes = 0
    agnostic_nms = False
    
    # initialize deepsort
    cfg = get_config()
    cfg.merge_from_file(config_deepsort)
    
    attempt_download(deep_sort_weights, repo='mikel-brostrom/Yolov5_DeepSort_Pytorch')
    deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT,
                        max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE,
                        max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE,
                        max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET,
                        use_cuda=True)

    # Initialize
    device = select_device('0')

    # The MOT16 evaluation runs multiple inference streams in parallel, each one writing to
    # its own .txt file. Hence, in that case, the output folder is not restored
    if os.path.exists(out):
        pass
        shutil.rmtree(out)  # delete output folder
    os.makedirs(out)  # make new output folder

    half = device.type != 'cpu'  # half precision only supported on CUDA
    # Load model
    model = attempt_load(yolo_weights, map_location=device)  # load FP32 model
    stride = int(model.stride.max())  # model stride
    imgsz = check_img_size(imgsz, s=stride)  # check img_size
    names = model.module.names if hasattr(model, 'module') else model.names  # get class names
    if half:
        model.half()  # to FP16

    # Set Dataloader
    vid_path, vid_writer = None, None
    
    dataset = LoadImages(source, img_size=imgsz, stride=stride)

    # Get names and colors
    names = model.module.names if hasattr(model, 'module') else model.names

    # Run inference
    if device.type != 'cpu':
        model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
    t0 = time.time()

    save_path = str(Path(out))
    # extract what is in between the last '/' and last '.'
    txt_file_name = source.split('/')[-1].split('.')[0]
    txt_path = str(Path(out)) + '/' + txt_file_name + '.txt'

    for frame_idx, (path, img, im0s, vid_cap) in enumerate(dataset):
        img = torch.from_numpy(img).to(device)
        img = img.half() if half else img.float()  # uint8 to fp16/32
        img /= 255.0  # 0 - 255 to 0.0 - 1.0
        if img.ndimension() == 3:
            img = img.unsqueeze(0)

        # Inference
        t1 = time_sync()
        pred = model(img, augment=augment)[0]

        # Apply NMS
        pred = non_max_suppression(pred, conf_thres, iou_thres, classes=classes, agnostic=agnostic_nms)
        t2 = time_sync()

        # Process detections
        for i, det in enumerate(pred):  # detections per image
            p, s, im0 = path, '', im0s

            s += '%gx%g ' % img.shape[2:]  # print string
            save_path = str(Path(out) / Path(p).name)

            annotator = Annotator(im0, line_width=2, pil=not ascii)

            if det is not None and len(det):
                # Rescale boxes from img_size to im0 size
                det[:, :4] = scale_coords(
                    img.shape[2:], det[:, :4], im0.shape).round()

                # Print results
                for c in det[:, -1].unique():
                    n = (det[:, -1] == c).sum()  # detections per class
                    s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string

                xywhs = xyxy2xywh(det[:, 0:4])
                confs = det[:, 4]
                clss = det[:, 5]

                # pass detections to deepsort
                outputs = deepsort.update(xywhs.cpu(), confs.cpu(), clss.cpu(), im0)
                
                # draw boxes for visualization
                if len(outputs) > 0:
                    for j, (output, conf) in enumerate(zip(outputs, confs)): 
                        
                        bboxes = output[0:4]
                        id = output[4]
                        cls = output[5]

                        c = int(cls)  # integer class
                        label = f'{id} {names[c]} {conf:.2f}'
                        annotator.box_label(bboxes, label, color=colors(id, True))
            else:
                deepsort.increment_ages()

            # Print time (inference + NMS)
            print('%sDone. (%.3fs)' % (s, t2 - t1))

            # Stream results
            im0 = annotator.result()

            # Save results (image with detections)
            if vid_path != save_path:  # new video
                vid_path = save_path
                if isinstance(vid_writer, cv2.VideoWriter):
                    vid_writer.release()  # release previous video writer
                if vid_cap:  # video
                    fps = vid_cap.get(cv2.CAP_PROP_FPS)
                    w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                    h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                else:  # stream
                    fps, w, h = 30, im0.shape[1], im0.shape[0]
                    save_path += '.mp4'

                vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
            vid_writer.write(im0)

    print('Results saved to %s' % os.getcwd() + os.sep + out)
    print('Done. (%.3fs)' % (time.time() - t0))


In [None]:
detect("../../input/nfl-health-and-safety-helmet-assignment/test/57906_000718_Endzone.mp4")

Let see the output:

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('TofMADTFkjI', width=800, height=450)

What's makes multi-objec tracking a difficult problem? From the above video it can already see clearly that some helmets are not tracked consistently throughout the video. Besides from false positive/false negative problem from traditional object detection there are also other issues. 

![](https://i.imgur.com/7IBr47d.png)
![](https://i.imgur.com/DyUtK4y.png)

For example in the above two image the identity of two players switch after they are close together. 

![](https://i.imgur.com/7QzHM6q.png)
![](https://i.imgur.com/1RHItha.png)

Also, sometimes the model use duplicated id to indicates the same object.

Therefore usually in multiple-object tracking paper there are multiple criteria to judge the model's performance

![TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking](https://i.imgur.com/tPMqbpv.png)
![Real-Time Multiple Object Tracking A Study on the Importance of Speed](https://i.imgur.com/jG5wRTE.png)

Hope this notebook would be useful!

In [None]:
%cd /kaggle/working/Yolov5_DeepSort_Pytorch
!rm -rf .git inference 
%cd /kaggle/working
!zip -r yolov5-deepsort-pytorch.zip Yolov5_DeepSort_Pytorch
!mkdir -p yolov5_deepsort_pytorch_source
!mv yolov5-deepsort-pytorch.zip ./yolov5_deepsort_pytorch_source/
!rm -rf Yolov5_DeepSort_Pytorch