# Object Tracking Using Yolov5 + DeepSort

The detections generated by YOLOv5, a family of object detection architectures and models pretrained on the COCO dataset, are passed to StrongSORT which combines motion and appearance information based on OSNet in order to tracks the objects. It can track any object that your Yolov5 model was trained to detect

In [1]:
!git clone --recurse-submodules https://github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch.git

Submodule path 'strong_sort/deep/reid': checked out '4a0793780bd13f53ec2ca753a94dcef62dc9e955'
Submodule path 'yolov5': checked out '47233e1698b89fc437a4fb9463c815e9171be955'


Cloning into 'Yolov5_DeepSort_Pytorch'...
Submodule 'strong_sort/deep/reid' (https://github.com/KaiyangZhou/deep-person-reid) registered for path 'strong_sort/deep/reid'
Submodule 'yolov5' (https://github.com/ultralytics/yolov5.git) registered for path 'yolov5'
Cloning into 'D:/yolov5_deepsort_env/Yolov5_DeepSort_Pytorch/strong_sort/deep/reid'...
Cloning into 'D:/yolov5_deepsort_env/Yolov5_DeepSort_Pytorch/yolov5'...


In [2]:
cd Yolov5_DeepSort_Pytorch

D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch


In [3]:
#!pip install -r requirements.txt

In [1]:
!python --version

Python 3.7.6


In [4]:
#!pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

In [5]:
import torch
torch.cuda.is_available() 
torch.cuda.get_device_name(0)

'NVIDIA GeForce RTX 3090'

# Tracking can be run on most video formats

In [11]:
# python track.py --source 0  # webcam
#                            img.jpg  # image
#                            vid.mp4  # video
#                            path/  # directory
#                            path/*.jpg  # glob
#                            'https://youtu.be/Zgi9g1ksQHc'  # YouTube
#                            'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream
                        

!python track.py --source video.mp4 --save-vid --strong-sort-weights osnet_x0_25_msmt17.pt

# by adding  --strong-sort-weights osnet_x0_25_msmt17.pt .this weight will get installed automatically in your current working directory.
# After this command, you will get run folder with the results

#!python track.py --source img.jpg



  'Cython evaluation (very fast so highly recommended) is '
[34m[1mtrack: [0myolo_weights=D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch\weights\yolov5m.pt, strong_sort_weights=osnet_x0_25_msmt17.pt, config_strongsort=strong_sort/configs/strong_sort.yaml, source=video.mp4, imgsz=[640, 640], conf_thres=0.5, iou_thres=0.5, max_det=1000, device=, show_vid=False, save_txt=False, save_conf=False, save_crop=False, save_vid=True, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\track, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, hide_class=False, half=False, dnn=False
YOLOv5  2022-7-1 Python-3.7.6 torch-1.8.1+cu111 CUDA:0 (NVIDIA GeForce RTX 3090, 24576MiB)

Fusing layers... 
YOLOv5m summary: 290 layers, 21172173 parameters, 0 gradients
Downloading...
From: https://drive.google.com/uc?id=1Kkx2zW89jq_NETu4u42CFZTMVD5Hwm6e
To: D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch\osnet_x0_25_msmt17.pt

  0%

Model: osnet_x0_25
- params: 203,568
- flops: 82,316,000
Successfully loaded pretrained weights from "osnet_x0_25_msmt17.pt"
** The following layers are discarded due to unmatched keys or layer size: ['classifier.weight', 'classifier.bias']


# Select object detection and ReID model
#### Yolov5
There is a clear trade-off between model inference speed and accuracy. In order to make it possible to fulfill your inference speed/accuracy needs you can select a Yolov5 family model for automatic download

=============================================================================================================================
No URL associated to the chosen DeepSort weights. Choose between:

Available ReID models for automatic download
['resnet50_market1501.pt', 'resnet50_dukemtmcreid.pt', 'resnet50_msmt17.pt', 'resnet50_fc512_market1501.pt', 'resnet50_fc512_dukemtmcreid.pt', 'resnet50_fc512_msmt17.pt', 'mlfn_market1501.pt', 'mlfn_dukemtmcreid.pt', 'mlfn_msmt17.pt', 'hacnn_market1501.pt', 'hacnn_dukemtmcreid.pt', 'hacnn_msmt17.pt', 'mobilenetv2_x1_0_market1501.pt', 'mobilenetv2_x1_0_dukemtmcreid.pt', 'mobilenetv2_x1_0_msmt17.pt', 'mobilenetv2_x1_4_market1501.pt', 'mobilenetv2_x1_4_dukemtmcreid.pt', 'mobilenetv2_x1_4_msmt17.pt', 'osnet_x1_0_market1501.pt', 'osnet_x1_0_dukemtmcreid.pt', 'osnet_x1_0_msmt17.pt', 'osnet_x0_75_market1501.pt', 'osnet_x0_75_dukemtmcreid.pt', 'osnet_x0_75_msmt17.pt', 'osnet_x0_5_market1501.pt', 'osnet_x0_5_dukemtmcreid.pt', 'osnet_x0_5_msmt17.pt', 'osnet_x0_25_market1501.pt', 'osnet_x0_25_dukemtmcreid.pt', 'osnet_x0_25_msmt17.pt', 'osnet_ibn_x1_0_msmt17.pt', 'osnet_ain_x1_0_msmt17.pt']

In [13]:
# python track.py --source 0 --yolo_model yolov5n.pt --img 640
#                                           yolov5s.pt
#                                           yolov5m.pt
#                                           yolov5l.pt 
#                                           yolov5x.pt --img 1280

!python track.py --source video.mp4 --yolo-weights weights/yolov5m.pt --img 640  --strong-sort-weights osnet_x0_25_msmt17.pt  --save-vid


Model: osnet_x0_25
- params: 203,568
- flops: 82,316,000
Successfully loaded pretrained weights from "osnet_x0_25_msmt17.pt"
** The following layers are discarded due to unmatched keys or layer size: ['classifier.weight', 'classifier.bias']


  'Cython evaluation (very fast so highly recommended) is '
[34m[1mtrack: [0myolo_weights=['weights/yolov5m.pt'], strong_sort_weights=osnet_x0_25_msmt17.pt, config_strongsort=strong_sort/configs/strong_sort.yaml, source=video.mp4, imgsz=[640, 640], conf_thres=0.5, iou_thres=0.5, max_det=1000, device=, show_vid=False, save_txt=False, save_conf=False, save_crop=False, save_vid=True, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\track, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, hide_class=False, half=False, dnn=False
YOLOv5  2022-7-1 Python-3.7.6 torch-1.8.1+cu111 CUDA:0 (NVIDIA GeForce RTX 3090, 24576MiB)

Fusing layers... 
YOLOv5m summary: 290 layers, 21172173 parameters, 0 gradients
video 1/1 (1/388) D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch\video.mp4: 384x640 2 persons, 7 cars, 3 traffic lights, Done. YOLO:(0.014s), StrongSORT:(0.044s)
video 1/1 (2/388) D:\yolov5_deepsort_env\Yolo

By default the tracker tracks all MS COCO classes.

If you only want to track persons I recommend you to get these weights for increased performance

python3 track.py --source 0 --yolo_model yolov5/weights/crowdhuman_yolov5m.pt --classes 0  # tracks persons, only

In [16]:
!python track.py --source video.mp4 --yolo-weights weights/yolov5m.pt --strong-sort-weights osnet_x0_25_msmt17.pt --save-vid --classes 0  

# tracks persons, only

Model: osnet_x0_25
- params: 203,568
- flops: 82,316,000
Successfully loaded pretrained weights from "osnet_x0_25_msmt17.pt"
** The following layers are discarded due to unmatched keys or layer size: ['classifier.weight', 'classifier.bias']


  'Cython evaluation (very fast so highly recommended) is '
[34m[1mtrack: [0myolo_weights=['weights/yolov5m.pt'], strong_sort_weights=osnet_x0_25_msmt17.pt, config_strongsort=strong_sort/configs/strong_sort.yaml, source=video.mp4, imgsz=[640, 640], conf_thres=0.5, iou_thres=0.5, max_det=1000, device=, show_vid=False, save_txt=False, save_conf=False, save_crop=False, save_vid=True, nosave=False, classes=[0], agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\track, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, hide_class=False, half=False, dnn=False
YOLOv5  2022-7-1 Python-3.7.6 torch-1.8.1+cu111 CUDA:0 (NVIDIA GeForce RTX 3090, 24576MiB)

Fusing layers... 
YOLOv5m summary: 290 layers, 21172173 parameters, 0 gradients
video 1/1 (1/388) D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch\video.mp4: 384x640 2 persons, Done. YOLO:(0.023s), StrongSORT:(0.023s)
video 1/1 (2/388) D:\yolov5_deepsort_env\Yolov5_DeepSort_Pytorch\video.m

If you want to track a subset of the MS COCO classes, add their corresponding index after the classes flag


python3 track.py --source 0 --yolo_model yolov5s.pt --classes 16 17  # tracks cats and dogs, only

In [17]:
#!python track.py --source 0 --yolo-weights weights/yolov5m.pt --classes 16 17  

# tracks cats and dogs, only

In [None]:
#python3 track.py --source ... --save-txt