### People Detection in Aerial Video

I reviewed and experimented with several object detection models including Faster RCNN, EfficientDet and also the custom YOLOv3 we built from scratch. But the latest YOLOv5 by Ultralytics is much more efficient (w.r.t. time and memory) than the rest. Also, YOLOv5 implementation is fully in PyTorch and fairly intuitive.
Hence, I chose YOLOv5 for this task.

Please note that the provided test video (https://drive.google.com/file/d/1L0ee-kdtwayN-tlCzXyWVUCqOGwmLj_A/view) has been saved at "data/videos/TopDown_AerialVideo_1080.mp4"

Let's start by installing necessary dependencies.

In [1]:
%pip install setuptools --upgrade
%pip install -r requirements.txt

Requirement already up-to-date: setuptools in c:\users\shrin\anaconda3\envs\venv\lib\site-packages (56.0.0)
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


Authors of YOLOv5 have provided four different standard models: YOLOv5s, YOLOv5m, YOLOv5l, YOLOv5x. The first is the smallest and least accurate (also the fastest), the last one is the largest with the greatest accuracy. All models run on PyTorch.

Taking into consideration our need for real-time performance in aerial footage object detection, let's proceed with **YOLOv5s** for the **quickest inference**. 

Pretrained weights for YOLOv5s for COCO dataset are publically available. COCO dataset has 80 different classes including person, car, dog etc. Our problem statement requires us to detect only persons in the frame. Hence, the module detect.py has been modified to account only for the detections made for class 'person', and to draw red bounding boxes around them annotated with confidence score.

Let's run the detections. It is **recommended to use a GPU** enabled system for quick completion of inference.

In [2]:
!python detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source data/videos/TopDown_AerialVideo_1080.mp4

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', exist_ok=False, hide_conf=False, hide_labels=False, img_size=640, iou_thres=0.45, line_thickness=3, name='exp', nosave=False, project='runs/detect', save_conf=False, save_crop=False, save_txt=False, source='data/videos/TopDown_AerialVideo_1080.mp4', update=False, view_img=False, weights=['yolov5s.pt'])

YOLOv5  v5.0-41-g4890499 torch 1.7.1 CUDA:0 (GeForce 940MX, 2048.0MB)

Fusing layers... 
Model Summary: 224 layers, 7266973 parameters, 0 gradients, 17.0 GFLOPS



video 1/1 (1/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 12 persons, 1 car, Done. (0.173s)
video 1/1 (2/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 12 persons, 1 car, Done. (0.038s)
video 1/1 (3/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 10 persons, 1 car, Done. (0.038s)
video 1/1 (4/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 10 persons, 1 car, Done. (0.038s)
video 1/1 (5/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 9 persons, 1 car, 1 bird, Done. (0.038s)
video 1/1 (6/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 11 persons, 1 car, Done. (0.039s)
video 1/1 (7/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 10 p

video 1/1 (293/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 3 persons, Done. (0.040s)
video 1/1 (294/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 4 persons, Done. (0.039s)
video 1/1 (295/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 2 persons, Done. (0.042s)
video 1/1 (296/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 1 person, Done. (0.039s)
video 1/1 (297/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 1 person, 1 bird, Done. (0.038s)
video 1/1 (298/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 Done. (0.039s)
video 1/1 (299/481) E:\Misc\people_detection_submmission\yolov5\data\videos\TopDown_AerialVideo_1080.mp4: 384x640 2 persons, Done. (0.040s)
video 1/1 (300/481) E:\Mi

The detections output file is written and saved at "runs/detect/exp/TopDown_AerialVideo_1080.mp4".

You can download the same detections output file at: https://drive.google.com/file/d/17Y75Kafoc56hP9naY6Ej16fJH-zt_c6D/view?usp=sharing