# Object Detection using the YOLO V4 pre-trained model

*by Georgios K. Ouzounis, June 10th, 2021*

In this exercise we will experiment with object detection in streaming video using the YOLO V4 pretrained model. This is only a demo and will perform very slowly due to the virtual environment. For substantially improved performance compile a .py file with all the relevant code and run it locally.  

## Setup

In [None]:
# import the relevant libraries
import numpy as np
import cv2 # openCV
from google.colab.patches import cv2_imshow

In [None]:
# check the opencv version
print(cv2.__version__)

In [None]:
# if the openCV version is < 4.4.0 update to the latest otherwise skip this step
!pip install opencv-python==4.5.2.52

## Get the model



In [None]:
# first create a directory to store the model
%mkdir model

In [None]:
# enter the directory and download the necessary files 
%cd model
!wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights
!wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg
!wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/data/coco.names
%cd ..

## Customize the YOLO detector

class labels:

In [None]:
class_labels_path = "/content/model/coco.names"
class_labels = open(class_labels_path).read().strip().split("\n")
class_labels

bounding box color definitions: two options

In [None]:
# declare repeating bounding box colors for each class 
# 1st: create a list colors as an RGB string array
# Example: Red, Green, Blue, Yellow, Magenda
class_colors = ["255,0,0","0,255,0","0,0,255","255,255,0","255,0, 255"]

#2nd: split the array on comma-seperated strings and for change each string type to integer
class_colors = [np.array(every_color.split(",")).astype("int") for every_color in class_colors]

#3d: convert the array or arrays to a numpy array
class_colors = np.array(class_colors)

#4th: tile this to get 80 class colors, i.e. as many as the classes  (16rows of 5cols each). 
# If you want unique colors for each class you may randomize the color generation 
# or set them manually
class_colors = np.tile(class_colors,(16,1))

or random colors:

In [None]:
class_colors = np.random.randint(0, 255, size=(len(class_labels), 3), dtype="uint8")

Declare remaining parameters

In [None]:
# for the image2blob conversion
scalefactor = 1.0/255.0
new_size = (416, 416)

# for the NMS
score_threshold = 0.5
nms_threshold = 0.4

## Load the model

In [None]:
# Load the pre-trained model 
yolo_model = cv2.dnn.readNetFromDarknet('model/yolov4.cfg','model/yolov4.weights')

In [None]:
# Read the network layers/components. The YOLO V4 neural network has 379 components.
# They consist of convolutional layers (conv), rectifier linear units (relu) etc.:
model_layers = yolo_model.getLayerNames()

In [None]:
# Loop through all network layers to find the output layers
output_layers = [model_layers[model_layer[0] - 1] for model_layer in yolo_model.getUnconnectedOutLayers()]

## Run the model on the live video feed using NMS


install the following two packages to access video content  from www.yutube.com

In [None]:
!pip install pafy

In [None]:
!pip install youtube-dl

get any video. We have selected the particular one as it shows views of a city life 

In [None]:
import pafy

url = "https://www.youtube.com/watch?v=_MMpKnfT5oU"
video = pafy.new(url)
best = video.getbest(preftype="mp4")

mount your Google Drive and get the following file (customize the path; the file is included in the git repo):

In [None]:
%cp /content/drive/MyDrive/object_detection/object_detection_functions.py .

In [None]:
from object_detection_functions import object_detection_analysis_with_nms

**WARNING:** this will be a very slow loop in part due to the cv2_imshow() command. Everyframe processed will be displayed after the previous one. To break this loop go to Runtime->Interrupt Execution

---



In [None]:
cap = cv2.VideoCapture(best.url)

new_width = 640
new_height = 480
dim = (new_width, new_height)

if cap.isOpened():
  while True:
    #get the current frame from video stream
    ret,frame = cap.read()

    frame = cv2.resize(frame, dim, interpolation = cv2.INTER_AREA)

    blob = cv2.dnn.blobFromImage(frame, scalefactor, new_size, swapRB=True, crop=False)

    # input pre-processed blob into the model
    yolo_model.setInput(blob)

    # compute the forward pass for the input, storing the results per output layer in a list
    obj_detections_in_layers = yolo_model.forward(output_layers)

    # get  the object detections drawn on  the frame
    frame, winner_boxes = object_detection_analysis_with_nms(frame, class_labels, class_colors, obj_detections_in_layers, score_threshold, nms_threshold)

    #display the frame
    cv2_imshow(frame)
    # if running outside Colab notebooks use:
    # cv2.imshow(frame)

    #terminate while loop if 'q' key is pressed - applicable outside the notebooks
    if cv2.waitKey(1) & 0xFF == ord('q'):
      break

  #releasing the stream and the camera
  cap.release()
  cv2.destroyAllWindows()


