#CM4709 Computer Vision
#Lab 06 Object Detection using YOLO

##Aims
1. Use YOLO for object detection.
1. Use YOLO in OpenCV.

##Uploading Testing Images
We will need some testing images.
There are a few in Moodle. You can also add some of your own.
Upload them to a folder in your GoogleDrive. e.g. `cm4709/Lab06/data`.

##Mounting GoogleDrive

It is faster and easier uploading to GoogleDrive than to a Colab runtime.
The following standard code connects your GoogleDrive space to Colab.
After this, your image folder should be visible in your Colab runtime.

In [None]:
from google.colab import drive

gdriveMountPoint='/content/gdrive'  #mounting point for GoogleDrive
drive.mount(gdriveMountPoint)       #mount GoogleDrive

##Download and Compile Darknet

There are a few implementation of YOLO:
1. [Darknet](https://pjreddie.com/darknet/): This is the official release and is the fastest.
1. [Darkflow](https://github.com/thtrieu/darkflow): This is the Tensorflow version of Darknet.
1. [OpenCV](https://opencv-tutorial.readthedocs.io/en/latest/yolo/yolo.html): This is the OpenCV Implementation of YOLO.

The following shell commands download Darknet and compile it. After the compilation, you should see a `darknet` folder in your Colab runtime.

Note: Every time you connect to a Colab runtime, the files you saved previously are gone. You may need to download and compile Darknet every time. Luckily the process is fast.

In [None]:
#Path/folder where Darknet will be downloaded.
#
darknetHome='darknet'

!git clone https://github.com/pjreddie/darknet
!cd $darknetHome; make

##Get the Pre-trained Weights

To do prediction, we also need the weights of a pre-trained YOLO model. We will use a YOLO model trained on the [COCO dataset](https://cocodataset.org/) which contains 80 classes of everyday objects.

Assuming that Darknet is downloaded and compiled into the `darknet` folder, we will downlood YOLOv3 weights into the same folder.
After running the following shell commands, you should see a file `yolov3.weights` in the `darknet` folder.

In [None]:
#darknetHome='darknet'  #already defined above

! cd $darknetHome; wget https://pjreddie.com/media/files/yolov3.weights

##Our Testing Image

The following code uses OpenCV to show our testing image. Feel free to change the filename.

In [None]:
import cv2 as cv
from matplotlib import pyplot as plt

imageFile='mumbai-traffic.jpg'                    #image file name
imageFolder='/cm4709/Lab06/data'                  #image folder within GoogleDrive
imagePath=gdriveMountPoint+'/MyDrive'+imageFolder+'/'+imageFile #absolute path to image file
print('Image: '+imagePath)

#load image file
img=cv.imread(imagePath)

print('Image shape: ',img.shape)

#get image height and width, which we will need later
(height,width,channels)=img.shape

#show image
plt.figure(figsize=(10,10))
plt.imshow(cv.cvtColor(img, cv.COLOR_BGR2RGB))


##Running YOLO/Darknet from Command Line

We can now run YOLO on our image using the following shell commands.
Change the filename or path if you wish.

YOLO will print out the objects detected, with class labels and confidences.

In [None]:
#These variables are defined above.
#
#imageFile='mumbai-traffic.jpg'                    #image file name
#imageFolder='/cm4709/Lab06/data'                  #image folder within GoogleDrive
#imagePath=gdriveMountPoint+'/MyDrive'+imageFolder+'/'+imageFile #absolute path to image file
#darknetHome='darknet'

!cd $darknetHome; ./darknet detect cfg/yolov3.cfg yolov3.weights $imagePath


##Visualising the Detection

Apart from printing out the result, YOLO also generates a file called `predictions.jpg`.
You can see this file using OpenCV.


In [None]:
import cv2 as cv
from matplotlib import pyplot as plt

#The follow variable is already defined.
#
#darknetHome='darknet'

#Path to output image file.
#
outputFile=darknetHome+'/predictions.jpg'

#Show output using OpenCV.
#
outputImg=cv.imread(outputFile)
plt.figure(figsize=(10,10))
plt.imshow(cv.cvtColor(outputImg, cv.COLOR_BGR2RGB))


##Seeing the COCO Classes

This version of YOLO is trained with the [COCO dataset](https://cocodataset.org/). COCO class names are in the `darknet/data/coco.names` file.
The following code prints out the number of classes and the class labels.

In [None]:
classes=[]

with open('darknet/data/coco.names','r') as f:
  classes=[line.strip() for line in f.readlines()]

print('No. of classes: ',len(classes))
print(classes)

##Using YOLO in OpenCV

YOLO is now supported in the OpenCV Deep Neural Network (DNN) module. Instead of compiling DarkNet, we can simply load YOLO withint OpenCV.

In [None]:
import numpy as np


net=cv.dnn.readNet('darknet/yolov3.weights','darknet/cfg/yolov3.cfg')

#layer_names=net.getLayerNames()
#output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]

output_layers=net.getUnconnectedOutLayersNames()
print('No. of output layers: ',len(output_layers))

#generate random colours for the classes
colours = np.random.uniform(0, 255, size=(len(classes), 3))


To feed an image into YOLO, we need to load and convert it into a blob.
We use the [`cv2.dnn.blobFromImage`](https://docs.opencv.org/3.4/d6/d0f/group__dnn.html#ga29f34df9376379a603acd8df581ac8d7) function to convert an image into a blob.

Notes:
1. We need to scale each pixel value from `0-255` to `0-1.0`. Thus the scaling factor is `1/255`.
1. OpenCV stores images in BGR instead of RGB. When we convert the image into a blob, we need to swap the R and B channels.
1. YOLO takes images in 3 sizes: $320 \times 320$, $416 \times 416$, and $609 \times 609$. We are using $416 \times 416$ here.

In [None]:
#img is the image loaded above.
#
blob=cv.dnn.blobFromImage(img,1/255.0,(416, 416),swapRB=True,crop=False)
print('Blob shape: ',blob.shape)

net.setInput(blob)

#get a list of detections
outs = net.forward(output_layers)


##Understanding the Detection Output

The YOLO we use has 3 outputs layers.
Detection outputs are in a tuple of 3 elements.
Each layer output $13 \times 13 \times 3$ detections.
Each detection is a Y vector from a grid cell.

Each Y vector has 85 elements:
1. Elements 0-3 are bounding box values.
1. Element 4 is the box confidence score.
1. Elements 5-84 are confidence scores of the 80 classes.

The following code examines structure of the detection result.

In [None]:
print('No. of layout output: ',len(outs))
detections=outs[0]
print('No. of detections in 1 layer output: ',len(detections))
vector=detections[0]
print('Bounding box: ',vector[:4])
print('Box confidence: ',vector[4])
print('Class confidence scores: ',vector[5:])


##Processing the Detections

We can now process the detections and collect information into 3 lists:
1. `classes_ids`: A list of class IDs. Each class ID is an index into the `classes` list/array of class names.
1. `confidences`: Confidence score in the class detected.
1. `boxes`: x,y,w,h, values of bounding boxes.

Notes:
1. In the code below, we ignore detections whose confidence is 0.5 or below.
1. The bounding box x,y seem to be relative to the whole image, not a grid cell. You can see this in the calculation below. This makes sense and is easier to process as we don't need to know which grid cell has detected this.

In [None]:
#variables to hold class IDs, confidences, and bounding boxes
#
class_ids=[]
confidences=[]
boxes = []

for out in outs:
  for detection in out:           #each detection is a Y vector from a grid cell
    scores = detection[5:]        #class scores
    class_id = np.argmax(scores)  #find index of max score. This is the class ID number.
    confidence = scores[class_id] #find confidence score of this class
    if confidence > 0.5:          #only accept if confidence>0.5
      # Object detected
      center_x = int(detection[0] * width)  #calculate box centre x
      center_y = int(detection[1] * height) #centre y
      w = int(detection[2] * width)   #box width
      h = int(detection[3] * height)  #box height
      # Rectangle coordinates
      x = int(center_x - w / 2)   #box top-left x
      y = int(center_y - h / 2)   #top-left y
      boxes.append([x, y, w, h])            #add bounding box into list
      confidences.append(float(confidence)) #add confidence score into list
      class_ids.append(class_id)            #add class id into list

print('No. of detections remain: ',len(class_ids))
print('\nClass ID list:')
print(class_ids)
print('\nConfidence score list:')
print(confidences)
print('\nBounding box list:')
print(boxes)

##Non-max Suppression

The list of result still contains redundant/duplicate bounding boxes.
We need to perform Non-max Suppression.
Luckily this is already implemented in [`cv2.dnn.NMSBoxes(...)`](https://docs.opencv.org/3.4/d6/d0f/group__dnn.html#ga9d118d70a1659af729d01b10233213ee) function.

Note: The last parameter is the IoU threshold. A higher value means more overlapping is allowed.

In [None]:
#perform non-max suppression
#
#returns indices of detections to remain
#
indices=cv.dnn.NMSBoxes(boxes,confidences,0.5,0.5)
print('No. of detections after Non-max suppression: ',len(indices))

In [None]:
#create a copy of the image
imgCopy=img.copy()

font = cv.FONT_HERSHEY_SIMPLEX

#go through the bounding boxes
for i in range(len(boxes)):
  if i in indices:            #1 of the boxes remaining
    x, y, w, h = boxes[i]     #get bounding box values
    label = str(classes[class_ids[i]])  #get class label
    colour = colours[i]                 #get colour
    cv.rectangle(imgCopy, (x, y), (x + w, y + h), colour, 2)    #draw bounding box
    cv.putText(imgCopy, label, (x, y -10), font, 0.5, colour)  #draw class label

plt.figure(figsize=(10,10))
plt.imshow(cv.cvtColor(imgCopy, cv.COLOR_BGR2RGB))