YOLO : You Only Look Once (Real time object recognition algorithm)


1. It divides the image into NxN grids
1. Bounding box regression : Each grid cell is sent to the model. Then YOLO determines the probability of the cell contains a certain class and the class with the maximum probability is chosen.
1. Intersection Over Union (IOU) : metric that evaluates intersection between the predicted bounding box and the ground truth bounding box. A Non-max suppression technique is applied to eliminate the bounding boxes that are very close by performing the IoU with the one having the highest class probability among them.


    IOU = B1 intersection B2 / B1 union B2
  


We then get a class probability Map



# Network Architecture
- The YOLO network has 24 convolutional layers followed by 2 fully connected layers. The convolutional layers are pre-trained on the ImageNet classification task at half the resolution (224 × 224 input image) and then double the resolution for detection.
- The layers Alternating 1 × 1 reduction layer and 3×3 convolutional layer to reduce the feature space from preceding layers.
- The last 4 layers are added to train the network for object detection.
- The last layer predicts the object class probability and the bounding box probability.

# OpenCV DNN Module

Used to implement YOLOv3. 

Classify : 
- Cars
- Heavy Motor Vehicles
- Light Motor Vehicles


Count the number of vehicles
Store the data to analyse it


In [1]:
# Installing Modules 
# Install Python3

! pip3 install opencv-python
! pip3 install opencv-contrib-python # Running GPU Models
! pip3 install numpy
! pip3 install matplotlib

Collecting opencv-python
  Downloading opencv_python-4.6.0.66-cp36-abi3-macosx_10_15_x86_64.whl (46.4 MB)
[K     |████████████████████████████████| 46.4 MB 813 kB/s eta 0:00:012    |████████████████                | 23.1 MB 7.2 MB/s eta 0:00:04     |███████████████████████████▍    | 39.8 MB 2.1 MB/s eta 0:00:04     |███████████████████████████▋    | 40.1 MB 2.1 MB/s eta 0:00:04
[?25hCollecting numpy>=1.14.5
  Using cached numpy-1.23.4-cp39-cp39-macosx_10_9_x86_64.whl (18.1 MB)
Installing collected packages: numpy, opencv-python
Successfully installed numpy-1.23.4 opencv-python-4.6.0.66
You should consider upgrading via the '/Users/daver/Desktop/mini/env/bin/python3 -m pip install --upgrade pip' command.[0m
Collecting opencv-contrib-python
  Downloading opencv_contrib_python-4.6.0.66-cp36-abi3-macosx_10_15_x86_64.whl (56.1 MB)
[K     |████████████████████████████████| 56.1 MB 276 kB/s eta 0:00:014   |▎                               | 552 kB 2.6 MB/s eta 0:00:22     |█▉              

In [2]:
# Create Tracker File

# Vehicle Counter


In [3]:
# Import necessary packages

import cv2
import csv
import collections
import numpy as np
from tracker import *

# Initialize Tracker
tracker = EuclideanDistTracker()


# Detection confidence threshold
confThreshold = 0.2 # minimum confidence score threshold for detection
nmsThreshold = 0.2 # Non-Max suppression threshold




ModuleNotFoundError: No module named 'cv2'

In [None]:
# Middle cross line position
# crossing line positions that will be used to count vehicles
# (ToDo : Modify According to your need)
middle_line_position = 225   
up_line_position = middle_line_position - 15
down_line_position = middle_line_position + 15

In [None]:
# Store Coco Dataser Names in a list
classesFile = "coco.names"
classNames = open(classesFile).read().strip().split('\n')
print(classNames)
print(len(classNames))

In [None]:
# YOLOv3 is trained on the coco dataset, but we need only a few items
# class index for our required detection classes
required_class_index = [2, 3, 5, 7]

detected_classNames = []

In [None]:
# Model Files
modelConfiguration = 'yolov3-320.cfg'
modelWeigheights = 'yolov3-320.weights'

# configure the network model
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeigheights)

# Configure the network backend
# If we are using a GPU, else comment out these lines
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

# Define random colour for each class
np.random.seed(32)
colors = np.random.randint(0, 255, size=(len(classNames), 3), dtype='uint8') # Setting random colour for each class

In [None]:
# Read Frames from a video file

# Initialize the videocapture object
cap = cv2.VideoCapture('video.mp4') # VideoCapture Object
input_size = 320



def realTime():
    while True:
        success, img = cap.read() # Read Each frame
        img = cv2.resize(img,(0,0),None,0.5,0.5) # Reduce frames by 50%
        ih, iw, channels = img.shape
        blob = cv2.dnn.blobFromImage(img, 1 / 255, (input_size, input_size), [0, 0, 0], 1, crop=False)

        # Set the input of the network
        net.setInput(blob)
        layersNames = net.getLayerNames()
        outputNames = [(layersNames[i[0] - 1]) for i in net.getUnconnectedOutLayers()]
        # Feed data to the network
        outputs = net.forward(outputNames)
    
        # Find the objects from the network output
        postProcess(outputs,img)

        # Draw the crossing lines

        cv2.line(img, (0, middle_line_position), (iw, middle_line_position), (255, 0, 255), 2)
        cv2.line(img, (0, up_line_position), (iw, up_line_position), (0, 0, 255), 2)
        cv2.line(img, (0, down_line_position), (iw, down_line_position), (0, 0, 255), 2)

        # Draw counting texts in the frame
        cv2.putText(img, "Up", (110, 20), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Down", (160, 20), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Car:        "+str(up_list[0])+"     "+ str(down_list[0]), (20, 40), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Motorbike:  "+str(up_list[1])+"     "+ str(down_list[1]), (20, 60), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Bus:        "+str(up_list[2])+"     "+ str(down_list[2]), (20, 80), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Truck:      "+str(up_list[3])+"     "+ str(down_list[3]), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)

        # Show the frames
        cv2.imshow('Output', img)

        if cv2.waitKey(1) == ord('q'):
            break

    # Write the vehicle counting information in a file and save it

    with open("data.csv", 'w') as f1:
        cwriter = csv.writer(f1)
        cwriter.writerow(['Direction', 'car', 'motorbike', 'bus', 'truck'])
        up_list.insert(0, "Up")
        down_list.insert(0, "Down")
        cwriter.writerow(up_list)
        cwriter.writerow(down_list)
    f1.close()
    # print("Data saved at 'data.csv'")
    # Finally realese the capture object and destroy all active windows
    cap.release()
    cv2.destroyAllWindows()