# Mielstone 2

## Video Link & Samples
https://drive.google.com/drive/folders/1NyixRXYeqys-zsf9zgNGbhpxO6PX9BzP?usp=sharing

## Problem Definition

Detecting, Tracking and segmenting vehicles is an important and emerging research area for intelligent transportation systems. Image processing plays an important role in detecting vehicles from a traffic surveillance videos. 
Traffic monitoring through image processing leads to better control of flow of traffic as well as to identify reckless users and speed violators.
Counting those detected vehicles also will come in handy when forecasting traffic jams and road congestions.

## Algorithm 
- In this project we YOLO which is a real time object detection algorithm and can detect and classify multiple objects in same frame, it does that by dividing the image into NxN Grids and each grid is sent to the model and given a certain probabiliity, the class with the maximum probability is chosen
- The confiuguration and weights of the YOLO network is fount online because it can be pretty chellenging to train so we use those files 
- We Use the DNN module from cv2 to work with YOLO directly via the two files already mentioned
- The program is divided into two parts, the first one is the tracker for vehicle detection using the OpenCv and the second part is the main detection program

## Steps

##### 1. We pre-process the image or video and feed forward those frames to the network
##### 2. Then comes the Custom-Made Post process Function
  ###### 2.1.  We define an empty list and using two for loops we iterate through each vector to collect confidence score and classID index
  ###### 2.2. We check if the class confidence scor is greater than our defined confThreshold
  ###### 2.3. Using NMSBoxes() Method we reduce the number of boxes and take the best detection  
  ###### 2.4. Draw bounding box
  ###### 2.5. Call Frequency counter to keep track of number of vehicles
###### 3. Final part will be Showing the post processed image on screen

## Links of Weight, Cfg, names files

### Because These files are of high sizes here is the link for downloading them 
- yolov3-320.cfg
- yolov3-320.weights
- coco.names
- https://pjreddie.com/darknet/yolo/

In [1]:
#Import Necessary packages
import math
import cv2
import csv
import collections
import numpy as np

In [2]:

class EuclideanDistTracker:
    def __init__(self):
        # Store the center positions of the objects
        self.center_points = {}
        # Keep the count of the IDs
        # each time a new object id detected, the count will increase by one
        self.id_count = 0


    def update(self, objects_rect):
        # Objects boxes and ids
        objects_bbs_ids = []

        # Get center point of new object
        for rect in objects_rect:
            x, y, w, h, index = rect
            cx = (x + x + w) // 2
            cy = (y + y + h) // 2

            # Find out if that object was detected already
            same_object_detected = False
            for id, pt in self.center_points.items():
                dist = math.hypot(cx - pt[0], cy - pt[1])

                if dist < 25:
                    self.center_points[id] = (cx, cy)
                    # print(self.center_points)
                    objects_bbs_ids.append([x, y, w, h, id, index])
                    same_object_detected = True
                    break

            # New object is detected we assign the ID to that object
            if same_object_detected is False:
                self.center_points[self.id_count] = (cx, cy)
                objects_bbs_ids.append([x, y, w, h, self.id_count, index])
                self.id_count += 1

        # Clean the dictionary by center points to remove IDS not used anymore
        new_center_points = {}
        for obj_bb_id in objects_bbs_ids:
            _, _, _, _, object_id, index = obj_bb_id
            center = self.center_points[object_id]
            new_center_points[object_id] = center

        # Update dictionary with IDs not used removed
        self.center_points = new_center_points.copy()
        return objects_bbs_ids



def ad(a, b):
    return a+b

In [3]:
# Function for finding the center of a rectangle
def find_center(x, y, w, h):
    x1=int(w/2)
    y1=int(h/2)
    cx = x+x1
    cy=y+y1
    return cx, cy

In [4]:
# Function for finding the detected objects from the network output
def postProcess(outputs,img):
    global detected_classNames
    height, width = img.shape[:2]
    boxes = []
    classIds = []
    confidence_scores = []
    detection = []
    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if classId in required_class_index:
                if confidence > confThreshold:
                    # print(classId)
                    w,h = int(det[2]*width) , int(det[3]*height)
                    x,y = int((det[0]*width)-w/2) , int((det[1]*height)-h/2)
                    boxes.append([x,y,w,h])
                    classIds.append(classId)
                    confidence_scores.append(float(confidence))

    # Apply Non-Max Suppression
    indices = cv2.dnn.NMSBoxes(boxes, confidence_scores, confThreshold, nmsThreshold)
    # print(classIds)
    for i in indices.flatten():
        x, y, w, h = boxes[i][0], boxes[i][1], boxes[i][2], boxes[i][3]
        # print(x,y,w,h)

        color = [int(c) for c in colors[classIds[i]]]
        name = classNames[classIds[i]]
        detected_classNames.append(name)
        # Draw classname and confidence score 
        cv2.putText(img,f'{name.upper()} {int(confidence_scores[i]*100)}%',
                  (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1)

        # Draw bounding rectangle
        cv2.rectangle(img, (x, y), (x + w, y + h), color, 1)
        detection.append([x, y, w, h, required_class_index.index(classIds[i])])

    # Update the tracker for each object
    boxes_ids = tracker.update(detection)

In [23]:
# One of the main functions in the project, this function can be called
# to apply detection and counting using a video locally on the machine

def realTime():
    while True:
        success, img = cap.read()
        img = cv2.resize(img, (608,608), interpolation = cv2.INTER_AREA)
        #img = cv2.resize(img,(0,0),None,0.5,0.5)
        ih, iw, channels = img.shape
        #blob = cv2.dnn.blobFromImage(img, 1 / 255, (input_size, input_size), [0, 0, 0], 1, crop=False)
        blob = cv2.dnn.blobFromImage(img, 1 / 255, (608 , 608), [0, 0, 0], 1, crop=False)

        # Set the input of the network
        net.setInput(blob)
        layersNames = net.getLayerNames()
        outputNames = [(layersNames[i- 1]) for i in net.getUnconnectedOutLayers()]
        # Feed data to the network
        outputs = net.forward(outputNames)
    
        # Find the objects from the network output
        postProcess(outputs,img)
        frequency = collections.Counter(detected_classNames)
        del detected_classNames[:]
        
        cv2.putText(img, "Car:        "+str(frequency['car']), (20, 40), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Motorbike:  "+str(frequency['motorbike']), (20, 60), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Bus:        "+str(frequency['bus']), (20, 80), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        cv2.putText(img, "Truck:      "+str(frequency['truck']), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
        

        # Show the frames
        cv2.imshow('Output', img)

        if cv2.waitKey(1) == ord('q'):
            break

    # Write the vehicle counting information in a file and save it

    # Finally realese the capture object and destroy all active windows
    cap.release()
    cv2.destroyAllWindows()

In [24]:
def from_static_image(image):
    img = cv2.imread(image)
    img = cv2.resize(img, (608,608), interpolation = cv2.INTER_AREA)
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (608 , 608), [0, 0, 0], 1, crop=False)

    # Set the input of the network
    net.setInput(blob)
    layersNames = net.getLayerNames()
    outputNames = [(layersNames[i- 1]) for i in net.getUnconnectedOutLayers()]
    # Feed data to the network
    outputs = net.forward(outputNames)

    # Find the objects from the network output
    postProcess(outputs,img)

    # count the frequency of detected classes
    frequency = collections.Counter(detected_classNames)
    del detected_classNames[:]
    print(frequency)
    # Draw counting texts in the frame
    cv2.putText(img, "Car:        "+str(frequency['car']), (20, 40), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
    cv2.putText(img, "Motorbike:  "+str(frequency['motorbike']), (20, 60), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
    cv2.putText(img, "Bus:        "+str(frequency['bus']), (20, 80), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)
    cv2.putText(img, "Truck:      "+str(frequency['truck']), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, font_size, font_color, font_thickness)


    cv2.imshow("image", img)

    cv2.waitKey(0)
    cv2.destroyAllWindows()

In [25]:
# Initialize Tracker
# The tracker basically uses euclidean distance between two points in current and previous
# frame and if the distance is less than threshold it confirms that it is the same object
tracker = EuclideanDistTracker()

# Initialize the videocapture object
cap = cv2.VideoCapture('Traffic_video.mp4')
input_size = 320

#Initialize The image you want to do detection and counting on
image_file = 'Traffic_snap.jpg'

# Detection confidence threshold
confThreshold = 0.2
nmsThreshold  = 0.2

font_color = (0, 0, 255)
font_size = 0.5
font_thickness = 2


# Store Coco Names in a list
classesFile = "coco.names"
classNames = open(classesFile).read().strip().split('\n')
print(classNames)
print(len(classNames))

# class index for our required detection classes
required_class_index = [2, 3, 5, 7]

detected_classNames = []

## Model Files
modelConfiguration = 'yolov3-320.cfg'
modelWeigheights = 'yolov3-320.weights'

# configure the network model
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeigheights)

# Configure the network backend to work on GPU

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

# Define random colour for each class
np.random.seed(42)
colors = np.random.randint(0, 255, size=(len(classNames), 3), dtype='uint8')



['person', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
80


In [26]:
if __name__ == '__main__':
    realTime()
    #from_static_image(image_file)
    