# Real Time Object Detection with YOLO and OpenCV

YOLO is a deep learning algorithm and a state-of-the-art real time object detection algorithm which came out in May 2016. It is so popular because it is so fast compared with the other deep learning object detection models.

In traditional methods we use windows to calculate scores within a image. We use the highest score to say there is an object. YOLO uses different methods. It uses bounding boxes (anchors) to determine the objects in the image with the help of neural network. It uses probabilities of the boxes. It is assumed that a box with highest score detects the object. Neural network sees image just once. That'S why YOLO algorims is called "You only look once algorithm".


Resource:(https://pjreddie.com/media/files/papers/YOLOv3.pdf)

It can be seen from above picture that YOLO3 is at least 3 times faster than the others.

YOLO3 accepts three different input forms as Image file, webcam feed and video file.

We will use pre-trained YOLOv3 algorithm and it's weights. It is capable of detecting 80 different objects such as person, bicycle and car.

For transfer learning we need to:
* download weight file of YOLO
* download configuration file of YOLO
* download name file - coco
* install OpenCV 3.4.2 or above




**---------------------------------------------------------**
* download config file: https://github.com/pjreddie/darknet/tree/master/cfg
* download weights: https://pjreddie.com/darknet/yolo/
* download coco.names:https://github.com/pjreddie/darknet/blob/master/data/coco.names
* !pip install opencv-python

In [None]:
import cv2
import numpy as np



# we are loading yolo weights, configuration and object names.

net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')





# Extract the object names from "coco.names" and put them in a list called "classes".

classes = []
with open('coco.names', 'r') as f:
    classes = f.read().splitlines()






# for WEBCAM  write  "cap= cv2.VideoCapture(0)" .

cap= cv2.VideoCapture('street_video.mp4')  

while True:
    
    
    _, img = cap.read()
    
    
    height, width, _ = img.shape     # keeps the shape of the image


   # Here we convert the image to blob. So we are converting the image to numpy.ndarray FORMAT
    blob = cv2.dnn.blobFromImage(img, 1/255, (416,416), (0,0,0), swapRB=True, crop=False)


   # After editing this photo, we set it to the model as input.
    net.setInput(blob)

    
    output_layers_names = net.getUnconnectedOutLayersNames()
    layer_outputs = net.forward(output_layers_names)
    








    # Boxes 
    boxes = []
    confidences = []
    class_ids = []

    #  where the first 4 values of detection give the box sizes, the 5th value is the object probability, and the others are the 80 classes' probabilities.
    for output in layer_outputs:
        for detection in output:
            scores = detection[5:] 
            class_id = np.argmax(scores) 
            confidence = scores[class_id] 
            
            # theshold of probability
            threshold = 0.5
            if confidence > threshold:
            
            
        
                center_x = int(detection[0]*width)
                center_y = int(detection[1]*height)
                w = int(detection[2]*width)         # width
                h = int(detection[3]*height)         # height


                # Let's determine the upper left corner of the box
                x = int(center_x - w/2)
                y = int(center_y - h/2)

                boxes.append([x,y,w,h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)





#  An object may have more than one box marked. In this case, non-maximum suppressions are used and the most likely to be taken
# 0.4 is non-max suppression value.
    indexes = cv2.dnn.NMSBoxes(boxes, confidences, threshold, 0.4)

    font = cv2.FONT_HERSHEY_PLAIN    # determines the font of the text on the boxes.
    colors = np.random.uniform(0,255, size = (len(boxes),3))  # determines the color of the boxes

    if len(indexes)>0:
        for i in indexes.flatten():
            x,y,w,h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = str(round(confidences[i],2))
            color = colors[i]
            cv2.rectangle(img, (x,y), (x+w, y+h), color, 2) 
            cv2.putText(img, label + " "+ confidence, (x, y+20), font, 1.2, (255,255,255), 2) 



        # To show the image and object determination on the image

        cv2.imshow('Image', img)
        key = cv2.waitKey(1)
        if key == 27:
            break


cap.release()
cv2.destroyAllWindows()