# Computer Vision and Internet of Things

## Task 1 : Object Detection / Optical Character Recognition (ORC)

## Description : Implement an object detector which identifies the classes of the objects in an image

## Author : Shivam Deshpande

### Explanation

### Here we have used MobileNet SSD and dnn module in openCV to build fast and efficient object detector of deep learning
### We load the pre-trained model by giving it the different class labels or objects
### Algorithm detects labels based on the provided data
### Then it draws the bounding box around the image with the predicted class label and the confidence value (Confidence value is nothing but the probability of detection)

### Importing necessary libraries

In [17]:
import numpy as np
import cv2

### Loading the pre-trained model

In [18]:
# loading the serialized model from disc
# It is a pre-trained caffe network

print("Loading the model")
net = cv2.dnn.readNetFromCaffe('MobileNetSSD_deploy.prototxt.txt', 'MobileNetSSD_deploy.caffemodel')
print("Model is loaded")

Loading the model
Model is loaded


### Initializing the objects

In [19]:
# These are the objects or class labels which mobilenet ssd was trained to detect
# They generate bounding boxes with colors for each category of object

classes = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", 
          "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]

colors = np.random.uniform(0, 255, size=(len(classes), 3))

### Loading the image 

In [46]:
# Loading the image of which the objects we have to detect

image = cv2.imread('image2.jpg')
image = cv2.resize(image, (500,500))
cv2.imshow('detected',image)
cv2.waitKey(3000)
cv2.destroyAllWindows()

### Object Detection Process

In [47]:
# extracting height and width and calculating 500x500 blob from image
# We will be feeding this blob forward through the network

(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(image, 0.007843, (500,500), 127.5)

In [48]:
# Here first we will set our blob as input to the network
# Then we compute the forward pass for the input and store the result as detections

print('processing object detection')
net.setInput(blob)
detections = net.forward()

processing object detection


### Label predictions, boxing objects and displaying probability

Description

--> We will loop through detections and determine what and where the objects are in the image

--> Apply confidence check or probability of prediction

--> If the confidence is high enough i.e above the threshold, then

       We will display the prediction on the terminal and 
       
       Draw the prediction on the image with text and a colored bounding box

In [49]:
for i in np.arange(0, detections.shape[2]):
    # extracting confidence (probability of prediction)
    confidence = detections[0, 0, i, 2]
    # filter out weak detections by ensuring the confidence is greater than minimum confidence
    if confidence > 0.3:
        
        idx = int(detections[0 ,0, i, 1]) # extract the index of class label from detections
        box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) # compute the bounding box around the detected object
        
        # extract the (x,y) co-ordinates of the box that we will use for drawing the rectangle and displaying the text
        (startX, startY, endX, endY) = box.astype("int") 
        
        # build a text label containing class label and confidence
        label = "{}:{:.2f}%".format(classes[idx], confidence * 100)
        
        # print to the terminal using the label
        print("{}".format(label))
        
        # draw colored rectangle around the object using the previously extracted (x,y) co-ordinated
        cv2.rectangle(image, (startX, startY), (endX, endY), colors[idx], 2)
        
        # Normally we want to display the text above the rectangle but if there is no room
        # display it just below the top of the rectangle
        y = startY - 15 if startY - 15 > 15 else startY + 15
        
        # overlay the coloerd text onto the image using calculated y value 
        cv2.putText(image, label, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, colors[idx], 2)

dog:91.01%
person:50.00%


In [50]:
# Display the resulting output image
cv2.imshow('output', image)
cv2.waitKey(6000)
cv2.destroyAllWindows()

### Let's now check this with other images

## Conclusion : We have implemented object detection using MobileNet SSD and dnn module of cv2 that detects different class labels in image with bounded box, label name and confidence value based on the data given to the loaded pre-trained network

## Thank You