# TASK 1 : Object Detection 

                                                                                         GRIP    : The Sparks Foundation

                                                                                         Function: Iot and Computer Vision

                                                                                         Batch   : January 2021
                                                                          
                                                                                         Author  : Charith

# Introduction to Object Detection

Object detection, a subset of computer vision, is an automated method for locating interesting objects in an image with respect to the background. For example, Figure 1 shows two images with objects in the foreground. There is a bird in the left image, while there is a dog and a person in the right image.

Solving the object detection problem means placing a tight bounding box around these objects and associating the correct object category with each bounding box. Like other computer vision tasks, deep learning is the state-of-art method to perform object detection.



# MobileNet

MobileNet is an efficient and portable CNN architecture that is used in real world applications. MobileNets primarily use depthwise seperable convolutions in place of the standard convolutions used in earlier architectures to build lighter models.MobileNets introduce two new global hyperparameters(width multiplier and resolution multiplier) that allow model developers to trade off latency or accuracy for speed and low size depending on their requirements.

For more information : https://iq.opengenus.org/mobilenet-v1-architecture/

# Single Shot Detector


Single Shot detector like YOLO takes only one shot to detect multiple objects present in an image using multibox.

It is significantly faster in speed and high-accuracy object detection algorithm. A quick comparison between speed and accuracy of different object detection models on VOC2007

For more info: https://towardsdatascience.com/review-ssd-single-shot-detector-object-detection-851a94607d11

# Combining MobileNets and Single Shot Detectors for fast, efficient deep-learning based object detection

we will use the MobileNet SSD + deep neural network (dnn ) module in OpenCV to build our object detector.

# IMPORTING NECESSARY PACKAGES

In [1]:
import numpy as np
import argparse
import cv2

In [2]:
# initialize the list of class labels MobileNet SSD was trained to
# detect, then generate a set of bounding box colors for each class
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

# Loading the Pre-Trained Model

In [3]:
# load our serialized model from disk
#loading a pre-trained Caffe network.
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe('MobileNetSSD_deployprototxt.txt','MobileNetSSD_deploy.caffemodel')
print("[INFO] loading done")

[INFO] loading model...
[INFO] loading done


# Loading Image for Detection

In [11]:
image = cv2.imread('images/doganimal.jpg')
image= cv2.resize(image, (500, 500))
cv2.imshow("car", image)
cv2.waitKey(0)

-1

# Object Detection Process

In [12]:
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(image, 0.007843,(100, 100), 127.5)

In [13]:
print("[INFO] computing object detections...")
net.setInput(blob)
detections = net.forward()

[INFO] computing object detections...


In [14]:
for i in np.arange(0, detections.shape[2]):
# extract the confidence (i.e., probability) associated with the
# prediction
   confidence = detections[0, 0, i, 2]
# filter out weak detections by ensuring the `confidence` is
# greater than the minimum confidence
   if confidence > 0.2 :
   # extract the index of the class label from the `detections`,
   # then compute the (x, y)-coordinates of the bounding box for
   # the object
        idx = int(detections[0, 0, i, 1])
        box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
        (startX, startY, endX, endY) = box.astype("int")
        # display the prediction
        label = "{}: {:.2f}%".format(CLASSES[idx], confidence * 100)
        print("[INFO] {}".format(label))
        cv2.rectangle(image, (startX, startY), (endX, endY),COLORS[idx], 2)
        y = startY - 15 if startY - 15 > 15 else startY + 15
        cv2.putText(image, label, (startX, y),cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)

[INFO] person: 91.48%
[INFO] person: 83.77%
[INFO] person: 73.43%


In [15]:
# show the output image
cv2.imshow("Output", image)
cv2.waitKey(0)

-1

# Defining a Function for Object Detection

In [44]:
def detect(imageurl):
    image = cv2.imread("images/"+imageurl)
    image= cv2.resize(image, (300, 300))
    cv2.imshow("car", image)
    cv2.waitKey(0)
    (h, w) = image.shape[:2]
    blob = cv2.dnn.blobFromImage(image, 0.007843,(300, 300), 127.5)
    print("[INFO] computing object detections...")
    net.setInput(blob)
    detections = net.forward()

    for i in np.arange(0, detections.shape[2]):
        # extract the confidence (i.e., probability) associated with the
        # prediction
        confidence = detections[0, 0, i, 2]
        # filter out weak detections by ensuring the `confidence` is
        # greater than the minimum confidence
        if confidence > 0.2 :
            # extract the index of the class label from the `detections`,
            # then compute the (x, y)-coordinates of the bounding box for
            # the object
            idx = int(detections[0, 0, i, 1])
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")
            # display the prediction
            label = "{}: {:.2f}%".format(CLASSES[idx], confidence * 100)
            print("[INFO] {}".format(label))
            cv2.rectangle(image, (startX, startY), (endX, endY),COLORS[idx], 2)
            y = startY - 15 if startY - 15 > 15 else startY + 15
            cv2.putText(image, label, (startX, y),cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
        
     # show the output image
    cv2.imshow("Output", image)
    cv2.waitKey(0)   
    

# Result

In [39]:
detect('2cars.jpg')

[INFO] computing object detections...
[INFO] car: 99.72%
[INFO] car: 95.50%


In [45]:
detect('ncars.jpg')

[INFO] computing object detections...
[INFO] car: 99.86%


In [41]:
detect('dogperson.jpg')

[INFO] computing object detections...
[INFO] dog: 77.35%
[INFO] person: 97.00%
[INFO] person: 94.26%


In [46]:
detect('horse.jpg')

[INFO] computing object detections...
[INFO] horse: 99.98%
[INFO] horse: 99.95%
