# **Getting Prerequisites**
Before starting to work on Object Detection module, following components need to be installed :
> -  Python
> -  Tensorflow
> -  Tensorboard
> -  Protobuf v3.4 or above 

# **Setting Up The Environment**
Now to Download TensorFlow and TensorFlow GPU we can use pip or conda commands:
>#### For CPU
>  -  pip install tensorflow
> #### For GPU
>  -  pip install tensorflow-gpu
    
For all the other libraries we can use pip or conda to install them. The code is provided below:<br>
>  -  pip install Cython<br>
>  -  pip install contextlib2<br>
>  -  pip install pillow<br>
>  -  pip install lxml<br>
>  -  pip install jupyter<br>
>  -  pip install matplotlib

- Next, we have Protobuf: Protocol Buffers (Protobuf)  are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data, – think of it like XML, but smaller, faster, and simpler. We need to Download Protobuf version 3.4 or above and extract it.<br><br>

- Now we need to Clone or Download TensorFlow’s Model from Github. Once downloaded and extracted rename the “models-masters” to just “models“.<br><br>

- Now for simplicity, we are going to keep “models” and “protobuf” under one folder “Tensorflow“.<br><br>

- Next, we need to go inside the Tensorflow folder and then inside research folder and run protobuf from there using this command:<br>
> "path_of_protobuf's bin"./bin/protoc object_detection/protos/
- To check whether this worked or not, we can go to the protos folder inside models>object_detection>protos and there we can see that for every proto file there’s one python file created.<br>

# Import required libraries

In [1]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

sys.path.append("..")

from utils import label_map_util

from utils import visualization_utils as vis_util

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


### Following library is used to handle image and video operations

In [2]:
import cv2

Next, we will download the model which is trained on the __[<font color=blue>*COCO dataset*</font>](http://cocodataset.org/)__. COCO stands for Common Objects in Context, this dataset contains around **330K labeled images**. Now the model selection is important as we need to make an important tradeoff between Speed and Accuracy. Depending upon our requirement and the system memory, the correct model must be selected.

Inside “models>research>object_detection>g3doc>detection_model_zoo” contains all the models with different speed and accuracy(mAP).<br>
<img src="images/models.png" alt="Alt text that describes the graphic" title="Title text" />

- Next, we provide the required model and the frozen inference graph generated by Tensorflow to use.

In [3]:
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

NUM_CLASSES = 90

- This code will download that model from the internet and extract the frozen inference graph of that model if it does not exist in the system we are using right now.

In [4]:
if not os.path.exists(MODEL_NAME):
    opener = urllib.request.URLopener()
    opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
    tar_file = tarfile.open(MODEL_FILE)
    for file in tar_file.getmembers():
        file_name = os.path.basename(file.name)
        if 'frozen_inference_graph.pb' in file_name:
            tar_file.extract(file, os.getcwd())

- Following code generates graph for each object and reads as the accuracy of currect prediction.

In [5]:
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

- Next, we are going to load all the labels to write on the boxes we will draw around detected objects

In [6]:
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

- Now we need to open camera and capture the objects for the session
- Scan for objects and draw boxes around each of them. Also write the accuracy of the detected objects.

In [7]:
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        cap = cv2.VideoCapture(0)    
        while True:
            ret, image_np = cap.read()
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Each box represents a part of the image where a particular object was detected.
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Each score represent how level of confidence for each of the objects.
            # Score is shown on the result image, together with the class label.
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
              [boxes, scores, classes, num_detections],
              feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
                use_normalized_coordinates=True,
                line_thickness=8)

            cv2.imshow('object detection', cv2.resize(image_np, (800,600)))
            k = cv2.waitKey(30) & 0xff
            if k == 27:
                break

        cap.release()
        cv2.destroyAllWindows()

Above code will use OpenCV that will, in turn, use the camera object initialized earlier to open a new window named “Object_Detection” of the size “800×600”. It will wait for 25 milliseconds for the camera to show images otherwise, it will close the window.

> cv2.imshow('object detection', cv2.resize(image_np, (800,600)))<br>
> k = cv2.waitKey(30) & 0xff<br>
> if k == 27: break

Following is example for the model we have designed<br>
<img src="images/example.png" alt="Alt text that describes the graphic" title="Title text" />