# Object Detection Demo
Welcome to the object detection inference walkthrough!  This notebook will walk you step by step through the process of using a pre-trained model to detect objects in an image. Make sure to follow the [installation instructions](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/installation.md) before you start.

# Imports

In [1]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

import time

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

## Env setup

In [2]:
# This is needed to display the images.
%matplotlib inline
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")

## Object detection imports
Here are the imports from the object detection module.

In [3]:
from utils import label_map_util
from utils import visualization_utils as vis_util

# Model preparation 

## Variables

Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_CKPT` to point to a new .pb file.  

By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.

In [13]:
"""
 Set the following variable to point to the model file and labels file.
"""

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = '/home/shanthans/Documents/Learning/MachineLearning/tensorflow/google-research/models/output_inference_graph_udacity_real_resnet.pb/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = '/home/shanthans/Documents/Projects/SDC/Term3/Course/Projects/p3_carnd_capstone/team/shanthan/ros/src/tl_detector/light_classification/graphs/object-detection.pbtxt'

#Number of classes
NUM_CLASSES = 14

"""
 Path to test images.
 Expects 'test_images' folder containing input images below PARENT_TO_TEST_IMAGES_DIR.
 Expects empty 'output_images' folder below PARENT_TO_TEST_IMAGES_DIR, to store output images.
""" 
PARENT_TO_TEST_IMAGES_DIR = '/home/shanthans/Documents/Projects/SDC/Term3/Course/Projects/p3_carnd_capstone/team/shanthan/Misc/rosbag_extraction/'

## Download Model

In [None]:
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())

## Load a (frozen) Tensorflow model into memory.

In [5]:
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

## Loading label map
Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine

In [6]:
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

## Helper code

In [7]:
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

# Detection

In [14]:
"""
 For the sake of simplicity we will use only 2 images:
 image1.jpg
 image2.jpg
 If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
"""

#PATH_TO_TEST_IMAGES_DIR = 'test_images'
#PARENT_TO_TEST_IMAGES_DIR = '/home/shanthans/Documents/Projects/SDC/Term3/Course/Projects/p3_carnd_capstone/team/shanthan/Misc/'
PATH_TO_TEST_IMAGES_DIR = PARENT_TO_TEST_IMAGES_DIR + 'test_images'
#PATH_TO_TEST_IMAGES_DIR = '/home/shanthans/Documents/Learning/MachineLearning/tensorflow/google-research/models/object_detection/test_images'
#PATH_TO_TEST_IMAGES_DIR = '/home/shanthans/Documents/Learning/MachineLearning/tensorflow/TF-API-Blog/raccoon_dataset/test_images/'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image%04d.jpg'%i) for i in range(1, 700) ]
#TEST_IMAGE_PATHS = ['/home/shanthans/Documents/Learning/MachineLearning/tensorflow/TF-API-Blog/shanthan_dataset/example.jpg']

PATH_TO_OUT_IMAGES = PARENT_TO_TEST_IMAGES_DIR + 'out_images/'
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

In [16]:


with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    
    count = 1
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      # Actual detection.
    
      start_time = time.time()
          
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
    
      duration = time.time() - start_time
      
      print(duration)
      #print(np.squeeze(scores))
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=4)
      #plt.figure(figsize=IMAGE_SIZE)
      #plt.imshow(image_np) 
      m = image_path.split('/', -1)
      out_img_path = PATH_TO_OUT_IMAGES + m[-1]     
      plt.imsave(out_img_path, image_np)
      print("Image {} processed:".format(count))
      count = count + 1

2.09177708626
Image 1 processed:
0.146237850189
Image 2 processed:
0.146841049194
Image 3 processed:
0.142335891724
Image 4 processed:
0.145473003387
Image 5 processed:
0.155081987381
Image 6 processed:
0.15018415451
Image 7 processed:
0.143882036209
Image 8 processed:
0.142235994339
Image 9 processed:
0.144668102264
Image 10 processed:
0.144777059555
Image 11 processed:
0.142326116562
Image 12 processed:
0.13703083992
Image 13 processed:
0.140684127808
Image 14 processed:
0.143256902695
Image 15 processed:
0.140347957611
Image 16 processed:
0.142180919647
Image 17 processed:
0.137939214706
Image 18 processed:
0.143456935883
Image 19 processed:
0.143826007843
Image 20 processed:
0.140073060989
Image 21 processed:
0.139058113098
Image 22 processed:
0.137341976166
Image 23 processed:
0.137513160706
Image 24 processed:
0.146947860718
Image 25 processed:
0.135497808456
Image 26 processed:
0.14390707016
Image 27 processed:
0.144288063049
Image 28 processed:
0.137944936752
Image 29 processed

0.147789955139
Image 233 processed:
0.153594970703
Image 234 processed:
0.138370990753
Image 235 processed:
0.142996072769
Image 236 processed:
0.142155885696
Image 237 processed:
0.143617153168
Image 238 processed:
0.137692928314
Image 239 processed:
0.142858982086
Image 240 processed:
0.147981882095
Image 241 processed:
0.142019987106
Image 242 processed:
0.138028144836
Image 243 processed:
0.136157035828
Image 244 processed:
0.142740011215
Image 245 processed:
0.139249801636
Image 246 processed:
0.144922018051
Image 247 processed:
0.13877415657
Image 248 processed:
0.141891002655
Image 249 processed:
0.149337053299
Image 250 processed:
0.142205953598
Image 251 processed:
0.146657943726
Image 252 processed:
0.139803171158
Image 253 processed:
0.139878988266
Image 254 processed:
0.142516851425
Image 255 processed:
0.145107030869
Image 256 processed:
0.141741991043
Image 257 processed:
0.141345024109
Image 258 processed:
0.149060964584
Image 259 processed:
0.14025592804
Image 260 proce

0.150480031967
Image 462 processed:
0.1434237957
Image 463 processed:
0.14400601387
Image 464 processed:
0.141554832458
Image 465 processed:
0.139888048172
Image 466 processed:
0.141637086868
Image 467 processed:
0.144236087799
Image 468 processed:
0.142570018768
Image 469 processed:
0.141571044922
Image 470 processed:
0.146381139755
Image 471 processed:
0.140276908875
Image 472 processed:
0.141029119492
Image 473 processed:
0.143597841263
Image 474 processed:
0.138842105865
Image 475 processed:
0.14492893219
Image 476 processed:
0.139313936234
Image 477 processed:
0.143620014191
Image 478 processed:
0.141065120697
Image 479 processed:
0.141385793686
Image 480 processed:
0.142385959625
Image 481 processed:
0.138211965561
Image 482 processed:
0.136963129044
Image 483 processed:
0.141686916351
Image 484 processed:
0.140052080154
Image 485 processed:
0.139921188354
Image 486 processed:
0.138453960419
Image 487 processed:
0.141577005386
Image 488 processed:
0.143760919571
Image 489 process

0.139617919922
Image 691 processed:
0.14679813385
Image 692 processed:
0.142104148865
Image 693 processed:
0.139029026031
Image 694 processed:
0.140438079834
Image 695 processed:
0.143049001694
Image 696 processed:
0.144618988037
Image 697 processed:
0.137717008591
Image 698 processed:
0.142174005508
Image 699 processed:
