## Object Detection using Tensorflow and OpenCV
The following code is modified from the Tensorflow Object Detection API example (https://github.com/tensorflow/models/tree/master/research/object_detection).  The main changes here are to allow a stream of images to be processed using the same session instead of creating a new session for every image.

I am using a older DLink 930L camera which does not work well with streaming video to a program.  Because of that, I was forced to request single images over and over to simulate a stream.  It does lag and the object detection slows it down as expected but overall, it does work decently for simulating a video stream.  The config.properties file holds the user and password for my camera.

This notebook is designed to be placed in the object_detection directory noted above.  If you wish to run this separately, you will need to copy the core, data, protos, and utils directories.  Regardless of how you decide to use it, the files in the subdirectories require that this notebook is located in a directory named object_detection.

Finally, you will need to build the files in the protos directory using the instructions in this link - https://pythonprogramming.net/introduction-use-tensorflow-object-detection-api-tutorial/

In [1]:
# See also:
# https://pythonprogramming.net/video-tensorflow-object-detection-api-tutorial/?completed=/introduction-use-tensorflow-object-detection-api-tutorial/
# https://github.com/tensorflow/models/issues/4355
import cv2
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
import configparser

config = configparser.RawConfigParser()
config.read('config.properties')
url = 'http://' + config.get('Account', 'name') + ':' + config.get('Account', 'password') + '@192.168.1.103:88/Image.jpg'

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from utils import ops as utils_ops
from utils import label_map_util
from utils import visualization_utils as vis_util

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
  raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')


# Model preparation 

## Variables

Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_FROZEN_GRAPH` to point to a new .pb file.  

By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.

In [2]:
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

## Download Model

In [3]:
exists = os.path.isfile(PATH_TO_FROZEN_GRAPH)
print("The PATH_TO_FROZEN_GRAPH (" + PATH_TO_FROZEN_GRAPH + ") exists: " + str(exists))
if not exists:
    print("The PATH_TO_FROZEN_GRAPH did not exist.  Loading..." )
    opener = urllib.request.URLopener()
    opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
    tar_file = tarfile.open(MODEL_FILE)
    for file in tar_file.getmembers():
      file_name = os.path.basename(file.name)
      if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())

The PATH_TO_FROZEN_GRAPH (ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb) exists: True


## Load a (frozen) Tensorflow model into memory.

In [4]:
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

## Loading label map
Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine

In [5]:
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

# Detection
The previous run_inference_for_single_image routine and the code that called it was replaced with the code below.  This allows the session to be reused instead of loading it over and over.  Again, my camera didn't allow streaming so there is a lot more code here that keeps loading images for processing to simulate a video stream.

In [6]:
with tf.Session(graph=detection_graph) as sess:
    cap = cv2.VideoCapture(url)
    ret, image_np = cap.read()
    
    # Check if camera opened successfully
    if (cap.isOpened()== False): 
        print("Error opening video stream or file")
        
    # Read until video is completed
    while(cap.isOpened()):
        if ret == True:
            # Process the image returned from the camera - image_np
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            
            # Actual detection.
#             Previously - output_dict = run_inference_for_single_image(image_np, detection_graph)
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in [
              'num_detections', 'detection_boxes', 'detection_scores',
              'detection_classes', 'detection_masks'
            ]:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                  tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)
            if 'detection_masks' in tensor_dict:
                # The following processing is only for single image
                detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
                detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
                # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
                real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
                detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
                detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                    detection_masks, detection_boxes, image_np.shape[0], image_np.shape[1])
                detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)
                # Follow the convention by adding back the batch dimension
                tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)
            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

            # Run inference
            output_dict = sess.run(tensor_dict, feed_dict={image_tensor: np.expand_dims(image_np, 0)})

            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
            if 'detection_masks' in output_dict:
                output_dict['detection_masks'] = output_dict['detection_masks'][0]           
            
            
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                output_dict['detection_boxes'],
                output_dict['detection_classes'],
                output_dict['detection_scores'],
                category_index,
                instance_masks=output_dict.get('detection_masks'),
                use_normalized_coordinates=True,
                line_thickness=8)

            cv2.imshow('object detection', cv2.resize(image_np, (800,600)))
            # Press Q on keyboard to  exit
            if cv2.waitKey(25) & 0xFF == ord('q'):
                break
            # Release the last image and load the next
            cap.release()
            cap = cv2.VideoCapture(url)
    #         ret, frame = cap.read()
            ret, image_np = cap.read()

        # Break the loop
        else: 
            print('ret != True')
            break

        count = 0
        # Occasionally, you get a bad feed that needs to be reloaded
        # Try a few more times before quitting
        while(not cap.isOpened()):
            count += 1
            #print('cap is not opened...', count)
            if(count > 10):
                print("Couldn't get an image after 10 tries...")
                break
            # Release the last image and load the next
            cap.release()
            cap = cv2.VideoCapture(url)
    #         ret, frame = cap.read()
            ret, image_np = cap.read()

    # When everything is done, release the video capture object
    cap.release()

    # Closes all the frames
    cv2.destroyAllWindows()
    

## Examples
These are some images that I was able to copy while this program was running.

<img src="Example1.jpg" alt="People" style="width: 500px;"/>
<img src="Example2.jpg" alt="Car" style="width: 500px;"/>