<a href="https://colab.research.google.com/github/rahiakela/building-computer-vision-applications-using-artificial-neural-networks/blob/master/6-deep-learning-in-object-detection/2_detecting_objects_using_trained_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Detecting Objects Using Trained Models

As we learned before, model training is not a frequent activity and, when we have a reasonably good model (high accuracy or mAP), we may not need to retrain the model for as long as the model gives accurate predictions. Also, the model training is compute-intensive, and it takes several hours or days to train a good model even on GPUs. It is sometimes desirable and economical to train your computer vision models on the cloud and use GPUs. When the model is ready, download it to use locally in your computer or application server, which will use this model to detect objects in images.

We will follow this high-level plan to develop our predictor:

1. Install the TensorFlow models project
2. Utilize the exported TensorFlow graph (exported model) to predict objects within new images that were not included in the training or test sets.

## Installing TensorFlow’s models Project

The installation process of the TensorFlow models project is the same as we did on Google Colab. The difference may be in the Protobuf installation as it is platform-dependent software.

Here is the full set of steps to install and configure the models project:

1. First, let’s install a few necessary libraries that are needed to build
and install the models project.

In [None]:
%%shell
pip install --user Cython
pip install --user contextlib2
pip install --user pillow
pip install --user lxml
pip install --user matplotlib

2. Install Google’s Protobuf compiler.

In [None]:
%%shell
%tensorflow_version 1.x
sudo apt-get install protobuf-compiler python-pil python-lxml python-tk

3. Clone the TensorFlow models project from GitHub

In [3]:
!git clone https://github.com/ansarisam/models.git

Cloning into 'models'...
remote: Enumerating objects: 34141, done.[K
remote: Total 34141 (delta 0), reused 0 (delta 0), pack-reused 34141[K
Receiving objects: 100% (34141/34141), 518.53 MiB | 35.77 MiB/s, done.
Resolving deltas: 100% (22322/22322), done.
Checking out files: 100% (3011/3011), done.


4. Compile the models project using the Protobuf compiler.

In [4]:
%%shell
cd models/research
protoc object_detection/protos/*.proto --python_out=.



5. Set the following environment variables.

https://stackoverflow.com/questions/53306150/setting-environment-variables-in-google-colab

In [5]:
%env PYTHONPATH=$PYTHONPATH:/content/models/research/object_detection
%env PYTHONPATH=$PYTHONPATH:/content/models/research
%env PYTHONPATH=$PYTHONPATH:/content/models/research/slim

env: PYTHONPATH=$PYTHONPATH:/content/models/research/object_detection
env: PYTHONPATH=$PYTHONPATH:/content/models/research
env: PYTHONPATH=$PYTHONPATH:/content/models/research/slim


6. Build and install the research project that we just built using
Protobuf.

In [None]:
%%shell
cd /content/models/research

python setup.py build
python setup.py install

If the command successfully runs, it should print, at the end, something like this:

`Finished processing dependencies for object-detection==0.1`

We are all set with the environment preparation and ready to write code to detect objects in images. We will use the exported model that we downloaded from Colab.

## Code for Object Detection

Now we are ready to write code that does object detection in images and draws bounding boxes around them. To keep the code simple and easy to understand, we have divided it into the following parts:

- **Configuration and initialization**: In this section of the code, we
initialize the model path, image input, and output directories.

In [None]:
%%shell

#copy object_detection to root due to set path not working
cp -r models/research/object_detection .

# donwload images for using prediction
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
tar -xvf images.tar.gz

# unzip trained model
unzip trained_models.zip 

mkdir output_dir

In [9]:
import os
import pathlib
import random
import numpy as np
import tensorflow as tf
import cv2
# Import the object detection module.
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util

In [11]:
# initializes the gfile in the TensorFlow2 compatibility mode.The gfile provides I/O functionality in TensorFlow.
tf.gfile = tf.io.gfile

# initializes the directory path where our object detection trained model is located.
model_path = "trained_models/final_model"

# initializes the mapping file path. We set the same JSON formatted file containing the class ID and class name mapping that we used for the training.
labels_path = "models/research/object_detection/data/pet_label_map.pbtxt"

# the input directory path containing images in which objects need to be detected.
image_dir = "images"
# defines the pattern of file names in the input image path. If you want to load all files from the directory, use *.*.
image_file_pattern = "*.jpg"
# the output directory path where the images with bounding boxes around the detected objects will be saved.
output_path="output_dir"

# create iterable path objects that we will iterate through to read images one by one and detect objects in each of them.
PATH_TO_IMAGES_DIR = pathlib.Path(image_dir)
IMAGE_PATHS = sorted(list(PATH_TO_IMAGES_DIR.glob(image_file_pattern)))

# List of the strings that is used to add correct label for each box.
category_index = label_map_util.create_category_index_from_labelmap(labels_path, use_display_name=True)
# assigned the number of classes to the class_num variable.
class_num = len(category_index)

Let's initialize a color table that we will use when drawing bounding boxes.

In [12]:
# Creating a Color Table Based on the Number of Object Classes
def get_color_table(class_num, seed=0):
  random.seed(seed)
  color_table = {}
  for i in range(class_num):
    color_table[i] = [random.randint(0, 255) for _ in range(3)]
  
  return color_table

colortable = get_color_table(class_num)

- **Loading the Model**: Create a model object by loading the trained model.We will use this model object to predict the objects and bounding boxes.

In [13]:
# Model preparation and loading the model from the disk
def load_model(model_path):
  model_dir = pathlib.Path(model_path) / "saved_model"
  model = tf.saved_model.load(str(model_dir))
  model = model.signatures["serving_default"]

  return model

- **Predicting Objects and Bounding Boxes and Organizing the Output**

We run the prediction and construct the output in a usable form. We have written a function called run_inference_for_single_image() that takes two arguments: the model object and image NumPy. This function returns a Python dictionary. The output dictionary contains the following key pairs:

1. detection_boxes, which is a 2D array consisting of the four corners
of bounding boxes.
2. detection_scores, which is a 1D array of scores associated with
each bounding box.
3. detection_classes, which is a 1D array of integer representation
of the object class-index associated with each bounding box.
4. num_detections, which is a scalar that indicates the number of
predicted object classes.

The TensorFlow model object takes a batch of image tensors to predict the object classes and bounding boxes around them. So we converts the image NumPy into a tensor. Since we are processing one image at a time and the model object takes a batch, we need to convert our image tensor into a batch of images.

The tf.newaxis expression is used to increase the dimension of an existing array by 1, when used once. Thus, a 1D array will become a 2D array. A 2D array will become a 3D array. And so on.

In [23]:
# Predict objects and bounding boxes and format the result
def run_inference_for_single_image(model, image):
  #  The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
  input_tensor = tf.convert_to_tensor(image)
  # The model expects a batch of images, so add an axis with `tf.newaxis`.
  input_tensor = input_tensor[tf.newaxis, ...]

  # Run prediction from the model and predicts the object classes, bounding boxes, and associated scores.
  output_dict = model(input_tensor)

  # Input to model is a tensor, so the output is also a tensor
  # Convert to numpy arrays, and take index [0] to remove the batch dimension.
  # We're only interested in the first num_detections.
  num_detections = int(output_dict.pop("num_detections"))
  output_dict = {
      key: value[0, :num_detections].numpy() for key, value in output_dict.items()
  }
  output_dict["num_detections"] = num_detections

  # detection_classes should be ints.
  output_dict["detection_classes"] = output_dict["detection_classes"].astype(np.int64)

  # Handle models with masks: this is applicable only for a Mask R-CNN when masks need to be predicted. For all other predictors, these lines may be omitted.
  if "detection_masks" in output_dict:
    # Reframe the the bbox mask to the image size.
    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
        output_dict["detection_masks"],
        output_dict["detection_boxes"],
        image.shape[0], image.shape[1]
    )
    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8)
    output_dict["detection_masks_reframed"] = detection_masks_reframed.numpy()

  """
  returns the output dictionary, which consists of coordinates of detected bounding boxes, object classes, scores, and number of
  detections. In the case of a Mask R-CNN, it also includes object masks.
  """
  return output_dict

- **Drawing Bounding Boxes Around Detected Objects in Input Images**

We will now write code to infer the output, draw bounding boxes around detected objects, and store the result. It will draws bounding boxes around each
detected object in the image. It also labels the objects with class
names and scores and finally saves the result to the output directory
location.

In [24]:
def infer_object(model, image_path):
  # Read the image using openCV and create an image numpy
  # The final output image with boxes and labels on it.
  imagename = os.path.basename(image_path)
  image_np = cv2.imread(os.path.abspath(image_path))

  # Actual detection.
  output_dict = run_inference_for_single_image(model, image_np)

  # Visualization of the results of a detection.
  for i in range(output_dict["detection_classes"].size):
    box = output_dict["detection_boxes"][i]
    classes = output_dict["detection_classes"][i]
    scores = output_dict["detection_scores"][i]

    if scores > 0.5:
      h = image_np.shape[0]
      w = image_np.shape[1]
      classname = category_index[classes]["name"]
      classid = category_index[classes]["id"]

      # Draw bounding boxes
      cv2.rectangle(image_np, (int(box[1] * w), int(box[0] * h)), (int(box[3] * w), int(box[2] * h)), colortable[classid], 2)

      # Write the class name on top of the bounding box
      font = cv2.FONT_HERSHEY_COMPLEX_SMALL
      size = cv2.getTextSize(str(classname) + ":" + str(scores), font, 0.75, 1)[0][0]

      cv2.rectangle(image_np,(int(box[1] * w), int(box[0] * h-20)), ((int(box[1] * w)+size+5), int(box[0] * h)), colortable[classid],-1)
      cv2.putText(image_np, str(classname) + ":" + str(scores), (int(box[1] * w), int(box[0] * h)-5), font, 0.75, (0,0,0), 1, 1)
    else:
      break
  # Save the result image with bounding boxes and class labels in file system
  cv2.imwrite(output_path + "/" + imagename, image_np)
  # cv2.imshow(imagename, image_np)

Now that we have all the right settings and functions defined, we need to call them to trigger the detection process.

In [16]:
# Obtain the model object
detection_model = load_model(model_path)

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


The function infer_object() is invoked for each image, and the final output with
bounding boxes around the detected objects are saved in the output directory.

In [25]:
# For each image, call the prediction
for image_path in IMAGE_PATHS:
  #print(image_path)
  infer_object(detection_model, image_path)

ValueError: ignored