##Mask R-CNN Image Segmentation

Implement Mask R-CNN from TensorFlow Hub for object detection and instance segmentation.Aside from the bounding boxes, the model will be able to predict segmentation masks for each instance of a class in the image. 

NOTE: Implement a TPU runtime due to the processing requirements of the model 

##Installation

Use the TensorFlow 2 Object Detection API
* Clone the TensorFlow Model Garden 

In [1]:
#Clone the TensorFlow models repository
!git clone --depth 1 https://github.com/tensorflow/models

fatal: destination path 'models' already exists and is not an empty directory.


In [2]:
#Compile the Object Detection API protocol buffers 
!cd models/research/ && protoc object_detection/protos/*.proto --python_out=.


In [3]:
%%writefile models/research/setup.py

import os
from setuptools import find_packages
from setuptools import setup

REQUIRED_PACKAGES = [
    'tf-models-official==2.7.0',
    'tensorflow_io'
]

setup(
    name='object_detection',
    version='0.1',
    install_requires=REQUIRED_PACKAGES,
    include_package_data=True,
    packages=(
        [p for p in find_packages() if p.startswith('object_detection')] +
        find_packages(where=os.path.join('.', 'slim'))),
    package_dir={
        'datasets': os.path.join('slim', 'datasets'),
        'nets': os.path.join('slim', 'nets'),
        'preprocessing': os.path.join('slim', 'preprocessing'),
        'deployment': os.path.join('slim', 'deployment'),
        'scripts': os.path.join('slim', 'scripts'),
    },
    description='Tensorflow Object Detection Library',
    python_requires='>3.6',
)

Overwriting models/research/setup.py


In [4]:
# Run the setup script you just wrote
!python -m pip install models/research

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing ./models/research
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m
Building wheels for collected packages: object-detection
  Building wheel for object-detection (setup.py) ... [?25l[?25hdone
  Created wheel for object-detection: filename=object_detection-0.1-py3-none-any.whl size=1694726 sha256=6e86bda511e60e776e6251e0a3a24b74d353515890cb4b085f895ff97accf61c
  Stored in directory: /tmp/pip-ephem-wheel-cache-25lqmelr/wheels/fa/a4/d2/e9a5057e414fd46c8e543d2706cd836d64e1fcd9eccceb2329
Successfully built object-detection
Installi

#Imports

In [5]:
import tensorflow as tf
import tensorflow_hub as hub

import matplotlib
import matplotlib.pyplot as plt

import numpy as np
from six import BytesIO
from PIL import Image
from six.moves.urllib.request import urlopen

from object_detection.utils import label_map_util, visualization_utils, ops

tf.get_logger().setLevel('ERROR')

%matplotlib inline 

##Utilities

For convenience, utilize a function to convert an image to a numpy array. Pass in a relative path to an image (local directory for example) or a URL

See this in the `TEST_IMAGES` directory below
* Some paths point to test images that come with the API package while others are URLs that point to images online 

In [6]:
def load_image_into_numpy_array(path):
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
  """
  image = None
  if(path.startswith('http')):
    response = urlopen(path)
    image_data = response.read()
    image_data = BytesIO(image_data)
    image = Image.open(image_data)
  else:
    image_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(image_data))

  (im_width, im_height) = (image.size)
  return np.array(image.getdata()).reshape(
      (1, im_height, im_width, 3)).astype(np.uint8)


# dictionary with image tags as keys, and image paths as values
TEST_IMAGES = {
  'Beach' : 'models/research/object_detection/test_images/image2.jpg',
  'Dogs' : 'models/research/object_detection/test_images/image1.jpg',
  # By Américo Toledano, Source: https://commons.wikimedia.org/wiki/File:Biblioteca_Maim%C3%B3nides,_Campus_Universitario_de_Rabanales_007.jpg
  'Phones' : 'https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg/1024px-Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg',
  # By 663highland, Source: https://commons.wikimedia.org/wiki/File:Kitano_Street_Kobe01s5s4110.jpg
  'Street' : 'https://upload.wikimedia.org/wikipedia/commons/thumb/0/08/Kitano_Street_Kobe01s5s4110.jpg/2560px-Kitano_Street_Kobe01s5s4110.jpg'
}

##Load the Model 

In [7]:
model_display_name = 'Mask R-CNN Inception ResNet V2 1024x1024'
model_handle = 'https://tfhub.dev/tensorflow/mask_rcnn/inception_resnet_v2_1024x1024/1'

print('Selected model: + model_display_name')
print('Model Handle at TF Hub: {}'.format(model_handle))

Selected model: + model_display_name
Model Handle at TF Hub: https://tfhub.dev/tensorflow/mask_rcnn/inception_resnet_v2_1024x1024/1


In [8]:
hub_model = hub.load(model_handle)

##Inference

Use the model to perform instance segmentation on an image. Choose one of the test images that was specified eartlier and load it into a numpy array 

In [9]:
#Choose a 'key' from the list below for TEST_IMAGES
#['Beach', 'Street', 'Dogs', 'Phones']

image_path =TEST_IMAGES['Street']

image_np = load_image_into_numpy_array(image_path)

plt.figure(figsize=(24,32))
plt.imshow(image_np[0])
plt.show()

Output hidden; open in https://colab.research.google.com to view.

In [10]:
#Run inference 
results = hub_model(image_np)

#Output values are tensors but only the numpy() is needed
#Parameter when visualizing the results 

result = {key:value.numpy() for key,value in results.items()}

#print the keys
for key in result.keys():
  print(key)

mask_predictions
image_shape
proposal_boxes
rpn_objectness_predictions_with_background
raw_detection_boxes
num_detections
box_classifier_features
detection_scores
detection_masks
detection_anchor_indices
final_anchors
detection_classes
anchors
raw_detection_scores
rpn_features_to_crop
refined_box_encodings
detection_boxes
num_proposals
rpn_box_predictor_features
rpn_box_encodings
class_predictions_with_background
detection_multiclass_scores
proposal_boxes_normalized


##Visualizing the results 

Plot the results on the original image
* Create the `category_index` dictionary that will contain the class IDs and names
* The model was trained on the COCO2017 dataset and the API package has the labels saved in a different format
* Use the `create_category_index_from_labelmap()` internal utility to convert this to the required dictionary format 

In [11]:
PATH_TO_LABELS = './models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

# sample output
print(category_index[1])
print(category_index[2])
print(category_index[4])

{'id': 1, 'name': 'person'}
{'id': 2, 'name': 'bicycle'}
{'id': 4, 'name': 'motorcycle'}


Next preprocess the masks and then plot the results
* The result dictionary contains a `detection_masks` key containing segmentation masks for each box that will be converted first to masks that will overlay the full image
* Select mask pixel values that are above a certain threshold 
 * Make sure the threshold isn't too low otherwise there will most likely be pixels that are outside the object 
* Use the `visualize_boxes_and_labels_on_image_array()` to plot the resutls on the image 
 * The parameter `instance_masks` will pass in the reframed detection boxes to see the segmentation masks on the image 

In [14]:
#Handle models with masks 
label_id_offset = 0
image_np_with_mask = image_np.copy()

if 'detection_masks' in result:

  #convert np.arrays in result:
  detection_masks = tf.convert_to_tensor(result['detection_masks'][0])
  detection_boxes = tf.convert_to_tensor(result['detection_boxes'][0])

  #Reframe the bounding box mask to the image size 
  detection_masks_reframed = ops.reframe_box_masks_to_image_masks(      detection_masks, detection_boxes, image_np.shape[1], image_np.shape[2])

  #Filter mask pixel values that are above a specified threshold 
  detection_masks_reframed = tf.cast(detection_masks_reframed > 0.6, tf.uint8)

  #Get the numpy array
  result['detection_masks_reframed'] = detection_masks_reframed.numpy()


# overlay labeled boxes and segmentation masks on the image
visualization_utils.visualize_boxes_and_labels_on_image_array(
      image_np_with_mask[0],
      result['detection_boxes'][0],
      (result['detection_classes'][0] + label_id_offset).astype(int),
      result['detection_scores'][0],
      category_index,
      use_normalized_coordinates=True,
      max_boxes_to_draw=100,
      min_score_thresh=.70,
      agnostic_mode=False,
      instance_masks=result.get('detection_masks_reframed', None),
      line_thickness=8)

plt.figure(figsize=(24,32))
plt.imshow(image_np_with_mask[0])
plt.show()

Output hidden; open in https://colab.research.google.com to view.