## Instance Detection and Segmentation with Mask-RCNN

[Mask RCNN](https://arxiv.org/abs/1703.06870) is a refinement of the [Faster RCNN](https://arxiv.org/abs/1506.01497) **object detection** model to also add support for **instance segmentation**.

The following shows how to use a [Keras based implementation](https://github.com/matterport/Mask_RCNN) provided by matterport.com along with model parameters pretrained on the [COCO object detection dataset](http://cocodataset.org/).

**WARNING**: The following requires to execute the companion `data_download.ipynb` notebook first.

**WARNING**: For this notebook (and only this one), you'll need a tensorflow version under 2.0.0; create a new virtualenv and install requirements_tensorflow1.txt.

In [None]:
import tensorflow
import keras

tensorflow.__version__, keras.__version__

In [None]:
from maskrcnn import config
from maskrcnn import model as modellib


class InferenceCocoConfig(config.Config):
    # Give the configuration a recognizable name
    NAME = "inference_coco"

    # Number of classes (including background)
    NUM_CLASSES = 1 + 80  # COCO has 80 classes

    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1


config = InferenceCocoConfig()
model = modellib.MaskRCNN(mode="inference", model_dir='maskrcnn/logs', config=config)

# Load weights trained on MS-COCO
coco_model_file = "mask_rcnn_coco.h5"
model.load_weights(coco_model_file, by_name=True)

### Class Names

Index of the class in the list is its ID. For example, to get ID of the teddy bear class, use: `class_names.index('teddy bear')`

`BG` stands for background.

In [None]:
# COCO Class names
class_names = ['BG', 'person', 'bicycle', 'car', 'motorcycle', 'airplane',
               'bus', 'train', 'truck', 'boat', 'traffic light',
               'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird',
               'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear',
               'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
               'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
               'kite', 'baseball bat', 'baseball glove', 'skateboard',
               'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
               'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
               'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
               'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
               'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
               'keyboard', 'cell phone', 'microwave', 'oven', 'toaster',
               'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors',
               'teddy bear', 'hair drier', 'toothbrush']

### Run Object Detection

In [None]:
from skimage.io import imread

image = imread('webcam_shot.jpeg')

In [None]:
image.shape

In [None]:
from maskrcnn import visualize
import time

# Run detection
tic = time.time()
results = model.detect([image], verbose=1)
toc = time.time()
print("Analyzed image in {:.3f}s".format(toc - tic))

# Visualize results
r = results[0]
for class_id, score in zip(r['class_ids'], r['scores']):
    print("{}:\t{:0.3f}".format(class_names[class_id], score))
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            class_names, r['scores']);

In [None]:
import os
import random

# Load a random image from the images folder
image_folder = 'maskrcnn/images'
file_names = next(os.walk(image_folder))[2]
image = imread(os.path.join(image_folder, random.choice(file_names)))

# Run detection
results = model.detect([image], verbose=1)

# Visualize results
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            class_names, r['scores'])