## **Deep Learning Made Easy**

----

**Important:** This notebook was developed by <a href="https://github.com/tugstugi/dl-colab-notebooks/blob/master/notebooks/TorchvisionMaskRCNN.ipynb">Erdene-Ochir Tuguldur</a> and a few modifications have been done by <a href="https://www.linkedin.com/in/valdivino-alexandre-de-santiago-j%C3%BAnior-103109206/?locale=en_US">Valdivino Alexandre de Santiago Júnior</a>. It is a notebook for instance segmentation via Mask R-CNN.

In [1]:
import os
from os.path import exists, join, basename, splitext

import random
import PIL
import torchvision
import cv2
import numpy as np
import torch
torch.set_grad_enabled(False)
  
import time
import matplotlib
import matplotlib.pylab as plt
from prettytable import PrettyTable
plt.rcParams["axes.grid"] = False

In [2]:
# This function obtains the number of trainable parameters of the 
# model/network.
def count_parameters(model):
    table = PrettyTable(["Modules", "Parameters"])
    total_params = 0
    for name, parameter in model.named_parameters():
        if not parameter.requires_grad: continue
        param = parameter.numel()
        table.add_row([name, param])
        total_params+=param
    print(table)
    print(f"Total trainable params: {total_params}")
    return total_params

### Pretrained Mask R-CNN
----

We use a pretrained Mask R-CNN: ResNet50FPN.

In [3]:
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)

# Just checking the number of trainable parameters
print('Checking trainable parameters: {}'.format(count_parameters(model)))

model = model.eval().cuda()

Downloading: "https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth" to /root/.cache/torch/hub/checkpoints/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth


  0%|          | 0.00/170M [00:00<?, ?B/s]

+-------------------------------------------------+------------+
|                     Modules                     | Parameters |
+-------------------------------------------------+------------+
|       backbone.body.layer2.0.conv1.weight       |   32768    |
|       backbone.body.layer2.0.conv2.weight       |   147456   |
|       backbone.body.layer2.0.conv3.weight       |   65536    |
|    backbone.body.layer2.0.downsample.0.weight   |   131072   |
|       backbone.body.layer2.1.conv1.weight       |   65536    |
|       backbone.body.layer2.1.conv2.weight       |   147456   |
|       backbone.body.layer2.1.conv3.weight       |   65536    |
|       backbone.body.layer2.2.conv1.weight       |   65536    |
|       backbone.body.layer2.2.conv2.weight       |   147456   |
|       backbone.body.layer2.2.conv3.weight       |   65536    |
|       backbone.body.layer2.3.conv1.weight       |   65536    |
|       backbone.body.layer2.3.conv2.weight       |   147456   |
|       backbone.body.lay

### Sample images
----



We use the following images from the COCO dataset.

In [4]:
IMAGE_URL = [
    'http://images.cocodataset.org/val2017/000000397133.jpg',
    'http://images.cocodataset.org/val2017/000000037777.jpg',
    'http://images.cocodataset.org/val2017/000000252219.jpg',
    'http://images.cocodataset.org/test-stuff2017/000000000880.jpg',
    'http://images.cocodataset.org/test-stuff2017/000000028352.jpg',
    'http://images.cocodataset.org/test-stuff2017/000000000171.jpg',
    'http://images.cocodataset.org/test-stuff2017/000000000533.jpg'
]

### Mask R-CNN in action
----

We have changed the original tutorial to accept several images. Hence, function ```run_MaskRCNN``` accomplishes the following:


1.   Show the original images;
2.   Show the detected objects (boxes);
3.   Show the objects' masks. 



In [5]:
def run_MaskRCNN(all_images):
  for img in all_images:
    image_file = basename(img)
    !wget -q -O {image_file} {img}
    plt.figure()
    plt.imshow(matplotlib.image.imread(image_file))

    t = time.time()
    image = PIL.Image.open(image_file)
    image_tensor = torchvision.transforms.functional.to_tensor(image).cuda()
    output = model([image_tensor])[0]
    print('Executed in %.3fs' % (time.time() - t))

    coco_names = ['unlabeled', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'street sign', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'hat', 'backpack', 'umbrella', 'shoe', 'eye glasses', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'plate', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'mirror', 'dining table', 'window', 'desk', 'toilet', 'door', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'blender', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
    colors = [[random.randint(0, 255) for _ in range(3)] for _ in coco_names]

    result_image = np.array(image.copy())
    for box, label, score in zip(output['boxes'], output['labels'], output['scores']):
      if score > 0.5:
        color = random.choice(colors)
    
        # draw box
        tl = round(0.002 * max(result_image.shape[0:2])) + 1  # line thickness
        c1, c2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
        cv2.rectangle(result_image, c1, c2, color, thickness=tl)
        # draw text
        display_txt = "%s: %.1f%%" % (coco_names[label], 100*score)
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(display_txt, 0, fontScale=tl / 3, thickness=tf)[0]
        c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
        cv2.rectangle(result_image, c1, c2, color, -1)  # filled
        cv2.putText(result_image, display_txt, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
  
    plt.figure(figsize=(10, 8))
    plt.imshow(result_image)

    masks = None
    for score, mask in zip(output['scores'], output['masks']):
      if score > 0.5:
        if masks is None:
          masks = mask
        else:
          masks = torch.max(masks, mask)

    plt.figure()
    plt.imshow(masks.squeeze(0).cpu().numpy())





### Run the Mask R-CNN
----

In [6]:
run_MaskRCNN(IMAGE_URL)

Output hidden; open in https://colab.research.google.com to view.