# Mask R-CNN - Train on Pantograph Dataset


This notebook shows how to train Mask R-CNN on your own dataset. To keep things simple we use a synthetic dataset of shapes (squares, triangles, and circles) which enables fast training. You'd still need a GPU, though, because the network backbone is a Resnet101, which would be too slow to train on a CPU. On a GPU, you can start to get okay-ish results in a few minutes, and good results in less than an hour.

In [1]:
import os
import sys
import random
import math
import re
import time
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt

# Root directory of the project
ROOT_DIR = os.path.abspath("../")

# Add root to path 
if ROOT_DIR not in sys.path:
    sys.path.append(ROOT_DIR)

# Import Mask RCNN
# from mrcnn.config import Config
import mrcnn.model as modellib
from mrcnn import visualize
from mrcnn.model import log

# Import pantogrograph class
from dev import pantograph

# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "models")

# Set path to root of images. 
DATA_DIR = os.path.join(ROOT_DIR, "datasets/pantograph")


print("Using Root dir:",ROOT_DIR)
print("Using Model dir:",MODEL_DIR)
print("Using Data dir:",DATA_DIR)

Using TensorFlow backend.


I've been imported
Using Root dir: /home/jupyter/GCP_Test
Using Model dir: /home/jupyter/GCP_Test/models
Using Data dir: /home/jupyter/GCP_Test/datasets/pantograph


## Load Configuration

In [2]:
# Load class config and overwrite values as needed
class InferenceConfig(pantograph.PantographConfig):
    pass
#     MINI_MASK_SHAPE = (224,224)
#     MASK_POOL_SIZE = 14
#     KEYPOINT_MASK_POOL_SIZE = 7
#     RPN_TRAIN_ANCHORS_PER_IMAGE = 50
#     RPN_ANCHOR_SCALES = (64, 128, 256, 512,1024)
#     IMAGE_RESIZE_MODE = 'pad64'

config = InferenceConfig()

config.display()


Configurations Superlee:
BACKBONE                       resnet101
BACKBONE_SHAPES                [[256 256]
 [128 128]
 [ 64  64]
 [ 32  32]
 [ 16  16]]
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     2
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        50
DETECTION_MIN_CONFIDENCE       0.9
DETECTION_NMS_THRESHOLD        0.3
GPU_COUNT                      1
IMAGES_PER_GPU                 2
IMAGE_MAX_DIM                  1024
IMAGE_MIN_DIM                  512
IMAGE_MIN_SCALE                0.5
IMAGE_PADDING                  True
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1024 1024    3]
KEYPOINT_MASK_POOL_SIZE        7
KEYPOINT_MASK_SHAPE            [56, 56]
KEYPOINT_THRESHOLD             0.005
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.002
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES        

## Dataset

Create a synthetic dataset

Extend the Dataset class and add a method to load the shapes dataset, `load_shapes()`, and override the following methods:

* load_image()
* load_mask()
* image_reference()

In [4]:
# Load dataset
assert config.NAME == "pantograph"

# Training dataset
train_dataset_keypoints = pantograph.PantographDataset()
train_dataset_keypoints.load_pantograph(DATA_DIR, "train")
train_dataset_keypoints.prepare()

#Validation dataset
val_dataset_keypoints = pantograph.PantographDataset()
val_dataset_keypoints.load_pantograph(DATA_DIR, "val")
val_dataset_keypoints.prepare()

print("Train Keypoints Image Count: {}".format(len(train_dataset_keypoints.image_ids)))
print("Train Keypoints Class Count: {}".format(train_dataset_keypoints.num_classes))
for i, info in enumerate(train_dataset_keypoints.class_info):
    print("{:3}. {:50}".format(i, info['name']))

print("Val Keypoints Image Count: {}".format(len(val_dataset_keypoints.image_ids)))
print("Val Keypoints Class Count: {}".format(val_dataset_keypoints.num_classes))
for i, info in enumerate(val_dataset_keypoints.class_info):
    print("{:3}. {:50}".format(i, info['name']))

loading annotations into memory...
Done (t=1.01s)
creating index...
index created!


KeyError: 0

## Create Model

In [None]:
# Create model in training mode
model = modellib.MaskRCNN(mode="training", 
                          config=config,
                          model_dir=MODEL_DIR)

In [None]:
# Which weights to start with?
init_with = "coco"  # imagenet, coco, or last

if init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)
elif init_with == "coco":
    # Load weights trained on MS COCO, but skip layers that are different due to the different number of classes    
    MODEL_PATH = os.path.join(MODEL_DIR, "mask_rcnn_coco.h5")
    model.load_weights(MODEL_PATH, by_name=True,exclude=["mrcnn_class_logits", "mrcnn_bbox_fc","mrcnn_bbox", "mrcnn_mask"])
elif init_with == 'last':
    LOG_DIR = os.path.join(MODEL_DIR, "pantograph20200412T2157")
    MODEL_PATH = os.path.join(LOG_DIR, "mask_rcnn_pantograph_0010.h5")
    model.load_weights(MODEL_PATH, by_name=True,exclude=None)

In [None]:
# model.keras_model.summary()

## Training

Train in two stages:
1. Only the heads. Here we're freezing all the backbone layers and training only the randomly initialized layers (i.e. the ones that we didn't use pre-trained weights from MS COCO). To train only the head layers, pass `layers='heads'` to the `train()` function.

2. Fine-tune all layers. For this simple example it's not necessary, but we're including it to show the process. Simply pass `layers="all` to train all layers.

In [None]:
# Fine tune all layers
# Passing layers="all" trains all layers. You can also 
# pass a regular expression to select which layers to
# train by name pattern.
model.train(train_dataset_keypoints, val_dataset_keypoints, 
            learning_rate=config.LEARNING_RATE,
            epochs=10, 
            layers="all")

In [None]:
# Save weights
# Typically not needed because callbacks save after every epoch
# Uncomment to save manually
model_path = os.path.join(MODEL_DIR, "mask_rcnn_shapes_test_1.h5")
model.keras_model.save_weights(model_path)