# Global Wheat Detection

Hi.<br>
This is a baseline [Matterport](https://github.com/matterport/Mask_RCNN) Keras implementation of **Mask-RCNN** for **Global Wheat Detection** task. 

---
**Please Note**

I will be using [Matterport](https://github.com/matterport/Mask_RCNN), Inc implementation. Initially I planned to use it in `TF 2.1` but ended up with `TF 1.x` because of compatible error issue. So previously when working on `TF 2.1`, I manually upgrade the necessary scripts of [**Mask-RCNN**](https://github.com/matterport/Mask_RCNN) using [tf_upgrade_v2](https://www.tensorflow.org/guide/upgrade). But though I am now using `TF 1.x` but still the converted scripts are usable. One can find the upgraded files from here [MaskRCNN Keras Source Code](https://www.kaggle.com/ipythonx/maskrcnn-keras-source-code). In this, we removed some unnecessary example notebooks, unwanted sample images and anything that are not necessary to keep work space neat and clean.


# Acknowledgement

- [Peter](https://www.kaggle.com/pestipeti/pytorch-starter-fasterrcnn-train)
- [Henrique Mendonça](https://www.kaggle.com/hmendonca/mask-rcnn-and-coco-transfer-learning-lb-0-155/notebook)
- [Alexander Teplyuk](https://www.kaggle.com/ateplyuk/gwd-starter-efficientdet-train)
- [Splash of Color: Instance Segmentation with Mask R-CNN and TensorFlow](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46)


---

## Content
* [EDA and Model Config](#1)
    * [Simple EDA](#1)
    * [Mask RCNN Model Configuration](#2)
* [Preparing the Training Set](#3)  
    * [Mask-RCNN Dataloader](#3)
    * [Data Split](#4)
* [Training Sample Visualization](#5)
    * [Top Mask Position](#5)
    * [All Mask | Sample with Masked BBox](#6)
* [Augmentation](#7)
* [Model Definition and Training || Inference](#8)
* [Evaluation](#9)
    * [Visual Evaluation](#9)
    * [Numerical Evaluation (Comp. Metrics)](#10)
* [Inference on Test Set](#11)
    * [Visual Prediction](#11)
    * [Submission](#12)

In [None]:
# copy to working directory
!cp -r ../input/maskrcnn-keras-source-code/MaskRCNN/* ./

**Imports**

In [None]:
import numpy as np 
import pandas as pd 
import seaborn as sns
from tqdm import tqdm
import matplotlib.pyplot as plt
import sys, os, random, glob, cv2, math

from mrcnn import utils
from mrcnn.model import log
from mrcnn import visualize
import mrcnn.model as modellib
from mrcnn.config import Config

In [None]:
# for reproducibility
def seed_all(SEED):
    random.seed(SEED)
    np.random.seed(SEED)
    os.environ['PYTHONHASHSEED'] = str(SEED)

seed_all(42)
sns.set(style="darkgrid")
%matplotlib inline

# Simple EDA <a id="1"></a>

In [None]:
ORIG_SIZE     = 1024
epoch         = 100
data_root     = '/kaggle/input'
packages_root = '/kaggle/working'

In [None]:
# load annotation files
df = pd.read_csv(data_root + '/global-wheat-detection/train.csv')
df.head()

In [None]:
# information summary
df.info()

**Check source distribution**

In [None]:
plt.figure(figsize=(9,5))
sns.countplot(df.source)
plt.show()

Organization informed that ` Not all images include wheat heads / bounding boxes.` We can justify that easily by following. There're about 49 image that doesn't have bbox.

In [None]:
# image directory
img_root = '../input/global-wheat-detection/train/'
len(os.listdir(img_root)) - len(df.image_id.unique())

Let's modify the annotation file for feasible use. The `bbox` values are in one column, we will make them separate in different attributes.

In [None]:
df['bbox'] = df['bbox'].apply(lambda x: x[1:-1].split(","))

df['x'] = df['bbox'].apply(lambda x: x[0]).astype('float32')
df['y'] = df['bbox'].apply(lambda x: x[1]).astype('float32')
df['w'] = df['bbox'].apply(lambda x: x[2]).astype('float32')
df['h'] = df['bbox'].apply(lambda x: x[3]).astype('float32')

df = df[['image_id','x', 'y', 'w', 'h']]
df.head()

# Mask-RCNN Model Configuration <a id="2"></a>

In [None]:
class WheatDetectorConfig(Config):

    # Give the configuration a recognizable name  
    NAME = 'wheat'
    
    # set the number of GPUs to use along with the number of images
    # per GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 8
    BACKBONE = 'resnet50'
    
    # number of classes (we would normally add +1 for the background)
    # BG + Wheat
    NUM_CLASSES = 2
    
    IMAGE_RESIZE_MODE = "square"  
    IMAGE_MIN_DIM = 512
    IMAGE_MAX_DIM = 512
    
    # Number of training steps per epoch
    STEPS_PER_EPOCH = 90
    
    # Use different size anchors because our target objects are multi-scale (wheats are some too big, some too small)
    RPN_ANCHOR_SCALES = (16, 32, 64, 128)  # anchor side in pixels
    
    # Learning rate
    LEARNING_RATE = 0.005
    WEIGHT_DECAY  = 0.0005
    
    # Maximum number of ROI’s, the Region Proposal Network (RPN) will generate for the image
    TRAIN_ROIS_PER_IMAGE = 170
    
    # Skip detections with < 70% confidence
    DETECTION_MIN_CONFIDENCE = 0.70
    
    # Increase with larger training
    VALIDATION_STEPS = 30
    
    # Maximum number of instances that can be detected in one image.
    MAX_GT_INSTANCES = 60

config = WheatDetectorConfig()
config.display()

# Data Preparing <a id="3"></a>

In [None]:
def get_jpg(img_root):
    jpg_fps = glob.glob(img_root + '*.jpg')
    return list(set(jpg_fps))

def get_dataset(img_dir, anns): 
    image_fps = get_jpg(img_dir)
    image_annotations = {fp: [] for fp in image_fps}
    
    for index, row in anns.iterrows(): 
        fp = os.path.join(img_root, row['image_id'] + '.jpg')
        image_annotations[fp].append(row)
    
    return image_fps, image_annotations 

# Data Generator for Mask-RCNN <a id="3"></a>

In [None]:
class DetectorDataset(utils.Dataset):
    def __init__(self, image_fps, image_annotations, orig_height, orig_width):
        super().__init__(self)
        
        # Add classes
        self.add_class('GlobalWheat', 1 , 'Wheat') # only one class, wheat
        
        # add images 
        for id, fp in enumerate(image_fps):
            annotations = image_annotations[fp]
            self.add_image('GlobalWheat', image_id=id, 
                           path=fp, annotations=annotations, 
                           orig_height=orig_height, orig_width=orig_width)

    # load bbox, most important function so far        
    def load_mask(self, image_id):
        info = self.image_info[image_id]
        annotations = info['annotations']
        count = len(annotations)
    
        if count == 0:
            mask = np.zeros((info['orig_height'], info['orig_width'], 1), 
                            dtype=np.uint8)
            class_ids = np.zeros((1,), dtype=np.int32)
        else:
            mask = np.zeros((info['orig_height'], info['orig_width'], count),
                            dtype=np.uint8)
            class_ids = np.zeros((count,), dtype=np.int32)
            for i, a in enumerate(annotations):
                x = int(a['x'])
                y = int(a['y'])
                w = int(a['w'])
                h = int(a['h'])
                mask_instance = mask[:, :, i].copy()
                cv2.rectangle(mask_instance, (x, y), (x+w, y+h), 255, -1)
                mask[:, :, i] = mask_instance
                class_ids[i] = 1
        return mask.astype(np.bool), class_ids.astype(np.int32)
    
    # simple image loader 
    def load_image(self, image_id):
        info = self.image_info[image_id]
        fp = info['path']
        image = cv2.imread(fp, cv2.IMREAD_COLOR)
        # If grayscale. Convert to RGB for consistency.
        if len(image.shape) != 3 or image.shape[2] != 3:
            image = np.stack((image,) * 3, -1)
        return image
    
    # simply return the image path
    def image_reference(self, image_id):
        info = self.image_info[image_id]
        return info['path']

# Splits Data Sets <a id="4"></a>

In [None]:
image_ids = df['image_id'].unique()

valid_ids = image_ids[-665:]
train_ids = image_ids[:-665]

valid_df = df[df['image_id'].isin(valid_ids)]
train_df = df[df['image_id'].isin(train_ids)]
train_df.shape, valid_df.shape

In [None]:
len(train_df.image_id.unique()), len(valid_df.image_id.unique())

## Build Train Set

In [None]:
# grab all image file path with concern annotation
train_image_fps, train_image_annotations = get_dataset(img_root,
                                                       anns=train_df)

# make data generator with that
dataset_train = DetectorDataset(train_image_fps, 
                                train_image_annotations,
                                ORIG_SIZE, ORIG_SIZE)
dataset_train.prepare()

print("Class Count: {}".format(dataset_train.num_classes))
for i, info in enumerate(dataset_train.class_info):
    print("{:3}. {:50}".format(i, info['name']))

## Build Validation Set

In [None]:
# grab all image file path with concern annotation
valid_image_fps, valid_image_annotations = get_dataset(img_root, 
                                           anns=valid_df)

# make data generator with that
dataset_valid = DetectorDataset(valid_image_fps, valid_image_annotations,
                                ORIG_SIZE, ORIG_SIZE)
dataset_valid.prepare()

print("Class Count: {}".format(dataset_valid.num_classes))
for i, info in enumerate(dataset_valid.class_info):
    print("{:3}. {:50}".format(i, info['name']))

# Training Samples <a id="5"></a>

Using `dataset_train`, let's observe some sample data.

In [None]:
class_ids = [0]

while class_ids[0] == 0:  ## look for a mask
    image_id = random.choice(dataset_train.image_ids)
    image_fp = dataset_train.image_reference(image_id)
    image = dataset_train.load_image(image_id)
    mask, class_ids = dataset_train.load_mask(image_id)

print(image.shape)

plt.figure(figsize=(15, 15))
plt.subplot(1, 2, 1)
plt.imshow(image)
plt.axis('off')

plt.subplot(1, 2, 2)
masked = np.zeros(image.shape[:2])
for i in range(mask.shape[2]):
    masked += image[:, :, 0] * mask[:, :, i]
plt.imshow(masked, cmap='gray')
plt.axis('off')

print(class_ids)
plt.show()

# Top Mask Position <a id="5"></a>

Let's display some sample and corresponding mask (here which is bounding box indicator).

In [None]:
# Load and display random samples
image_ids = np.random.choice(dataset_train.image_ids,3)
for image_id in image_ids:
    image = dataset_train.load_image(image_id)
    mask, class_ids = dataset_train.load_mask(image_id)
    visualize.display_top_masks(image, mask, class_ids, 
                                dataset_train.class_names, limit=1)

# BBoxes with Masked Sample <a id="6"></a>

In `Mask-RCNN`, the aspect ratio is preserved, though. If an image is not square, then zero padding is added at the `top/bottom` or `right/left`.

In [None]:
# Load random image and mask.
image_id = np.random.choice(dataset_train.image_ids, 1)[0]
image = dataset_train.load_image(image_id)
mask, class_ids = dataset_train.load_mask(image_id)
original_shape = image.shape

# Resize
image, window, scale, padding, _ = utils.resize_image(image, 
                                                      min_dim=config.IMAGE_MIN_DIM, 
                                                      max_dim=config.IMAGE_MAX_DIM,
                                                      mode=config.IMAGE_RESIZE_MODE)
mask = utils.resize_mask(mask, scale, padding)

# Compute Bounding box
bbox = utils.extract_bboxes(mask)

# Display image and additional stats
print("Original shape: ", original_shape)
log("image", image)
log("mask", mask)
log("class_ids", class_ids)
log("bbox", bbox)

# Display image and instances
visualize.display_instances(image, bbox, mask, class_ids, 
                            dataset_train.class_names)

# Augmentation <a id="7"></a>

Augmentation is the key part to boost performance. Here are some fancy augmentation but we have to find out which are the best for this task. 

In [None]:
from imgaug import augmenters as iaa

# List of augmentations
# http://imgaug.readthedocs.io/en/latest/source/augmenters.html
augmentationA = iaa.Sequential([
    iaa.Affine(rotate=(-10, 10)),
    iaa.AdditiveGaussianNoise(scale=(6, 6)),
    iaa.Fliplr(0.5),
    iaa.Multiply((0.6, 1.3)),
    iaa.CoarseDropout(0.02, size_percent=0.15, per_channel=0.5)
])

In [None]:
# from official repo
def get_ax(rows=1, cols=1, size=7):
    """Return a Matplotlib Axes array to be used in
    all visualizations in the notebook. Provide a
    central point to control graph sizes.
    
    Adjust the size attribute to control how big to render images
    """
    _, ax = plt.subplots(rows, cols, figsize=(size*cols, size*rows))
    return ax


# Load the image multiple times to show augmentations
limit = 4
ax = get_ax(rows=2, cols=limit//2)

for i in range(limit):
    image, image_meta, class_ids,\
    bbox, mask = modellib.load_image_gt(
        dataset_train, config, image_id, use_mini_mask=False, 
        augment=False, augmentation=augmentationA)
    
    visualize.display_instances(image, bbox, mask, class_ids,
                                dataset_train.class_names, ax=ax[i//2, i % 2],
                                show_mask=False, show_bbox=False)

**Augmentation B**

Observing augmentation with different parameter.

In [None]:
# Image augmentation (light but constant)
augmentationB = iaa.Sequential([
    iaa.OneOf([ ## rotate
        iaa.Affine(rotate=0),
        iaa.Affine(rotate=90),
        iaa.Affine(rotate=180),
        iaa.Affine(rotate=270),
    ]),
    iaa.Fliplr(0.5),
    iaa.Flipud(0.5),
    iaa.OneOf([ ## brightness or contrast
        iaa.Multiply((0.9, 1.1)),
        iaa.ContrastNormalization((0.9, 1.1)),
    ]),
    iaa.OneOf([ ## blur or sharpen
        iaa.GaussianBlur(sigma=(0.0, 0.1)),
        iaa.Sharpen(alpha=(0.0, 0.1)),
    ]),
])

In [None]:
# Load the image multiple times to show augmentations
limit = 4
ax = get_ax(rows=2, cols=limit//2)

for i in range(limit):
    image, image_meta, class_ids,\
    bbox, mask = modellib.load_image_gt(
        dataset_train, config, image_id, use_mini_mask=False, 
        augment=False, augmentation=augmentationB)
    
    visualize.display_instances(image, bbox, mask, class_ids,
                                dataset_train.class_names, ax=ax[i//2, i % 2],
                                show_mask=False, show_bbox=False)

# Build Model <a id="8"></a>

Time to build the model. I will use [`mask_rcnn_coco.h5`](https://www.kaggle.com/ipythonx/cocowg) pre-trained model and train the model by initializing with it.

In [None]:
def model_definition():
    print("loading mask R-CNN model")
    model = modellib.MaskRCNN(mode='training', 
                              config=config, 
                              model_dir=packages_root)
    
    # load the weights for COCO
    model.load_weights(data_root + '/cocowg/mask_rcnn_coco.h5',
                       by_name=True, 
                       exclude=["mrcnn_class_logits",
                                "mrcnn_bbox_fc",  
                                "mrcnn_bbox","mrcnn_mask"])
    return model   

model = model_definition()

In [None]:
from keras.callbacks import (ModelCheckpoint, ReduceLROnPlateau, CSVLogger)

def callback():
    cb = []
    checkpoint = ModelCheckpoint(packages_root+'wheat_wg.h5',
                                 save_best_only=True,
                                 mode='min',
                                 monitor='val_loss',
                                 save_weights_only=True, verbose=1)
    cb.append(checkpoint)
    reduceLROnPlat = ReduceLROnPlateau(monitor='val_loss',
                                   factor=0.3, patience=5,
                                   verbose=1, mode='auto',
                                   epsilon=0.0001, cooldown=1, min_lr=0.00001)
    log = CSVLogger(packages_root+'wheat_history.csv')
    cb.append(log)
    cb.append(reduceLROnPlat)
    return cb

**Inference Configuration**

I've trained the model on-site. I set `epoch` 100 but the model converged within `50` but later slighty improved in next few more epoch. I didn't have the intention to train longer though. I started the training and went to sleep; next is history. :D

In [None]:
%%time
CB = callback()
TRAIN = False

class WheatInferenceConfig(WheatDetectorConfig):
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

if TRAIN:
    model.train(dataset_train, dataset_valid, 
                augmentation=augmentationB, 
                learning_rate=config.LEARNING_RATE,
                custom_callbacks = CB,
                epochs=epoch, layers='all') 
else:
    inference_config = WheatInferenceConfig()
    # Recreate the model in inference mode
    model = modellib.MaskRCNN(mode='inference', 
                              config=inference_config,
                              model_dir=packages_root)
    
    model.load_weights(data_root + '/wheatweight/wheat_wg.h5', by_name = True)

**Learning Curves**

In [None]:
history = pd.read_csv(data_root + '/wheatweight/wheat_history.csv') 

# find the lowest validation loss score
print(history.loc[history['val_loss'].idxmin()])
history.head()

In [None]:
plt.figure(figsize=(19,6))

plt.subplot(131)
plt.plot(history.epoch, history.loss, label="Train loss")
plt.plot(history.epoch, history.val_loss, label="Valid loss")
plt.legend()

plt.subplot(132)
plt.plot(history.epoch, history.mrcnn_class_loss, label="Train class ce")
plt.plot(history.epoch, history.val_mrcnn_class_loss, label="Valid class ce")
plt.legend()

plt.subplot(133)
plt.plot(history.epoch, history.mrcnn_bbox_loss, label="Train box loss")
plt.plot(history.epoch, history.val_mrcnn_bbox_loss, label="Valid box loss")
plt.legend()

plt.show()

# Evaluation <a id="9"></a>  

We will evaluate the model performance in both ways: `visual interpretation` and `numerical` or mainly competition metrices (`mAP(0.5:0.75:0.05)`. But I know that most of the cases `visual interpretation` doesn't really matter (except in medical domain). 

In [None]:
image_id = np.random.choice(dataset_valid.image_ids, 5)

for img_id in image_id:
    original_image, image_meta, gt_class_id, gt_bbox, gt_mask =\
        modellib.load_image_gt(dataset_valid, inference_config,     
                               img_id, use_mini_mask=False)

    info = dataset_valid.image_info[img_id]
    results = model.detect([original_image], verbose=1)
    r = results[0]

    visualize.display_instances(original_image, r['rois'], r['masks'], r['class_ids'], 
                                dataset_valid.class_names, r['scores'], ax=get_ax(), title="Predictions")
    
    log("image_meta", image_meta)
    log("gt_class_id", gt_class_id)
    log("gt_bbox", gt_bbox)
    log("gt_mask", gt_mask)

# Competition Metrics <a id="10"></a>

The following functons takes good amount of time to evaluate the average precision scores withing the given `IoU` threshold scores on the validation set. So, please consider if you want to use it.

In [None]:
%%time

thresh_score = [0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75]

def evaluate_threshold_range(test_set, image_ids, model, 
                             iou_thresholds, inference_config):
    '''Calculate mAP based on iou_threshold range
    inputs:
        test_set        : test samples
        image_ids       : image ids of the test samples
        model           : trained model
        inference_config: test configuration
        iou_threshold   : by default [0.5:0.75:0.05]
    return:
        AP : mAP[@0.5:0.75] scores lists of the test samples
    '''
    # placeholder for all the ap of all classes for IoU socres 0.5 to 0.95 with step size 0.05
    AP = []
    np.seterr(divide='ignore', invalid='ignore') 
    
    for image_id in image_ids:
        # Load image and ground truth data
        image, image_meta, gt_class_id, gt_bbox, gt_mask =\
            modellib.load_image_gt(test_set, inference_config,
                                   image_id, use_mini_mask=False)

        # Run object detection
        results = model.detect([image], verbose=0)
        r = results[0]
        AP_range = utils.compute_ap_range(gt_bbox, gt_class_id, gt_mask, 
                                          r["rois"], r["class_ids"], r["scores"], r['masks'],
                                          iou_thresholds=iou_thresholds, verbose=0)
        
        if math.isnan(AP_range):
            continue
            
        # append the scores of each samples
        AP.append(AP_range)   
        
    return AP

AP = evaluate_threshold_range(dataset_valid, dataset_valid.image_ids,
                              model, thresh_score, inference_config)

print("AP[0.5:0.75]: ", np.mean(AP))

# Inference on Test Set <a id="1"></a>

In [None]:
# Get filenames of test dataset jpg images
test_img_root  = data_root + '/global-wheat-detection/test/'
test_image_fps = get_jpg(test_img_root)

# Visual Prediction <a id="11"></a>

In [None]:
# show a few test image detection example
def visualize(): 
    image_id = random.choice(test_image_fps)
    image = cv2.imread(image_id, cv2.IMREAD_COLOR)
    
    # assume square image 
    resize_factor = ORIG_SIZE / config.IMAGE_SHAPE[0]
    
    # If grayscale. Convert to RGB for consistency.
    if len(image.shape) != 3 or image.shape[2] != 3:
        image = np.stack((image,) * 3, -1) 
        
    resized_image, window, scale, padding, crop = utils.resize_image(
        image,
        min_dim=config.IMAGE_MIN_DIM,
        min_scale=config.IMAGE_MIN_SCALE,
        max_dim=config.IMAGE_MAX_DIM,
        mode=config.IMAGE_RESIZE_MODE)

    image_id = os.path.splitext(os.path.basename(image_id))[0]

    results = model.detect([resized_image])
    r = results[0]
    for bbox in r['rois']: 
        x1 = int(bbox[1] * resize_factor)
        y1 = int(bbox[0] * resize_factor)
        x2 = int(bbox[3] * resize_factor)
        y2 = int(bbox[2] * resize_factor)
        cv2.rectangle(image, (x1,y1), (x2,y2), (77, 255, 9), 3, 1)
        width  = x2 - x1 
        height = y2 - y1 
    
    plt.figure(figsize=(10,10)) 
    plt.grid(False)
    plt.imshow(image, cmap=plt.cm.gist_gray)


visualize()
visualize()
visualize()
visualize()

# Submission <a id="12"></a>

Yes brother! Like you, I've also faced stupid `Submission Scoring Error` around `15` times. And when I solved, it felt as same as winning the competition. LoL :D

In [None]:
# Make predictions on test images, write out sample submission
def predict(image_fps, filepath='submission.csv', min_conf=0.50):
    # assume square image
    resize_factor = ORIG_SIZE / config.IMAGE_SHAPE[0]

    with open(filepath, 'w') as file:
        file.write("image_id,PredictionString\n")

        for image_id in tqdm(image_fps):
            image = cv2.imread(image_id, cv2.IMREAD_COLOR)
            # If grayscale. Convert to RGB for consistency.
            if len(image.shape) != 3 or image.shape[2] != 3:
                image = np.stack((image,) * 3, -1)
                
            image, window, scale, padding, crop = utils.resize_image(
                image,
                min_dim=config.IMAGE_MIN_DIM,
                min_scale=config.IMAGE_MIN_SCALE,
                max_dim=config.IMAGE_MAX_DIM,
                mode=config.IMAGE_RESIZE_MODE)

            image_id = os.path.splitext(os.path.basename(image_id))[0]

            results = model.detect([image])
            r = results[0]

            out_str = ""
            out_str += image_id
            out_str += ","
            
            assert( len(r['rois']) == len(r['class_ids']) == len(r['scores']) )
            
            if len(r['rois']) == 0:
                pass
            else:
                num_instances = len(r['rois'])
                for i in range(num_instances):
                    if r['scores'][i] > min_conf:
                               
                        out_str += ' '
                        out_str += "{0:.4f}".format(r['scores'][i])
                        out_str += ' '

                        # x1, y1, width, height
                        x1 = r['rois'][i][1]
                        y1 = r['rois'][i][0]
                        width = r['rois'][i][3] - x1
                        height = r['rois'][i][2] - y1
                        bboxes_str = "{} {} {} {}".format( x1*resize_factor, y1*resize_factor, \
                                                           width*resize_factor, height*resize_factor )
                        out_str += bboxes_str

            file.write(out_str+"\n")

In [None]:
submission = os.path.join(packages_root, 'submission.csv')
predict(test_image_fps, filepath=submission)

In [None]:
submit = pd.read_csv(submission)
submit.head(10)

## End Note

There are lots of valuable hyper-parameters have to consider. Based on the training samples and our end goal we need to config them properly. Here are few hyper-parameter that you need to set carefully in the `configuration` setup while using this implementation. 

- RPN_ANCHOR_SCALES
- TRAIN_ROIS_PER_IMAGE
- MAX_GT_INSTANCES
- LOSS_WEIGHTS 

Though we don't need to precisely predict mask but I think it can be usefull for better generalization of the model. Currently the model outcomes are far from the top scores, and I'll experiment more on this and update the notebook accordingly. 

But running the notebook somewhat costly, so I've opened a github repository to work on the model better generalization. I will add new feature along with my further experiment and update the kernel with finalized features. If you find interest of this kernel, please feel free to follow the project on GitHub but don't fork it until the competition end. Find github repository from here: https://github.com/innat/GWD-MaskRCNN   