# Instance Segmentation

In this tutorial, you will learn:
- the basic structure of Mask R-CNN.
- to perform inference with a MMDetection detector.
- to train a new instance segmentation model with a new dataset.

Let's start!


## Install MMDetection

In [None]:
# Check nvcc version
!nvcc -V
# Check GCC version
!gcc --version

In [None]:
# install dependencies: (use cu111 because colab has CUDA 11.1)
!pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html

# install mmcv-full thus we could use CUDA operators
!pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html

# Install mmdetection
!rm -rf mmdetection
!git clone https://github.com/open-mmlab/mmdetection.git
%cd mmdetection

!pip install -e .

In [None]:
# Check Pytorch installation
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())

# Check MMDetection installation
import mmdet
print(mmdet.__version__)

# Check mmcv installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

In [None]:
1+1

## Perform Inference with An MMDetection Detector

In [None]:
!mkdir checkpoints
!wget -c https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth \
      -O checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth

In [None]:
import mmcv
from mmcv.runner import load_checkpoint

from mmdet.apis import inference_detector, show_result_pyplot
from mmdet.models import build_detector

## Train a Detector on A Customized Dataset

To train a new detector, there are usually three things to do:
1. Support a new dataset
2. Modify the config
3. Train a new detector



In [None]:
%cd ..

In [None]:
%ls ../input/morzh-coco/morzh_coco/

In [None]:
%mkdir ./mmdetection/morzh_coco/
%cp -av ../input/morzh-coco/morzh_coco/ ./mmdetection/morzh_coco/

In [None]:
%cd mmdetection

In [None]:
%ls ./morzh_coco/

In [None]:
# Let's take a look at the dataset image
import mmcv
import matplotlib.pyplot as plt

img = mmcv.imread('./morzh_coco/morzh_coco/train/90.jpg')
plt.figure(figsize=(15, 10))
plt.imshow(mmcv.bgr2rgb(img))
plt.show()

### Modify the config

In the next step, we need to modify the config for the training.
To accelerate the process, we finetune a detector using a pre-trained detector.

In [None]:
%ls /

In [None]:
from mmcv import Config
cfg = Config.fromfile('./configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py')

Given a config that trains a Mask R-CNN on COCO dataset, we need to modify some values to use it for training on the balloon dataset.

In [None]:
from mmdet.apis import set_random_seed

# Modify dataset type and path
cfg.dataset_type = 'COCODataset'

cfg.data.test.ann_file = './morzh_coco/morzh_coco/val/annotation_coco.json'
cfg.data.test.img_prefix = './morzh_coco/morzh_coco/val/'
cfg.data.test.classes = ('walrus',)

cfg.data.train.ann_file = './morzh_coco/morzh_coco/train/annotation_coco.json'
cfg.data.train.img_prefix = './morzh_coco/morzh_coco/train/'
cfg.data.train.classes = ('walrus',)

# cfg.data.workers_per_gpu = 0


cfg.data.val.ann_file = './morzh_coco/morzh_coco/val/annotation_coco.json'
cfg.data.val.img_prefix = './morzh_coco/morzh_coco/val/'
cfg.data.val.classes = ('walrus',)

# modify num classes of the model in box head and mask head
cfg.model.roi_head.bbox_head.num_classes = 1
cfg.model.roi_head.mask_head.num_classes = 1
# for i in range(len(cfg.model.roi_head.bbox_head)):
#     cfg.model.roi_head.bbox_head[i].num_classes = 1

# We can still the pre-trained Mask RCNN model to obtain a higher performance
cfg.load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
# cfg.load_from = 'exps/epoch_20.pth'

# Set up working dir to save files and logs.
cfg.work_dir = './exps'

# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
# cfg.optimizer.lr = 1e-4
# cfg.lr_config.warmup = None
cfg.log_config.interval = 30

# We can set the evaluation interval to reduce the evaluation times
cfg.evaluation.interval = 10
# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 10

cfg.auto_scale_lr = dict(enable=False, base_batch_size=4)
cfg.runner = dict(type='EpochBasedRunner', max_epochs=60)

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)
cfg.device='cuda' 

# We can also use tensorboard to log the training process
cfg.log_config.hooks = [
    dict(type='TextLoggerHook'),
    dict(type='TensorboardLoggerHook')]


# cfg.model.train_cfg.rpn.assigner.pos_iou_thr=0.5
# cfg.model.train_cfg.rpn_proposal.nms.iou_threshold=0.5
# cfg.model.train_cfg.rpn_proposal.nms_pre=3000
# cfg.model.train_cfg.rpn_proposal.max_per_img=2000

# ПОМЕНЯТЬ ТОЛЬКО ТЕСТ
# cfg.model.test_cfg.rpn.nms_pre=2000
# cfg.model.test_cfg.rpn.max_per_img=2000
cfg.model.test_cfg.rcnn.max_per_img=1000

# cfg.model.test_cfg.rpn.nms.iou_threshold=0.5


# We can initialize the logger for training and have a look
# at the final config used for training
print(f'Config:\n{cfg.pretty_text}')


In [None]:
1+1

### Train a new detector

Finally, lets initialize the dataset and detector, then train a new detector!

In [None]:
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.apis import train_detector
import os.path as osp


# Build dataset
datasets = [build_dataset(cfg.data.train)]

# Build the detector
model = build_detector(cfg.model)

# # Add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES

# # Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_detector(model, datasets, cfg, distributed=False, validate=True)

In [None]:
1+1

## Test the Trained Detector

After finetuning the detector, let's visualize the prediction results!

In [None]:
# img = mmcv.imread('./morzh_coco/train/11.jpg')
img = mmcv.imread('./morzh_coco/morzh_coco/val/DJI_0105.jpg')

model.cfg = cfg
result = inference_detector(model, img)
show_result_pyplot(model, img, result)
# out_file=None

In [None]:
bbox_result, segm_result = result

In [None]:
segm_result[0][0].shape

## What to Do Next?

So far, we have learnt how to test and train Mask R-CNN. To further explore the segmentation task, you could do several other things as shown below:

- Try cascade methods, e.g., [Cascade Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) and [HTC](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc) in [MMDetection model zoo](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/model_zoo.md). They are powerful detectors that are ranked high in many benchmarks, e.g., COCO dataset.
- Try single-stage methods, e.g., [K-Net](https://github.com/ZwwWayne/K-Net) and [Dense-RepPoints](https://github.com/justimyhxu/Dense-RepPoints). These two algorithms are based on MMDetection. Box-free instance segmentation is a new trend in the instance segmentation community.
- Try semantic segmentation. Semantic segmentation is also a popular task with wide applications. You can explore [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/); we also provide a [colab tutorial](https://github.com/open-mmlab/mmsegmentation/blob/master/demo/MMSegmentation_Tutorial.ipynb) for semantic segmentation using MMSegmentation.
