<a href="https://colab.research.google.com/github/tep00018/Panoptic_Segmentation/blob/main/OpenMMLab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Instance Segmentation**

In this tutorial, you will learn:

the basic structure of Mask R-CNN.
to perform inference with a MMDetection detector.
to train a new instance segmentation model with a new dataset.
Let's start!

In [2]:
!pip install mmdet

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting mmdet
  Downloading mmdet-3.0.0-py3-none-any.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m56.8 MB/s[0m eta [36m0:00:00[0m
Collecting terminaltables (from mmdet)
  Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB)
Installing collected packages: terminaltables, mmdet
Successfully installed mmdet-3.0.0 terminaltables-3.1.10


In [5]:
!pip install mmcv

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting mmcv
  Downloading mmcv-2.0.0.tar.gz (473 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m473.2/473.2 kB[0m [31m24.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting addict (from mmcv)
  Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
Collecting mmengine>=0.2.0 (from mmcv)
  Downloading mmengine-0.7.4-py3-none-any.whl (374 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m374.3/374.3 kB[0m [31m36.4 MB/s[0m eta [36m0:00:00[0m
Collecting yapf (from mmcv)
  Downloading yapf-0.40.0-py3-none-any.whl (250 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m250.3/250.3 kB[0m [31m27.4 MB/s[0m eta [36m0:00:00[0m
Collecting importlib-metadata>=6.6.0 (from yapf->mmcv)
  Downloading importlib_metadata-6.6.0-py3-none-any.whl (22 kB)
Collecting platformdirs>=3.5.1 (from yapf->mmc

In [6]:
# Check Pytorch installation
import torch, torchvision
print("torch version:",torch.__version__, "cuda:",torch.cuda.is_available())

# Check MMDetection installation
import mmdet
print("mmdetection:",mmdet.__version__)

# Check mmcv installation
import mmcv
print("mmcv:",mmcv.__version__)

# Check mmengine installation
import mmengine
print("mmengine:",mmengine.__version__)

torch version: 2.0.1+cu118 cuda: False
mmdetection: 3.0.0
mmcv: 2.0.0
mmengine: 0.7.4


**Perform Inference with An MMDetection Detector**

A two-stage detector

In this tutorial, we use Mask R-CNN, a simple two-stage detector as an example.

The high-level architecture of Mask R-CNN is shown in the following picture. More details can be found in the paper.

Mask R-CNN adds a mask branch based on the original Faster R-CNN. It also uses RoIAlign, a more precise version of RoIPooling for RoI feature extraction to improve the performance.

In [7]:
!mim download mmdet --config mask-rcnn_r50-caffe_fpn_ms-poly-3x_coco --dest ./checkpoints

/bin/bash: mim: command not found


In [8]:
import mmcv
import mmengine
from mmdet.apis import init_detector, inference_detector
from mmdet.utils import register_all_modules
# Choose to use a config and initialize the detector
config_file = 'configs/mask_rcnn/mask-rcnn_r50-caffe_fpn_ms-poly-3x_coco.py'
# Setup a checkpoint file to load
checkpoint_file = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'

# register all modules in mmdet into the registries
register_all_modules()

# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')  # or device='cuda:0'

FileNotFoundError: ignored

<output> local loads checkpoint from path: checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth


From the printed model, we will find that the model does consist of the components that we described earlier. It uses ResNet as its CNN backbone, and has a RPN head and RoI Head. The RoI Head includes box head and mask head. In addition, the model has a neural network module, named neck, directly after the CNN backbone. It is a feature pyramid network (FPN) for enhancing the multi-scale features.

Inference with the detector

The model is successfully created and loaded, let's see how good it is. We use the high-level API inference_detector implemented in the MMDetection. This API is created to ease the inference process. The details of the codes can be found here.

In [None]:
# Use the detector to do inference
image = mmcv.imread('demo/demo.jpg',channel_order='rgb')
result = inference_detector(model, image)
print(result)

PLot the result

In [None]:
from mmdet.registry import VISUALIZERS
# init visualizer(run the block only once in jupyter notebook)
visualizer = VISUALIZERS.build(model.cfg.visualizer)
# the dataset_meta is loaded from the checkpoint and
# then pass to the model in init_detector
visualizer.dataset_meta = model.dataset_meta

In [None]:
# show the results
visualizer.add_datasample(
    'result',
    image,
    data_sample=result,
    draw_gt = None,
    wait_time=0,
)
visualizer.show()

**Train a Detector on A Customized Dataset**

To train a new detector, there are usually three things to do:

Support a new dataset
Modify the config
Train a new detector
Support a new dataset
There are three ways to support a new dataset in MMDetection:

Reorganize the dataset into a COCO format
Reorganize the dataset into a middle format
Implement a new dataset
We recommend the first two methods, as they are usually easier than the third.

In this tutorial, we give an example that converts the data into COCO format because MMDetection only support evaluating mask AP of dataset in COCO format for now. Other methods and more advanced usages can be found in the doc.

First, let's download the the balloon dataset.