open-mmlab · ZwwWayne · Nov 1, 2020 · Oct 1, 2020 · Oct 5, 2020 · Oct 6, 2020
diff --git a/docs/1_exist_data_model.md b/docs/1_exist_data_model.md
diff --git a/docs/2_new_data_model.md b/docs/2_new_data_model.md
@@ -0,0 +1,263 @@
+# 2: Train with customized datasets
+
+In this note, you will know how to inference, test, and train predefined models with customized datasets. We use the [ballon dataset](https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon) as an example to describe the whole process.
+
+The basic steps are as below:
+
+1. Prepare the customized dataset
+2. Prepare a config
+3. Train, test, inference models on the customized dataset.
+
+## Prepare the customized dataset
+
+There are three ways to support a new dataset in MMDetection:
+
+1. reorganize the dataset into COCO format.
+2. reorganize the dataset into a middle format.
+3. implement a new dataset.
+
+Usually we recommend to use the first two methods which are usually easier than the third.
+
+In this note, we give an example for converting the data into COCO format.
+
+**Note**: MMDetection only supports evaluating mask AP of dataset in COCO format for now.
+So for instance segmentation task users should convert the data into coco format.
+
+### COCO annotation format
+
+The necessary keys of COCO format for instance segmentation is as below, for the complete details, please refer [here](https://cocodataset.org/#format-data).
+
+```json
+{
+    "images": [image],
+    "annotations": [annotation],
+    "categories": [category]
+}
+
+
+image = {
+    "id": int,
+    "width": int,
+    "height": int,
+    "file_name": str,
+}
+
+annotation = {
+    "id": int,
+    "image_id": int,
+    "category_id": int,
+    "segmentation": RLE or [polygon],
+    "area": float,
+    "bbox": [x,y,width,height],
+    "iscrowd": 0 or 1,
+}
+
+categories = [{
+    "id": int,
+    "name": str,
+    "supercategory": str,
+}]
+```
+
+Assume we use the ballon dataset.
+After downloading the data, we need to implement a function to convert the annotation format into the COCO format. Then we can use implemented COCODataset to load the data and perform training and evaluation.
+
+If you take a look at the dataset, you will find the dataset format is as below:
+
+```json
+{'base64_img_data': '',
+ 'file_attributes': {},
+ 'filename': '34020010494_e5cb88e1c4_k.jpg',
+ 'fileref': '',
+ 'regions': {'0': {'region_attributes': {},
+   'shape_attributes': {'all_points_x': [1020,
+     1000,
+     994,
+     1003,
+     1023,
+     1050,
+     1089,
+     1134,
+     1190,
+     1265,
+     1321,
+     1361,
+     1403,
+     1428,
+     1442,
+     1445,
+     1441,
+     1427,
+     1400,
+     1361,
+     1316,
+     1269,
+     1228,
+     1198,
+     1207,
+     1210,
+     1190,
+     1177,
+     1172,
+     1174,
+     1170,
+     1153,
+     1127,
+     1104,
+     1061,
+     1032,
+     1020],
+    'all_points_y': [963,
+     899,
+     841,
+     787,
+     738,
+     700,
+     663,
+     638,
+     621,
+     619,
+     643,
+     672,
+     720,
+     765,
+     800,
+     860,
+     896,
+     942,
+     990,
+     1035,
+     1079,
+     1112,
+     1129,
+     1134,
+     1144,
+     1153,
+     1166,
+     1166,
+     1150,
+     1136,
+     1129,
+     1122,
+     1112,
+     1084,
+     1037,
+     989,
+     963],
+    'name': 'polygon'}}},
+ 'size': 1115004}
+```
+The annotation is a JSON file where each key indicates an image's all annotations.
+The code to convert the ballon dataset into coco format is as below.
+
+```python
+import os.path as osp
+
+def convert_balloon_to_coco(ann_file, out_file, image_prefix):
+    data_infos = mmcv.load(ann_file)
+
+    annotations = []
+    images = []
+    obj_count = 0
+    for idx, v in enumerate(mmcv.track_iter_progress(data_infos.values())):
+        filename = v['filename']
+        img_path = osp.join(image_prefix, filename)
+        height, width = mmcv.imread(img_path).shape[:2]
+
+        images.append(dict(
+            id=idx,
+            file_name=filename,
+            height=height,
+            width=width))
+
+        bboxes = []
+        labels = []
+        masks = []
+        for _, obj in v['regions'].items():
+            assert not obj['region_attributes']
+            obj = obj['shape_attributes']
+            px = obj['all_points_x']
+            py = obj['all_points_y']
+            poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
+            poly = [p for x in poly for p in x]
+
+            x_min, y_min, x_max, y_max = (
+                min(px), min(py), max(px), max(py))
+
+
+            data_anno = dict(
+                image_id=idx,
+                id=obj_count,
+                category_id=0,
+                bbox=[x_min, y_min, x_max - x_min, y_max - y_min],
+                area=(x_max - x_min) * (y_max - y_min),
+                segmentation=[poly],
+                iscrowd=0)
+            annotations.append(data_anno)
+            obj_count += 1
+
+    coco_format_json = dict(
+        images=images,
+        annotations=annotations,
+        categories=[{'id':0, 'name': 'balloon'}])
+    mmcv.dump(coco_format_json, out_file)
+
+```
+
+Using the function above, users can successfully convert the annotation file into json format, then we can use `CocoDataset` to train and evaluate the model.
+
+
+## Prepare a config
+
+The second step is to prepare a config thus the dataset could be successfully loaded. Assume that we want to use Mask R-CNN with FPN, the config to train the detector on ballon dataset is as below. Assume the config is under directory `configs/ballon/` and named as `mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py`, the config is as below.
+
+```python
+# The new config inherits a base config to highlight the necessary modification
+_base_ = 'mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'
+
+# We also need to change the num_classes in head to match the dataset's annotation
+model = dict(
+    roi_head=dict(
+        bbox_head=dict(num_classes=1),
+        mask_head=dict(num_classes=1)))
+
+# Modify dataset related settings
+dataset_type = 'COCODataset'
+classes = ('balloon',)
+data = dict(
+    train=dict(
+        img_prefix='balloon/train/',
+        classes=classes,
+        ann_file='balloon/train/annotation_coco.json')
+    val=dict(
+        img_prefix='balloon/val/',
+        classes=classes,
+        ann_file='balloon/val/annotation_coco.json')
+    test=dict(
+        img_prefix='balloon/val/',
+        classes=classes,
+        ann_file='balloon/val/annotation_coco.json'))
+
+# We can use the pre-trained Mask RCNN model to obtain higher performance
+load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
+```
+
+## Train a new model
+
+To train a model with the new config, you can simply run
+
+```shell
+python tools/train.py configs/ballon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py
+```
+
+For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).
+
+## Test and inference
+
+To test the trained model, you can simply run
+
+```shell
+python tools/test.py configs/ballon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py work_dirs/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py/latest.pth --eval bbox segm
+```
+
+For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).