<a href="https://colab.research.google.com/github/ZwwWayne/mmdetection/blob/update-colab/demo/MMDet_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import copy
import os.path as osp

import mmcv
import numpy as np

from mmdet.datasets.builder import DATASETS
from mmdet.datasets.custom import CustomDataset

@DATASETS.register_module()
class VisDroneDataset(CustomDataset):

    CLASSES = ('car', 'truck', 'bus')

    def load_annotations(self, ann_file):
        cat2label = {k: i for i, k in enumerate(self.CLASSES)}
        # load image list from file
        image_list = mmcv.list_from_file(self.ann_file)
    
        data_infos = []
        # convert annotations to middle format
        for image_id in image_list:
            filename = f'{self.img_prefix}/{image_id}.jpg'
            image = mmcv.imread(filename)
            height, width = image.shape[:2]
    
            data_info = dict(filename=f'{image_id}.jpg', width=width, height=height)
    
            # load annotations
            label_prefix = self.img_prefix.replace('images', 'VisDronetxt')
            lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt'))
    
            content = [line.strip().split(' ') for line in lines]
            bbox_names = [x[0] for x in content]
            bboxes = [[float(info) for info in x[4:8]] for x in content]
    
            gt_bboxes = []
            gt_labels = []
            gt_bboxes_ignore = []
            gt_labels_ignore = []
    
            # filter 'DontCare'
            for bbox_name, bbox in zip(bbox_names, bboxes):
                if bbox_name in cat2label:
                    gt_labels.append(cat2label[bbox_name])
                    gt_bboxes.append(bbox)
                else:
                    gt_labels_ignore.append(-1)
                    gt_bboxes_ignore.append(bbox)

            data_anno = dict(
                bboxes=np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4),
                labels=np.array(gt_labels, dtype=np.long),
                bboxes_ignore=np.array(gt_bboxes_ignore,
                                       dtype=np.float32).reshape(-1, 4),
                labels_ignore=np.array(gt_labels_ignore, dtype=np.long))

            data_info.update(ann=data_anno)
            data_infos.append(data_info)

        return data_infos

### Modify the config

In the next step, we need to modify the config for the training.
To accelerate the process, we finetune a detector using a pre-trained detector.

In [None]:
from mmcv import Config
cfg = Config.fromfile('./configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py')

Given a config that trains a Faster R-CNN on COCO dataset, we need to modify some values to use it for training Faster R-CNN on KITTI dataset.

In [None]:
# 从./configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py脚本中导入，以下且修改部分信息如model.bbox_head.num_classes=3
# 在faster rcnn中为model.roi_head.bbox_head.num_classes=3

from mmdet.apis import set_random_seed

# Modify dataset type and path
cfg.dataset_type = 'VisDroneDataset'
cfg.data_root = 'data/coco/'

cfg.data.train.type = 'VisDroneDataset'
cfg.data.train.data_root ='data/coco/'
cfg.data.train.ann_file = 'ImageSets/Main/train.txt'
cfg.data.train.img_prefix = 'images'

cfg.data.val.type = 'VisDroneDataset'
cfg.data.val.data_root = 'data/coco/'
cfg.data.val.ann_file = 'ImageSets/Main/val.txt'
cfg.data.val.img_prefix = 'images'

cfg.data.test.type = 'VisDroneDataset'
cfg.data.test.data_root = 'data/coco/'
cfg.data.test.ann_file = 'ImageSets/Main/val.txt'
cfg.data.test.img_prefix = 'images'

# modify num classes of the model in box head
cfg.model.bbox_head.num_classes = 3
# We can still use the pre-trained Mask RCNN model though we do not need to

# use the mask branch

#cfg.load_from = './work_dirs/faster_rcnn/epoch_400.pth'
cfg.load_from = None

# Set up working dir to save files and logs.
cfg.work_dir = './work_dirs/D2Det/'

# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
cfg.optimizer.lr = 0.02 / 8
cfg.lr_config.warmup = None

#iter,打印一次loss, loss_cls, loss_bbox, acc结果, 一个epoch 141 iter则一个epoch打印1次，1410iter打印10次
cfg.log_config.interval = 100

#设置评估指标coco-style使用bbox, voc-style使用mAP
cfg.evaluation.metric = 'mAP'
#设置评估模型的epoch数
cfg.evaluation.interval = 5
#设置保存checkpoints的epoch数
cfg.checkpoint_config.interval = 10

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)


# We can initialize the logger for training and have a look
# at the final config used for training
print(f'Config:\n{cfg.pretty_text}')


### Train a new detector

Finally, lets initialize the dataset and detector, then train a new detector!

In [None]:
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.apis import train_detector


# Build dataset
datasets = [build_dataset(cfg.data.train)]

# Build the detector
model = build_detector(
    cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
# Add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES

# Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_detector(model, datasets, cfg, distributed=False, validate=True)

### Understand the log
From the log, we can have a basic understanding the training process and know how well the detector is trained.

Firstly, the ResNet-50 backbone pre-trained on ImageNet is loaded, this is a common practice since training from scratch is more cost. The log shows that all the weights of the ResNet-50 backbone are loaded except the `conv1.bias`, which has been merged into `conv.weights`.

Second, since the dataset we are using is small, we loaded a Mask R-CNN model and finetune it for detection. Because the detector we actually using is Faster R-CNN, the weights in mask branch, e.g. `roi_head.mask_head`, are `unexpected key in source state_dict` and not loaded.
The original Mask R-CNN is trained on COCO dataset which contains 80 classes but KITTI Tiny dataset only have 3 classes. Therefore, the last FC layer of the pre-trained Mask R-CNN for classification has different weight shape and is not used.

Third, after training, the detector is evaluated by the default VOC-style evaluation. The results show that the detector achieves 54.1 mAP on the val dataset,
 not bad!

## Test the trained detector

After finetuning the detector, let's visualize the prediction results!

In [None]:
from mmdet.apis import init_detector, inference_detector, show_result_pyplot

img = mmcv.imread('testImages/test1.jpg')

model.cfg = cfg
result = inference_detector(model, img)
show_result_pyplot(model, img, result)


# 计算bbox/mAP

In [None]:
%run tools/test.py ./configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py ./work_dirs/D2Det/latest.pth  --out out/aver.pkl --eval bbox --show

# 分析训练日志

In [None]:
#绘制分类损失
%run tools/analyze_logs.py plot_curve ./work_dirs/D2Det/None.log.json --keys loss_cls --legend loss_cls

In [None]:
#回归损失
%run tools/analyze_logs.py plot_curve ./work_dirs/D2Det/None.log.json --keys loss_bbox --legend loss_bbox

In [None]:
#总损失
%run tools/analyze_logs.py plot_curve ./work_dirs/D2Det/None.log.json --keys loss --legend loss

In [None]:
#多曲线保存为pdf输出
%run tools/analyze_logs.py plot_curve ./work_dirs/D2Det/None.log.json --keys loss_cls loss_bbox loss --out ./work_dirs/retinanet/losses.pdf
#%run tools/analyze_logs.py plot_curve ./work_dirs/retinanet/None.log.json --keys acc --out ./work_dirs/retinanet/acc.pdf