# Faster R-CNN Training

(1) Train a Detector on A Customized Dataset. 

(2) The ResNet50 backbone of Faster R-CNN is pre-trained on another dataset in image classification task, implemented in MMPretrain.

## 1. Check installation 

In [None]:
# Step 1. Check Pytorch installation

import torch, torchvision
print("torch version: ", torch.__version__, torch.cuda.is_available())

# Check MMDetection installation
import mmdet
print("mmdet version: ", mmdet.__version__)

# Check mmcv installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

In [None]:
# Step 2. We need to download config and checkpoint files.

# The downloading will take several seconds or more, depending on your network environment. 
# When it is done, you will find two files rtmdet_tiny_8xb32-300e_coco.py，
#  and rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth in your current folder.

!mim download mmdet --config rtmdet_tiny_8xb32-300e_coco --dest .

In [None]:
# Step 3 (Case a): 

# If you install MMDetection from source, just run the following command.
# You will see a new image demo.jpg on your ./outputs/vis folder, 
# where bounding boxes are plotted on cars, benches, etc.

!python demo/image_demo.py demo/demo.jpg rtmdet_tiny_8xb32-300e_coco.py --weights rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth --device cpu

In [None]:
# Step 3 (Case b): 

# If you install MMDetection with MIM, open your python interpreter and copy&paste the following codes.
# You will see a list of DetDataSample, and the predictions are in the pred_instance, indicating the detected bounding boxes, labels, and scores.

from mmdet.apis import init_detector, inference_detector

config_file = 'rtmdet_tiny_8xb32-300e_coco.py'
checkpoint_file = 'rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth'
model = init_detector(config_file, checkpoint_file, device='cpu')  # or device='cuda:0'
inference_detector(model, 'demo/demo.jpg')

## 2. Prepare the dataset in COCO format

MMDetection supports multiple public datasets including COCO, Pascal VOC, CityScapes, and more.

Here we only use the COCO format dataset, as follows:

```
data_root
├──annotations (folder)
├  ├── train.json
├  ├── val.json
├  └── 
├──train (folder)
├  ├── 1.jpg
├  ├── 2.jpg
├  └── n.jpg
├──val (folder)
├  ├── 1.jpg
├  └── 2.jpg


```

## 3. Modify the model config

We generate a file **configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_deep_plastic_mmpretrain.py**

where we may update the following values/parameter if possible:

1. Class name: "metainfo".

2. Dataset path: (1) "data_root"; (2) "ann_file"; (3) "data_prefix".

3. Output path： "work_dir".

4. Model architecture: 

(1) pretrained model weight: "backbone.init_cfg.checkpoint"; 

(2) num-classes: "roi_head.bbox_head.num_classes";

(3) Epoches: "train_cfg.max_epochs"; 

(4) batch_size: "train_dataloader.batch_size", and "val_dataloader.batch_size".

(5) Save the best model with a specific evaluation metric by changing "default_hooks.checkpoint.save_best"

The metrics include:

(a) For object detection: 'bbox_mAP', 'bbox_mAP_50', 'bbox_mAP_75', 'bbox_mAP_s', 'bbox_mAP_m', 'bbox_mAP_l', 'bbox_mAP_copypaste'

(b) For instance segmentation: 'segm_mAP', 'segm_mAP_50', 'segm_mAP_75', 'segm_mAP_s', 'segm_mAP_m', 'segm_mAP_l', 'segm_mAP_copypaste'


## 4. Train a new detector

Finally, lets initialize the dataset and detector, then train a new detector!

### 4.1 ResNet50 pretrained on GV dataset

In [None]:
# ResNet50 pretrained on GV dataset
!python tools/train.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_deep_plastic_mmpretrain.py \
    --work-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_GV/ \
    --cfg-options data_root='/scratch/tjian/Data/Paper2_Exp5/Plan_C/Jak_100per/' model.backbone.frozen_stages=4

### 4.2 ResNet50 pretrained on ImageNet dataset

In [None]:
# ResNet50 Pretrained on ImageNet dataset
!python tools/train.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_deep_plastic_Pretrain_ImgaeNet.py \
    --work-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_ImageNet/ \
    --cfg-options data_root='/scratch/tjian/Data/Paper2_Exp5/Plan_C/Jak_100per/' model.backbone.frozen_stages=4

## 5. Test on the test dataset

In [None]:
# Single-gpu testing

# "--show-dir" saves the predicted images in test dataset

!python tools/test.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_deep_plastic_mmpretrain.py \
    /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_GV/best_coco_bbox_mAP_50_epoch_100.pth \
    --work-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_GV/
    # --show-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp5/Oos_20per/FRCNN_Frozen_4_PreTrain_GV/pred_images_in_test_dataset/

In [None]:
!python tools/test.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_deep_plastic_Pretrain_ImgaeNet.py \
    /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_ImageNet/best_coco_bbox_mAP_50_epoch_78.pth \
    --work-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp4/Plan_C/Jak_100per/F4_PreTrain_ImageNet/
    # --show-dir /scratch/tjian/PythonProject/DP_MMDetection/checkpoints/train_weights/Paper2_exp5/Oos_20per/FRCNN_Frozen_4_PreTrain_ImageNet/pred_images_in_test_dataset/

## Load tensorboard

Understand the log

From the log, we can have a basic understanding on the training process and know how well the detector is trained.

First, since the dataset we are using is small, we loaded a Mask R-CNN model and finetune it for detection. Because the original Mask R-CNN is trained on COCO dataset that contains 80 classes but KITTI Tiny dataset only have 3 classes. Therefore, the last FC layers of the pre-trained Mask R-CNN for classification and regression have different weight shape and are not used. The pre-trained weights of mask prediction layer `mask_head.conv_logits` also does not matches the current model and is not used due to similar reason.

Third, after training, the detector is evaluated by the default COCO-style evaluation. The results show that the detector achieves 79.6 bbox AP and 81.5 mask AP on the val dataset, not bad!

 We can also check the tensorboard to see the curves.

In [None]:
# load tensorboard in colab
%load_ext tensorboard

# if the tensorboard page on VS Code is not so clear, 
# you can type this (localhost:6006) on web browser after executing this code 

# see curves in tensorboard
%tensorboard --logdir ./tutorial_exps