# Detectron2 Beginner's Tutorial
Will only work inside Colab, not as standalone notebook

<img src="https://dl.fbaipublicfiles.com/detectron2/Detectron2-Logo-Horz.png" width="500">

Welcome to detectron2! This is the official colab tutorial of detectron2. Here, we will go through some basics usage of detectron2, including the following:
* Run inference on images or videos, with an existing detectron2 model
* Train a detectron2 model on a new dataset

You can make a copy of this tutorial by "File -> Open in playground mode" and make changes there. __DO NOT__ request access to this tutorial.


# Install detectron2

In [None]:
!pip install pyyaml==5.1
!pip install torch==1.10
!pip install torchvision==0.11.1

import torch
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
print(torch.__version__)
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)
# Install detectron2 that matches the above pytorch version
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
!pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/$CUDA_VERSION/torch$TORCH_VERSION/index.html

# If there is not yet a detectron2 release that matches the given torch + CUDA version, you need to install a different pytorch.

# exit(0)  # After installation, you may need to "restart runtime" in Colab. This line can also restart runtime

In [None]:
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor, DefaultTrainer
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog, DatasetMapper, build_detection_train_loader
import detectron2.data.transforms as T

# Train on a custom dataset

## Prepare the dataset

In [None]:
# download, decompress the data
# !wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip
# !unzip balloon_dataset.zip > /dev/null

# Mount a Drive to be able to train on images uploaded on Drive
# from google.colab import drive
# drive.mount("/content/drive")

In [None]:
import zipfile
import os

!wget --no-check-certificate \
https://github.com/OscStal/parts_and_ros/archive/refs/heads/main.zip \
-O "/tmp/volvo_parts.zip"

zip_ref = zipfile.ZipFile('/tmp/volvo_parts.zip', 'r') #Opens the zip file in read mode
zip_ref.extractall('/content') #Extracts the files into the /tmp folder
zip_ref.close()

# Rename unzipped github folder to just "github", next cell relies on this
unzipped_repo_path = "/content/parts_and_ros-main/"
repo_folder = "/content/github"
os.rename(unzipped_repo_path, repo_folder)

In [None]:
from detectron2.data.datasets import register_coco_instances
# if your dataset is in COCO format, this cell can be replaced by the following three lines:
# Specify your path to images files and json file
#register_coco_instances("my_dataset_train", {}, "/content/drive/MyDrive/datasets/all/a_train.json", "/content/drive/MyDrive/datasets/all")
#register_coco_instances("my_dataset_val", {}, "/content/drive/MyDrive/datasets/validate/val.json", "/content/drive/MyDrive/datasets/validate")

train_data_location = os.path.join(repo_folder, "datasets/sample/imgs/all")
test_data_location = os.path.join(repo_folder, "datasets/sample/imgs/validate")

register_coco_instances("my_dataset_train", {}, os.path.join(train_data_location, "a_train.json"), train_data_location)
register_coco_instances("my_dataset_val", {}, os.path.join(test_data_location, "val.json"), test_data_location)

# Metadata for training dataset
custom_metadata = MetadataCatalog.get("my_dataset_train")

To verify the data loading is correct, let's visualize the annotations of randomly selected samples in the training set:



In [None]:
dataset_dicts = DatasetCatalog.get("my_dataset_train")
print(dataset_dicts)
for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=custom_metadata, scale=0.5)
    out = visualizer.draw_dataset_dict(d)
    cv2_imshow(out.get_image()[:, :, ::-1])

## Train!

Now, let's fine-tune a COCO-pretrained R50-FPN Mask R-CNN model on the balloon dataset. It takes ~2 minutes to train 300 iterations on a P100 GPU.


Custom Trainer for Data Augmentations

In [None]:
from detectron2.data.transforms.augmentation_impl import Resize

class CustomTrainer(DefaultTrainer):

  @classmethod
  def build_train_loader(cls, cfg):
    return build_detection_train_loader(cfg, mapper=DatasetMapper(cfg, is_train=True, augmentations=[
      T.Resize((1000, 1000)),
   ]))
    
  @classmethod  
  def build_test_loader(cls, cfg):
    return build_detection_train_loader(cfg, mapper=DatasetMapper(cfg, is_train=False, augmentations=[
      
    ]))

Setup the Config

In [None]:
from torch.cuda import empty_cache

MODELS = [
          "COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml",
          "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml"
]
ACTIVE_MODEL_INDEX = 4

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file(MODELS[ACTIVE_MODEL_INDEX]))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
# cfg.DATASETS.TEST = ("my_dataset_val", )  # If mid-training validation is wanted
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(MODELS[ACTIVE_MODEL_INDEX])  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.001  # pick a good LR
cfg.SOLVER.MAX_ITER = 1500    # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset
cfg.SOLVER.STEPS = []        # do not decay learning rate
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512   # faster, and good enough for this toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
# cfg.TEST.EVAL_PERIOD = 50

Run the Training normally

In [None]:
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = CustomTrainer(cfg)
# empty_cache()
trainer.resume_or_load(resume=False)
trainer.train()

Test writing JSON-file to Drive

In [None]:
from google.colab import drive, files
import json
drive.mount("/content/drive", force_remount=True)

out_file = "test.json"
drive_file = f"/content/drive/MyDrive/{out_file}"
metric_dict = {"test": "test"}
with open(drive_file, "w+") as out:
  json.dump(metric_dict, out, indent=4)

In [None]:
from google.colab import drive
import shutil
drive.mount("/content/drive", force_remount=True)
# shutil.move("output/model_final.pth", "drive/MyDrive/model_final_final.pth")

Mounted at /content/drive


Run training, automated to test config parameters

In [None]:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

from google.colab import drive, files
drive.mount("/content/drive")

# In order as listed below
DEFAULTS = [2, 2, 0.00025, 500, 64]

# List of values to test for config parameters NUM_WORKERS, IMS_PER_BATCH, LR, MAX_ITERS and BATCH_SIZE_PER_IMAGE
# Chosen randomly, might manually test more if these are not deemed enough
NWS = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
IPBS = [1, 2, 3, 4, 5, 6, 7, 8]
LRS = [0.00001, 0.00005, 0.0001, 0.00015, 0.0002, 0.00025, 0.0003, 0.0005, 0.0006, 0.001, 0.002]
MAXITERS = [500, 1000, 1500, 2000, 2500, 3000]
BSPIS = [32, 64, 128, 256, 512, 1024]

metric_dict = dict()
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
out_file = "output_dict_numworkers_2.json"
dict_title = "NUM_WORKERS:"
drive_file = f"/content/drive/MyDrive/{out_file}"

for val in [9, 10]:
  cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(MODELS[ACTIVE_MODEL_INDEX])
  cfg.DATALOADER.NUM_WORKERS = val
  trainer = CustomTrainer(cfg)
  trainer.resume_or_load(resume=False)
  print(val)
  trainer.train()

  cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")  # path to the model we just trained
  cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7   # set a custom testing threshold
  predictor = DefaultPredictor(cfg)

  evaluator = COCOEvaluator("my_dataset_val", output_dir="./output")
  val_loader = build_detection_test_loader(cfg, "my_dataset_val", mapper=DatasetMapper(cfg, is_train=False, augmentations=[T.Resize((1000, 1000)),]))
  res = inference_on_dataset(predictor.model, val_loader, evaluator)

  metric_dict[f"{dict_title}: {val}"] = res # TODO: save only the APs that are required, not entire "res"

  with open(out_file, "w+") as out:
    json.dump(metric_dict, out, indent=4)
  with open(drive_file, "w+") as out:
    json.dump(metric_dict, out, indent=4)

# files.download(out_file)


"""
Config used for the different parameter tests:
NUM_WORKERS:              IMS_PER_BATCH = 2, MAX_ITERS = 500, LR = 0.00025, BATCH_SIZE_PER_IMAGE = 64;    Training augs: Resize(1000,1000), Testing augs: Resize(1000,1000)
IMS_PER_BATCH:            NUM_WORKERS = 2, MAX_ITERS = 500, LR = 0.00025, BATCH_SIZE_PER_IMAGE = 64;      Training augs: Resize(1000,1000), Testing augs: Resize(1000,1000)
LR:                       NUM_WORKERS = 2, IMS_PER_BATCH = 2 MAX_ITERS = 500, BATCH_SIZE_PER_IMAGE = 64;  Training augs: Resize(1000,1000), Testing augs: Resize(1000,1000)
MAX_ITERS:                NUM_WORKERS = 2, IMS_PER_BATCH = 2 LR = 0.00025, BATCH_SIZE_PER_IMAGE = 64;     Training augs: Resize(1000,1000), Testing augs: Resize(1000,1000)
BATCH_SIZE_PER_IMAGE:     NUM_WORKERS = 2, IMS_PER_BATCH = 2 LR = 0.00025, MAX_ITERS = 500;               Training augs: Resize(1000,1000), Testing augs: Resize(1000,1000)
"""


In [None]:
# Look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir /content/output

## Inference & evaluation using the trained model
Now, let's run inference with the trained model on the balloon validation dataset. First, let's create a predictor using the model we just trained:



In [None]:
# Inference should use the config with parameters that are used in training
# cfg now already contains everything we've set previously. We changed it a little bit for inference:
# cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")  # path to the model we just trained
cfg.MODEL.WEIGHTS = os.path.join("/content/drive/MyDrive/model_final_final.pth")  # path to the model we just trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.3  # set a custom testing threshold
predictor = DefaultPredictor(cfg)

Then, we randomly select several samples to visualize the prediction results.

In [None]:
from detectron2.utils.visualizer import ColorMode
dataset_dicts = DatasetCatalog.get("my_dataset_val")
for d in random.sample(dataset_dicts, 3):    
    im = cv2.imread(d["file_name"])
    outputs = predictor(im)  # format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
    v = Visualizer(im[:, :, ::-1],
                   metadata=custom_metadata, 
                   scale=0.5, 
                   instance_mode=ColorMode.SEGMENTATION   # remove the colors of unsegmented pixels. This option is only available for segmentation models
    )
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])

We can also evaluate its performance using AP metric implemented in COCO API.

In [None]:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
evaluator = COCOEvaluator("my_dataset_val", output_dir="./output")
val_loader = build_detection_test_loader(cfg, "my_dataset_val", mapper=DatasetMapper(cfg, is_train=False, augmentations=[T.Resize((1000, 1000)),]))
res = inference_on_dataset(predictor.model, val_loader, evaluator)
print(res)
# another equivalent way to evaluate the model is to use `trainer.test`