## Introduction

<img src="https://dl.fbaipublicfiles.com/detectron2/Detectron2-Logo-Horz.png" width="500">

**How to Train [Detectron2](https://github.com/facebookresearch/detectron2) Segmentation on a Custom Dataset**

The notebook is based on official Detectron2 [colab notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5) and it covers:
- Python environment setup
- Inference using pre-trained models
- Download, register and visualize COCO Format Dataset
- Configure, train and evaluate model using custom COCO Format Dataset

**Preparing a Custom Dataset**

In this tutorial, we will utilize an open source computer vision dataset from one of the 100,000+ available on [Roboflow Universe](https://universe.roboflow.com).

If you already have your own images (and, optionally, annotations), you can convert your dataset using [Roboflow](https://roboflow.com), a set of tools developers use to build better computer vision models quickly and accurately. 150k+ developers use roboflow for (automatic) annotation, converting dataset formats (like to Detectron2), training, deploying, and improving their datasets/models.

Follow [the getting started guide here](https://docs.roboflow.com/quick-start) to create and prepare your own custom dataset. Make sure to select **Instance Segmentation** Option, If you want to create your own dataset on roboflow

Useful Dataset Links

* [Helmet Instace Segmentation ](https://universe.roboflow.com/computer-vision-hx9i9/helmet_polygon_v2/dataset/4)

* [PCB Board Instance Segmentation](https://universe.roboflow.com/chip/pcb_segmentation_yolov7/dataset/17)

* [Fire Segmentation Instance Segmentation](https://universe.roboflow.com/fire-instance-segmentation/fire-detection-pr6nj/dataset/1)

## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator` and set it to `GPU`.

In [None]:
!nvidia-smi

## Install Detectron2 and dependencies

In [None]:
!python3 -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Now is a good time to confirm that we have the right versions of the libraries at our disposal.

In [None]:
import torch, detectron2
!nvcc --version
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)
print("detectron2:", detectron2.__version__)

In [None]:
# COMMON LIBRARIES
import os
import cv2

from datetime import datetime
#from google.colab.patches import cv2_imshow

# DATA SET PREPARATION AND LOADING
from detectron2.data.datasets import register_coco_instances
from detectron2.data import DatasetCatalog, MetadataCatalog

# VISUALIZATION
from detectron2.utils.visualizer import Visualizer
from detectron2.utils.visualizer import ColorMode

# CONFIGURATION
from detectron2 import model_zoo
from detectron2.config import get_cfg

# EVALUATION
from detectron2.engine import DefaultPredictor

# TRAINING
from detectron2.engine import DefaultTrainer

## Run a Pre-trained Detectron2 Model

Before you start training, it's a good idea to check that everything is working properly. The best way to do this is to perform inference using a pre-trained model.

In [None]:
# !wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg
# image = cv2.imread("./input.jpg")
# cv2_imshow(image)

In [None]:
# cfg = get_cfg()
# cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
# cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
# cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
# predictor = DefaultPredictor(cfg)
# outputs = predictor(image)

In [None]:
# print(outputs["instances"].pred_classes)
# print(outputs["instances"].pred_boxes)

In [None]:
# visualizer = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
# out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
# # cv2_imshow(out.get_image()[:, :, ::-1])

## COCO Format Dataset

### Download

We use `football-pitch-segmentation` dataset as example. Feel free to visit [Roboflow Universe](https://universe.roboflow.com/), and select any other Instance Segmentation dataset. Make sure to download the dataset in correct - `COCO Segmentation` format.

Structure of your dataset should look like this:

```
dataset-directory/
├─ README.dataset.txt
├─ README.roboflow.txt
├─ train
│  ├─ train-image-1.jpg
│  ├─ train-image-1.jpg
│  ├─ ...
│  └─ _annotations.coco.json
├─ test
│  ├─ test-image-1.jpg
│  ├─ test-image-1.jpg
│  ├─ ...
│  └─ _annotations.coco.json
└─ valid
   ├─ valid-image-1.jpg
   ├─ valid-image-1.jpg
   ├─ ...
   └─ _annotations.coco.json
```

In [None]:
import os
import json
import shutil
from sklearn.model_selection import train_test_split
from collections import defaultdict

# Load the results.json file
with open("/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/COCO_OPMD_SP/result.json", "r", encoding="utf-8") as f:  # Replace with your path
    data = json.load(f)

# Group annotations by their image_id
annotations_grouped_by_image = defaultdict(list)
for annotation in data["annotations"]:
    annotations_grouped_by_image[annotation["image_id"]].append(annotation)

# Split the image data and associated annotations into train, test, and validation sets
train_images, temp_images = train_test_split(data["images"], test_size=0.3, random_state=42)
valid_images, test_images = train_test_split(temp_images, test_size=0.5, random_state=42)

# Extract annotations for each split based on the image IDs
train_annotations = [annotations_grouped_by_image[image["id"]] for image in train_images]
train_annotations = [item for sublist in train_annotations for item in sublist]  # Flatten the list

test_annotations = [annotations_grouped_by_image[image["id"]] for image in test_images]
test_annotations = [item for sublist in test_annotations for item in sublist]  # Flatten the list

valid_annotations = [annotations_grouped_by_image[image["id"]] for image in valid_images]
valid_annotations = [item for sublist in valid_annotations for item in sublist]  # Flatten the list

# Create directories for train, test, and validation images
for dir_name in ["train", "test", "valid"]:
    os.makedirs(dir_name, exist_ok=True)

# Base path to the directory containing the images
base_path = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/COCO_OPMD_SP/"  # Adjust this path to your dataset's image directory

# Move the images to their respective directories
def move_images_to_dir(images, split_name):
    for image_data in images:
        source_path = os.path.join(base_path, image_data["file_name"])
        dest_path = os.path.join(split_name, os.path.basename(image_data["file_name"]))
        shutil.move(source_path, dest_path)

move_images_to_dir(train_images, "train")
move_images_to_dir(test_images, "test")
move_images_to_dir(valid_images, "valid")

# Save split data to JSON files
def save_split_to_json(split_name, images, annotations):
    split_data = {
        "images": images,
        "annotations": annotations,
        "categories": data["categories"],
        "info": data["info"]
    }
    with open(f"{split_name}.json", "w", encoding="utf-8") as f:
        json.dump(split_data, f, ensure_ascii=False, indent=4)

save_split_to_json("train", train_images, train_annotations)
save_split_to_json("test", test_images, test_annotations)
save_split_to_json("valid", valid_images, valid_annotations)


### Register

When you use Detectron2, before you actually train the model you need to [register it](https://detectron2.readthedocs.io/en/latest/tutorials/datasets.html#register-a-coco-format-dataset).

In [None]:
# Replace these with your dataset's name and paths
DATA_SET_NAME = "OPMD5"
ANNOTATIONS_FILE_NAME = "result.json"


In [None]:
# Replace the paths with your dataset's paths

# TRAIN SET
TRAIN_DATA_SET_NAME = f"{DATA_SET_NAME}-train"
TRAIN_DATA_SET_IMAGES_DIR_PATH = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/train"
TRAIN_DATA_SET_ANN_FILE_PATH = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/train/train.json"

register_coco_instances(
    name=TRAIN_DATA_SET_NAME,
    metadata={},
    json_file=TRAIN_DATA_SET_ANN_FILE_PATH,
    image_root=TRAIN_DATA_SET_IMAGES_DIR_PATH
)

# TEST SET
TEST_DATA_SET_NAME = f"{DATA_SET_NAME}-test"
TEST_DATA_SET_IMAGES_DIR_PATH = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/test"
TEST_DATA_SET_ANN_FILE_PATH = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/test/test.json"

register_coco_instances(
    name=TEST_DATA_SET_NAME,
    metadata={},
    json_file=TEST_DATA_SET_ANN_FILE_PATH,
    image_root=TEST_DATA_SET_IMAGES_DIR_PATH
)

# # VALID SET
VALID_DATA_SET_NAME = f"{DATA_SET_NAME}-valid"
VALID_DATA_SET_IMAGES_DIR_PATH = "//home/buddhadev/Buddhadev_Everything/OPMD/detecron2/valid"
VALID_DATA_SET_ANN_FILE_PATH = "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/valid/valid.json"

register_coco_instances(
    name=VALID_DATA_SET_NAME,
    metadata={},
    json_file=VALID_DATA_SET_ANN_FILE_PATH,
    image_root=VALID_DATA_SET_IMAGES_DIR_PATH
)


We can now confirm that our custom dataset was correctly registered using [MetadataCatalog](https://detectron2.readthedocs.io/en/latest/modules/data.html#detectron2.data.MetadataCatalog).

In [None]:
[
    data_set
    for data_set
    in MetadataCatalog.list()
    if data_set.startswith(DATA_SET_NAME)
]

### Visualize

Let's take a look at single entry from out train dataset.

In [None]:
import matplotlib.pyplot as plt

out = visualizer.draw_dataset_dict(dataset_entry)
image_rgb = out.get_image()[:, :, ::-1]  # Convert BGR to RGB for matplotlib display

plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)
plt.axis('off')  # No axes for this image
plt.show()

## Train Model Using Custom COCO Format Dataset

### Configuration

In [None]:
import os
from datetime import datetime
from detectron2.config import get_cfg
from detectron2.data import transforms as T
from detectron2 import model_zoo

# SELECT THE MODEL
MODEL_TYPE = "mask_rcnn"  # Choose either 'mask_rcnn' or 'faster_rcnn'

# HYPERPARAMETERS
ARCHITECTURES = {
    "mask_rcnn": "mask_rcnn_R_101_FPN_3x",
    "faster_rcnn": "faster_rcnn_R_101_FPN_3x"
}
ARCHITECTURE = ARCHITECTURES[MODEL_TYPE]
CONFIG_FILE_PATH = f"COCO-InstanceSegmentation/{ARCHITECTURE}.yaml"
MAX_ITER = 6000
EVAL_PERIOD = 200
BASE_LR = 0.001
NUM_CLASSES = 6

# OUTPUT DIR
OUTPUT_DIR_PATH = os.path.join(
    DATA_SET_NAME,
    ARCHITECTURE,
    datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
)
os.makedirs(OUTPUT_DIR_PATH, exist_ok=True)

# CONFIGURATION SETUP
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file(CONFIG_FILE_PATH))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(CONFIG_FILE_PATH)
cfg.DATASETS.TRAIN = (TRAIN_DATA_SET_NAME,)
cfg.DATASETS.TEST = (TEST_DATA_SET_NAME,)


augs = [
    T.ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style="choice"),
    T.RandomFlip(prob=0.5, horizontal=True, vertical=False),
    T.RandomFlip(prob=0.5, horizontal=False, vertical=True)
]

cfg.DATALOADER.AUGMENTATIONS = ["ResizeShortestEdge", "RandomFlip", "RandomFlip"]

cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 32
cfg.TEST.EVAL_PERIOD = EVAL_PERIOD
cfg.DATALOADER.NUM_WORKERS = 2
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.INPUT.MASK_FORMAT = 'bitmask'
cfg.SOLVER.BASE_LR = BASE_LR
cfg.SOLVER.MAX_ITER = MAX_ITER
cfg.MODEL.ROI_HEADS.NUM_CLASSES = NUM_CLASSES
cfg.OUTPUT_DIR = OUTPUT_DIR_PATH

In [None]:
import os
from datetime import datetime
from detectron2.config import get_cfg
from detectron2.data import transforms as T
from detectron2 import model_zoo

# HYPERPARAMETERS
MAX_ITER = 6000
EVAL_PERIOD = 200
BASE_LR = 0.001
NUM_CLASSES = 6  # Update this based on your dataset

# OUTPUT DIR
OUTPUT_DIR_PATH = os.path.join(
    "OPMD3_frcnn",  # Replace with your desired output directory name
    datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
)
os.makedirs(OUTPUT_DIR_PATH, exist_ok=True)

# CONFIGURATION SETUP
cfg = get_cfg()

# # Set the path to your custom configuration file
# CONFIG_FILE_PATH = "path/to/your/config/file.yaml"
# cfg.merge_from_file(CONFIG_FILE_PATH)

# Set the path to your downloaded model weights
MODEL_WEIGHTS_PATH = '/home/buddhadev/Buddhadev_Everything/OPMD/Fewshot/Meta_Faster_RCNN_model_final_coco.pth'
cfg.MODEL.WEIGHTS = MODEL_WEIGHTS_PATH

# Assuming TRAIN_DATA_SET_NAME and TEST_DATA_SET_NAME are already defined
cfg.DATASETS.TRAIN = (TRAIN_DATA_SET_NAME,)
cfg.DATASETS.TEST = (TEST_DATA_SET_NAME,)

# Augmentations (adjust as needed for your dataset)
augs = [
    T.ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style="choice"),
    T.RandomFlip(prob=0.5, horizontal=True, vertical=False),
    T.RandomFlip(prob=0.5, horizontal=False, vertical=True)
]
cfg.DATALOADER.AUGMENTATIONS = augs

# Other Configurations
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 32
cfg.TEST.EVAL_PERIOD = EVAL_PERIOD
cfg.DATALOADER.NUM_WORKERS = 2
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.INPUT.MASK_FORMAT = 'bitmask'
cfg.SOLVER.BASE_LR = BASE_LR
cfg.SOLVER.MAX_ITER = MAX_ITER
cfg.MODEL.ROI_HEADS.NUM_CLASSES = NUM_CLASSES
cfg.OUTPUT_DIR = OUTPUT_DIR_PATH


In [None]:
# # HYPERPARAMETERS
# ARCHITECTURE = "mask_rcnn_R_101_FPN_3x"
# CONFIG_FILE_PATH = f"COCO-InstanceSegmentation/{ARCHITECTURE}.yaml"
# MAX_ITER = 5000
# EVAL_PERIOD = 200
# BASE_LR = 0.001
# NUM_CLASSES = 3

# # OUTPUT DIR
# OUTPUT_DIR_PATH = os.path.join(
#     DATA_SET_NAME,
#     ARCHITECTURE,
#     datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
# )

# os.makedirs(OUTPUT_DIR_PATH, exist_ok=True)

In [None]:
# cfg = get_cfg()
# cfg.merge_from_file(model_zoo.get_config_file(CONFIG_FILE_PATH))
# cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(CONFIG_FILE_PATH)
# cfg.DATASETS.TRAIN = (TRAIN_DATA_SET_NAME,)
# cfg.DATASETS.TEST = (TEST_DATA_SET_NAME,)
# cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
# cfg.TEST.EVAL_PERIOD = EVAL_PERIOD
# cfg.DATALOADER.NUM_WORKERS = 2
# cfg.SOLVER.IMS_PER_BATCH = 2
# cfg.INPUT.MASK_FORMAT='bitmask'
# cfg.SOLVER.BASE_LR = BASE_LR
# cfg.SOLVER.MAX_ITER = MAX_ITER
# cfg.MODEL.ROI_HEADS.NUM_CLASSES = NUM_CLASSES
# cfg.OUTPUT_DIR = OUTPUT_DIR_PATH

### Training

In [None]:
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

In [None]:
# Look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir $OUTPUT_DIR_PATH

### Evaluation

In [None]:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7
predictor = DefaultPredictor(cfg)

In [None]:
dataset_valid = DatasetCatalog.get(VALID_DATA_SET_NAME)

for d in dataset_valid:
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])

In [None]:
dataset_valid = DatasetCatalog.get(VALID_DATA_SET_NAME)

for d in dataset_valid:
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])

In [None]:
import cv2
import matplotlib.pyplot as plt
from detectron2.data import DatasetCatalog
from detectron2.utils.visualizer import Visualizer, ColorMode

# Assuming you've already registered your dataset and set VALID_DATA_SET_NAME
dataset_valid = DatasetCatalog.get(VALID_DATA_SET_NAME)

for d in dataset_valid:
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,  # Ensure this is set to your dataset's metadata
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW  # ColorMode.IMAGE_BW or other modes you prefer
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
    
    # Display the image in Jupyter Notebook
    plt.figure(figsize=(10, 10))
    plt.imshow(out.get_image()[:, :, ::-1])
    plt.axis("off")
    plt.show()


In [None]:
import cv2
import matplotlib.pyplot as plt
from detectron2.data import DatasetCatalog
from detectron2.utils.visualizer import Visualizer, ColorMode

# Assuming you've already registered your dataset and set TEST_DATA_SET_NAME
dataset_test = DatasetCatalog.get(TEST_DATA_SET_NAME)

for d in dataset_test:
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,  # Ensure this is set to your dataset's metadata
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW  # ColorMode.IMAGE_BW or other modes you prefer
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
    
    # Display the image in Jupyter Notebook
    plt.figure(figsize=(10, 10))
    plt.imshow(out.get_image()[:, :, ::-1])
    plt.axis("off")
    plt.show()
 


In [None]:
# Here's the modified code to save the visualized images in your local environment:

SAVE_DIR = "./visualized_predictions3/"  # Define where you want to save the visualized images

if not os.path.exists(SAVE_DIR):
    os.makedirs(SAVE_DIR)

for idx, d in enumerate(dataset_test):
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,  # Ensure this is set to your dataset's metadata
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW  # ColorMode.IMAGE_BW or other modes you prefer
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))

    # Display the image using matplotlib
    plt.figure(figsize=(10, 10))
    plt.imshow(out.get_image()[:, :, ::-1])
    plt.axis("off")
    
    # Save the visualized image using matplotlib
    save_path = os.path.join(SAVE_DIR, f"visualized_{idx}.jpg")
    plt.savefig(save_path, bbox_inches='tight', pad_inches=0)
    plt.close()  # Close the figure after saving

# Note: You can run the provided code on your local machine where you have Detectron2 and the necessary datasets. 
# This will save the visualized images using matplotlib to the specified directory (in this case, "./visualized_predictions/").


In [None]:
from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "/home/buddhadev/Buddhadev_Everything/OPMD/detecron2/OPMD3_fSL/2023-11-15-21-11-08/model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.50
predictor = DefaultPredictor(cfg)
evaluator = COCOEvaluator("OPMD5-test", cfg, False, output_dir="./output/")
val_loader = build_detection_test_loader(cfg, "OPMD5-test")
inference_on_dataset(trainer.model, val_loader, evaluator)