# Training an Object Detection model using Detectron2

This 5-minute quickstart tutorial demonstrates how to train Detectron2 on object detection datasets. In this dataset each example contains a bounding box and a class label surrounding a physical object within an image scene. Using this labeled data, we train a model to predict classes of objects in an image and their physical locations.

This notebook demonstrates how to train a state-of-the-art object detection model using Detectron2 and use it to produce `pred_probs`, which will help detect label errors in Object detection datasets. 

To identify label errors, we provide an [example](https://github.com/cleanlab/examples/) notebook on our GitHub repository on [Finding Label Errors in Object Detection Datasets](https://github.com/cleanlab/cleanlab/blob/master/docs/source/tutorials/object_detection.ipynb). Here we fit a state-of-the-art neural network initialized from a pretrained [X-101](https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md#imagenet-pretrained-models) network backbone.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cleanlab/examples/blob/master/object_detection/detectron2_training.ipynb)

In [None]:
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import pickle
# import some common libraries
import numpy as np
import os, json, cv2, random
from detectron2.data import build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances
import glob

In [None]:
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/test_coco_0_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/test_coco_1_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/test_coco_2_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/test_coco_3_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/test_coco_4_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/train_coco_0_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/train_coco_1_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/train_coco_2_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/train_coco_3_fold.json"
# !wget -nc "https://cleanlab-public.s3.amazonaws.com/ObjectDetectionBenchmarking/DATASET_annotations/train_coco_4_fold.json"
# !wget -nc "http://images.cocodataset.org/zips/val2017.zip" && unzip -q -o val2017.zip
# !wget -nc "http://images.cocodataset.org/zips/train2017.zip" && unzip -q -o train2017.zip

Please download the [COCOdataset](https://cocodataset.org/#download)

Before you begin training on a custom dataset, be sure to review the COCO dataset guidelines for formatting your data, which can be found on their [website](https://cocodataset.org/#format-data).

To use a custom dataset named "my_dataset" with detectron2, users must implement a function that returns the items in the dataset and then inform detectron2 about this function. For instance, a subset of the labels ["car", "chair", "cup", "person", and "traffic light"] are used for training and detecting errors in this notebook.

In [None]:
IMAGE_PATH = ""
TRAIN_PATH = os.path.join(IMAGE_PATH,"train2017")
VAL_PATH = os.path.join(IMAGE_PATH,"val2017")
pairs = []
for k in range(0,5):
    train_name = f"train_coco_{k}_fold"
    json_train = train_name+".json"
    test_name = f"test_coco_{k}_fold"
    json_test= train_name+".json"
    register_coco_instances(train_name, {}, json_train,
                        TRAIN_PATH)
    register_coco_instances(test_name, {}, json_test,
                        VAL_PATH)
    pairs.append((train_name,test_name))

We define the configuration settings for training an object detection model using Detectron2. The model architecture used in this example is "faster_rcnn_X_101_32x8d_FPN_3x" from the COCO-Detection model zoo. The training data is specified by the "my_dataset_train" dataset and validation data is specified by the "my_dataset_val" dataset which refer to COCO2017 train and val containing only the subset of labels specified before.

The number of worker threads is set to 2 and the batch size is set to 2.
The learning rate and maximum number of iterations are also specified. The model is initialized from the COCO-Detection model zoo and the output directory for the trained model is created. Finally, the configuration is passed to the DefaultTrainer class for training the object detection model.

<strong>Note:</strong> The number of iterations was set based on [early stopping.](https://en.wikipedia.org/wiki/Early_stopping#:~:text=In%20machine%20learning%2C%20early%20stopping,training%20data%20with%20each%20iteration.)

## Train the model


In [None]:
def train_data(TRAIN,VALIDATION,folder):
    cfg = get_cfg()
    MODEL = 'faster_rcnn_X_101_32x8d_FPN_3x.yaml'
    cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/"+MODEL))
    cfg.DATASETS.TRAIN = (TRAIN,)
    cfg.DATASETS.TEST = (VALIDATION,)
    cfg.DATALOADER.NUM_WORKERS = 2
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/"+MODEL)  # Let training initialize from model zoo
    cfg.SOLVER.IMS_PER_BATCH = 2  # This is the real "batch size" commonly known to deep learning people
    cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
    cfg.SOLVER.MAX_ITER = 6000    # 
    cfg.SOLVER.STEPS = []        # do not decay learning rate
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128   # The "RoIHead batch size". 
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 80  # only 5 classes ["car", "chair", "cup", "person", and "traffic light"] 
    cfg.OUTPUT_DIR = folder
    cfg.TEST.EVAL_PERIOD = 500
    os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
    trainer = DefaultTrainer(cfg) 
    trainer.resume_or_load(resume=False)
    trainer.train();


The given code block implements a function "format_detectron2_predictions" that converts the output of Detectron2 to a format that can be used by Cleanlab for identifying label errors. The function accepts the predicted instances and the number of classes as inputs. It processes the predicted bounding boxes and scores for each instance, and outputs a list of numpy arrays containing the bounding boxes and scores for each class.

In [None]:
def format_detectron2_predictions(ins,num_classes):
    fields = ins.get_fields()
    boxes = fields['pred_boxes'].tensor.numpy()
    res = [[] for i in range(num_classes)]
    for i in range(0,len(fields['pred_classes'])):
        pred_class = fields['pred_classes'][i].item()
        probs = ins.get_fields()['scores'][i].item()
        box_cord = list(boxes[i])
        box_cord.append(probs)
        res[pred_class].append(box_cord)
    res2 = []
    for i in res:
        if len(i)==0:
            q = np.array(i,dtype=np.float32).reshape((0,num_classes))
        else:
            q = np.array(i,dtype=np.float32)
        res2.append(q)
    return res2

In [None]:
import json

In [None]:
dat = json.load(open("../"+pairs[k][1]+'.json','rb'))

In [None]:
dat['images'][0]

In [None]:
result_dict = {}
for k in range(0,len(pairs)):
    train_data = pairs[k][0]
    val_data = pairs[k][1]
    train_data(train_data,val_data,"COCO_TRAIN_"+str(k)+"_FOLD")
    evaluator = COCOEvaluator(val_data, output_dir="output")
    val_loader = build_detection_test_loader(cfg, val_data)
    cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")  # path to the model we just trained
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7   # set a custom testing threshold
    predictor = DefaultPredictor(cfg)
    dataset = json.load(open("../"+pairs[k][1]+'.json','rb'))
    for image in dat['images']:
        im_name = os.path.join(TRAIN_PATH, i['file_name'])
        im = cv2.imread(im_name)
        outputs = predictor(im)
        result_dict[im_name](format_detectron2_predictions(outputs["instances"].to("cpu"),cfg.MODEL.ROI_HEADS.NUM_CLASSES))

In [None]:
dataset = pickle.load(open("TRAIN_COCO_ALL_labels.pkl",'rb'))
results = []
for i in dataset:
    im_name = os.path.join(TRAIN_PATH, i['seg_map'].replace(".png",'.jpg'))
    results.append(result_dict[im_name])
    
pickle.dump(results,open("results_train_ALL.pkl",'wb'))