1: What is Detectron2 and how does it differ from previous object detection
frameworks?
- Detectron2 provides state-of-the-art implementations of popular computer vision models, including:

 - It is widely used in research and production for tasks like image understanding, medical imaging, autonomous driving, and surveillance.

How Detectron2 differs from previous object detection frameworks
1. PyTorch-based (vs Caffe2 in Detectron)

Detectron2 is built entirely on PyTorch

Enables dynamic computation graphs

Easier debugging, customization, and experimentation

Older Detectron used Caffe2, which was harder to modify and debug.

2. Modular and Flexible Architecture

Components such as backbones, heads, losses, and datasets are highly modular

Easy to plug in custom models or datasets

 Earlier frameworks were more monolithic and less flexible.

3. Improved Performance & Scalability

Optimized training and inference pipelines

Supports multi-GPU and distributed training

Faster experimentation and deployment

4. Better Dataset Handling

Native support for datasets like COCO, LVIS, and Pascal VOC

Simple APIs for registering custom datasets

 Older frameworks required more manual dataset configuration.

5. Production-Ready Features

TorchScript support for model deployment

Better checkpointing and reproducibility

Clean integration with modern PyTorch tools

6. State-of-the-Art Research Support

Frequently updated with latest research models

Strong benchmark results on COCO and other datasets

 Detectron2 is actively maintained, unlike many older frameworks.

2. Explain the process and importance of data annotation when working with
Detectron2.
- Step 1: Define the Task

Decide what the model needs to learn:

Object detection → Bounding boxes + class labels

Instance segmentation → Pixel-level masks

Keypoint detection → Body/joint points

Step 2: Choose Annotation Format

Detectron2 commonly supports:

COCO JSON format

Pascal VOC XML

Custom datasets (registered via Detectron2 API)

Step 3: Annotate the Data

Use annotation tools such as:

LabelImg (bounding boxes)

CVAT

LabelMe

Roboflow

Annotations include:

Class name (e.g., person, car)

Bounding box coordinates

Segmentation masks (if required)

Step 4: Quality Checking

Remove incorrect labels

Ensure consistency in class names

Validate box/mask alignment

Poor-quality annotations lead to poor model performance.

Step 5: Dataset Registration

Register annotated datasets in Detectron2

Split data into training, validation, and testing sets

2. Importance of Data Annotation
1. Enables Supervised Learning

Detectron2 models rely on labeled data to learn object features.
Without annotation → model cannot be trained.

2. Directly Impacts Model Accuracy

Accurate labels → high precision & recall

Incorrect labels → false detections, poor generalization



3. Supports Advanced Tasks

Proper annotation enables:

Instance segmentation

Panoptic segmentation

Keypoint detection

These tasks cannot work with weak or incomplete labels.

4. Reduces Training Time and Overfitting

Clean annotations:

Help models converge faster

Reduce noise in loss functions

5. Ensures Real-World Reliability

For applications like:

Medical imaging

Autonomous driving

Surveillance

Incorrect annotations can cause serious real-world failures.

3. Describe the steps involved in training a custom object detection model
using Detectron2.
- Install Detectron2
- Prepare and Annotate the Dataset
- Register the Dataset in Detectron2
- Choose a Pretrained Model & Configuration
- Modify Configuration for Custom Training
- Start Training the Model
- Evaluate the Model
- Run Inference on New Images
- Save/Deploy the Model

4. What are evaluation curves in Detectron2, and how are metrics like mAP
and IoU interpreted?
- Evaluation Curves in Detectron2
   - Evaluation curves are graphical tools used to understand the performance of an object detection model. These curves help visualize:
      - How well the model detects object
      - How accurate the predictions are
      - Trade-off between precision and recall
      - Performance across IoU thresholds
      1. Precision–Recall (PR) Curve
      2. IoU–Accuracy Curve
      3. Loss Curves
- 1. IoU (Intersection over Union)
    - IoU measures how well a predicted bounding box overlaps with the ground truth (actual object).
    - IoU=Area of Union/Area of Overlap​
    - Interpretation
       - IoU = 1.0 → Perfect match
       - IoU = 0.5 → 50% overlap
       - IoU < 0.5 → Often considered a wrong detection
      - Higher IoU = better localization accuracy
2. mAP (mean Average Precision)
   - mAP measures the overall performance of the model in detecting and classifying objects across multiple IoU thresholds and classes.
   - How mAP Works
      - Compute Precision–Recall curve for each class
      - Compute Average Precision (AP) for each class
      - Take the mean of AP across all classes and IoU thresholds
    - Interpretation
       - Higher mAP = better overall detection accuracy
       - AP50 is easier to achieve, AP75 is more strict
       - AP@[0.50:0.95] is the most reliable metric

5. Compare Detectron2 and TFOD2 in terms of features, performance, and
ease of use.
- Detectron2
   - 1. Framework & Backend
     - Built on PyTorch
     - Modern, modular architecture
     - Designed by Facebook AI Research (FAIR)
   - 2. Features
      - Object Detection
      - Instance Segmentation
      - Panoptic Segmentation
      - Keypoint Detection
      - DensePose
    - 3. Model Zoo
      - Faster R-CNN
      - Mask R-CNN
      - Cascade R-CNN
      - RetinaNet
      - Panoptic FPN
    - 4. Performance
      - Faster training due to efficient PyTorch pipelines
      - High accuracy for segmentation and complex tasks
      - Excellent multi-GPU support
- TFOD2
   - 1. Framework & Backend
       - Built on TensorFlow
       - Developed by Google
       - Optimized for production and TPU support
    - 2. Features
       - Object Detection
       - Instance Segmentation (limited models)
       - Uses pipeline config files for training
    - 3. Model Zoo
        - Wide range of production models
        - SSD
        - EfficientDet
        - Faster R-CNN
        - CenterNet
        - MobileNet-SSD
    4. Performance
        - Optimized for TPU and mobile deployment
        - EfficientDet gives fast inference
        - Very competitive accuracy for detection tasks

6. Write Python code to install Detectron2 and verify the installation.


In [None]:
# Import Detectron2 to verify installation
import detectron2
from detectron2.utils.logger import setup_logger

setup_logger()

print("Detectron2 successfully installed!")
print("Version:", detectron2.__version__)


Detectron2 successfully installed!
Version: 0.6


In [None]:
# Test if model zoo loads correctly
from detectron2 import model_zoo
print("Model Zoo Import Successful")


Model Zoo Import Successful


Question 7: Annotate a dataset using any tool of your choice and convert the
annotations to COCO format for Detectron2

In [None]:
from labelme import utils


In [None]:
import os

os.makedirs("dataset/images", exist_ok=True)
os.makedirs("dataset/annotations", exist_ok=True)


In [None]:
import json
import os
from labelme import utils
import numpy as np
from PIL import Image

input_dir = "dataset/annotations"
output_file = "dataset/annotations/coco_train.json"

coco = {
    "images": [],
    "annotations": [],
    "categories": [{"id": 1, "name": "tiger"}, {"id": 2, "name": "deer"}] # example
}

annotation_id = 1
for idx, file in enumerate(os.listdir(input_dir)):
    if file.endswith(".json"):
        path = os.path.join(input_dir, file)
        with open(path) as f:
            data = json.load(f)

        # image info
        image_path = os.path.join("dataset/images", data["imagePath"])
        img = Image.open(image_path)
        width, height = img.size

        coco["images"].append({
            "id": idx + 1,
            "file_name": data["imagePath"],
            "width": width,
            "height": height
        })

        # object annotations
        for shape in data["shapes"]:
            points = shape["points"]
            x_min = min([p[0] for p in points])
            y_min = min([p[1] for p in points])
            x_max = max([p[0] for p in points])
            y_max = max([p[1] for p in points])
            width_box = x_max - x_min
            height_box = y_max - y_min

            category_name = shape["label"]
            category_id = next(c["id"] for c in coco["categories"] if c["name"] == category_name)

            coco["annotations"].append({
                "id": annotation_id,
                "image_id": idx + 1,
                "category_id": category_id,
                "bbox": [x_min, y_min, width_box, height_box],
                "area": width_box * height_box,
                "iscrowd": 0
            })
            annotation_id += 1

# Save COCO JSON
with open(output_file, "w") as f:
    json.dump(coco, f)


8. Write a script to download pretrained weights and configure paths for
training in Detectron2.


In [None]:
# Import Detectron2 modules
from detectron2.config import get_cfg
from detectron2 import model_zoo

# -----------------------------
# 1. Load base configuration
# -----------------------------
cfg = get_cfg()

# Use a pretrained Faster R-CNN model from Detectron2 model zoo
config_file = "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
cfg.merge_from_file(model_zoo.get_config_file(config_file))

# -----------------------------
# 2. Download pretrained weights
# -----------------------------
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(config_file)
print("Pretrained weights downloaded from:", cfg.MODEL.WEIGHTS)

# -----------------------------
# 3. Configure dataset paths
# -----------------------------
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)

# Image + annotation paths
cfg.DATASETS.TRAIN_IMG_DIR = "dataset/train/images/"
cfg.DATASETS.TRAIN_JSON = "dataset/train/annotations.json"

cfg.DATASETS.VAL_IMG_DIR = "dataset/val/images/"
cfg.DATASETS.VAL_JSON = "dataset/val/annotations.json"

# -----------------------------
# 4. Configure training settings
# -----------------------------
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 3000

# Number of classes (change according to dataset)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3

# Output directory for saving weights and logs
cfg.OUTPUT_DIR = "./output_model"

print("Training configuration completed!")


Pretrained weights downloaded from: https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl
Training configuration completed!


Question 9: Show the steps and code to run inference using a trained Detectron2
model on a new image.


In [None]:
import cv2
import torch
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog


In [None]:
import os
print(os.listdir())


['.config', 'sample_data']


In [None]:
with open("config.yaml", "w") as f:
    f.write(cfg.dump())


In [None]:

import os
print(os.listdir())


['.config', 'config.yaml', 'sample_data']


In [None]:
cfg.OUTPUT_DIR = "./output_model"
import os
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)


In [None]:
cfg.MODEL.DEVICE = "cpu"  # Force CPU training/inference


In [None]:

from detectron2.data.datasets import register_coco_instances


register_coco_instances(
    "my_dataset_train",
    {},
    "dataset/annotations/coco_train.json",
    "dataset/images/train"
)

register_coco_instances(
    "my_dataset_val",
    {},
    "dataset/annotations/coco_val.json",
    "dataset/images/val"
)


In [None]:
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3  # number of classes in your dataset
cfg.OUTPUT_DIR = "./output_model"

import os
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)


In [None]:
from detectron2.data.datasets import register_coco_instances

register_coco_instances(
    "balloon_train",
    {},
    "dataset/annotations/balloon_train.json",
    "dataset/images/train"
)
register_coco_instances(
    "balloon_val",
    {},
    "dataset/annotations/balloon_val.json",
    "dataset/images/val"
)


In [None]:
cfg.DATASETS.TRAIN = ("balloon_train",)
cfg.DATASETS.TEST = ("balloon_val",)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only 1 class: balloon
cfg.OUTPUT_DIR = "./output_model"


Question 10: You are assigned to build a wildlife monitoring system to detect and track
different animal species in a forest using Detectron2. Describe the end-to-end pipeline
from data collection to deploying the model, and how you would handle challenges like
occlusion or nighttime detection.

In [None]:
from detectron2.data.datasets import register_coco_instances

register_coco_instances("wildlife_train", {}, "annotations/coco_train.json", "images/train")
register_coco_instances("wildlife_val", {}, "annotations/coco_val.json", "images/val")


In [None]:
cfg.MODEL.DEVICE = "cpu"  # Force CPU usage


In [None]:
register_coco_instances(
    "wildlife_train_v2", {},
    "dataset/annotations/coco_train.json",
    "dataset/images/train"
)
register_coco_instances(
    "wildlife_val_v2", {},
    "dataset/annotations/coco_val.json",
    "dataset/images/val"
)

cfg.DATASETS.TRAIN = ("wildlife_train_v2",)
cfg.DATASETS.TEST = ("wildlife_val_v2",)


In [None]:
from detectron2.data import DatasetCatalog
print(DatasetCatalog.list())


['coco_2014_train', 'coco_2014_val', 'coco_2014_minival', 'coco_2014_valminusminival', 'coco_2017_train', 'coco_2017_val', 'coco_2017_test', 'coco_2017_test-dev', 'coco_2017_val_100', 'keypoints_coco_2014_train', 'keypoints_coco_2014_val', 'keypoints_coco_2014_minival', 'keypoints_coco_2014_valminusminival', 'keypoints_coco_2017_train', 'keypoints_coco_2017_val', 'keypoints_coco_2017_val_100', 'coco_2017_train_panoptic_separated', 'coco_2017_train_panoptic_stuffonly', 'coco_2017_train_panoptic', 'coco_2017_val_panoptic_separated', 'coco_2017_val_panoptic_stuffonly', 'coco_2017_val_panoptic', 'coco_2017_val_100_panoptic_separated', 'coco_2017_val_100_panoptic_stuffonly', 'coco_2017_val_100_panoptic', 'lvis_v1_train', 'lvis_v1_val', 'lvis_v1_test_dev', 'lvis_v1_test_challenge', 'lvis_v0.5_train', 'lvis_v0.5_val', 'lvis_v0.5_val_rand_100', 'lvis_v0.5_test', 'lvis_v0.5_train_cocofied', 'lvis_v0.5_val_cocofied', 'cityscapes_fine_instance_seg_train', 'cityscapes_fine_sem_seg_train', 'citysca