1. What is Detectron2 and how does it differ from previous object detection frameworks?
  - Detectron2 is a high-performance, open-source software system for object detection and segmentation, developed by Meta’s Facebook AI Research (FAIR) team. Released in 2019, it is a ground-up rewrite of the original Detectron and serves as a unified library for state-of-the-art computer vision research and production. It is widely considered the industry standard for "R-CNN style" tasks—such as finding where an object is (detection), drawing a pixel-perfect mask around it (segmentation), and identifying its skeleton (keypoint detection).
  - The jump from the original Detectron (and other early frameworks) to Detectron2 introduced several fundamental shifts in how vision models are built and deployed.
      - Transition to PyTorch
      - Modular "Plug-and-Play" Architecture
      - Unified Task Support
      - Speed and Efficiency
      - "Detectron2Go" (D2Go) for Production

2.  Explain the process and importance of data annotation when working with Detectron2.
  - In computer vision, data annotation is the process of labeling your raw images with ground-truth information (like bounding boxes or masks) so the model can learn what to look for.When working with Detectron2, this step is arguably the most critical part of the entire pipeline—even more so than choosing your model architecture.
  - Since Detectron2 is a supervised learning framework, the quality of your output is directly tied to the quality of your labels (the "garbage in, garbage out" principle).

      - Defining the Search Space: Annotations tell the model exactly which pixels belong to a "cat" versus the "couch." Without precise boundaries, the model will struggle to generalize to new images.

      - Performance Metrics: During training, Detectron2 compares its predictions against your annotations to calculate Loss. If your annotations are loose or inconsistent, the model never receives a clear "signal" to improve.

      - Handling Edge Cases: Annotation is where you teach the model how to handle occlusions (objects behind other objects) or difficult lighting, which are common in real-world data science.

3. Describe the steps involved in training a custom object detection model using Detectron2.
  - 1. Data Preparation and Registration
Detectron2 does not automatically "know" where your images are. You must register your dataset into the DatasetCatalog and MetadataCatalog.
  - 2. Configuration Setup (The CfgNode)
Instead of writing complex training loops from scratch, Detectron2 uses a Config system. You start with a "base" configuration from the Model Zoo (e.g., a Faster R-CNN or Mask R-CNN pre-trained on COCO) and override specific fields
  - 3. Building the Trainer
Detectron2 provides a standard tool called DefaultTrainer.
  - 4. The Training Loop
Once the trainer is initialized, you call trainer.train().
  - 5. Evaluation and Inference
After training, you need to see how the model performs on unseen data.

4. What are evaluation curves in Detectron2, and how are metrics like mAP and IoU interpreted?
  - In Detectron2, evaluation curves and metrics provide a mathematical "report card" for your model. Since object detection involves both finding where an object is and what it is, we use specific metrics to measure spatial accuracy and classification success.
  - 1. Intersection over Union (IoU)
Before understanding complex curves, you must understand IoU. This is the fundamental building block for all detection metrics.
  - 2. Mean Average Precision (mAP)
The mAP is the primary metric used in Detectron2. Because "Precision" changes depending on how confident the model is, we calculate the average across different thresholds.
  - 3. Evaluation Curves
During and after training, Detectron2 generates curves that help you diagnose model health.

5. Compare Detectron2 and TFOD2 in terms of features, performance, and ease of use.  
  - Choosing between Detectron2 and the TensorFlow Object Detection API 2 (TFOD2) often comes down to your preferred ecosystem (PyTorch vs. TensorFlow) and whether your priority is academic flexibility or enterprise scalability.
      - 1. Features and Model Variety
Both frameworks offer a "Model Zoo" with pre-trained weights, but their focus areas differ:

          - Detectron2: Specializes in "R-CNN style" tasks. It is the gold standard for Instance Segmentation (Mask R-CNN) and Panoptic Segmentation. It also includes niche research models like DensePose (mapping image pixels to 3D human surfaces) and PointRend.

          - TFOD2: Offers a broader range of architectures suited for different hardware. While it supports Faster R-CNN, it excels in providing mobile-friendly models like SSD-MobileNet and CenterNet, which are designed to run on low-power devices.

      - 2. Performance and Efficiency: Performance is measured in two ways: training speed and inference (deployment) speed.

          - Training Speed: Detectron2 is generally faster during training. Because it is built natively on PyTorch, it moves more of the data augmentation and processing onto the GPU, reducing the CPU-to-GPU bottleneck that can sometimes slow down TensorFlow pipelines.

          - Inference & Deployment: TFOD2 has a slight edge in production environments. It integrates seamlessly with TFLite (for mobile) and TensorFlow Serving (for cloud). While Detectron2 can export to TorchScript or ONNX, the path to a high-performance mobile app is often smoother with the TensorFlow ecosystem.

      - 3. Ease of Use (The Developer Experience) : This is where the two frameworks diverge most sharply in "feel."

          - Detectron2 (The "Pythonic" Way): It feels like writing standard Python. If you want to change how a specific part of the model works, you can simply subclass a Python object. It is highly transparent and easier to debug because you can use standard print statements and debuggers.

          - TFOD2 (The "Config" Way): Much of TFOD2 is driven by .config files (Protobuf). To change a model, you often edit a massive text file rather than writing code. While this is great for "no-code" experimentation, it can be frustrating if you want to implement a custom logic that the configuration file doesn't support.

6. Write Python code to install Detectron2 and verify the installation.

In [None]:
import torch
import os

!pip install pyyaml==5.1
!pip install 'git+https://github.com/facebookresearch/detectron2.git'

Collecting pyyaml==5.1
  Downloading PyYAML-5.1.tar.gz (274 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/274.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m274.2/274.2 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[?25h  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... [?25l[?25herror
[1;31merror[0m: [1mmetadata-generation-failed[0m

[31m×[0m Encountered error while generating package metadata.
[31m╰─>[0m See above for output.

[1;35mnote[0m: This is an issue with the package mentioned above, not pip.
[1;36mhint[0m: See above for details.
Collecting git+https://github.com/facebookresearch/detectron2

In [None]:
import torch
import detectron2
from detectron2.utils.logger import setup_logger

# Initialize logger
setup_logger()

def verify_detectron2():
    # Check PyTorch and CUDA
    print(f"PyTorch version: {torch.__version__}")
    print(f"CUDA available: {torch.cuda.is_available()}")

    # Check Detectron2
    print(f"Detectron2 version: {detectron2.__version__}")

    # Simple test: Load a model config
    from detectron2.config import get_cfg
    from detectron2 import model_zoo

    try:
        cfg = get_cfg()
        cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
        print("✅ Detectron2 configuration loaded successfully!")
    except Exception as e:
        print(f"❌ Verification failed: {e}")

if __name__ == "__main__":
    verify_detectron2()

PyTorch version: 2.10.0+cpu
CUDA available: False
Detectron2 version: 0.6
✅ Detectron2 configuration loaded successfully!


7. Annotate a dataset using any tool of your choice and convert the annotations to COCO format for Detectron2.

In [None]:
from detectron2.data.datasets import register_coco_instances
from detectron2.data import MetadataCatalog, DatasetCatalog
import os

# Register the training set
register_coco_instances(
    "my_dataset_train",
    {},
    "my_dataset/train/_annotations.coco.json",
    "my_dataset/train"
)

# Register the validation set
register_coco_instances(
    "my_dataset_val",
    {},
    "my_dataset/val/_annotations.coco.json",
    "my_dataset/val"
)

# Verify registration
metadata = MetadataCatalog.get("my_dataset_train")
dataset_dicts = DatasetCatalog.get("my_dataset_train")

print(f"Successfully registered {len(dataset_dicts)} images.")
print(f"Classes found: {metadata.thing_classes}")

FileNotFoundError: [Errno 2] No such file or directory: 'my_dataset/train/_annotations.coco.json'

8.  Write a script to download pretrained weights and configure paths for training in Detectron2.

In [None]:
import os
from detectron2.config import get_cfg
from detectron2 import model_zoo
from detectron2.engine import DefaultTrainer

def setup_training_config(dataset_name, num_classes, output_dir="./output"):
    cfg = get_cfg()

    config_path = "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
    cfg.merge_from_file(model_zoo.get_config_file(config_path))

    cfg.DATASETS.TRAIN = (f"{dataset_name}_train",)
    cfg.DATASETS.TEST = (f"{dataset_name}_val",)
    cfg.DATALOADER.NUM_WORKERS = 2

    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(config_path)

    cfg.SOLVER.IMS_PER_BATCH = 2
    cfg.SOLVER.BASE_LR = 0.00025
    cfg.SOLVER.MAX_ITER = 1000
    cfg.SOLVER.STEPS = []

    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = num_classes

    cfg.OUTPUT_DIR = output_dir
    os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)

    return cfg


In [None]:
from detectron2.data.datasets import register_coco_instances

# Register your local paths first
register_coco_instances("my_data_train", {}, "path/to/train.json", "path/to/train_imgs")
register_coco_instances("my_data_val", {}, "path/to/val.json", "path/to/val_imgs")

# Initialize and train
cfg = setup_training_config("my_data", num_classes=5)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

AssertionError: Torch not compiled with CUDA enabled

9. Show the steps and code to run inference using a trained Detectron2 model on a new image.

In [None]:
import cv2
import os
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2 import model_zoo

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))

cfg.MODEL.WEIGHTS = os.path.join("./output", "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3

predictor = DefaultPredictor(cfg)

im = cv2.imread("./new_image.jpg")
outputs = predictor(im)

metadata = MetadataCatalog.get("my_dataset_train")

v = Visualizer(im[:, :, ::-1], metadata=metadata, scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))

cv2.imshow("Inference Result", out.get_image()[:, :, ::-1])
cv2.waitKey(0)

AssertionError: Torch not compiled with CUDA enabled

10. You are assigned to build a wildlife monitoring system to detect and track different animal species in a forest using Detectron2. Describe the end-to-end pipeline from data collection to deploying the model, and how you would handle challenges like occlusion or nighttime detection.

In [None]:
import cv2
import torch
import numpy as np
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2 import model_zoo

from deep_sort_realtime.deepsort_tracker import DeepSort

class WildlifeTracker:
    def __init__(self, weights_path):

        self.cfg = get_cfg()
        self.cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
        self.cfg.MODEL.WEIGHTS = weights_path
        self.cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.6
        self.cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

        self.predictor = DefaultPredictor(self.cfg)

        self.tracker = DeepSort(max_age=30, n_init=3)

    def run(self, video_source):
        cap = cv2.VideoCapture(video_source)

        while cap.isOpened():
            success, frame = cap.read()
            if not success: break

            outputs = self.predictor(frame)
            instances = outputs["instances"].to("cpu")

            detections = []
            boxes = instances.pred_boxes.tensor.numpy()
            scores = instances.scores.numpy()
            classes = instances.pred_classes.numpy()

            for box, score, cls in zip(boxes, scores, classes):
                x1, y1, x2, y2 = box
                detections.append(([x1, y1, x2-x1, y2-y1], score, str(cls)))

            tracks = self.tracker.update_tracks(detections, frame=frame)

            for track in tracks:
                if not track.is_confirmed(): continue
                track_id = track.track_id
                ltrb = track.to_ltrb()

                cv2.rectangle(frame, (int(ltrb[0]), int(ltrb[1])), (int(ltrb[2]), int(ltrb[3])), (0, 255, 0), 2)
                cv2.putText(frame, f"Animal ID: {track_id}", (int(ltrb[0]), int(ltrb[1]-10)),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

            cv2.imshow("Forest Monitor", frame)
            if cv2.waitKey(1) & 0xFF == ord('q'): break

        cap.release()
        cv2.destroyAllWindows()


ModuleNotFoundError: No module named 'detectron2'