# Car Detection and Fine-tuning with Detectron2

In this Jupyter notebook, I demonstrate an approach to detecting cars in images using a pre-trained object detection model from the Detectron2 library. I will first use the pre-trained model to perform car detection on a set of images. Following that, I will fine-tune the model on a custom dataset to improve its performance on car detection tasks specific to our use case.

The steps involved in this notebook are as follows:

1. Load a pre-trained model and use it as a predictor for detecting cars in images.
2. Create a custom dataset by splitting the images into training and validation sets and preparing the annotations in a suitable format.
3. Configure the Detectron2 model for fine-tuning on the custom dataset.
4. Train the fine-tuned model using the custom dataset and configuration.
5. Evaluate the performance of the fine-tuned model on the validation dataset.


In [None]:
from google.colab import drive
drive.mount('/content/gdrive/')


Mounted at /content/gdrive/


In [None]:
cd gdrive/MyDrive/Datrix

/content/gdrive/.shortcut-targets-by-id/1rEf2Kw1t8kBTVPrhIoB1VWHDdH9KQZiz/Datrix


In [None]:
!python -m pip install pyyaml==5.1
import sys, os, distutils.core
# Note: This is a faster way to install detectron2 in Colab, but it does not include all functionalities.
# See https://detectron2.readthedocs.io/tutorials/install.html for full installation instructions
!git clone 'https://github.com/facebookresearch/detectron2'
dist = distutils.core.run_setup("./detectron2/setup.py")
!python -m pip install {' '.join([f"'{x}'" for x in dist.install_requires])}
sys.path.insert(0, os.path.abspath('./detectron2'))

# Properly install detectron2. (Please do not install twice in both ways)
!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyyaml==5.1
  Downloading PyYAML-5.1.tar.gz (274 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m274.2/274.2 KB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... [?25l[?25hdone
  Created wheel for pyyaml: filename=PyYAML-5.1-cp39-cp39-linux_x86_64.whl size=44088 sha256=c98ce089d268ee9ee7f8b5bd4ebbf7db180871ee6c50a4ec0ee9aa06f9f10a9a
  Stored in directory: /root/.cache/pip/wheels/68/be/8f/b6c454cd264e0b349b47f8ee00755511f277618af9e5dae20d
Successfully built pyyaml
Installing collected packages: pyyaml
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 6.0
    Uninstalling PyYAML-6.0:
      Successfully uninstalled PyYAML-6.0
[31mERROR: pip's dependency resolver does not currently take into account 

## Using a Pre-trained Model for Car Detection

In this step, I will use a pre-trained model from the Detectron2 library to detect cars in a set of images. Detectron2 provides various pre-trained models for object detection, which can be used as a starting point for detecting objects in images. I will load the pre-trained model and use it as a predictor to identify cars in the given images.




In [None]:
import os
import json
import cv2
import numpy as np
import pandas as pd
import detectron2
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2 import model_zoo

In [None]:
def is_red_car(cropped_image, threshold=0.5):
    """
      Determine if a cropped image of a car contains a red car by calculating the
      proportion of red pixels in the image. If the proportion exceeds the given
      threshold, the function returns True, indicating that the car is red.

      Args:
          cropped_image (numpy.ndarray): A cropped image containing a car, represented
                                        as a 3-channel (BGR) numpy array.
          threshold (float, optional): The minimum proportion of red pixels to total
                                      pixels required to consider the car as red.
                                      Default is 0.5 (50%).

      Returns:
          bool: True if the proportion of red pixels exceeds the threshold, False otherwise.
    """

    # Convert the BGR image to the HSV color space
    hsv_image = cv2.cvtColor(cropped_image, cv2.COLOR_BGR2HSV)
    lower_red1 = np.array([0, 70, 50])
    upper_red1 = np.array([10, 255, 255])
    lower_red2 = np.array([170, 70, 50])
    upper_red2 = np.array([180, 255, 255])

    # Create masks for the two red color ranges
    mask1 = cv2.inRange(hsv_image, lower_red1, upper_red1)
    mask2 = cv2.inRange(hsv_image, lower_red2, upper_red2)

    # Combine the masks to cover the entire red color range
    mask = cv2.bitwise_or(mask1, mask2)

    # Count the number of non-zero (red) pixels in the mask
    red_pixels = cv2.countNonZero(mask)

    # Calculate the total number of pixels in the cropped_image
    total_pixels = cropped_image.size // 3

    # Calculate the proportion of red pixels to total pixels
    # and compare it to the threshold
    return red_pixels / total_pixels > threshold

In [None]:
# Configure Detectron2 model
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Set threshold for this model
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)

In [None]:
# Read annotations from the JSON file
with open('Data/cars/annotations_sample.json') as json_file:
    annotations = json.load(json_file)

In [None]:
# Prepare the output CSV file
output_csv = pd.DataFrame(columns=['file_name', 'bounding_box'])

In [None]:
# Iterate through each image in the dataset and perform the following steps:
#   - Read the image and convert it to RGB
#   - Detect cars in the image using a pre-trained model (Detectron2 or another model)
#   - Filter the detected bounding boxes with the car class ID
#   - Check if the car is red using the is_red_car() function
#   - Save the results (image file name and bounding box) to a CSV file
for image_info in annotations['annotations']:
    image_path = os.path.join('Data/cars', image_info['file_name'])

    # Read the image and convert to RGB
    image_np = cv2.imread(image_path)
    image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)

    # Detect cars
    outputs = predictor(image_np)

    # Filter detections with car class ID (3)
    for i, bbox in enumerate(outputs['instances'].pred_boxes.tensor.tolist()):
        if outputs['instances'].pred_classes[i].item() == 2:  # Class ID for cars in COCO is 2 (not 3)
            
            x, y, x_max, y_max = [int(coord) for coord in bbox]
            w, h = x_max - x, y_max - y

            # Check if the car is red
            cropped_image = image_np[y:y_max, x:x_max]
            if is_red_car(cropped_image):
                # Append the results to the output CSV dataframe
                output_csv = output_csv.append({'file_name': image_info['file_name'], 'bounding_box': (x, y, w, h)}, ignore_index=True)

# Save the output dataframe to a CSV file
output_csv.to_csv('output_detectron2.csv', index=False)


In [None]:
output_csv

Unnamed: 0,file_name,bounding_box
0,000000394964.jpg,"(560, 247, 78, 104)"
1,000000394964.jpg,"(488, 237, 57, 38)"
2,000000394964.jpg,"(531, 238, 37, 39)"
3,000000394964.jpg,"(0, 232, 58, 38)"
4,000000394964.jpg,"(59, 230, 67, 63)"
...,...,...
79,000000394964.jpg,"(488, 237, 57, 38)"
80,000000394964.jpg,"(531, 238, 37, 39)"
81,000000394964.jpg,"(0, 232, 58, 38)"
82,000000394964.jpg,"(59, 230, 67, 63)"


## Creating a Custom Dataset

To fine-tune the model on a custom dataset, I will first prepare the dataset by splitting the images into training and validation sets. Additionally, I will convert the annotations into a format suitable for use with the Detectron2 library.

The process for creating the custom dataset includes the following sub-steps:

1. Read the annotations and split the dataset into training and validation sets.
2. Copy the images to separate folders for training and validation.
3. Convert the bounding box annotations to the required format (Detectron2) and save them in separate label files for each image in the training and validation sets.


In [None]:
import random

In [None]:
# Shuffle the annotations and split them into training and validation sets
random.shuffle(annotations['annotations'])
split_idx = int(0.8 * len(annotations['annotations']))
train_annotations = annotations['annotations'][:split_idx]
val_annotations = annotations['annotations'][split_idx:]

In [None]:
train_images = [img['file_name'] for img in train_annotations]
val_images = [img['file_name'] for img in train_annotations]

train_bboxes = [img['bbox'] for img in train_annotations]
val_bboxes = [img['bbox'] for img in train_annotations]

In [None]:
import os
import numpy as np
import cv2
from detectron2.structures import BoxMode

In [None]:
def get_car_data(images, bboxes):

    """
      Create a list of dictionaries containing image and annotation information
      for use with the Detectron2 library.

      Args:
          images (list): A list of image file names from the dataset.
          bboxes (list): A list of bounding boxes for each image, 
                          where each bounding box is in the format (x, y, w, h).

      Returns:
          list: A list of dictionaries containing image and annotation information.
    """
    dataset_dicts = []
    for idx, image_name in enumerate(images):
        record = {}
        image_path = os.path.join("Data", "cars", image_name)

        height, width = cv2.imread(image_path).shape[:2]
        
        record["file_name"] = image_path
        record["image_id"] = idx
        record["height"] = height
        record["width"] = width

        objs = []
        for bbox in bboxes:
            print(bbox)
            x, y, w, h = bbox

            obj = {
                "bbox": [x, y, x+w, y+h],
                "bbox_mode": BoxMode.XYXY_ABS,
                "category_id": 0,  # Only one class: car
            }
            objs.append(obj)

        record["annotations"] = objs
        dataset_dicts.append(record)

    return dataset_dicts


In [None]:
from detectron2.data import DatasetCatalog, MetadataCatalog

DatasetCatalog.register("car_train1", lambda: get_car_data(train_images, train_bboxes))
DatasetCatalog.register("car_val1", lambda: get_car_data(val_images, val_bboxes))

car_metadata = MetadataCatalog.get("car_train1")


In [None]:
from detectron2.config import get_cfg
from detectron2 import model_zoo

# Initialize the Detectron2 configuration object
cfg = get_cfg()

# Merge the Faster R-CNN configuration file from Detectron2's model zoo
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))

# Set the dataset names for training and validation
cfg.DATASETS.TRAIN = ("car_train1",)
cfg.DATASETS.TEST = ("car_val1",)


# Set the number of data loading workers
cfg.DATALOADER.NUM_WORKERS = 2

# Load pre-trained weights from the model zoo
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")

# Set training parameters
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 300

# Set the parameters for region proposals and classification
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # Only one class: car


In [None]:
from detectron2.engine import DefaultTrainer

# Instantiate a trainer object using the configuration settings
trainer = DefaultTrainer(cfg)

# Load pre-trained weights if available, otherwise start training from scratch
trainer.resume_or_load(resume=False)

# Start the training process
trainer.train()


[03/14 21:47:23 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
      (res

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



[03/14 21:48:22 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[03/14 21:48:22 d2.data.build]: Using training sampler TrainingSampler
[03/14 21:48:22 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[03/14 21:48:22 d2.data.common]: Serializing 2001 elements to byte tensors and concatenating them all ...
[03/14 21:48:25 d2.data.common]: Serialized dataset takes 205.47 MiB
[03/14 21:48:28 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl ...


roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}


[03/14 21:48:30 d2.engine.train_loop]: Starting training from iteration 0
[03/14 21:48:36 d2.utils.memory]: Attempting to copy inputs of <function pairwise_iou at 0x7f2f641a7160> to CPU due to CUDA OOM


In [None]:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

# Instantiate a COCOEvaluator object for evaluating the model on the validation dataset
evaluator = COCOEvaluator("car_val", cfg, False, output_dir="./output/")

# Build a test data loader for the validation dataset
val_loader = build_detection_test_loader(cfg, "car_val")

# Perform inference on the validation dataset and evaluate the model's performance
inference_on_dataset(trainer.model, val_loader, evaluator)
