<a href="https://colab.research.google.com/github/Hamda-Bahri/bfset-experiments/blob/main/notebooks/02_bfset_faster_rcnn_smallscale.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# BFSET – Faster R-CNN Small-Scale Experiment (500 Images)

This notebook trains and evaluates a Faster R-CNN detector on a small-scale subset
(≈500 images) of the BFSET dataset. The goal is to obtain a clean, reproducible
baseline that can be referenced in the experimental section of the associated article.


In [None]:
import sys, platform
import torch, torchvision

print(f"Python version : {sys.version.split()[0]}")
print(f"PyTorch version: {torch.__version__}")
print(f"Torchvision    : {torchvision.__version__}")
print(f"CUDA available : {torch.cuda.is_available()}")
print(f"Platform       : {platform.platform()}")


## 2. Dataset location (Google Drive)

Mount Google Drive and define the paths to the BFSET small-scale subset
(images and YOLO-format labels).

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
DATA_ROOT = "/content/drive/MyDrive/BFSET_TEST"

TRAIN_IMG   = f"{DATA_ROOT}/images/train"
TRAIN_LABEL = f"{DATA_ROOT}/labels/train"

VAL_IMG     = f"{DATA_ROOT}/images/val"
VAL_LABEL   = f"{DATA_ROOT}/labels/val"

print("Train images :", TRAIN_IMG)
print("Train labels :", TRAIN_LABEL)
print("Val   images :", VAL_IMG)
print("Val   labels :", VAL_LABEL)

## 3. Utilities – YOLO to COCO conversion

BFSET annotations are assumed to be in YOLO format
`class x_center y_center width height` (normalized).
Faster R-CNN expects bounding boxes in absolute pixel coordinates
`[x_min, y_min, x_max, y_max]`. The helper below performs this conversion.

In [None]:
import os, glob
import torch
from PIL import Image

def yolo_to_coco(box, W, H):
    """Convert YOLO-normalized box to COCO-style absolute coordinates.

    Args:
        box: (x_center, y_center, width, height) normalized in [0,1].
        W, H: image width and height in pixels.
    Returns:
        [x_min, y_min, x_max, y_max]
    """
    x_c, y_c, w, h = box
    x_min = (x_c - w / 2.0) * W
    y_min = (y_c - h / 2.0) * H
    x_max = (x_c + w / 2.0) * W
    y_max = (y_c + h / 2.0) * H
    return [x_min, y_min, x_max, y_max]

## 4. PyTorch Dataset for BFSET (Faster R-CNN)

The dataset class reads RGB images and their corresponding YOLO label files,
converts bounding boxes to COCO format, and returns tensors suitable for
Faster R-CNN training.


In [None]:
import torchvision.transforms as T

class BFSETFasterRCNNDataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, label_dir):
        self.img_paths = sorted(glob.glob(os.path.join(img_dir, "*.jpg")))
        if len(self.img_paths) == 0:
            raise RuntimeError(f"No .jpg images found in {img_dir}")
        self.label_dir = label_dir
        self.transforms = T.ToTensor()

    def __getitem__(self, idx):
        img_path = self.img_paths[idx]
        image = Image.open(img_path).convert("RGB")
        W, H = image.size

        label_path = os.path.join(
            self.label_dir,
            os.path.basename(img_path).replace(".jpg", ".txt")
        )

        boxes = []
        if os.path.exists(label_path):
            with open(label_path, "r") as f:
                for line in f:
                    parts = line.strip().split()
                    if len(parts) != 5:
                        continue
                    _, xc, yc, w, h = map(float, parts)
                    boxes.append(yolo_to_coco([xc, yc, w, h], W, H))

        boxes = torch.tensor(boxes, dtype=torch.float32)
        labels = torch.ones((boxes.shape[0],), dtype=torch.int64)  # single class: beard

        target = {"boxes": boxes, "labels": labels}

        if self.transforms is not None:
            image = self.transforms(image)

        return image, target

    def __len__(self):
        return len(self.img_paths)

def collate_fn(batch):
    images, targets = list(zip(*batch))
    return list(images), list(targets)

## 5. Data loaders

We now create training and validation loaders with a small batch size,
suitable for Colab GPUs (or CPU if necessary).

In [None]:
train_dataset = BFSETFasterRCNNDataset(TRAIN_IMG, TRAIN_LABEL)
val_dataset   = BFSETFasterRCNNDataset(VAL_IMG, VAL_LABEL)

train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=4,
    shuffle=True,
    collate_fn=collate_fn
)

val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=4,
    shuffle=False,
    collate_fn=collate_fn
)

len(train_dataset), len(val_dataset)

## 6. Faster R-CNN model

We use a pre-trained Faster R-CNN with a ResNet-50 backbone and an FPN.
The classification head is replaced to predict two classes: background and beard.


In [None]:
from torchvision.models.detection import fasterrcnn_resnet50_fpn
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

# Load a model pre-trained on COCO
model = fasterrcnn_resnet50_fpn(weights="DEFAULT")

# Replace the classifier with a new one for 2 classes (background + beard)
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, 2)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
device

## 7. Training loop

We train for a small number of epochs (e.g. 5) to obtain a first baseline on
the 500-image subset. The goal is not to fully optimize the model, but to
obtain meaningful detection metrics.

In [None]:
import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=3e-4)

num_epochs = 5

for epoch in range(num_epochs):
    model.train()
    total_loss = 0.0

    for images, targets in train_loader:
        images = [img.to(device) for img in images]
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

        total_loss += losses.item()
      print(f"Epoch {epoch+1}/{num_epochs} – Training loss: {total_loss:.4f}")

## 8. Simple evaluation (mAP@IoU=0.5 approximation)

This section computes a simple precision-like metric based on the IoU between
predicted and ground-truth boxes. The result is an approximation of mAP@0.5 on
the validation set and is sufficient for a small-scale comparison with YOLO.

In [None]:
from torchvision.ops import box_iou
import numpy as np

def evaluate_map50(model, data_loader, score_threshold=0.5, iou_threshold=0.5):
    model.eval()
    all_precisions = []

    with torch.no_grad():
        for images, targets in data_loader:
            images = [img.to(device) for img in images]
            outputs = model(images)

            for output, target in zip(outputs, targets):
                # Filter predictions by score
                scores = output["scores"].cpu()
                keep = scores >= score_threshold
                pred_boxes = output["boxes"][keep].cpu()

                gt_boxes = target["boxes"].cpu()

                if len(pred_boxes) == 0 or len(gt_boxes) == 0:
                    continue

                ious = box_iou(pred_boxes, gt_boxes)
                max_iou, _ = ious.max(dim=1)

                precision = (max_iou >= iou_threshold).float().mean().item()
                all_precisions.append(precision)

    if len(all_precisions) == 0:
        return 0.0

    return float(np.mean(all_precisions))

map50 = evaluate_map50(model, val_loader)
print(f"Approximate mAP@0.5 on validation set: {map50:.4f}")

## 9. Visualizing predictions

To better understand the qualitative behaviour of the detector, we show a few
validation images with predicted beard bounding boxes overlaid.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2

def show_prediction(image_path, model, score_threshold=0.5):
    model.eval()
    img = Image.open(image_path).convert("RGB")
    t = T.ToTensor()(img).to(device)

    with torch.no_grad():
        output = model([t])[0]

    boxes = output["boxes"].cpu().numpy()
    scores = output["scores"].cpu().numpy()

    img_np = np.array(img)

    for box, score in zip(boxes, scores):
        if score < score_threshold:
            continue
        x1, y1, x2, y2 = box.astype(int)
        cv2.rectangle(img_np, (x1, y1), (x2, y2), (0, 255, 0), 2)

    plt.figure(figsize=(6, 6))
    plt.imshow(img_np)
    plt.axis('off')
    plt.show()

# Example usage on one validation image (change index if needed)
example_idx = 0
example_path = val_dataset.img_paths[example_idx]
print("Example image:", example_path)
show_prediction(example_path, model)

## 10. Saving the trained model

In [None]:
MODEL_PATH = "faster_rcnn_bfset_smallscale.pth"
torch.save(model.state_dict(), MODEL_PATH)
print("Model saved to", MODEL_PATH)

## 11. Conclusion

This notebook provides a complete, reproducible Faster R-CNN baseline on a
500-image subset of the BFSET dataset. The results (training loss, approximate
mAP@0.5 and qualitative visualizations) can be directly used in the experimental
section of the BFSET article and the notebook can be published on GitHub as:

`notebooks/02_bfset_faster_rcnn_smallscale.ipynb`.