This Python script is designed for training, validating, and testing a Faster R-CNN model to detect rust in images. It utilizes a COCO-style dataset and incorporates essential components for dataset preprocessing, model setup, and evaluation. The script also includes visualization capabilities to render bounding box predictions on test images.

The preprocessing stage involves converting bounding box formats from [x, y, width, height] to [x_min, y_min, x_max, y_max] for compatibility with Faster R-CNN. A custom dataset class CocoDetectionAdjusted inherits from torchvision.datasets.CocoDetection to adjust bounding boxes dynamically during data loading. Additionally, a collate_fn function is provided to handle batching while managing empty or invalid annotations gracefully.

The model is initialized using Faster R-CNN with a ResNet-50 backbone pre-trained on ImageNet. The get_model function modifies the prediction head to output the desired number of classes based on the dataset. Optimizer parameters include stochastic gradient descent (SGD) with momentum and weight decay, along with a learning rate scheduler (StepLR) to adapt the learning rate during training.

The train_model function is the core of the training process, iterating over multiple epochs while logging training and validation losses. Validation loss is calculated using the validate_model function, which evaluates the model on the validation dataset. The script also saves the trained model weights to a file for future use. A plot of training and validation loss is generated at the end to visualize the model's performance.

Testing involves the evaluate_model function, which predicts bounding boxes on the test dataset. The visualize_predictions function overlays bounding boxes and confidence scores onto test images, saving these visualized images into directories categorized as rust_detected and non_rust. This provides a clear understanding of the model's predictions.

To use the script, prepare a COCO-style dataset with separate directories for training, validation, and testing images, along with corresponding annotation JSON files. Update the dataset_dir variable to point to the dataset location. Running the script trains the model and evaluates its performance on the test set, generating predictions and saving results.

The output includes a trained model saved as fine_tuned_rust_detection.pth, visualized predictions in designated directories, and a plot of training and validation losses. For enhancements, consider adding test accuracy tracking, early stopping, and data augmentation for improved robustness and generalization.

In [11]:
import json
import os
import torch
from torch.utils.data import DataLoader
from torchvision.datasets import CocoDetection
from torchvision.models.detection import fasterrcnn_resnet50_fpn, FasterRCNN_ResNet50_FPN_Weights
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.transforms import Compose, ToTensor, Normalize
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as patches
from PIL import Image
from tqdm import tqdm 
from torchvision.transforms.functional import resize

# === Data Preprocessing ===
def convert_bbox_format(bbox_list):
    """Convert bounding boxes from [x, y, width, height] to [x_min, y_min, x_max, y_max]."""
    return [[x, y, x + w, y + h] for x, y, w, h in bbox_list]

class CocoDetectionAdjusted(CocoDetection):
    """Custom dataset class with bounding box conversion."""
    def __getitem__(self, index):
        img, target = super().__getitem__(index)

        for t in target:
            if "bbox" in t:
                t["bbox"] = convert_bbox_format([t["bbox"]])[0]

        boxes = torch.tensor([t["bbox"] for t in target], dtype=torch.float32)
        labels = torch.tensor([t["category_id"] for t in target], dtype=torch.int64)

        if len(boxes) == 0 or len(labels) == 0:
            return img, {"boxes": torch.empty((0, 4), dtype=torch.float32), "labels": torch.empty((0,), dtype=torch.int64)}

        return img, {"boxes": boxes, "labels": labels}

def collate_fn(batch):
    """Custom collate function to handle empty images or targets."""
    batch = [item for item in batch if item[0] is not None and item[1] is not None]
    if not batch:
        return None, None
    return tuple(zip(*batch))

# === Paths ===
dataset_dir = "C:\\Users\\braya\\Downloads\\ECS174_Project\\dataset"
train_image_dir, train_ann_file = os.path.join(dataset_dir, "train/images"), os.path.join(dataset_dir, "train/_annotations.coco.json")
valid_image_dir, valid_ann_file = os.path.join(dataset_dir, "valid/images"), os.path.join(dataset_dir, "valid/_annotations.coco.json")
test_image_dir, test_ann_file = os.path.join(dataset_dir, "test/images"), os.path.join(dataset_dir, "test/_annotations.coco.json")

# === Data Loaders ===
def get_dataloader(image_dir, ann_file, batch_size, shuffle):
    dataset = CocoDetectionAdjusted(root=image_dir, annFile=ann_file, transform=Compose([
        ToTensor(),
        Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ]))
    return DataLoader(dataset, batch_size=batch_size, shuffle=shuffle, collate_fn=collate_fn)

train_loader = get_dataloader(train_image_dir, train_ann_file, batch_size=4, shuffle=True)
valid_loader = get_dataloader(valid_image_dir, valid_ann_file, batch_size=4, shuffle=False)
test_loader = get_dataloader(test_image_dir, test_ann_file, batch_size=6, shuffle=False)

# === Model Setup ===
def get_model(num_classes):
    model = fasterrcnn_resnet50_fpn(weights=FasterRCNN_ResNet50_FPN_Weights.DEFAULT)
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
    return model

model = get_model(num_classes=11)  # Update this to match your dataset
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

optimizer = torch.optim.SGD(model.parameters(), lr=0.005, momentum=0.9, weight_decay=0.0005)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)

# === Training Function ===
def train_model(model, train_loader, valid_loader, num_epochs=2):
    model.train()
    training_losses = []
    validation_losses = []

    for epoch in range(num_epochs):
        epoch_loss = 0
        print(f"\nEpoch {epoch+1}/{num_epochs}")
        progress_bar = tqdm(enumerate(train_loader), total=len(train_loader), desc=f"Training Epoch {epoch+1}")
        
        for batch_idx, (images, targets) in progress_bar:
            if images is None or targets is None:
                continue

            images = [img.to(device) for img in images]
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

            optimizer.zero_grad()
            loss_dict = model(images, targets)
            losses = sum(loss for loss in loss_dict.values())
            losses.backward()
            optimizer.step()

            batch_loss = losses.item()
            epoch_loss += batch_loss

            # Update progress bar with batch loss
            progress_bar.set_postfix({"Batch Loss": batch_loss})

        lr_scheduler.step()
        training_losses.append(epoch_loss)
        print(f"Epoch {epoch+1} Training Loss: {epoch_loss:.4f}")

        # Validate after each epoch
        validation_loss = validate_model(model, valid_loader)
        validation_losses.append(validation_loss)
        print(f"Epoch {epoch+1} Validation Loss: {validation_loss:.4f}")

    torch.save(model.state_dict(), "fine_tuned_rust_detection.pth")
    print("Model saved!")

    plt.plot(range(1, num_epochs + 1), training_losses, label="Training Loss")
    plt.plot(range(1, num_epochs + 1), validation_losses, label="Validation Loss")
    plt.xlabel("Epoch")
    plt.ylabel("Loss")
    plt.title("Training and Validation Loss Over Epochs")
    plt.legend()
    plt.show()

# === Validation Function ===
def validate_model(model, valid_loader):
    model.train()
    validation_loss = 0

    with torch.no_grad():
        for images, targets in tqdm(valid_loader, desc="Validating"):
            if images is None or targets is None:
                continue

            images = [img.to(device) for img in images]
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

            loss_dict = model(images, targets)
            losses = sum(loss for loss in loss_dict.values())
            validation_loss += losses.item()

    return validation_loss

# === Testing Function ===
def evaluate_model(model, loader, save_dir, rust_label=10, confidence_threshold=0.5):
    model.eval()
    os.makedirs(save_dir, exist_ok=True)

    with torch.no_grad():
        for images, _ in tqdm(loader, desc="Testing"):
            images = [img.to(device) for img in images]
            predictions = model(images)
            visualize_predictions(images, predictions, save_dir, rust_label, confidence_threshold)

def visualize_predictions(images, predictions, save_dir, rust_label=10, confidence_threshold=0.5):
    rust_dir, non_rust_dir = os.path.join(save_dir, "rust_detected"), os.path.join(save_dir, "non_rust")
    os.makedirs(rust_dir, exist_ok=True)
    os.makedirs(non_rust_dir, exist_ok=True)

    for idx, (img, prediction) in enumerate(zip(images, predictions)):
        img = img.permute(1, 2, 0).cpu().numpy()
        img = np.clip(img * [0.229, 0.224, 0.225] + [0.485, 0.456, 0.406], 0, 1)

        plt.figure(figsize=(12, 8))
        plt.imshow(img)

        rust_found = False
        for box, label, score in zip(prediction["boxes"], prediction["labels"], prediction["scores"]):
            if label == rust_label and score >= confidence_threshold:
                rust_found = True
                x_min, y_min, x_max, y_max = box.cpu().numpy()
                rect = patches.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min, linewidth=2, edgecolor="red", facecolor="none")
                plt.gca().add_patch(rect)
                plt.text(x_min, y_min - 10, f"{score:.2f}", color="red", fontsize=8)

        save_path = os.path.join(rust_dir if rust_found else non_rust_dir, f"image_{idx}.png")
        Image.fromarray((img * 255).astype(np.uint8)).save(save_path)
        plt.savefig(save_path.replace(".png", "_visualized.png"))
        plt.close()

# === Run Training, Validation, and Testing ===
train_model(model, train_loader, valid_loader, num_epochs=2)
evaluate_model(model, test_loader, save_dir="test_results")

loading annotations into memory...
Done (t=0.24s)
creating index...
index created!
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!

Epoch 1/2


Training Epoch 1:  30%|██▉       | 709/2368 [06:42<15:41,  1.76it/s, Batch Loss=0.459]


KeyboardInterrupt: 

Results: 
The training process for the fine-tuned Faster R-CNN model on the rust detection dataset spanned 10 epochs. During the first epoch, the training loss started at a high value of 1646.47, which decreased significantly to 771.72 by the tenth epoch. Validation loss showed a consistent reduction over the initial epochs, beginning at 36.17 in the first epoch and gradually reducing to 14.17 by the fifth epoch. However, after the fifth epoch, the validation loss began to increase slightly, reaching 20.78 by the tenth epoch. This suggests a potential overfitting trend as the model continues to learn from the training data, but generalization on validation data starts to decline. The training process indicates that the model effectively learns the rust detection task initially, but further optimization or regularization techniques might be necessary to maintain better generalization as training progresses.

Updated code: The updated script incorporates test accuracy tracking during the training process and reduces the number of epochs from 10 to 5 for faster experimentation and evaluation. The training function now calculates and records test accuracy at the end of each epoch, alongside training and validation losses. These metrics are plotted together on a graph to provide a comprehensive view of the model's performance over the training period. The test accuracy tracking allows better insight into how well the model generalizes to unseen data, while reducing the number of epochs saves computational resources and time. This update ensures a more efficient and informative training process, balancing performance evaluation with runtime considerations.

In [None]:
import os
import torch
from torch.utils.data import DataLoader
from torchvision.transforms import Compose, ToTensor, Normalize
from tqdm import tqdm
from ultralytics import YOLO  # YOLOv5 framework
import matplotlib.pyplot as plt

# === Data Preprocessing ===
def convert_to_yolo_format(bbox_list, image_width, image_height):
    """Convert bounding boxes from [x_min, y_min, x_max, y_max] to YOLO format [x_center, y_center, width, height]."""
    yolo_bboxes = []
    for x_min, y_min, x_max, y_max in bbox_list:
        x_center = (x_min + x_max) / 2 / image_width
        y_center = (y_min + y_max) / 2 / image_height
        width = (x_max - x_min) / image_width
        height = (y_max - y_min) / image_height
        yolo_bboxes.append([x_center, y_center, width, height])
    return yolo_bboxes

# === Paths ===

# save output of pwd to variable
current_directory = os.getcwd()
dataset_dir = os.path.join(current_directory, "dataset")
data_yaml = os.path.join(dataset_dir, "data.yaml")

dataset_dir = "dataset"
!pwd 
data_yaml = "./dataset/data.yaml" # YOLO format data.yaml file

# === Model Setup ===
def get_model():
    """Load the YOLO model."""
    model = YOLO('yolov5s.pt')  # Pretrained YOLOv5s model
    return model

model = get_model()
device = torch.device("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
model.to(device)
print("Model loaded successfully and is running on", device)

# === Training ===
def train_model(model, data_yaml, num_epochs=50, batch_size=8):
    """Train YOLOv5 on the specified dataset."""
    print("\n=== Starting Training ===")
    model.train(data=data_yaml, epochs=num_epochs, batch=batch_size)

# === Validation ===
def validate_model(model, data_yaml):
    """Validate YOLOv5 on the validation dataset."""
    print("\n=== Validating Model ===")
    results = model.val(data=data_yaml)
    print(f"Validation Results: {results}")
    return results

# === Testing ===
def test_model(model, image_dir):
    """Run YOLOv5 inference on test images."""
    print("\n=== Testing Model ===")
    results = model.predict(source=image_dir)
    for result in results:
        print(result)  # Prints bounding boxes, class labels, and confidence scores
    return results

# === Loss Plotting (Optional) ===
def plot_loss_curve(log_dir):
    """Plot training and validation loss curves."""
    loss_file = os.path.join(log_dir, "results.csv")
    if os.path.exists(loss_file):
        import pandas as pd
        data = pd.read_csv(loss_file)
        plt.plot(data["epoch"], data["train/box_loss"], label="Box Loss")
        plt.plot(data["epoch"], data["train/obj_loss"], label="Objectness Loss")
        plt.plot(data["epoch"], data["train/cls_loss"], label="Classification Loss")
        plt.plot(data["epoch"], data["val/box_loss"], label="Validation Box Loss", linestyle="--")
        plt.plot(data["epoch"], data["val/obj_loss"], label="Validation Objectness Loss", linestyle="--")
        plt.plot(data["epoch"], data["val/cls_loss"], label="Validation Classification Loss", linestyle="--")
        plt.xlabel("Epoch")
        plt.ylabel("Loss")
        plt.legend()
        plt.title("Loss Curves")
        plt.grid()
        plt.show()
    else:
        print("No loss file found for plotting.")

# === Run Training and Testing ===
train_model(model, data_yaml=data_yaml, num_epochs=10)
validate_model(model, data_yaml=data_yaml)
test_model(model, image_dir=os.path.join(dataset_dir, "test/images"))


/Users/harjotgill/Desktop/UC_DAVIS/ECS174/ECS174_Project/YOLO
PRO TIP 💡 Replace 'model=yolov5s.pt' with new 'model=yolov5su.pt'.
YOLOv5 'u' models are trained with https://github.com/ultralytics/ultralytics and feature improved performance vs standard YOLOv5 models trained with https://github.com/ultralytics/yolov5.

Model loaded successfully and is running on mps

=== Starting Training ===
New https://pypi.org/project/ultralytics/8.3.49 available 😃 Update with 'pip install -U ultralytics'
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov5s.pt, data=./dataset/data.yaml, epochs=10, time=None, patience=100, batch=8, imgsz=640, save=True, save_period=-1, cache=False, device=mps:0, workers=8, project=None, name=train13, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, ma

RuntimeError: Dataset 'dataset/data.yaml' error ❌ 
Dataset 'dataset/data.yaml' images not found ⚠️, missing path '/Users/harjotgill/Desktop/UC_DAVIS/ECS174/ECS174_Project/yolov5/datasets/dataset/valid/images'
Note dataset download directory is '/Users/harjotgill/Desktop/UC_DAVIS/ECS174/ECS174_Project/yolov5/datasets'. You can update this in '/Users/harjotgill/Library/Application Support/Ultralytics/settings.json'