# Road Defect Detection with SVRDD Dataset

training an object detection model on the SVRDD dataset.

## Project Overview
- **Dataset**: SVRDD (Zenodo)
- **Classes**: 7 defect types (Alligator Crack, Longitudinal Crack, Transverse Crack, Pothole, Longitudinal Patch, Transverse Patch, Manhole Cover)
- **Model**: YOLOv8 for object detection
- **Goal**: Detect road defects in real-world footage

## 1. Setup and Installation

In [None]:
# Install required packages
!pip install ultralytics opencv-python pillow matplotlib pandas seaborn
!pip install roboflow  # Optional: if using Roboflow for dataset management

In [None]:
# Import libraries
import os
import json
import shutil
from pathlib import Path
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from ultralytics import YOLO
import pandas as pd

# Set random seed for reproducibility
np.random.seed(42)

print("Libraries imported successfully!")

## 2. Download SVRDD Dataset

Download the SVRDD dataset from Zenodo: https://zenodo.org/records/4070548

License: CC BY 4.0

In [None]:
# Define paths
DATA_DIR = Path("./data")
SVRDD_DIR = DATA_DIR / "svrdd"
IMAGES_DIR = SVRDD_DIR / "images"
ANNOTATIONS_DIR = SVRDD_DIR / "annotations"

# Create directories if they don't exist
DATA_DIR.mkdir(exist_ok=True)
SVRDD_DIR.mkdir(exist_ok=True)

print(f"Dataset directory: {SVRDD_DIR}")
print("\nPlease download the SVRDD dataset from Zenodo and extract it to the above directory.")
print("Expected structure:")
print("  data/svrdd/")
print("    ├── images/")
print("    └── annotations/")

## 3. Explore the Dataset

In [None]:
# Define class names based on SVRDD
CLASS_NAMES = [
    "alligator_crack",
    "longitudinal_crack",
    "transverse_crack",
    "pothole",
    "longitudinal_patch",
    "transverse_patch",
    "manhole_cover"
]

print(f"Number of classes: {len(CLASS_NAMES)}")
print(f"Classes: {CLASS_NAMES}")

In [None]:
# Check dataset structure
if IMAGES_DIR.exists():
    image_files = list(IMAGES_DIR.glob("**/*.jpg")) + list(IMAGES_DIR.glob("**/*.png"))
    print(f"Total images found: {len(image_files)}")
    
    # Display sample image
    if len(image_files) > 0:
        sample_img = Image.open(image_files[0])
        plt.figure(figsize=(10, 6))
        plt.imshow(sample_img)
        plt.title(f"Sample Image: {image_files[0].name}")
        plt.axis('off')
        plt.show()
else:
    print(f"Images directory not found at {IMAGES_DIR}")
    print("Please download and extract the SVRDD dataset first.")

## 4. Convert COCO to YOLO Format

SVRDD uses COCO format annotations. We need to convert them to YOLO format for training.

In [None]:
def coco_to_yolo_bbox(bbox, img_width, img_height):
    """
    Convert COCO bounding box [x_min, y_min, width, height] to YOLO format [x_center, y_center, width, height]
    All values normalized to [0, 1]
    """
    x_min, y_min, width, height = bbox
    
    # Calculate center coordinates
    x_center = (x_min + width / 2) / img_width
    y_center = (y_min + height / 2) / img_height
    
    # Normalize width and height
    norm_width = width / img_width
    norm_height = height / img_height
    
    return [x_center, y_center, norm_width, norm_height]


def convert_coco_to_yolo(coco_json_path, output_dir, images_dir):
    """
    Convert COCO format annotations to YOLO format
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Load COCO annotations
    with open(coco_json_path, 'r') as f:
        coco_data = json.load(f)
    
    # Create image id to filename mapping
    images_dict = {img['id']: img for img in coco_data['images']}
    
    # Group annotations by image_id
    annotations_by_image = {}
    for ann in coco_data['annotations']:
        image_id = ann['image_id']
        if image_id not in annotations_by_image:
            annotations_by_image[image_id] = []
        annotations_by_image[image_id].append(ann)
    
    # Convert each image's annotations
    for image_id, anns in annotations_by_image.items():
        img_info = images_dict[image_id]
        img_width = img_info['width']
        img_height = img_info['height']
        img_filename = Path(img_info['file_name']).stem
        
        # Create YOLO format txt file
        yolo_txt_path = output_dir / f"{img_filename}.txt"
        
        with open(yolo_txt_path, 'w') as f:
            for ann in anns:
                # COCO category_id is 1-indexed, YOLO class_id is 0-indexed
                class_id = ann['category_id'] - 1
                bbox = ann['bbox']
                
                # Convert to YOLO format
                yolo_bbox = coco_to_yolo_bbox(bbox, img_width, img_height)
                
                # Write to file
                f.write(f"{class_id} {' '.join(map(str, yolo_bbox))}\n")
    
    print(f"Conversion complete! YOLO labels saved to {output_dir}")
    print(f"Total images processed: {len(annotations_by_image)}")


# Example usage (uncomment when you have the COCO annotation file)
# convert_coco_to_yolo(
#     coco_json_path="data/svrdd/annotations/instances_train.json",
#     output_dir="data/svrdd/labels/train",
#     images_dir="data/svrdd/images/train"
# )

## 5. Create Dataset Configuration for YOLO

In [None]:
# Create data.yaml for YOLO training
yaml_content = f"""# SVRDD Dataset Configuration
path: {SVRDD_DIR.absolute()}  # dataset root dir
train: images/train  # train images (relative to 'path')
val: images/val      # val images (relative to 'path')
test: images/test    # test images (optional)

# Classes
names:
  0: alligator_crack
  1: longitudinal_crack
  2: transverse_crack
  3: pothole
  4: longitudinal_patch
  5: transverse_patch
  6: manhole_cover
"""

yaml_path = SVRDD_DIR / "svrdd.yaml"
with open(yaml_path, 'w') as f:
    f.write(yaml_content)

print(f"Dataset configuration saved to {yaml_path}")
print("\nConfiguration:")
print(yaml_content)

## 6. Train YOLOv8 Model

Starting with YOLOv8s for a balance between speed and accuracy.

In [None]:
# Initialize YOLOv8 model
# Options: yolov8n.pt (nano), yolov8s.pt (small), yolov8m.pt (medium), yolov8l.pt (large), yolov8x.pt (extra large)
model = YOLO("yolov8s.pt")  # Start with small model

print("Model loaded successfully!")
print(f"Model type: YOLOv8s")

In [None]:
# Training parameters
TRAINING_CONFIG = {
    'data': str(yaml_path),           # Path to dataset YAML
    'epochs': 80,                     # Number of training epochs
    'imgsz': 1280,                    # Input image size
    'batch': 16,                      # Batch size (adjust based on GPU memory)
    'workers': 8,                     # Number of worker threads
    'lr0': 0.01,                      # Initial learning rate
    'cos_lr': True,                   # Use cosine learning rate scheduler
    'patience': 50,                   # Early stopping patience
    'save': True,                     # Save training checkpoints
    'device': 0,                      # GPU device (0 for first GPU, 'cpu' for CPU)
    'project': 'runs/detect',         # Project directory
    'name': 'svrdd_yolov8s',         # Experiment name
    
    # Augmentation settings for domain gap
    'hsv_h': 0.015,                   # HSV-Hue augmentation
    'hsv_s': 0.7,                     # HSV-Saturation augmentation
    'hsv_v': 0.4,                     # HSV-Value augmentation
    'degrees': 0.0,                   # Rotation augmentation
    'translate': 0.1,                 # Translation augmentation
    'scale': 0.5,                     # Scale augmentation
    'shear': 0.0,                     # Shear augmentation
    'perspective': 0.0,               # Perspective augmentation
    'flipud': 0.0,                    # Vertical flip probability
    'fliplr': 0.5,                    # Horizontal flip probability
    'mosaic': 1.0,                    # Mosaic augmentation probability
    'mixup': 0.0,                     # Mixup augmentation probability
}

print("Training configuration:")
for key, value in TRAINING_CONFIG.items():
    print(f"  {key}: {value}")

In [None]:
# Start training (uncomment when ready)
# results = model.train(**TRAINING_CONFIG)

print("\nTo start training, uncomment the line above.")
print("Training will take several hours depending on your GPU.")

## 7. Evaluate Model Performance

In [None]:
# Load trained model for evaluation
# trained_model = YOLO("runs/detect/svrdd_yolov8s/weights/best.pt")

# Evaluate on validation set
# metrics = trained_model.val(data=str(yaml_path), imgsz=1280)

# Print metrics
# print(f"\nValidation Results:")
# print(f"mAP50: {metrics.box.map50:.4f}")
# print(f"mAP50-95: {metrics.box.map:.4f}")
# print(f"Precision: {metrics.box.mp:.4f}")
# print(f"Recall: {metrics.box.mr:.4f}")

## 8. Run Inference on Custom Video/Images

Test the model on your own road footage to validate domain shift handling.

In [None]:
# Load trained model
# trained_model = YOLO("runs/detect/svrdd_yolov8s/weights/best.pt")

# Run inference on custom images
# test_image_path = "path/to/your/test/image.jpg"
# results = trained_model.predict(source=test_image_path, imgsz=1280, conf=0.25, save=True)

# Display results
# for result in results:
#     boxes = result.boxes
#     print(f"Detected {len(boxes)} objects")
#     
#     # Plot results
#     result.show()

In [None]:
# Extract frames from video for testing
def extract_video_frames(video_path, output_dir, frame_interval=30):
    """
    Extract frames from video at regular intervals
    
    Args:
        video_path: Path to video file
        output_dir: Directory to save frames
        frame_interval: Extract every Nth frame
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    cap = cv2.VideoCapture(video_path)
    frame_count = 0
    saved_count = 0
    
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        
        if frame_count % frame_interval == 0:
            frame_filename = output_dir / f"frame_{saved_count:06d}.jpg"
            cv2.imwrite(str(frame_filename), frame)
            saved_count += 1
        
        frame_count += 1
    
    cap.release()
    print(f"Extracted {saved_count} frames from {frame_count} total frames")
    print(f"Frames saved to {output_dir}")

# Example usage
# extract_video_frames("path/to/your/video.mp4", "data/test_frames", frame_interval=30)

## 9. Calculate Operational Metrics

Track defects per 10 km and false alarms per 10 km as mentioned in the README.

In [None]:
def calculate_ops_metrics(predictions, distance_km=10, fps=30, speed_kmh=50):
    """
    Calculate operational metrics: defects per 10km and false alarms per 10km
    
    Args:
        predictions: List of prediction results
        distance_km: Distance covered in the video (km)
        fps: Video frames per second
        speed_kmh: Vehicle speed (km/h)
    """
    total_detections = 0
    class_counts = {name: 0 for name in CLASS_NAMES}
    
    for result in predictions:
        boxes = result.boxes
        total_detections += len(boxes)
        
        for box in boxes:
            class_id = int(box.cls[0])
            class_counts[CLASS_NAMES[class_id]] += 1
    
    # Calculate metrics normalized to 10km
    defects_per_10km = (total_detections / distance_km) * 10
    
    print(f"\nOperational Metrics:")
    print(f"Total defects detected: {total_detections}")
    print(f"Defects per 10km: {defects_per_10km:.2f}")
    print(f"\nPer-class detections:")
    for class_name, count in class_counts.items():
        per_10km = (count / distance_km) * 10
        print(f"  {class_name}: {count} ({per_10km:.2f} per 10km)")
    
    return {
        'total_detections': total_detections,
        'defects_per_10km': defects_per_10km,
        'class_counts': class_counts
    }

# Example usage after running inference
# metrics = calculate_ops_metrics(results, distance_km=5.0)

## 10. Fine-tune on Custom Data

If performance drops on your footage, fine-tune with a small labeled set (300-1000 frames).

In [None]:
# Fine-tuning configuration
FINETUNING_CONFIG = {
    'data': 'path/to/custom_data.yaml',  # Your custom dataset
    'epochs': 40,                         # Fewer epochs for fine-tuning
    'imgsz': 1280,
    'batch': 16,
    'lr0': 0.001,                         # Lower learning rate
    'cos_lr': True,
    'patience': 20,
    'device': 0,
    'project': 'runs/detect',
    'name': 'svrdd_finetuned',
    'resume': False,                      # Start from best weights
}

# Load best model and fine-tune
# best_model = YOLO("runs/detect/svrdd_yolov8s/weights/best.pt")
# results = best_model.train(**FINETUNING_CONFIG)

print("Fine-tuning configuration ready.")
print("Prepare your custom dataset with 300-1000 labeled frames first.")

## 11. Export Model for Deployment

In [None]:
# Export model to different formats for deployment
# trained_model = YOLO("runs/detect/svrdd_yolov8s/weights/best.pt")

# Export to ONNX (good for cross-platform deployment)
# trained_model.export(format='onnx', imgsz=1280)

# Export to TensorRT (for NVIDIA GPUs)
# trained_model.export(format='engine', imgsz=1280)

# Export to CoreML (for Apple devices)
# trained_model.export(format='coreml', imgsz=1280)

print("Available export formats: ONNX, TensorRT, CoreML, TFLite, etc.")
print("Choose based on your deployment target hardware.")

## Next Steps

1. Download SVRDD dataset from Zenodo
2. Convert annotations from COCO to YOLO format
3. Train baseline YOLOv8 model
4. Test on your custom footage
5. Fine-tune if domain gap is significant
6. Track both ML metrics (mAP) and operational metrics (defects/10km)
7. Consider expanding with RDD2022/N-RDD datasets if needed
8. Export model for deployment on target hardware

## Resources

- SVRDD Dataset: https://zenodo.org/records/4070548
- Ultralytics YOLOv8 Docs: https://docs.ultralytics.com/
- RDD2022 Dataset: Additional road damage dataset for multi-country data