<a href="https://colab.research.google.com/github/0Nguyen0Cong0Tuan0/Road-Buddy-Challenge/blob/main/models/yolo_finetune.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **YOLO11 Fine-tuning for Traffic Object Detection**


This notebook **fine-tunes YOLO11n and YOLO11l** on custom traffic datasets to improve detection of **road objects** such as cars, trucks, buses, lanes, traffic lights, road signs and **exclude unrelated** such as toothbrush, skis, wine glass, etc.



**Datasets**

| Dataset | Classes | Focus |
|---------|---------|-------|
| BDD100K | 12 | Vehicles, pedestrians, traffic signs/lights |
| Road Lane v2 | 6 | Lane line types (dotted, solid, divider, etc.) |


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## 1. Setup & Installation

In [None]:
# Install Ultralytics (YOLO)
!pip install ultralytics -q

# Verify installation
import ultralytics
ultralytics.checks()
print(f"\n‚úÖ Ultralytics version: {ultralytics.__version__}")

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

print("\n‚úÖ Google Drive mounted successfully!")

In [None]:
# Imports
import os
import yaml
from pathlib import Path
from ultralytics import YOLO
import shutil

print("‚úÖ All imports ready!")

## 2. Define Paths

In [None]:
# Dataset paths on Google Drive
DRIVE_BASE = "/content/drive/MyDrive/traffic datasets/Kaggle Datasets"

# Dataset 1: Road Lane Segmentation (for lane detection)
LANE_DATASET_PATH = f"{DRIVE_BASE}/road-lane-segmentation"

# Dataset 2: BDD100K (for general traffic object detection)
BDD100K_PATH = f"{DRIVE_BASE}/bdd100k"

# Working directory
WORK_DIR = "/content/yolo_training"
os.makedirs(WORK_DIR, exist_ok=True)

print("üìÅ Dataset Paths:")
print(f"   Lane Dataset: {LANE_DATASET_PATH}")
print(f"   BDD100K: {BDD100K_PATH}")
print(f"   Working Dir: {WORK_DIR}")

# Verify datasets exist
print("\nüîç Verifying datasets...")
print(f"   Lane Dataset exists: {os.path.exists(LANE_DATASET_PATH)}")
print(f"   BDD100K exists: {os.path.exists(BDD100K_PATH)}")

In [None]:
# Explore Lane Dataset structure
print("üìÇ Lane Dataset Structure:")
print("=" * 50)

def show_dir_structure(path, indent=0):
    """Show directory structure."""
    if not os.path.exists(path):
        print(f"{'  ' * indent}‚ùå Path not found: {path}")
        return

    items = sorted(os.listdir(path))
    for item in items[:15]:  # Limit to 15 items
        item_path = os.path.join(path, item)
        if os.path.isdir(item_path):
            count = len(os.listdir(item_path))
            print(f"{'  ' * indent}üìÅ {item}/ ({count} items)")
            if indent < 2:  # Only go 2 levels deep
                show_dir_structure(item_path, indent + 1)
        else:
            print(f"{'  ' * indent}üìÑ {item}")

show_dir_structure(LANE_DATASET_PATH)

## 3. Create/Fix data.yaml for YOLO Training

YOLO requires a `data.yaml` file with correct paths to the training data.

In [None]:
# Create data.yaml for Road Lane Segmentation Dataset
# This dataset has 1 class: Lane

lane_data_yaml = {
    'path': f"{LANE_DATASET_PATH}/dataset",  # Dataset root directory
    'train': 'train/images',  # Train images (relative to path)
    'val': 'val/images',      # Validation images (relative to path)
    'test': 'test/images',    # Test images (relative to path)
    'nc': 1,                  # Number of classes
    'names': ['Lane']         # Class names
}

# Save to working directory
lane_yaml_path = f"{WORK_DIR}/lane_data.yaml"
with open(lane_yaml_path, 'w') as f:
    yaml.dump(lane_data_yaml, f, default_flow_style=False)

print("‚úÖ Created lane_data.yaml:")
print("=" * 50)
with open(lane_yaml_path, 'r') as f:
    print(f.read())

# Verify paths exist
print("\nüîç Verifying paths...")
dataset_root = lane_data_yaml['path']
for split in ['train', 'val', 'test']:
    split_path = os.path.join(dataset_root, lane_data_yaml[split])
    exists = os.path.exists(split_path)
    count = len(os.listdir(split_path)) if exists else 0
    status = "‚úÖ" if exists else "‚ùå"
    print(f"   {status} {split}: {split_path} ({count} files)")

## 4. Load YOLO11 Model

In [None]:
# Load YOLO11n (nano) model - smallest and fastest
# Options: yolo11n.pt, yolo11s.pt, yolo11m.pt, yolo11l.pt, yolo11x.pt

model = YOLO('yolo11n.pt')  # This will download if not present

print("\n‚úÖ YOLO11n model loaded!")
print(f"   Model type: {model.task}")
print(f"   Model info: {model.info()}")

## 5. Train the Model

‚ö†Ô∏è **Important**: Make sure you have GPU enabled!
- Go to `Runtime` ‚Üí `Change runtime type` ‚Üí Select `T4 GPU`

In [None]:
# Check GPU availability
import torch

print("üñ•Ô∏è Hardware Check:")
print(f"   CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
    print(f"   GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("   ‚ö†Ô∏è No GPU detected! Training will be slow.")
    print("   Go to Runtime ‚Üí Change runtime type ‚Üí Select T4 GPU")

In [None]:
# Training Configuration for Lane Detection
TRAIN_CONFIG = {
    'data': lane_yaml_path,      # Path to data.yaml
    'epochs': 50,                # Number of training epochs
    'imgsz': 640,                # Image size
    'batch': 16,                 # Batch size (reduce if OOM error)
    'patience': 10,              # Early stopping patience
    'save': True,                # Save checkpoints
    'project': f'{WORK_DIR}/runs',  # Save results here
    'name': 'lane_detection',    # Experiment name
    'exist_ok': True,            # Overwrite existing experiment
    'pretrained': True,          # Use pretrained weights
    'optimizer': 'auto',         # Optimizer (auto, SGD, Adam, AdamW)
    'verbose': True,             # Verbose output
    'seed': 42,                  # Random seed for reproducibility
}

print("üöÄ Training Configuration:")
for key, value in TRAIN_CONFIG.items():
    print(f"   {key}: {value}")

In [None]:
# Start Training!
print("üöÄ Starting training...")
print("=" * 60)

results = model.train(**TRAIN_CONFIG)

print("\n" + "=" * 60)
print("‚úÖ Training completed!")

## 6. Evaluate the Model

In [None]:
# Validate on test set
print("üìä Validating on test set...")

# Load the best model
best_model_path = f"{WORK_DIR}/runs/lane_detection/weights/best.pt"
best_model = YOLO(best_model_path)

# Run validation
metrics = best_model.val(data=lane_yaml_path, split='test')

print("\nüìà Test Results:")
print(f"   mAP50: {metrics.box.map50:.4f}")
print(f"   mAP50-95: {metrics.box.map:.4f}")
print(f"   Precision: {metrics.box.mp:.4f}")
print(f"   Recall: {metrics.box.mr:.4f}")

In [None]:
# Visualize training results
from IPython.display import Image, display
import glob

results_dir = f"{WORK_DIR}/runs/lane_detection"

# Display training curves
print("üìä Training Results:")
result_images = ['results.png', 'confusion_matrix.png', 'F1_curve.png', 'PR_curve.png']

for img_name in result_images:
    img_path = f"{results_dir}/{img_name}"
    if os.path.exists(img_path):
        print(f"\n{img_name}:")
        display(Image(filename=img_path, width=800))

## 7. Test Inference

In [None]:
# Test inference on sample images
import matplotlib.pyplot as plt
import cv2

# Get sample test images
test_images_dir = f"{LANE_DATASET_PATH}/dataset/test/images"
test_images = sorted(os.listdir(test_images_dir))[:6]  # First 6 images

print(f"üîç Running inference on {len(test_images)} sample images...")

fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.flatten()

for idx, img_name in enumerate(test_images):
    img_path = os.path.join(test_images_dir, img_name)

    # Run inference
    results = best_model.predict(img_path, conf=0.25, verbose=False)

    # Get annotated image
    annotated = results[0].plot()
    annotated = cv2.cvtColor(annotated, cv2.COLOR_BGR2RGB)

    axes[idx].imshow(annotated)
    axes[idx].set_title(img_name[:30])
    axes[idx].axis('off')

plt.suptitle('Lane Detection Results', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

## 8. Save Model to Google Drive

In [None]:
# Save trained model to Google Drive
SAVE_DIR = "/content/drive/MyDrive/traffic datasets/trained_models"
os.makedirs(SAVE_DIR, exist_ok=True)

# Copy best and last weights
weights_dir = f"{WORK_DIR}/runs/lane_detection/weights"

for weight_file in ['best.pt', 'last.pt']:
    src = f"{weights_dir}/{weight_file}"
    dst = f"{SAVE_DIR}/lane_detection_{weight_file}"
    if os.path.exists(src):
        shutil.copy(src, dst)
        print(f"‚úÖ Saved: {dst}")

# Copy training results
results_dst = f"{SAVE_DIR}/lane_detection_results"
if os.path.exists(results_dir):
    shutil.copytree(results_dir, results_dst, dirs_exist_ok=True)
    print(f"‚úÖ Saved training results to: {results_dst}")

print("\n‚úÖ All models saved to Google Drive!")

## 9. Export Model (Optional)

Export to different formats for deployment.

In [None]:
# Export model to ONNX format (optional)
# Uncomment to export

# print("üì¶ Exporting model to ONNX...")
# best_model.export(format='onnx')
# print("‚úÖ Model exported to ONNX format!")

# Available export formats:
# - onnx: ONNX format
# - torchscript: TorchScript
# - coreml: CoreML (iOS)
# - tflite: TensorFlow Lite (mobile)
# - engine: TensorRT (NVIDIA GPU)

print("üí° To export, uncomment the export code above.")
print("   Available formats: onnx, torchscript, coreml, tflite, engine")

## üìã Summary

This notebook trained a YOLO11 model for **lane detection**.

### Files Saved:
- `lane_detection_best.pt` - Best model weights
- `lane_detection_last.pt` - Last checkpoint
- `lane_detection_results/` - Training curves and metrics

### Next Steps:
1. Download the model from Google Drive
2. Use for inference in your application
3. Fine-tune with more data if needed