# 🌊 YOLO-UDD v2.0 Training on Google Colab

**Turbidity-Adaptive Underwater Debris Detection**

⚠️ **Important**: Colab sessions timeout after 2-3 hours. This notebook is optimized for quick training cycles.

---

## 📋 Setup Instructions:

1. **Enable GPU**: Runtime → Change runtime type → GPU (T4)
2. **Connect to Google Drive** to save checkpoints
3. **Run all cells** in order
4. **Save checkpoints** to Drive before session ends!

---

## 1️⃣ Check GPU & Setup

In [None]:
# Check GPU availability
!nvidia-smi

import torch
print(f"\n🎮 GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"🎮 GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"🎮 GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## 2️⃣ Mount Google Drive (IMPORTANT!)

This allows saving checkpoints that persist after session ends.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Create working directory in Drive
!mkdir -p '/content/drive/MyDrive/YOLO_UDD_Training'
%cd '/content/drive/MyDrive/YOLO_UDD_Training'

## 3️⃣ Clone Repository

In [None]:
# Clone your repository (only first time)
import os
if not os.path.exists('YOLO-UDD-v2.0'):
    !git clone https://github.com/kshitijkhede/YOLO-UDD-v2.0.git
else:
    print("✅ Repository already exists, pulling latest changes...")
    !cd YOLO-UDD-v2.0 && git pull

%cd YOLO-UDD-v2.0

## 4️⃣ Install Dependencies

In [None]:
!pip install -q -r requirements.txt
!pip install -q tensorboard
print("✅ Dependencies installed!")

## 5️⃣ Upload Dataset

**Option A: Upload annotations (first time only)**

In [None]:
# Upload annotation files
from google.colab import files
import os

os.makedirs('data/trashcan/annotations', exist_ok=True)

# Check if annotations already exist in Drive
drive_annotations = '/content/drive/MyDrive/YOLO_UDD_Training/annotations'
if os.path.exists(f'{drive_annotations}/train.json'):
    print("✅ Using annotations from Drive...")
    !cp {drive_annotations}/*.json data/trashcan/annotations/
else:
    print("📤 Please upload train.json (22 MB)...")
    uploaded = files.upload()
    for filename in uploaded.keys():
        !mv {filename} data/trashcan/annotations/train.json
    
    print("📤 Please upload val.json (5.6 MB)...")
    uploaded = files.upload()
    for filename in uploaded.keys():
        !mv {filename} data/trashcan/annotations/val.json
    
    # Save to Drive for future sessions
    !mkdir -p {drive_annotations}
    !cp data/trashcan/annotations/*.json {drive_annotations}/
    print("✅ Annotations saved to Drive!")

**Option B: Download dataset from Drive**

In [None]:
# If you have images stored in Google Drive
drive_images = '/content/drive/MyDrive/trashcan_images'

if os.path.exists(drive_images):
    print("✅ Copying images from Drive...")
    !cp -r {drive_images}/* data/trashcan/images/
else:
    print("⚠️ Images not found in Drive.")
    print("Please upload images to: /content/drive/MyDrive/trashcan_images/")

## 6️⃣ Verify Dataset

In [None]:
!python scripts/verify_dataset.py --dataset-dir data/trashcan

## 7️⃣ Configure Training (Optimized for 2-3 hour sessions)

In [None]:
import yaml

# Colab-optimized config (50 epochs per session)
config = {
    'data_dir': './data/trashcan',
    'num_classes': 3,
    'img_size': 640,
    'batch_size': 16,  # T4 GPU can handle this
    'epochs': 50,  # Train 50 epochs per session
    'num_workers': 2,
    'lr': 0.001,
    'weight_decay': 0.0001,
    'lr_scheduler': 'cosine',
    'min_lr': 1e-6,
    'device': 'cuda',
    'save_interval': 10,  # Save every 10 epochs
    'early_stopping_patience': 20,
}

with open('configs/colab_config.yaml', 'w') as f:
    yaml.dump(config, f)

print("✅ Config created!")
print(yaml.dump(config, default_flow_style=False))

## 8️⃣ Start Training 🚀

In [None]:
# Check if there's a checkpoint in Drive to resume from
import glob
drive_checkpoints = glob.glob('/content/drive/MyDrive/YOLO_UDD_Training/checkpoints/last.pth')

if drive_checkpoints:
    print(f"📂 Resuming from saved checkpoint...")
    !python scripts/train.py \
        --config configs/colab_config.yaml \
        --resume {drive_checkpoints[0]}
else:
    print("🆕 Starting fresh training...")
    !python scripts/train.py --config configs/colab_config.yaml

## 9️⃣ Save Checkpoints to Drive (CRITICAL!)

**Run this before session ends or you'll lose progress!**

In [None]:
# Save ALL checkpoints to Google Drive
!mkdir -p '/content/drive/MyDrive/YOLO_UDD_Training/checkpoints'
!mkdir -p '/content/drive/MyDrive/YOLO_UDD_Training/logs'

# Copy checkpoints
!cp -r runs/*/checkpoints/* '/content/drive/MyDrive/YOLO_UDD_Training/checkpoints/' 2>/dev/null || true

# Copy logs
!cp -r runs/*/logs/* '/content/drive/MyDrive/YOLO_UDD_Training/logs/' 2>/dev/null || true

print("✅ Checkpoints saved to Google Drive!")
print("📁 Location: /content/drive/MyDrive/YOLO_UDD_Training/")

## 🔟 Monitor Training

In [None]:
%load_ext tensorboard
%tensorboard --logdir runs

## 1️⃣1️⃣ Evaluate Model

In [None]:
# Evaluate best model
!python scripts/evaluate.py \
    --checkpoint runs/*/checkpoints/best.pth \
    --data-dir data/trashcan \
    --split val

## 1️⃣2️⃣ Test Detection

In [None]:
# Run detection on sample images
!python scripts/detect.py \
    --checkpoint runs/*/checkpoints/best.pth \
    --source data/trashcan/images/val/ \
    --output results/ \
    --max-images 10

In [None]:
# Display detection results
import matplotlib.pyplot as plt
from PIL import Image
import glob

result_images = glob.glob('results/*.jpg')[:6]

if result_images:
    fig, axes = plt.subplots(2, 3, figsize=(15, 10))
    axes = axes.flatten()
    
    for idx, img_path in enumerate(result_images):
        img = Image.open(img_path)
        axes[idx].imshow(img)
        axes[idx].axis('off')
        axes[idx].set_title(f'Detection {idx+1}')
    
    plt.tight_layout()
    plt.show()
else:
    print("⚠️ No detection results found")

## 1️⃣3️⃣ Download Best Model

In [None]:
# Download best model to your computer
from google.colab import files

best_model = glob.glob('runs/*/checkpoints/best.pth')
if best_model:
    print(f"📥 Downloading: {best_model[0]}")
    files.download(best_model[0])
else:
    print("⚠️ No model found to download")

---

## 🔄 Multi-Session Strategy for Colab

Since Colab sessions timeout after 2-3 hours:

### **Session Plan:**

1. **Session 1** (2-3 hours): Train 0-50 epochs → Save to Drive
2. **Session 2** (2-3 hours): Resume 50-100 epochs → Save to Drive
3. **Session 3** (2-3 hours): Resume 100-150 epochs → Save to Drive
4. **Session 4** (2-3 hours): Resume 150-200 epochs → Save to Drive
5. **Session 5** (2-3 hours): Resume 200-250 epochs → Save to Drive
6. **Session 6** (2-3 hours): Resume 250-300 epochs → Final model!

### **Before Each Session Ends:**
1. ✅ Run cell 9️⃣ to save checkpoints to Drive
2. ✅ Verify files in Drive
3. ✅ Disconnect and start new session when needed

### **To Resume:**
- Just run all cells again
- Cell 8️⃣ automatically detects and resumes from saved checkpoint

---

## 📊 Expected Progress

- **After 50 epochs**: mAP ~35-40%
- **After 100 epochs**: mAP ~50-55%
- **After 150 epochs**: mAP ~60-65%
- **After 200 epochs**: mAP ~65-70%
- **After 300 epochs**: mAP ~70-75% (Final!)

---