# üöÄ YOLOv8 Building Inspection Training - Google Colab

Train your property damage detection model **10x faster** using Google Colab's free GPU!

## üìã What this notebook does:
- Sets up YOLOv8 environment
- Downloads your dataset from GitHub
- Trains the model with GPU acceleration
- Saves trained model for download

## ‚ö° Expected speedup:
- **Local CPU**: ~8 hours for 100 epochs
- **Colab GPU**: ~30-45 minutes for 100 epochs

---

## Step 1: Setup Environment

In [None]:
# Install required packages
!pip install ultralytics
!pip install roboflow

# Check GPU availability
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

## Step 2: Download Dataset from GitHub

This will download your DSCRevaluator repository and navigate to the ml-service directory.

In [None]:
# Clone your repository
!git clone https://github.com/kyletbuzbee/DSCRevaluator.git
%cd DSCRevaluator/ml-service

# Verify dataset exists
!ls -la dataset/images/train/

## Step 3: Verify Dataset Structure

In [None]:
# Check data.yaml configuration
!cat data.yaml

# Count images in each split
import os
train_images = len([f for f in os.listdir('dataset/images/train/train/images') if f.endswith(('.jpg', '.jpeg', '.png'))])
val_images = len([f for f in os.listdir('dataset/images/train/valid/images') if f.endswith(('.jpg', '.jpeg', '.png'))])
test_images = len([f for f in os.listdir('dataset/images/train/test/images') if f.endswith(('.jpg', '.jpeg', '.png'))])

print(f"\nüìä Dataset Summary:")
print(f"Training images: {train_images}")
print(f"Validation images: {val_images}")
print(f"Test images: {test_images}")
print(f"Total: {train_images + val_images + test_images}")

## Step 4: Train YOLOv8 Model üöÄ

This will train much faster with GPU acceleration!

In [None]:
from ultralytics import YOLO
import time

print("üöÄ Starting YOLOv8 training with GPU acceleration...")
start_time = time.time()

# Load pretrained model
model = YOLO('yolov8m.pt')  # Medium model for better performance

# Train with optimized settings for Colab
results = model.train(
    data='data.yaml',
    epochs=100,
    imgsz=640,
    batch=16,  # Larger batch size for GPU
    device='cuda',  # Use GPU!
    project='runs/train',
    name='building_inspection_detector_colab',
    patience=50,
    save=True,
    save_period=10,
    cos_lr=True,
    augment=True,
    mixup=0.1,
    degrees=10.0,
    translate=0.1,
    scale=0.5,
    shear=2.0,
    perspective=0.0001,
    flipud=0.5,
    fliplr=0.5,
    mosaic=1.0,
    hsv_h=0.015,
    hsv_s=0.7,
    hsv_v=0.4,
)

training_time = time.time() - start_time
print(f"\nüéâ Training completed in {training_time/60:.1f} minutes!")
print(f"üìÅ Best model saved to: {results.save_dir}/weights/best.pt")
print(f"üìä Final mAP50: {results.results_dict.get('metrics/mAP50(B)', 'N/A')}")
print(f"üìà Final mAP50-95: {results.results_dict.get('metrics/mAP50-95(B)', 'N/A')}")

## Step 5: Evaluate Model Performance

In [None]:
# Load best model and evaluate
best_model = YOLO('runs/train/building_inspection_detector_colab/weights/best.pt')

# Run validation
validation_results = best_model.val(
    data='data.yaml',
    split='test'
)

print("üìä Test Set Performance:")
print(f"mAP50: {validation_results.results_dict.get('metrics/mAP50(B)', 'N/A')}")
print(f"mAP50-95: {validation_results.results_dict.get('metrics/mAP50-95(B)', 'N/A')}")
print(f"Precision: {validation_results.results_dict.get('metrics/precision(B)', 'N/A')}")
print(f"Recall: {validation_results.results_dict.get('metrics/recall(B)', 'N/A')}")

## Step 6: Download Trained Model

Download the trained model to use in your local application.

In [None]:
# Create zip file with model and training results
!zip -r trained_model.zip runs/train/building_inspection_detector_colab/

# Download the model
from google.colab import files
files.download('trained_model.zip')

print("\n‚úÖ Model training complete!")
print("üì• Download 'trained_model.zip' and extract to your ml-service/runs/train/ directory")
print("üîÑ Update your run_model.py to use the new model path")

## Step 7: Test Inference (Optional)

In [None]:
# Test inference on sample images
import glob
from PIL import Image
import matplotlib.pyplot as plt

# Find some test images
test_image_paths = glob.glob('dataset/images/train/test/images/*.jpg')[:3]

for img_path in test_image_paths:
    print(f"\nüîç Analyzing: {img_path}")
    
    # Run inference
    results = best_model(img_path, conf=0.25)
    
    # Display results
    for r in results:
        print(f"Detected {len(r.boxes)} objects")
        if len(r.boxes) > 0:
            for box in r.boxes:
                class_name = r.names[int(box.cls)]
                confidence = float(box.conf)
                print(f"  - {class_name}: {confidence:.2f}")
    
    # Show image with detections
    plt.figure(figsize=(10, 8))
    plt.imshow(r.plot())
    plt.axis('off')
    plt.show()

## üìù Next Steps

1. **Download** the `trained_model.zip` file
2. **Extract** it to your `ml-service/runs/train/` directory
3. **Update** `run_model.py` to use the new model:
   ```python
   model = YOLO('runs/train/building_inspection_detector_colab/weights/best.pt')
   ```
4. **Test** your application with the improved model!

## üí° Pro Tips

- **Runtime**: Keep Colab tab active during training
- **Storage**: Free tier has 12GB RAM, 15GB GPU RAM
- **Time limit**: Sessions can run for up to 12 hours
- **Save frequently**: Download checkpoints periodically

---

**Happy training! üöÄ** Your model will be much more accurate with GPU training.