# 🚀 Waste Classification using YOLOv8
## TASK 3: TRAINING MODEL YOLOV8

**Project:** Phân loại rác thải thông minh  
**Classes:** Plastic (0), Metal (1), Paper (2), Glass (3)  
**Model:** YOLOv8n (nano) with transfer learning  
**Training:** 100 epochs, ~2-4 hours on T4 GPU

---

## 📋 Prerequisites (từ Task 2):
- ✅ YOLOv8 installed
- ✅ GPU T4 enabled
- ✅ Dataset extracted at `/content/dataset/`
- ✅ data.yaml configured

---


## 🔄 CELL 1: Quick Setup Check

**Mục đích:**
- Import libraries
- Verify GPU
- Verify dataset exists

**⚠️ Lưu ý:** Nếu chưa chạy Task 2, chạy Task 2 notebook trước!


In [None]:
# Import libraries
from ultralytics import YOLO
import torch
import os
import glob
from IPython.display import Image, display
import shutil
from datetime import datetime

print("🔍 Checking setup...\n")

# Check GPU
print("=" * 60)
print("GPU STATUS")
print("=" * 60)
if torch.cuda.is_available():
    print(f"✅ GPU: {torch.cuda.get_device_name(0)}")
    print(f"✅ Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("❌ NO GPU FOUND! Enable GPU: Runtime > Change runtime type > T4 GPU")
    raise RuntimeError("GPU required for training!")

# Check dataset
print("\n" + "=" * 60)
print("DATASET STATUS")
print("=" * 60)

dataset_path = '/content/dataset'
data_yaml = f'{dataset_path}/data.yaml'

if not os.path.exists(dataset_path):
    print("❌ Dataset not found! Run Task 2 notebook first.")
    raise FileNotFoundError(f"Dataset not found at {dataset_path}")

if not os.path.exists(data_yaml):
    print("❌ data.yaml not found!")
    raise FileNotFoundError(f"data.yaml not found at {data_yaml}")

# Count images
train_imgs = len(glob.glob(f'{dataset_path}/train/images/*'))
valid_imgs = len(glob.glob(f'{dataset_path}/valid/images/*'))
test_imgs = len(glob.glob(f'{dataset_path}/test/images/*'))

print(f"✅ Dataset found!")
print(f"   Train: {train_imgs} images")
print(f"   Valid: {valid_imgs} images")
print(f"   Test:  {test_imgs} images")
print(f"   Total: {train_imgs + valid_imgs + test_imgs} images")

print("\n" + "=" * 60)
print("✅ READY TO TRAIN!")
print("=" * 60)


## 🤖 CELL 2: Load Pretrained YOLOv8n Model

**Mục đích:**
- Load YOLOv8n pretrained trên COCO dataset
- Sử dụng transfer learning để tận dụng pretrained weights

**Key Concept:** Transfer learning giúp model học nhanh hơn và đạt accuracy cao hơn so với training from scratch.


In [None]:
# Load pretrained YOLOv8n model
print("🤖 Loading YOLOv8n pretrained model...")
model = YOLO('yolov8n.pt')

print("✅ Model loaded successfully!")
print("\nModel Info:")
print(f"  - Architecture: YOLOv8n (nano)")
print(f"  - Parameters: ~3.2M")
print(f"  - Pretrained: COCO dataset (80 classes)")
print(f"  - Will fine-tune for: 4 waste classes")
print(f"\n✨ Transfer learning enabled!")


## ⚙️ CELL 3: Configure Training Parameters

**Tham số quan trọng:**
- `epochs=100`: Số lượng epochs (1 epoch = train qua toàn bộ dataset 1 lần)
- `imgsz=640`: Kích thước ảnh input
- `batch=16`: Số ảnh train cùng lúc (giảm nếu OOM)
- `lr0=0.01`: Learning rate ban đầu
- `patience=20`: Early stopping (dừng nếu 20 epochs không cải thiện)

**💡 Tips:**
- Nếu OOM → giảm `batch=8` hoặc `imgsz=416`
- Training sẽ mất ~2-4 giờ trên T4 GPU


In [None]:
# Configure training parameters
print("⚙️ Configuring training parameters...\n")

training_config = {
    'data': '/content/dataset/data.yaml',
    'epochs': 100,
    'imgsz': 640,
    'batch': 16,
    'lr0': 0.01,
    'patience': 20,
    'save': True,
    'save_period': 10,
    'project': 'runs/waste_detection',
    'name': 'yolov8n_waste',
    'exist_ok': True,
    'pretrained': True,
    'optimizer': 'SGD',
    'verbose': True,
    'plots': True,
    'device': 0
}

print("📋 Training Configuration:")
print("=" * 60)
for key, value in training_config.items():
    print(f"  {key:15s}: {value}")
print("=" * 60)

print("\n⏱️ Estimated training time: 2-4 hours")
print("💾 Weights will be saved every 10 epochs")
print("🎯 Best model: runs/waste_detection/yolov8n_waste/weights/best.pt")
print("\n✅ Ready to start training!")


## 🚀 CELL 4: START TRAINING!

**⚠️ QUAN TRỌNG:**
- Cell này sẽ chạy **2-4 giờ**
- Không tắt trình duyệt (Colab sẽ disconnect)
- Monitor tiến độ trong output
- Có thể pause/resume bằng cách stop cell

**📊 Output sẽ hiển thị:**
- Epoch progress (1/100, 2/100, ...)
- Loss values (box_loss, cls_loss, dfl_loss)
- Metrics (precision, recall, mAP)
- Training speed (ms/image)

**💾 Auto-save:** Weights tự động lưu mỗi 10 epochs


In [None]:
# START TRAINING!
print("🚀 STARTING TRAINING...")
print("=" * 60)
print("⏱️  Training started at:", datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
print("=" * 60)
print()

# Train the model
results = model.train(**training_config)

print()
print("=" * 60)
print("🎉 TRAINING COMPLETED!")
print("⏱️  Training finished at:", datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
print("=" * 60)
print()
print("📁 Results saved at: runs/waste_detection/yolov8n_waste/")
print("🏆 Best weights: runs/waste_detection/yolov8n_waste/weights/best.pt")
print("📊 Metrics: runs/waste_detection/yolov8n_waste/results.csv")


## 💾 CELL 5: Backup Weights to Google Drive

**Mục đích:**
- Sao lưu model weights vào Google Drive
- Đảm bảo không mất model nếu Colab disconnect

**⚠️ Lưu ý:** Chỉ chạy cell này SAU KHI training xong!


In [None]:
# Backup weights to Google Drive
print("💾 Backing up weights to Google Drive...\n")

# Mount Drive if not already mounted
if not os.path.exists('/content/drive/MyDrive'):
    from google.colab import drive
    drive.mount('/content/drive')

# Create backup folder
backup_dir = '/content/drive/MyDrive/Waste_Detection_Project/trained_models'
os.makedirs(backup_dir, exist_ok=True)

# Copy best weights
best_weights = 'runs/waste_detection/yolov8n_waste/weights/best.pt'
last_weights = 'runs/waste_detection/yolov8n_waste/weights/last.pt'

if os.path.exists(best_weights):
    shutil.copy(best_weights, f'{backup_dir}/yolov8n_waste_best.pt')
    print(f"✅ Best weights backed up to: {backup_dir}/yolov8n_waste_best.pt")

if os.path.exists(last_weights):
    shutil.copy(last_weights, f'{backup_dir}/yolov8n_waste_last.pt')
    print(f"✅ Last weights backed up to: {backup_dir}/yolov8n_waste_last.pt")

# Also backup results
results_csv = 'runs/waste_detection/yolov8n_waste/results.csv'
if os.path.exists(results_csv):
    shutil.copy(results_csv, f'{backup_dir}/training_results.csv')
    print(f"✅ Training results backed up to: {backup_dir}/training_results.csv")

print("\n🎉 Backup completed!")


## 📊 CELL 6: Visualize Training Results

**Mục đích:**
- Xem training curves (loss, mAP, precision, recall)
- Xem confusion matrix
- Xem sample predictions trên validation set

**Key Metrics to Check:**
- **mAP@0.5** > 0.7 (good performance)
- **Loss** giảm dần theo epochs
- **Precision & Recall** cân bằng


In [None]:
# Visualize training results
print("📊 TRAINING RESULTS VISUALIZATION\n")
print("=" * 60)

results_dir = 'runs/waste_detection/yolov8n_waste'

# 1. Training curves (Loss, mAP, Precision, Recall)
print("\n1️⃣ TRAINING CURVES")
print("-" * 60)
results_plot = f'{results_dir}/results.png'
if os.path.exists(results_plot):
    display(Image(results_plot, width=900))
    print("✅ Training curves loaded")
else:
    print("❌ results.png not found!")

# 2. Confusion Matrix
print("\n2️⃣ CONFUSION MATRIX")
print("-" * 60)
confusion_matrix = f'{results_dir}/confusion_matrix.png'
if os.path.exists(confusion_matrix):
    display(Image(confusion_matrix, width=600))
    print("✅ Confusion matrix loaded")
else:
    print("❌ confusion_matrix.png not found!")

# 3. Validation batch predictions
print("\n3️⃣ SAMPLE PREDICTIONS (Validation Set)")
print("-" * 60)
val_pred = f'{results_dir}/val_batch0_pred.jpg'
if os.path.exists(val_pred):
    display(Image(val_pred, width=900))
    print("✅ Validation predictions loaded")
else:
    print("❌ val_batch0_pred.jpg not found!")

print("\n" + "=" * 60)
print("✅ Visualization complete!")
print("=" * 60)


## 📈 CELL 7: Training Metrics Summary

Xem tổng kết metrics từ training


In [None]:
# Print metrics summary
import pandas as pd

print("📈 TRAINING METRICS SUMMARY\n")
print("=" * 60)

results_csv = 'runs/waste_detection/yolov8n_waste/results.csv'

if os.path.exists(results_csv):
    df = pd.read_csv(results_csv)
    df.columns = df.columns.str.strip()
    
    # Get final epoch metrics
    final_metrics = df.iloc[-1]
    
    print("\n🏆 FINAL METRICS (Last Epoch):")
    print("-" * 60)
    print(f"  mAP@0.5:     {final_metrics['metrics/mAP50(B)']:.4f}")
    print(f"  mAP@0.5:0.95: {final_metrics['metrics/mAP50-95(B)']:.4f}")
    print(f"  Precision:   {final_metrics['metrics/precision(B)']:.4f}")
    print(f"  Recall:      {final_metrics['metrics/recall(B)']:.4f}")
    
    print("\n📊 LOSS VALUES:")
    print("-" * 60)
    print(f"  Box Loss:    {final_metrics['train/box_loss']:.4f}")
    print(f"  Class Loss:  {final_metrics['train/cls_loss']:.4f}")
    print(f"  DFL Loss:    {final_metrics['train/dfl_loss']:.4f}")
    
    # Best epoch
    best_epoch = df['metrics/mAP50(B)'].idxmax() + 1
    best_map = df['metrics/mAP50(B)'].max()
    
    print("\n🎯 BEST PERFORMANCE:")
    print("-" * 60)
    print(f"  Best Epoch:  {best_epoch}/{len(df)}")
    print(f"  Best mAP@0.5: {best_map:.4f}")
    
    print("\n" + "=" * 60)
else:
    print("❌ results.csv not found!")

print("✅ Metrics summary complete!")


---

## ✅ TASK 3 COMPLETED!

### 🎉 Checklist:
- ✅ Model loaded (YOLOv8n pretrained)
- ✅ Training completed (100 epochs)
- ✅ Weights saved and backed up
- ✅ Metrics visualized
- ✅ Performance evaluated

### 📁 Important Files:
```
runs/waste_detection/yolov8n_waste/
├── weights/
│   ├── best.pt       ← Best model (highest mAP)
│   └── last.pt       ← Last epoch model
├── results.csv       ← Training metrics
├── results.png       ← Training curves
├── confusion_matrix.png
└── val_batch0_pred.jpg
```

### 📊 Expected Performance:
- **mAP@0.5:** 0.70 - 0.85 (Good)
- **Precision:** 0.75 - 0.90
- **Recall:** 0.70 - 0.85

---

### 🎯 Next: TASK 4 - TESTING & EVALUATION

**Bước tiếp theo:**
1. Load best model
2. Test trên test set
3. Generate detailed reports
4. Test trên ảnh thực tế

**Continue to Task 4 notebook!**

---
