# üöÄ Waste Classification using YOLOv8
## TASK 3: TRAINING MODEL YOLOV8

**Project:** Ph√¢n lo·∫°i r√°c th·∫£i th√¥ng minh  
**Classes:** Plastic (0), Metal (1), Paper (2), Glass (3)  
**Model:** YOLOv8n (nano) with transfer learning  
**Training:** 100 epochs, ~2-4 hours on T4 GPU

---

## üìã Prerequisites (t·ª´ Task 2):
- ‚úÖ YOLOv8 installed
- ‚úÖ GPU T4 enabled
- ‚úÖ Dataset extracted at `/content/dataset/`
- ‚úÖ data.yaml configured

---


## üîÑ CELL 1: Quick Setup Check

**M·ª•c ƒë√≠ch:**
- Import libraries
- Verify GPU
- Verify dataset exists

**‚ö†Ô∏è L∆∞u √Ω:** N·∫øu ch∆∞a ch·∫°y Task 2, ch·∫°y Task 2 notebook tr∆∞·ªõc!


In [None]:
# Import libraries
from ultralytics import YOLO
import torch
import os
import glob
from IPython.display import Image, display
import shutil
from datetime import datetime

print("üîç Checking setup...\n")

# Check GPU
print("=" * 60)
print("GPU STATUS")
print("=" * 60)
if torch.cuda.is_available():
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")
    print(f"‚úÖ Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ùå NO GPU FOUND! Enable GPU: Runtime > Change runtime type > T4 GPU")
    raise RuntimeError("GPU required for training!")

# Check dataset
print("\n" + "=" * 60)
print("DATASET STATUS")
print("=" * 60)

dataset_path = '/content/dataset'
data_yaml = f'{dataset_path}/data.yaml'

if not os.path.exists(dataset_path):
    print("‚ùå Dataset not found! Run Task 2 notebook first.")
    raise FileNotFoundError(f"Dataset not found at {dataset_path}")

if not os.path.exists(data_yaml):
    print("‚ùå data.yaml not found!")
    raise FileNotFoundError(f"data.yaml not found at {data_yaml}")

# Count images
train_imgs = len(glob.glob(f'{dataset_path}/train/images/*'))
valid_imgs = len(glob.glob(f'{dataset_path}/valid/images/*'))
test_imgs = len(glob.glob(f'{dataset_path}/test/images/*'))

print(f"‚úÖ Dataset found!")
print(f"   Train: {train_imgs} images")
print(f"   Valid: {valid_imgs} images")
print(f"   Test:  {test_imgs} images")
print(f"   Total: {train_imgs + valid_imgs + test_imgs} images")

print("\n" + "=" * 60)
print("‚úÖ READY TO TRAIN!")
print("=" * 60)


## ü§ñ CELL 2: Load Pretrained YOLOv8n Model

**M·ª•c ƒë√≠ch:**
- Load YOLOv8n pretrained tr√™n COCO dataset
- S·ª≠ d·ª•ng transfer learning ƒë·ªÉ t·∫≠n d·ª•ng pretrained weights

**Key Concept:** Transfer learning gi√∫p model h·ªçc nhanh h∆°n v√† ƒë·∫°t accuracy cao h∆°n so v·ªõi training from scratch.


In [None]:
# Load pretrained YOLOv8n model
print("ü§ñ Loading YOLOv8n pretrained model...")
model = YOLO('yolov8n.pt')

print("‚úÖ Model loaded successfully!")
print("\nModel Info:")
print(f"  - Architecture: YOLOv8n (nano)")
print(f"  - Parameters: ~3.2M")
print(f"  - Pretrained: COCO dataset (80 classes)")
print(f"  - Will fine-tune for: 4 waste classes")
print(f"\n‚ú® Transfer learning enabled!")


## ‚öôÔ∏è CELL 3: Configure Training Parameters

**Tham s·ªë quan tr·ªçng:**
- `epochs=100`: S·ªë l∆∞·ª£ng epochs (1 epoch = train qua to√†n b·ªô dataset 1 l·∫ßn)
- `imgsz=640`: K√≠ch th∆∞·ªõc ·∫£nh input
- `batch=16`: S·ªë ·∫£nh train c√πng l√∫c (gi·∫£m n·∫øu OOM)
- `lr0=0.01`: Learning rate ban ƒë·∫ßu
- `patience=20`: Early stopping (d·ª´ng n·∫øu 20 epochs kh√¥ng c·∫£i thi·ªán)

**üí° Tips:**
- N·∫øu OOM ‚Üí gi·∫£m `batch=8` ho·∫∑c `imgsz=416`
- Training s·∫Ω m·∫•t ~2-4 gi·ªù tr√™n T4 GPU


In [None]:
# Configure training parameters
print("‚öôÔ∏è Configuring training parameters...\n")

training_config = {
    'data': '/content/dataset/data.yaml',
    'epochs': 100,
    'imgsz': 640,
    'batch': 16,
    'lr0': 0.01,
    'patience': 20,
    'save': True,
    'save_period': 10,
    'project': 'runs/waste_detection',
    'name': 'yolov8n_waste',
    'exist_ok': True,
    'pretrained': True,
    'optimizer': 'SGD',
    'verbose': True,
    'plots': True,
    'device': 0
}

print("üìã Training Configuration:")
print("=" * 60)
for key, value in training_config.items():
    print(f"  {key:15s}: {value}")
print("=" * 60)

print("\n‚è±Ô∏è Estimated training time: 2-4 hours")
print("üíæ Weights will be saved every 10 epochs")
print("üéØ Best model: runs/waste_detection/yolov8n_waste/weights/best.pt")
print("\n‚úÖ Ready to start training!")


## üöÄ CELL 4: START TRAINING!

**‚ö†Ô∏è QUAN TR·ªåNG:**
- Cell n√†y s·∫Ω ch·∫°y **2-4 gi·ªù**
- Kh√¥ng t·∫Øt tr√¨nh duy·ªát (Colab s·∫Ω disconnect)
- Monitor ti·∫øn ƒë·ªô trong output
- C√≥ th·ªÉ pause/resume b·∫±ng c√°ch stop cell

**üìä Output s·∫Ω hi·ªÉn th·ªã:**
- Epoch progress (1/100, 2/100, ...)
- Loss values (box_loss, cls_loss, dfl_loss)
- Metrics (precision, recall, mAP)
- Training speed (ms/image)

**üíæ Auto-save:** Weights t·ª± ƒë·ªông l∆∞u m·ªói 10 epochs


In [None]:
# START TRAINING!
print("üöÄ STARTING TRAINING...")
print("=" * 60)
print("‚è±Ô∏è  Training started at:", datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
print("=" * 60)
print()

# Train the model
results = model.train(**training_config)

print()
print("=" * 60)
print("üéâ TRAINING COMPLETED!")
print("‚è±Ô∏è  Training finished at:", datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
print("=" * 60)
print()
print("üìÅ Results saved at: runs/waste_detection/yolov8n_waste/")
print("üèÜ Best weights: runs/waste_detection/yolov8n_waste/weights/best.pt")
print("üìä Metrics: runs/waste_detection/yolov8n_waste/results.csv")


## üíæ CELL 5: Backup Weights to Google Drive

**M·ª•c ƒë√≠ch:**
- Sao l∆∞u model weights v√†o Google Drive
- ƒê·∫£m b·∫£o kh√¥ng m·∫•t model n·∫øu Colab disconnect

**‚ö†Ô∏è L∆∞u √Ω:** Ch·ªâ ch·∫°y cell n√†y SAU KHI training xong!


In [None]:
# Backup weights to Google Drive
print("üíæ Backing up weights to Google Drive...\n")

# Mount Drive if not already mounted
if not os.path.exists('/content/drive/MyDrive'):
    from google.colab import drive
    drive.mount('/content/drive')

# Create backup folder
backup_dir = '/content/drive/MyDrive/Waste_Detection_Project/trained_models'
os.makedirs(backup_dir, exist_ok=True)

# Copy best weights
best_weights = 'runs/waste_detection/yolov8n_waste/weights/best.pt'
last_weights = 'runs/waste_detection/yolov8n_waste/weights/last.pt'

if os.path.exists(best_weights):
    shutil.copy(best_weights, f'{backup_dir}/yolov8n_waste_best.pt')
    print(f"‚úÖ Best weights backed up to: {backup_dir}/yolov8n_waste_best.pt")

if os.path.exists(last_weights):
    shutil.copy(last_weights, f'{backup_dir}/yolov8n_waste_last.pt')
    print(f"‚úÖ Last weights backed up to: {backup_dir}/yolov8n_waste_last.pt")

# Also backup results
results_csv = 'runs/waste_detection/yolov8n_waste/results.csv'
if os.path.exists(results_csv):
    shutil.copy(results_csv, f'{backup_dir}/training_results.csv')
    print(f"‚úÖ Training results backed up to: {backup_dir}/training_results.csv")

print("\nüéâ Backup completed!")


## üìä CELL 6: Visualize Training Results

**M·ª•c ƒë√≠ch:**
- Xem training curves (loss, mAP, precision, recall)
- Xem confusion matrix
- Xem sample predictions tr√™n validation set

**Key Metrics to Check:**
- **mAP@0.5** > 0.7 (good performance)
- **Loss** gi·∫£m d·∫ßn theo epochs
- **Precision & Recall** c√¢n b·∫±ng


In [None]:
# Visualize training results
print("üìä TRAINING RESULTS VISUALIZATION\n")
print("=" * 60)

results_dir = 'runs/waste_detection/yolov8n_waste'

# 1. Training curves (Loss, mAP, Precision, Recall)
print("\n1Ô∏è‚É£ TRAINING CURVES")
print("-" * 60)
results_plot = f'{results_dir}/results.png'
if os.path.exists(results_plot):
    display(Image(results_plot, width=900))
    print("‚úÖ Training curves loaded")
else:
    print("‚ùå results.png not found!")

# 2. Confusion Matrix
print("\n2Ô∏è‚É£ CONFUSION MATRIX")
print("-" * 60)
confusion_matrix = f'{results_dir}/confusion_matrix.png'
if os.path.exists(confusion_matrix):
    display(Image(confusion_matrix, width=600))
    print("‚úÖ Confusion matrix loaded")
else:
    print("‚ùå confusion_matrix.png not found!")

# 3. Validation batch predictions
print("\n3Ô∏è‚É£ SAMPLE PREDICTIONS (Validation Set)")
print("-" * 60)
val_pred = f'{results_dir}/val_batch0_pred.jpg'
if os.path.exists(val_pred):
    display(Image(val_pred, width=900))
    print("‚úÖ Validation predictions loaded")
else:
    print("‚ùå val_batch0_pred.jpg not found!")

print("\n" + "=" * 60)
print("‚úÖ Visualization complete!")
print("=" * 60)


## üìà CELL 7: Training Metrics Summary

Xem t·ªïng k·∫øt metrics t·ª´ training


In [None]:
# Print metrics summary
import pandas as pd

print("üìà TRAINING METRICS SUMMARY\n")
print("=" * 60)

results_csv = 'runs/waste_detection/yolov8n_waste/results.csv'

if os.path.exists(results_csv):
    df = pd.read_csv(results_csv)
    df.columns = df.columns.str.strip()
    
    # Get final epoch metrics
    final_metrics = df.iloc[-1]
    
    print("\nüèÜ FINAL METRICS (Last Epoch):")
    print("-" * 60)
    print(f"  mAP@0.5:     {final_metrics['metrics/mAP50(B)']:.4f}")
    print(f"  mAP@0.5:0.95: {final_metrics['metrics/mAP50-95(B)']:.4f}")
    print(f"  Precision:   {final_metrics['metrics/precision(B)']:.4f}")
    print(f"  Recall:      {final_metrics['metrics/recall(B)']:.4f}")
    
    print("\nüìä LOSS VALUES:")
    print("-" * 60)
    print(f"  Box Loss:    {final_metrics['train/box_loss']:.4f}")
    print(f"  Class Loss:  {final_metrics['train/cls_loss']:.4f}")
    print(f"  DFL Loss:    {final_metrics['train/dfl_loss']:.4f}")
    
    # Best epoch
    best_epoch = df['metrics/mAP50(B)'].idxmax() + 1
    best_map = df['metrics/mAP50(B)'].max()
    
    print("\nüéØ BEST PERFORMANCE:")
    print("-" * 60)
    print(f"  Best Epoch:  {best_epoch}/{len(df)}")
    print(f"  Best mAP@0.5: {best_map:.4f}")
    
    print("\n" + "=" * 60)
else:
    print("‚ùå results.csv not found!")

print("‚úÖ Metrics summary complete!")


---

## ‚úÖ TASK 3 COMPLETED!

### üéâ Checklist:
- ‚úÖ Model loaded (YOLOv8n pretrained)
- ‚úÖ Training completed (100 epochs)
- ‚úÖ Weights saved and backed up
- ‚úÖ Metrics visualized
- ‚úÖ Performance evaluated

### üìÅ Important Files:
```
runs/waste_detection/yolov8n_waste/
‚îú‚îÄ‚îÄ weights/
‚îÇ   ‚îú‚îÄ‚îÄ best.pt       ‚Üê Best model (highest mAP)
‚îÇ   ‚îî‚îÄ‚îÄ last.pt       ‚Üê Last epoch model
‚îú‚îÄ‚îÄ results.csv       ‚Üê Training metrics
‚îú‚îÄ‚îÄ results.png       ‚Üê Training curves
‚îú‚îÄ‚îÄ confusion_matrix.png
‚îî‚îÄ‚îÄ val_batch0_pred.jpg
```

### üìä Expected Performance:
- **mAP@0.5:** 0.70 - 0.85 (Good)
- **Precision:** 0.75 - 0.90
- **Recall:** 0.70 - 0.85

---

### üéØ Next: TASK 4 - TESTING & EVALUATION

**B∆∞·ªõc ti·∫øp theo:**
1. Load best model
2. Test tr√™n test set
3. Generate detailed reports
4. Test tr√™n ·∫£nh th·ª±c t·∫ø

**Continue to Task 4 notebook!**

---
