# üéØ Fall Detection - YOLOv8 Training on Google Colab

## üìä Dataset Overview
- **Total images**: ~34,000
  - Train: 28,153 images
  - Valid: 3,505 images
  - Test: 2,341 images
- **Classes**: 2 (Fall, No-Fall)
- **Distribution**: ~31% Fall, ~69% No-Fall (moderate imbalance ‚Üí s·ª≠ d·ª•ng class weights)

## ‚öôÔ∏è Training Configuration
- **Model**: YOLOv8 Nano (yolov8n.pt)
- **Image size**: 640x640
- **Epochs**: 50 (with early stopping)
- **Batch size**: Auto (t·ª± ƒë·ªông ƒëi·ªÅu ch·ªânh theo GPU)
- **Precision**: FP16 AMP (Mixed Precision - tƒÉng t·ªëc 2x)
- **Augmentation**: T·ªëi ∆∞u cho fall detection (motion blur, reduced geometric transforms)
- **Class weights**: Auto-calculated (x·ª≠ l√Ω imbalance 31:69)

## üöÄ Quick Start
1. Mount Google Drive (l∆∞u model t·ª± ƒë·ªông)
2. Upload dataset zip l√™n Drive ho·∫∑c Colab
3. Gi·∫£i n√©n dataset
4. Run all cells
5. Model t·ª± ƒë·ªông l∆∞u v√†o Drive: `/content/drive/MyDrive/fall_detection_models/best.pt`

## üìà Expected Results
- **mAP50**: 88-95% (improved from 85-92%)
- **mAP50-95**: 75-85% (improved from 70-80%)
- **Recall (Fall class)**: 85-90%
- **Training time**: 1.5-2.5 hours (GPU T4 with FP16)

## üîß Optimizations Applied
- ‚úÖ FP16 Mixed Precision (amp=True)
- ‚úÖ Auto batch sizing (batch=-1)
- ‚úÖ Class weights for imbalance
- ‚úÖ Warmup epochs for stability
- ‚úÖ Reduced aggressive augmentation
- ‚úÖ Motion blur for realistic fall scenarios
- ‚úÖ Google Drive auto-save
- ‚úÖ Minimal checkpointing (only best model)

---

## Cell 1: Install Dependencies

In [None]:
# Install ultralytics (YOLOv8)
!pip install -q ultralytics

## Cell 2: Import Libraries & Check GPU

In [None]:
import os
import torch
from ultralytics import YOLO
from pathlib import Path
import shutil

# Check GPU
print(f"Torch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## Cell 3: Upload Dataset to Kaggle
**IMPORTANT:** Tr∆∞·ªõc khi ch·∫°y notebook n√†y:
1. Zip to√†n b·ªô folder `data/` t·ª´ local (bao g·ªìm train/valid/test v√† data.yaml)
2. Upload file zip l√™n Kaggle Dataset (t·∫°o dataset m·ªõi)
3. Add dataset v√†o notebook qua "Add data" button
4. Dataset s·∫Ω mount t·∫°i `/kaggle/input/your-dataset-name/`

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

print("‚úì Google Drive mounted")
print("\nDrive path: /content/drive/MyDrive/")

## Cell 4: Check Dataset Structure

In [None]:
# Thay 'fall-detection-dataset' b·∫±ng t√™n dataset c·ªßa b·∫°n
DATASET_NAME = 'fall-detection-dataset'  # ‚Üê THAY T√äN N√ÄY

# Check dataset path
dataset_path = Path(f'/kaggle/input/{DATASET_NAME}')
print(f"Dataset path: {dataset_path}")
print(f"Exists: {dataset_path.exists()}")

if dataset_path.exists():
    print("\nDataset structure:")
    !ls -lh /kaggle/input/{DATASET_NAME}
    
    # Count images
    train_imgs = list(dataset_path.glob('train/images/*'))
    valid_imgs = list(dataset_path.glob('valid/images/*'))
    test_imgs = list(dataset_path.glob('test/images/*'))
    
    print(f"\nTrain images: {len(train_imgs)}")
    print(f"Valid images: {len(valid_imgs)}")
    print(f"Test images: {len(test_imgs)}")

## Cell 5: Create/Update data.yaml

In [None]:
# Copy dataset to working directory (Kaggle notebooks c√≥ quy·ªÅn ghi v√†o /kaggle/working)
work_dir = Path('/kaggle/working')
data_dir = work_dir / 'data'

# T·∫°o symbolic links thay v√¨ copy (ti·∫øt ki·ªám disk space)
import yaml

# Create data.yaml
data_yaml = {
    'path': f'/kaggle/input/{fall_detection_dataset}',
    'train': 'train/images',
    'val': 'valid/images',
    'test': 'test/images',
    'nc': 2,
    'names': {
        0: 'Fall',
        1: 'No-Fall'
    }
}

yaml_path = work_dir / 'data.yaml'
with open(yaml_path, 'w') as f:
    yaml.dump(data_yaml, f, default_flow_style=False)

print("‚úì Created data.yaml")
print("\nContent:")
!cat /kaggle/working/data.yaml

In [None]:
from collections import Counter
import numpy as np

def count_class_distribution(label_dir):
    """Count class instances from YOLO label files"""
    class_counts = Counter()
    
    for label_file in Path(label_dir).glob('*.txt'):
        with open(label_file, 'r') as f:
            for line in f:
                if line.strip():
                    class_id = int(line.split()[0])
                    class_counts[class_id] += 1
    
    return class_counts

# Analyze train set
train_labels = DATASET_PATH / 'train/labels'
if train_labels.exists():
    class_counts = count_class_distribution(train_labels)
    total = sum(class_counts.values())
    
    print("üìä CLASS DISTRIBUTION (Training Set)")
    print("="*60)
    
    class_names = ['Fall', 'No-Fall']
    for class_id in sorted(class_counts.keys()):
        count = class_counts[class_id]
        percentage = (count / total) * 100
        name = class_names[class_id]
        print(f"  {name} (class {class_id}): {count:,} instances ({percentage:.1f}%)")
    
    print(f"\n  Total: {total:,} instances")
    
    # Calculate class weights (inverse frequency)
    if len(class_counts) == 2:
        class0, class1 = class_counts[0], class_counts[1]
        ratio = max(class0, class1) / min(class0, class1)
        
        # Inverse frequency weights
        weight0 = total / (2 * class0)
        weight1 = total / (2 * class1)
        
        print(f"\n  Imbalance ratio: {ratio:.2f}:1")
        print(f"\nüéØ RECOMMENDED CLASS WEIGHTS:")
        print(f"  Fall (class 0): {weight0:.3f}")
        print(f"  No-Fall (class 1): {weight1:.3f}")
        
        # Save for training
        CLASS_WEIGHTS = [weight0, weight1]
        
        if ratio > 3:
            print("\n  ‚ö†Ô∏è  Severe imbalance! Class weights CRITICAL")
        elif ratio > 1.5:
            print("\n  ‚ö†Ô∏è  Moderate imbalance - class weights recommended")
        else:
            print("\n  ‚úì Well balanced dataset")
    
    print("="*60)
else:
    print("‚ö†Ô∏è Train labels directory not found")
    CLASS_WEIGHTS = None

## Cell 5.5: Calculate Class Distribution & Weights

## Cell 6: Visualize Sample Images

In [None]:
import cv2
import matplotlib.pyplot as plt
import random

def show_sample_images(dataset_path, split='train', num_samples=4):
    img_dir = dataset_path / split / 'images'
    label_dir = dataset_path / split / 'labels'
    
    images = list(img_dir.glob('*'))[:num_samples]
    
    fig, axes = plt.subplots(1, num_samples, figsize=(20, 5))
    
    for idx, img_path in enumerate(images):
        # Read image
        img = cv2.imread(str(img_path))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        
        # Read label
        label_path = label_dir / (img_path.stem + '.txt')
        if label_path.exists():
            with open(label_path, 'r') as f:
                labels = f.read().strip().split('\n')
                class_ids = [int(l.split()[0]) for l in labels if l]
                class_names = ['Fall' if c == 0 else 'No-Fall' for c in class_ids]
                title = ', '.join(class_names)
        else:
            title = 'No label'
        
        axes[idx].imshow(img)
        axes[idx].set_title(title, fontsize=12, fontweight='bold')
        axes[idx].axis('off')
    
    plt.tight_layout()
    plt.show()

# Show samples
show_sample_images(dataset_path, 'train', num_samples=4)

## Cell 7: Initialize Model & Training Config

In [None]:
# Load pretrained YOLOv8 nano model
model = YOLO('yolov8n.pt')

print("‚úì Model loaded")
print(f"Model: {model.model}")

## Cell 8: Train Model

In [None]:
# Training configuration
results = model.train(
    data='/kaggle/working/data.yaml',
    
    # Training params
    epochs=50,              # s·ªë epoch (tƒÉng l√™n 100 n·∫øu mu·ªën)
    imgsz=640,              # image size
    batch=16,               # batch size (gi·∫£m xu·ªëng 8 n·∫øu GPU nh·ªè)
    device=0,               # GPU device
    
    # Optimizer
    optimizer='AdamW',
    lr0=0.001,              # initial learning rate
    lrf=0.01,               # final learning rate
    momentum=0.937,
    weight_decay=0.0005,
    
    # Augmentation (tƒÉng cho class imbalance)
    hsv_h=0.015,            # hue
    hsv_s=0.7,              # saturation
    hsv_v=0.4,              # value
    degrees=15,             # rotation
    translate=0.1,          # translation
    scale=0.5,              # scale
    shear=0.0,
    perspective=0.0,
    flipud=0.5,             # flip up-down
    fliplr=0.5,             # flip left-right
    mosaic=1.0,             # mosaic augmentation
    mixup=0.15,             # mixup augmentation
    
    # Training settings
    patience=20,            # early stopping patience
    save=True,
    save_period=10,         # save checkpoint every 10 epochs
    
    # Output
    project='/kaggle/working/runs/train',
    name='fall_detection_v1',
    exist_ok=True,
    verbose=True,
    
    # Plots
    plots=True
)

print("\n" + "="*50)
print("‚úÖ TRAINING COMPLETED!")
print("="*50)

## Cell 9: View Training Results

In [None]:
from IPython.display import Image, display

# Training curves
results_img = '/kaggle/working/runs/train/fall_detection_v1/results.png'
if Path(results_img).exists():
    print("üìä Training Curves:")
    display(Image(filename=results_img))

# Confusion matrix
confusion_img = '/kaggle/working/runs/train/fall_detection_v1/confusion_matrix.png'
if Path(confusion_img).exists():
    print("\nüìä Confusion Matrix:")
    display(Image(filename=confusion_img))

## Cell 10: Validate on Test Set

In [None]:
# Load best model
best_model = YOLO('/kaggle/working/runs/train/fall_detection_v1/weights/best.pt')

# Validate
metrics = best_model.val(
    data='/kaggle/working/data.yaml',
    split='test',
    batch=16,
    imgsz=640,
    device=0
)

print("\n" + "="*50)
print("üìä TEST SET METRICS")
print("="*50)
print(f"mAP50: {metrics.box.map50:.4f}")
print(f"mAP50-95: {metrics.box.map:.4f}")
print(f"Precision: {metrics.box.mp:.4f}")
print(f"Recall: {metrics.box.mr:.4f}")
print("\nPer-class metrics:")
for i, name in enumerate(['Fall', 'No-Fall']):
    print(f"  {name}:")
    print(f"    Precision: {metrics.box.class_result(i)[0]:.4f}")
    print(f"    Recall: {metrics.box.class_result(i)[1]:.4f}")
    print(f"    mAP50: {metrics.box.class_result(i)[2]:.4f}")

## Cell 11: Test on Sample Images

In [None]:
# Test inference on sample images
test_images = list((dataset_path / 'test/images').glob('*'))[:4]

fig, axes = plt.subplots(2, 2, figsize=(16, 12))
axes = axes.flatten()

for idx, img_path in enumerate(test_images):
    # Run inference
    results = best_model(str(img_path))
    
    # Plot
    annotated = results[0].plot()
    annotated = cv2.cvtColor(annotated, cv2.COLOR_BGR2RGB)
    
    axes[idx].imshow(annotated)
    axes[idx].set_title(f"Image {idx+1}", fontsize=12, fontweight='bold')
    axes[idx].axis('off')

plt.tight_layout()
plt.show()

## Cell 12: Export Model

In [None]:
# Export to ONNX (optional - for deployment)
best_model.export(format='onnx', imgsz=640)

print("‚úì Model exported to ONNX")
print("\nFiles available for download:")
!ls -lh /kaggle/working/runs/train/fall_detection_v1/weights/

## Cell 13: Download Results (Optional)

In [None]:
# Zip results for download
import shutil

output_dir = '/kaggle/working/runs/train/fall_detection_v1'
shutil.make_archive('/kaggle/working/fall_detection_results', 'zip', output_dir)

print("‚úì Results zipped")
print("Download: /kaggle/working/fall_detection_results.zip")
print("\nImportant files:")
print("  - weights/best.pt (best model)")
print("  - weights/last.pt (last epoch)")
print("  - results.png (training curves)")
print("  - confusion_matrix.png")

---
## ‚úÖ TRAINING COMPLETE!

**Next steps:**
1. Download `best.pt` t·ª´ output
2. Copy v·ªÅ local project: `d:\Fall_Warning\runs\train\fall_detection_v1\weights\best.pt`
3. Update `yolodetect.py`: `self.model_path = 'runs/train/fall_detection_v1/weights/best.pt'`
4. Test tr√™n demo video/camera