In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "header"
   },
   "source": [
    "# 🤖 AI CCTV Surveillance - PPE Detection Training\n",
    "## Google Colab GPU Training Notebook\n",
    "\n",
    "This notebook will train a YOLOv8 model to detect Personal Protective Equipment (PPE) violations in construction sites.\n",
    "\n",
    "**Classes:** Hardhat, Mask, NO-Hardhat, NO-Mask, NO-Safety Vest, Person, Safety Cone, Safety Vest, Machinery, Vehicle\n",
    "\n",
    "---\n",
    "**GPU Training Benefits:**\n",
    "- ⚡ 10-20x faster than CPU\n",
    "- 🆓 Free NVIDIA T4 GPU\n",
    "- 📊 Real-time training metrics\n",
    "- 🎯 Better model performance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "setup"
   },
   "source": [
    "## 🚀 Step 1: Setup and Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "install_packages"
   },
   "outputs": [],
   "source": [
    "# Install required packages\n",
    "!pip install ultralytics torch torchvision opencv-python matplotlib pandas plotly\n",
    "!pip install -q roboflow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "check_gpu"
   },
   "outputs": [],
   "source": [
    "# Check GPU availability\n",
    "import torch\n",
    "print(f\"PyTorch version: {torch.__version__}\")\n",
    "print(f\"CUDA available: {torch.cuda.is_available()}\")\n",
    "if torch.cuda.is_available():\n",
    "    print(f\"GPU Device: {torch.cuda.get_device_name(0)}\")\n",
    "    print(f\"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB\")\n",
    "else:\n",
    "    print(\"❌ No GPU available. Please enable GPU in Runtime > Change runtime type\")\n",
    "\n",
    "# Import required libraries\n",
    "from ultralytics import YOLO\n",
    "import cv2\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import pandas as pd\n",
    "import os\n",
    "from pathlib import Path\n",
    "import yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dataset"
   },
   "source": [
    "## 📁 Step 2: Dataset Setup\n",
    "\n",
    "We'll use the Construction Site Safety Dataset from Roboflow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "download_dataset"
   },
   "outputs": [],
   "source": [
    "# Download the Construction Site Safety Dataset\n",
    "from roboflow import Roboflow\n",
    "rf = Roboflow(api_key=\"YOUR_API_KEY\")  # You'll need to get a free API key from roboflow.com\n",
    "\n",
    "# Alternative: Use direct download if you have the dataset URL\n",
    "print(\"📥 Downloading dataset...\")\n",
    "\n",
    "# Create dataset directory\n",
    "!mkdir -p /content/dataset\n",
    "\n",
    "# Download dataset (replace with your actual dataset URL)\n",
    "# !wget -O /content/dataset/construction_safety.zip \"YOUR_DATASET_URL\"\n",
    "# !unzip /content/dataset/construction_safety.zip -d /content/dataset/\n",
    "\n",
    "print(\"✅ Dataset downloaded successfully!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "upload_dataset"
   },
   "outputs": [],
   "source": [
    "# Alternative: Upload your dataset manually\n",
    "from google.colab import files\n",
    "import zipfile\n",
    "\n",
    "print(\"📤 Please upload your dataset ZIP file:\")\n",
    "uploaded = files.upload()\n",
    "\n",
    "# Extract the uploaded file\n",
    "for filename in uploaded.keys():\n",
    "    if filename.endswith('.zip'):\n",
    "        print(f\"📦 Extracting {filename}...\")\n",
    "        with zipfile.ZipFile(filename, 'r') as zip_ref:\n",
    "            zip_ref.extractall('/content/dataset/')\n",
    "        print(f\"✅ {filename} extracted successfully!\")\n",
    "\n",
    "# List contents\n",
    "!ls -la /content/dataset/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "create_yaml"
   },
   "outputs": [],
   "source": [
    "# Create YAML configuration file for the dataset\n",
    "yaml_content = \"\"\"\n",
    "train: /content/dataset/css-data/train/images\n",
    "val: /content/dataset/css-data/valid/images\n",
    "test: /content/dataset/css-data/test/images\n",
    "\n",
    "nc: 10\n",
    "names: ['Hardhat', 'Mask', 'NO-Hardhat', 'NO-Mask', 'NO-Safety Vest', 'Person', 'Safety Cone', 'Safety Vest', 'machinery', 'vehicle']\n",
    "\"\"\"\n",
    "\n",
    "# Write YAML file\n",
    "with open('/content/dataset/ppe_data.yaml', 'w') as f:\n",
    "    f.write(yaml_content)\n",
    "\n",
    "print(\"📋 YAML configuration created:\")\n",
    "print(yaml_content)\n",
    "\n",
    "# Verify dataset structure\n",
    "print(\"\\n🔍 Verifying dataset structure...\")\n",
    "train_path = \"/content/dataset/css-data/train/images\"\n",
    "val_path = \"/content/dataset/css-data/valid/images\"\n",
    "test_path = \"/content/dataset/css-data/test/images\"\n",
    "\n",
    "for path, name in [(train_path, \"Train\"), (val_path, \"Validation\"), (test_path, \"Test\")]:\n",
    "    if os.path.exists(path):\n",
    "        count = len([f for f in os.listdir(path) if f.endswith(('.jpg', '.jpeg', '.png'))])\n",
    "        print(f\"✅ {name}: {count} images found\")\n",
    "    else:\n",
    "        print(f\"❌ {name}: Path not found - {path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "training"
   },
   "source": [
    "## 🎯 Step 3: Model Training\n",
    "\n",
    "Now we'll train the YOLOv8 model using GPU acceleration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "training_params"
   },
   "outputs": [],
   "source": [
    "# Training parameters\n",
    "EPOCHS = 100  # You can adjust this\n",
    "BATCH_SIZE = 16  # Adjust based on GPU memory\n",
    "IMG_SIZE = 640\n",
    "MODEL_SIZE = 'n'  # 'n' for nano, 's' for small, 'm' for medium, 'l' for large, 'x' for xlarge\n",
    "\n",
    "print(f\"🎯 Training Configuration:\")\n",
    "print(f\"   Model: YOLOv8{MODEL_SIZE}\")\n",
    "print(f\"   Epochs: {EPOCHS}\")\n",
    "print(f\"   Batch Size: {BATCH_SIZE}\")\n",
    "print(f\"   Image Size: {IMG_SIZE}x{IMG_SIZE}\")\n",
    "print(f\"   Device: {'GPU' if torch.cuda.is_available() else 'CPU'}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "start_training"
   },
   "outputs": [],
   "source": [
    "# Load YOLOv8 model\n",
    "model = YOLO(f'yolov8{MODEL_SIZE}.pt')\n",
    "\n",
    "print(\"🚀 Starting training...\")\n",
    "print(\"⏱️  This will take approximately 30-60 minutes with GPU\")\n",
    "\n",
    "# Start training\n",
    "results = model.train(\n",
    "    data='/content/dataset/ppe_data.yaml',\n",
    "    epochs=EPOCHS,\n",
    "    batch=BATCH_SIZE,\n",
    "    imgsz=IMG_SIZE,\n",
    "    device=0 if torch.cuda.is_available() else 'cpu',  # Use GPU if available\n",
    "    project='/content/results',\n",
    "    name='ppe_detection_model',\n",
    "    patience=20,  # Early stopping\n",
    "    save=True,\n",
    "    save_period=10,  # Save every 10 epochs\n",
    "    verbose=True,\n",
    "    plots=True,  # Generate training plots\n",
    "    val=True  # Run validation\n",
    ")\n",
    "\n",
    "print(\"✅ Training completed successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "results"
   },
   "source": [
    "## 📊 Step 4: Training Results and Analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "view_results"
   },
   "outputs": [],
   "source": [
    "# Display training results\n",
    "print(\"📊 Training Results:\")\n",
    "!ls -la /content/results/ppe_detection_model/\n",
    "\n",
    "# Show training plots\n",
    "from IPython.display import Image, display\n",
    "\n",
    "result_files = [\n",
    "    '/content/results/ppe_detection_model/results.png',\n",
    "    '/content/results/ppe_detection_model/confusion_matrix.png',\n",
    "    '/content/results/ppe_detection_model/labels.jpg',\n",
    "    '/content/results/ppe_detection_model/train_batch0.jpg'\n",
    "]\n",
    "\n",
    "for file_path in result_files:\n",
    "    if os.path.exists(file_path):\n",
    "        print(f\"\\n📈 {os.path.basename(file_path)}:\")\n",
    "        display(Image(filename=file_path))\n",
    "    else:\n",
    "        print(f\"❌ {file_path} not found\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "model_metrics"
   },
   "outputs": [],
   "source": [
    "# Load and display model metrics\n",
    "import pandas as pd\n",
    "\n",
    "results_csv = '/content/results/ppe_detection_model/results.csv'\n",
    "if os.path.exists(results_csv):\n",
    "    df = pd.read_csv(results_csv)\n",
    "    print(\"📈 Training Metrics:\")\n",
    "    print(df.tail())\n",
    "    \n",
    "    # Plot training curves\n",
    "    plt.figure(figsize=(15, 5))\n",
    "    \n",
    "    plt.subplot(1, 3, 1)\n",
    "    plt.plot(df['epoch'], df['train/box_loss'], label='Train Box Loss')\n",
    "    plt.plot(df['epoch'], df['val/box_loss'], label='Val Box Loss')\n",
    "    plt.title('Box Loss')\n",
    "    plt.xlabel('Epoch')\n",
    "    plt.ylabel('Loss')\n",
    "    plt.legend()\n",
    "    \n",
    "    plt.subplot(1, 3, 2)\n",
    "    plt.plot(df['epoch'], df['train/cls_loss'], label='Train Cls Loss')\n",
    "    plt.plot(df['epoch'], df['val/cls_loss'], label='Val Cls Loss')\n",
    "    plt.title('Classification Loss')\n",
    "    plt.xlabel('Epoch')\n",
    "    plt.ylabel('Loss')\n",
    "    plt.legend()\n",
    "    \n",
    "    plt.subplot(1, 3, 3)\n",
    "    plt.plot(df['epoch'], df['metrics/mAP50(B)'], label='mAP50')\n",
    "    plt.plot(df['epoch'], df['metrics/mAP50-95(B)'], label='mAP50-95')\n",
    "    plt.title('mAP Metrics')\n",
    "    plt.xlabel('Epoch')\n",
    "    plt.ylabel('mAP')\n",
    "    plt.legend()\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "else:\n",
    "    print(\"❌ Results CSV not found\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "testing"
   },
   "source": [
    "## 🧪 Step 5: Model Testing and Validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "validate_model"
   },
   "outputs": [],
   "source": [
    "# Validate the trained model\n",
    "print(\"🔍 Validating model on test set...\")\n",
    "\n",
    "best_model_path = '/content/results/ppe_detection_model/weights/best.pt'\n",
    "if os.path.exists(best_model_path):\n",
    "    model = YOLO(best_model_path)\n",
    "    \n",
    "    # Run validation\n",
    "    results = model.val(\n",
    "        data='/content/dataset/ppe_data.yaml',\n",
    "        split='test',\n",
    "        device=0 if torch.cuda.is_available() else 'cpu'\n",
    "    )\n",
    "    \n",
    "    print(\"✅ Validation completed!\")\n",
    "    print(f\"📊 Test mAP50: {results.box.map50:.3f}\")\n",
    "    print(f\"📊 Test mAP50-95: {results.box.map:.3f}\")\n",
    "else:\n",
    "    print(\"❌ Best model not found\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "test_inference"
   },
   "outputs": [],
   "source": [
    "# Test inference on a sample image\n",
    "import cv2\n",
    "from PIL import Image\n",
    "\n",
    "# Find a test image\n",
    "test_images_dir = '/content/dataset/css-data/test/images'\n",
    "if os.path.exists(test_images_dir):\n",
    "    test_images = [f for f in os.listdir(test_images_dir) if f.endswith(('.jpg', '.jpeg', '.png'))]\n",
    "    if test_images:\n",
    "        test_image_path = os.path.join(test_images_dir, test_images[0])\n",
    "        \n",
    "        print(f\"🧪 Testing inference on: {test_images[0]}\")\n",
    "        \n",
    "        # Run inference\n",
    "        results = model(test_image_path)\n",
    "        \n",
    "        # Display results\n",
    "        for result in results:\n",
    "            # Plot with bounding boxes\n",
    "            result_plot = result.plot()\n",
    "            plt.figure(figsize=(12, 8))\n",
    "            plt.imshow(cv2.cvtColor(result_plot, cv2.COLOR_BGR2RGB))\n",
    "            plt.title(f'Detection Results - {test_images[0]}')\n",
    "            plt.axis('off')\n",
    "            plt.show()\n",
    "            \n",
    "            # Print detections\n",
    "            if result.boxes is not None:\n",
    "                boxes = result.boxes.xyxy.cpu().numpy()\n",
    "                confs = result.boxes.conf.cpu().numpy()\n",
    "                clss = result.boxes.cls.cpu().numpy().astype(int)\n",
    "                \n",
    "                print(\"\\n🎯 Detections:\")\n",
    "                for box, conf, cls in zip(boxes, confs, clss):\n",
    "                    label = model.names[cls]\n",
    "                    print(f\"   {label}: {conf:.3f}\")\n",
    "    else:\n",
    "        print(\"❌ No test images found\")\n",
    "else:\n",
    "    print(\"❌ Test images directory not found\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "download"
   },
   "source": [
    "## 💾 Step 6: Download Trained Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "download_model"
   },
   "outputs": [],
   "source": [
    "# Download the trained model\n",
    "from google.colab import files\n",
    "\n",
    "best_model_path = '/content/results/ppe_detection_model/weights/best.pt'\n",
    "if os.path.exists(best_model_path):\n",
    "    print(\"📥 Downloading best model...\")\n",
    "    files.download(best_model_path)\n",
    "    print(\"✅ Model downloaded successfully!\")\n",
    "    \n",
    "    # Also download the last model\n",
    "    last_model_path = '/content/results/ppe_detection_model/weights/last.pt'\n",
    "    if os.path.exists(last_model_path):\n",
    "        print(\"📥 Downloading last model...\")\n",
    "        files.download(last_model_path)\n",
    "        print(\"✅ Last model downloaded successfully!\")\n",
    "else:\n",
    "    print(\"❌ Best model not found\")\n",
    "\n",
    "# Download training results\n",
    "print(\"\\n📊 Downloading training results...\")\n",
    "!zip -r /content/training_results.zip /content/results/ppe_detection_model/\n",
    "files.download('/content/training_results.zip')\n",
    "print(\"✅ Training results downloaded successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "usage"
   },
   "source": [
    "## 🚀 Step 7: How to Use Your Trained Model\n",
    "\n",
    "After downloading the model, you can use it in your local environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "usage_example"
   },
   "outputs": [],
   "source": [
    "# Example code for using the trained model locally\n",
    "usage_code = \"\"\"\n",
    "# Local usage example\n",
    "from ultralytics import YOLO\n",
    "import cv2\n",
    "\n",
    "# Load your trained model\n",
    "model = YOLO('best.pt')  # Use the downloaded model\n",
    "\n",
    "# For webcam detection\n",
    "cap = cv2.VideoCapture(0)\n",
    "while True:\n",
    "    ret, frame = cap.read()\n",
    "    if not ret:\n",
    "        break\n",
    "    \n",
    "    # Run inference\n",
    "    results = model(frame)\n",
    "    \n",
    "    # Draw results\n",
    "    annotated_frame = results[0].plot()\n",
    "    cv2.imshow('PPE Detection', annotated_frame)\n",
    "    \n",
    "    if cv2.waitKey(1) & 0xFF == ord('q'):\n",
    "        break\n",
    "\n",
    "cap.release()\n",
    "cv2.destroyAllWindows()\n",
    "\"\"\"\n",
    "\n",
    "print(\"📝 Local Usage Example:\")\n",
    "print(usage_code)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tips"
   },
   "source": [
    "## 💡 Tips and Best Practices\n",
    "\n",
    "### **Training Tips:**\n",
    "- 🎯 **Start with 50-100 epochs** for good results\n",
    "- 📊 **Monitor mAP50** - aim for >0.8 for good performance\n",
    "- 🔄 **Use early stopping** to prevent overfitting\n",
    "- 📈 **Check training plots** for convergence\n",
    "\n",
    "### **Model Performance:**\n",
    "- ⚡ **GPU training** is 10-20x faster than CPU\n",
    "- 🎯 **YOLOv8n** is fast and good for real-time detection\n",
    "- 📱 **YOLOv8s/m** for better accuracy if speed allows\n",
    "\n",
    "### **Dataset Tips:**\n",
    "- 📸 **More diverse images** = better generalization\n",
    "- 🏷️ **Proper labeling** is crucial for accuracy\n",
    "- ⚖️ **Balanced classes** prevent bias\n",
    "\n",
    "### **Deployment:**\n",
    "- 💻 **Local deployment** for privacy-sensitive applications\n",
    "- ☁️ **Cloud deployment** for scalability\n",
    "- 📱 **Edge deployment** for real-time applications\n",
    "\n",
    "---\n",
    "**🎉 Congratulations!** You've successfully trained a PPE detection model using GPU acceleration!\n",
    "\n",
    "**Next Steps:**\n",
    "1. Download your trained model\n",
    "2. Test it on your local environment\n",
    "3. Integrate it into your CCTV surveillance system\n",
    "4. Monitor and improve performance over time"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "T4",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}