

->

# YOLOv8 Prawn Keypoint Detection Training Pipeline

**Author:** Gil Ben-Or  
**Purpose:** Train YOLOv8 pose estimation model for prawn keypoint detection  
**Environment:** Google Colab with GPU acceleration  
**Last Updated:** December 2024  

## 🎯 Objective
Train a YOLOv8 pose estimation model to detect anatomical keypoints on prawns for:
- **Morphometric analysis** - Measuring prawn body lengths and carapace dimensions
- **Population studies** - Automated analysis of prawn populations
- **Research applications** - Supporting marine biology research

## 🔄 Training Pipeline Overview

1. **🛠️ Environment Setup** - Install dependencies and setup Colab environment
2. **📊 W&B Integration** - Configure experiment tracking and model versioning  
3. **📁 Dataset Management** - Download dataset from Roboflow workspace
4. **🏗️ Model Configuration** - Setup YOLOv8 pose model with custom callbacks
5. **🚀 Training Execution** - Train model with optimized hyperparameters
6. **📈 Results Logging** - Track metrics and save best checkpoints to W&B

## 🔧 Key Features
- **Custom W&B Callback**: Automatically saves best checkpoints during training
- **Comprehensive Logging**: Tracks all training metrics, validation results, and model artifacts
- **Error Analysis**: Computes counting errors (RMSE, MAE, MRD) for validation
- **Visualization**: Logs prediction visualizations and comparison plots

---

In [None]:
"""
Install core dependencies for YOLOv8 keypoint detection training.

This cell installs:
- ultralytics: YOLOv8 implementation with pose estimation capabilities
- wandb: Weights & Biases for experiment tracking and model versioning
- roboflow: Dataset management and downloading
"""

# Install YOLOv8 framework for pose estimation
%pip install ultralytics

# Install Weights & Biases for experiment tracking
%pip install wandb  

# Install Roboflow for dataset management
%pip install roboflow

print("✅ All dependencies installed successfully!")



#  🔑 Secure API Key Configuration
This section focuses on securely configuring API keys for integration with external services.

 ### Key Points:
- **Security Best Practices**: Avoid hardcoding API keys directly in the notebook.
- **Use Colab Secrets**: Leverage Google Colab's secret management for secure storage.
- **Regular Rotation**: Regularly update and rotate API keys to maintain security.
### Setup Instructions:
1. Click the key icon (🔑) in the Colab sidebar to manage secrets.
2. Add the following secrets:
   - `WANDB_API_KEY`: Your Weights & Biases API key from wandb.ai/authorize.
   - `ROBOFLOW_API_KEY`: Your Roboflow API key from app.roboflow.com.

The following cell will attempt to load these keys securely and configure the environment for training.



In [None]:
"""
Secure API key configuration using Google Colab secrets.

IMPORTANT SECURITY NOTES:
- Never hardcode API keys in notebooks
- Use Colab secrets or environment variables
- Rotate API keys regularly for security

Setup Instructions:
1. In Colab, click the key icon (🔑) in the left sidebar
2. Add secrets:
   - WANDB_API_KEY: Your W&B API key from wandb.ai/authorize
   - ROBOFLOW_API_KEY: Your Roboflow API key from app.roboflow.com
"""

import os
from google.colab import userdata

try:
    # Get API keys from Colab secrets
    WANDB_API_KEY = userdata.get('WANDB_API_KEY')
    ROBOFLOW_API_KEY = userdata.get('ROBOFLOW_API_KEY')
    
    # Set W&B API key as environment variable
    os.environ['WANDB_API_KEY'] = WANDB_API_KEY
    
    print("✅ API keys loaded successfully from Colab secrets")
    print("🔐 W&B API key configured")
    print("🔐 Roboflow API key configured")
    
except Exception as e:
    print("❌ Error loading API keys from Colab secrets")
    print("Please ensure you've added WANDB_API_KEY and ROBOFLOW_API_KEY to Colab secrets")
    print(f"Error: {e}")
    
    # Fallback: Manual input (less secure, only for testing)
    print("\n🚨 Fallback: Manual API key input (NOT RECOMMENDED for production)")
    WANDB_API_KEY = input("Enter your W&B API key: ")
    ROBOFLOW_API_KEY = input("Enter your Roboflow API key: ")
    os.environ['WANDB_API_KEY'] = WANDB_API_KEY


## 2. 🔧 Import Core Libraries & Setup Custom Callback

Import required libraries and load the custom W&B callback for advanced checkpoint management.


In [None]:
"""
Import libraries and setup custom W&B callback.

This cell:
1. Downloads the custom W&B callback script from this repository
2. Imports all required libraries for training
3. Sets up the custom callback for advanced checkpoint management

The yolo_wandb_callback module provides:
- Automatic saving of best checkpoints to W&B artifacts
- Comprehensive logging of training metrics  
- Validation result visualization and error analysis
- Custom callback integration with YOLOv8 training loop
"""

import wandb
from pathlib import Path
from ultralytics import YOLO

# Download the custom W&B callback script from this repository
print("📥 Setting up custom W&B callback script...")

# Method 1: Download from this GitHub repository
try:
    !wget -q -O yolo_wandb_callback.py https://raw.githubusercontent.com/gbo999/counting_research_algorithms/main/notebooks/training/yolo_wandb_callback.py
    print("✅ Downloaded yolo_wandb_callback.py from repository")
except:
    print("⚠️ Could not download from GitHub, trying alternative methods...")
    
    # Method 2: Copy from Google Drive (if you've uploaded the entire repo)
    drive_path = "/content/drive/MyDrive/counting_research_algorithms/notebooks/training/yolo_wandb_callback.py"
    if Path(drive_path).exists():
        !cp "{drive_path}" .
        print("✅ Copied yolo_wandb_callback.py from Google Drive")
    else:
        # Method 3: Clone the entire repository
        print("📦 Cloning repository to access callback script...")
        !git clone https://github.com/gbo999/counting_research_algorithms.git temp_repo
        !cp temp_repo/notebooks/training/yolo_wandb_callback.py .
        !rm -rf temp_repo
        print("✅ Extracted yolo_wandb_callback.py from cloned repository")

# Verify the file exists and import
if Path("yolo_wandb_callback.py").exists():
    print("✅ yolo_wandb_callback.py is available in current directory")
    
    # Import the custom W&B callback (using new class name)
    from yolo_wandb_callback import YOLOv8WeightsBiasesIntegrationCallback
    print("✅ Custom W&B callback imported successfully!")
    print("🔧 Callback features loaded: automatic checkpointing, error analysis, visualization")
    
else:
    print("❌ yolo_wandb_callback.py not found!")
    print("Please manually upload the file from:")
    print("https://github.com/gbo999/counting_research_algorithms/blob/main/notebooks/training/yolo_wandb_callback.py")

print("📝 Libraries imported and callback ready for training")


## 3. 🔗 Mount Google Drive & Access Dataset

Connect to Google Drive to access training data and configuration files.


In [None]:
# Mount Google Drive to access the dataset
from google.colab import drive
drive.mount('/content/drive')

# Define the path to the dataset in Google Drive
dataset_path = '/content/drive/MyDrive/your-dataset-path'

# Verify the dataset path exists
if Path(dataset_path).exists():
    print(f"✅ Dataset found at: {dataset_path}")
else:
    print("❌ Dataset path not found! Please check the path and try again.")


## 4. 📊 Initialize Weights & Biases Experiment

Setup W&B for experiment tracking and download any pre-trained models from previous runs.


In [None]:
"""
Initialize W&B experiment and download pre-trained segmentation model.

This cell:
1. Initializes a new W&B run for experiment tracking
2. Downloads a previous segmentation model checkpoint for transfer learning
3. Sets up the experiment environment for keypoint detection training
"""

# Initialize Weights & Biases run
run = wandb.init(
    project="your_project_name",
    name="your_experiment_name",
    tags=["your", "tags", "here"],
    notes="Your experiment notes here"
)

# Download pre-trained segmentation model artifact for transfer learning
print("📥 Downloading pre-trained model checkpoint...")
artifact = run.use_artifact(
    'your_artifact_name:version', 
    type='model'
)
artifact_dir = artifact.download()

print(f"✅ Model artifact downloaded to: {artifact_dir}")
print("🔗 W&B experiment initialized successfully!")


## 5. 📁 Dataset Management with Roboflow

Download the prawn keypoint detection dataset from Roboflow workspace.


In [None]:
"""
Download prawn keypoint detection dataset from Roboflow.

This cell:
1. Authenticates with Roboflow using API key
2. Accesses the prawn morphotypes project
3. Downloads the dataset in YOLOv5 format (compatible with YOLOv8)
4. Provides access to training/validation splits and annotations
"""

from roboflow import Roboflow

# Initialize Roboflow client with API key
# Note: In production, store API key as environment variable
rf = Roboflow(api_key=ROBOFLOW_API_KEY)

# Access the prawn morphotypes project
print("🔗 Connecting to Roboflow workspace...")
project = rf.workspace("workspace").project("project")

# Download dataset version 4 in YOLOv5 format
print("📥 Downloading dataset...")
dataset = project.version(4).download("yolov5")

print("✅ Dataset downloaded successfully!")
print(f"📂 Dataset location: {dataset.location}")
print(f"🏷️ Classes: {dataset.names}")
print(f"📊 Dataset stats: {len(dataset)} total samples")


## 6. 🏗️ Model Configuration & W&B Callback Setup

Load YOLOv8 pose estimation model and configure custom W&B logging.


In [None]:
"""
Initialize YOLOv8 pose estimation model and setup custom W&B callbacks.

This cell:
1. Loads pre-trained YOLOv8-large pose estimation model
2. Configures custom W&B callback for advanced logging
3. Sets up automatic checkpoint saving and artifact management
4. Enables comprehensive training metrics tracking
"""

# Load YOLOv8-large pose estimation model
print("🤖 Loading YOLOv8-large pose estimation model...")
model = YOLO("yolov8l-pose.pt")  # Pre-trained on COCO keypoints

# Configure custom W&B callback with project settings
print("⚙️ Setting up W&B callback for advanced logging...")
wandb_logger = YOLOv8WeightsBiasesIntegrationCallback(
    model,
    project='YOUR_PROJECT_NAME',  # Placeholder for project name
    run_name='YOUR_RUN_NAME',  # Placeholder for run name
    tags=['YOUR_TAG1', 'YOUR_TAG2', 'YOUR_TAG3']  # Placeholder for tags
)

# Attach callback events to model
print("🔗 Attaching W&B callbacks to model...")
for event, callback_fn in wandb_logger.callbacks.items():
    model.add_callback(event, callback_fn)

print("✅ Model and callbacks configured successfully!")
print(f"📊 Model: {model.model_name}")
print(f"🎯 Task: Pose estimation (keypoint detection)")
print(f"📈 W&B Project: {wandb_logger.run.project}")
print(f"🏷️ Run Name: {wandb_logger.run.name}")


## 7. 🚀 Training Execution

Execute the training process with optimized hyperparameters for prawn keypoint detection.


In [None]:
"""
Execute YOLOv8 keypoint detection training with optimized hyperparameters.

Training Configuration:
- Dataset: Google Drive mounted dataset with prawn keypoint annotations  
- Epochs: 300 (with early stopping patience=50)
- Image Size: 640x640 pixels
- Batch Size: 8 (optimized for Colab GPU memory)
- Seed: 42 (for reproducible results)

The custom W&B callback will automatically:
- Log training metrics and validation results
- Save best model checkpoints as W&B artifacts
- Generate prediction visualizations and error analysis
- Track model performance throughout training
"""

# Define dataset configuration path
dataset_config = "/content/drive/MyDrive/colab experiments/to colab only 31-12 segment with 76/data.yaml"

# Verify dataset config exists
if os.path.exists(dataset_config):
    print(f"✅ Dataset configuration found: {dataset_config}")
else:
    print(f"❌ Dataset configuration not found: {dataset_config}")
    print("Please verify the Google Drive path is correct")

# Execute training with optimized hyperparameters
print("🚀 Starting YOLOv8 keypoint detection training...")
print("⏱️ This will take several hours depending on GPU availability")

results = model.train(
    data=dataset_config,
    epochs=300,           # Maximum training epochs
    imgsz=640,            # Input image size
    seed=42,              # Random seed for reproducibility
    batch=8,              # Batch size optimized for Colab
    patience=50,          # Early stopping patience
    save=True,            # Save checkpoints
    save_period=10,       # Save checkpoint every 10 epochs
    verbose=True,         # Verbose logging
    device='auto'         # Automatically select GPU if available
)

print("🎉 Training completed successfully!")
print(f"📊 Best mAP50: {results.metrics['metrics/mAP50(B)']:.3f}")
print(f"💾 Model saved to: {results.save_dir}")
print(f"🔗 W&B Run: {wandb_logger.run.url}")


## 8. 📈 Training Results & Model Artifacts

Review training results and ensure all model artifacts are properly saved to W&B.


In [None]:
"""
Post-training analysis and artifact management.

This cell:
1. Displays final training metrics and model performance
2. Verifies W&B artifacts were saved correctly
3. Provides links to training visualizations and logs
4. Summarizes the training session results
"""

# Display training summary
print("🎯 TRAINING SUMMARY")
print("=" * 50)
print(f"📊 Final Metrics:")
if hasattr(results, 'metrics'):
    for metric_name, value in results.metrics.items():
        if isinstance(value, (int, float)):
            print(f"   {metric_name}: {value:.4f}")

print(f"\n💾 Model Artifacts:")
print(f"   Best weights: {results.save_dir}/weights/best.pt")
print(f"   Last weights: {results.save_dir}/weights/last.pt")

# Check W&B artifacts
print(f"\n🔗 Weights & Biases:")
print(f"   Project: {wandb_logger.run.project}")
print(f"   Run ID: {wandb_logger.run.id}")
print(f"   Run URL: {wandb_logger.run.url}")

# List artifacts in current run
artifacts = wandb_logger.run.logged_artifacts()
if artifacts:
    print(f"   Logged Artifacts: {len(artifacts)}")
    for artifact in artifacts:
        print(f"     - {artifact.name} ({artifact.type})")
else:
    print("   No artifacts found (may still be uploading)")

print(f"\n✅ Training completed successfully!")
print(f"🚀 Best model checkpoint saved and uploaded to W&B")
print(f"📈 View detailed training logs and visualizations at: {wandb_logger.run.url}")

# Optional: Finish W&B run
# wandb.finish()
