# DSA RL Training in Google Colab

This notebook sets up and runs the DSA RL training pipeline.

## ‚ö†Ô∏è IMPORTANT: GPU Required

**You MUST enable GPU before running this notebook:**
1. Go to: **Runtime ‚Üí Change runtime type**
2. Set **Hardware accelerator: GPU**
3. Click **Save**
4. The notebook will verify GPU is available before proceeding

**Training will fail without GPU** - CPU training is too slow and will timeout.


## Step 0: Check CUDA Version (Optional)

Run this first to verify your CUDA version before installing PyTorch.


In [None]:
# Check CUDA version and GPU availability
import subprocess
import sys

print("Checking GPU and CUDA availability...")
print("=" * 50)

try:
    # Try to get GPU info (using valid nvidia-smi fields)
    result = subprocess.run(['nvidia-smi', '--query-gpu=name,driver_version,memory.total', 
                            '--format=csv,noheader'], 
                           capture_output=True, text=True, timeout=5)
    
    if result.returncode == 0 and result.stdout.strip():
        print("‚úÖ GPU Information:")
        lines = result.stdout.strip().split('\n')
        for i, line in enumerate(lines, 1):
            parts = [p.strip() for p in line.split(',')]
            if len(parts) >= 3:
                print(f"   GPU {i}: {parts[0]}")
                print(f"   Driver Version: {parts[1]}")
                print(f"   Memory: {parts[2]}")
                print()
        
        # Get CUDA version from nvidia-smi header (it's shown there)
        cuda_result = subprocess.run(['nvidia-smi'], capture_output=True, text=True, timeout=5)
        if cuda_result.returncode == 0:
            for line in cuda_result.stdout.split('\n'):
                if 'CUDA Version' in line:
                    print(f"   {line.strip()}")
                    break
    else:
        print("‚ö†Ô∏è  nvidia-smi returned no output")
        print("   This might mean GPU runtime is not enabled")
        print("   Please enable GPU: Runtime ‚Üí Change runtime type ‚Üí GPU")
        
except FileNotFoundError:
    print("‚ùå nvidia-smi not found")
    print("   GPU runtime may not be enabled")
    print("   Please enable GPU: Runtime ‚Üí Change runtime type ‚Üí GPU")
except subprocess.TimeoutExpired:
    print("‚ö†Ô∏è  nvidia-smi timed out")
except Exception as e:
    print(f"‚ö†Ô∏è  Could not check GPU: {e}")
    print("   Will verify after PyTorch installation")

print("=" * 50)


## Step 1: Install Dependencies

**Note:** We use `pip` (not conda) because:
- Colab doesn't have conda pre-installed
- pip is simpler and works perfectly with Colab's GPU setup
- PyTorch wheels from pip work seamlessly with Colab's CUDA 12.4


In [None]:
# Install PyTorch with CUDA support (Colab uses CUDA 12.4)
# Using cu121 which is compatible with CUDA 12.4
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Install other dependencies
!pip install numpy>=1.21.0 scipy>=1.7.0 opencv-python>=4.5.0 matplotlib>=3.4.0


## Step 2: Clone Repository
4

In [None]:
# Clone your repository
# Replace YOUR_USERNAME with your GitHub username
!git clone https://github.com/MahsaAbadian/DSA-RL-Tracker.git

# Or if you've already cloned it, skip the clone step above


## Step 3: Navigate to Experiment1 Directory


In [None]:
import os
import sys

# Navigate to Experiment1 directory
os.chdir('/content/DSA-RL-Tracker/Experiment1')
print(f"Current directory: {os.getcwd()}")

# Add to Python path (both Experiment1 and src directories)
sys.path.insert(0, '/content/DSA-RL-Tracker/Experiment1')
sys.path.insert(0, '/content/DSA-RL-Tracker/Experiment1/src')

# List files to verify
print("\nFiles in Experiment1:")
!ls -la
print("\nFiles in src:")
!ls -la src/


## Step 4: Verify GPU Setup


In [None]:
import torch

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")
    print(f"   CUDA version: {torch.version.cuda}")
    print(f"   GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    print("\n‚úÖ GPU is ready for training!")
else:
    print("\n‚ùå ERROR: GPU is not available!")
    print("\n‚ö†Ô∏è  Training requires GPU acceleration.")
    print("   Please enable GPU:")
    print("   1. Go to: Runtime ‚Üí Change runtime type")
    print("   2. Set Hardware accelerator: GPU")
    print("   3. Click Save")
    print("   4. Re-run this cell")
    print("\n   Training will be extremely slow on CPU and may timeout.")
    raise RuntimeError("GPU not available. Please enable GPU runtime before continuing.")


## Step 5: Verify Required Files


In [None]:
# Check if all required files exist
required_files = [
    'src/train.py',
    'src/models.py',
    'src/curve_generator.py',
    'curve_config.json'
]

print("Checking required files...")
all_exist = True
for file in required_files:
    if os.path.exists(file):
        print(f"‚úÖ {file}")
    else:
        print(f"‚ùå {file} - MISSING!")
        all_exist = False

if all_exist:
    print("\n‚úÖ All required files found!")
else:
    print("\n‚ùå Some files are missing. Please check your repository.")


## Step 6: Run Training

**Note:** Training will take several hours. The script saves checkpoints every 2000 episodes, so you can resume if interrupted.


In [None]:
# Import training function
from src.train import run_unified_training

# Start training
# This will:
# - Create a timestamped run directory in runs/
# - Train through 3 curriculum stages
# - Save checkpoints every 2000 episodes
# - Save final weights after each stage

print("üöÄ Starting training...")
print("=" * 50)

run_unified_training(
    run_dir=None,  # Auto-create timestamped directory
    base_seed=42,  # For reproducibility
    clean_previous=False,  # Keep previous runs
    experiment_name="colab_training",  # Name for this experiment
    resume_from=None,  # Start fresh (or provide checkpoint path to resume)
    curve_config_path="curve_config.json"  # Config file
)

print("\n" + "=" * 50)
print("‚úÖ Training complete!")


## Step 7: Monitor Training Progress (Optional)

Run this cell periodically to check training progress without interrupting training.


In [None]:
import glob

# Find the latest run directory
run_dirs = glob.glob('runs/*/')
if run_dirs:
    latest_run = max(run_dirs, key=os.path.getctime)
    log_file = os.path.join(latest_run, 'logs', 'training.log')
    
    print(f"üìä Latest run: {latest_run}")
    
    if os.path.exists(log_file):
        print("\n=== Last 30 lines of training log ===")
        with open(log_file, 'r') as f:
            lines = f.readlines()
            for line in lines[-30:]:
                print(line.rstrip())
    else:
        print(f"Log file not found: {log_file}")
        
    # Show checkpoints
    checkpoint_dir = os.path.join(latest_run, 'checkpoints')
    if os.path.exists(checkpoint_dir):
        checkpoints = glob.glob(os.path.join(checkpoint_dir, '*.pth'))
        if checkpoints:
            print(f"\nüì¶ Checkpoints ({len(checkpoints)} total):")
            for ckpt in sorted(checkpoints)[-5:]:  # Show last 5
                size_mb = os.path.getsize(ckpt) / (1024 * 1024)
                print(f"   {os.path.basename(ckpt)} ({size_mb:.1f} MB)")
else:
    print("No runs found yet")


## Step 8: Download Results (Optional)

Download checkpoints and training results to your local machine.


In [None]:
from google.colab import files
import zipfile

# Find latest run
run_dirs = glob.glob('runs/*/')
if run_dirs:
    latest_run = max(run_dirs, key=os.path.getctime)
    
    # Create zip file
    zip_path = f'{latest_run.rstrip("/")}.zip'
    print(f"üì¶ Creating zip file: {zip_path}")
    
    with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
        for root, dirs, filenames in os.walk(latest_run):
            for filename in filenames:
                file_path = os.path.join(root, filename)
                arcname = os.path.relpath(file_path, os.path.dirname(latest_run))
                zipf.write(file_path, arcname)
    
    # Download
    print(f"‚¨áÔ∏è  Downloading: {zip_path}")
    files.download(zip_path)
    print(f"‚úÖ Download complete!")
else:
    print("No runs found to download")


## Resume Training (If Interrupted)

If your Colab session times out or training is interrupted, you can resume from the last checkpoint:


In [None]:
# Example: Resume from a checkpoint
# Replace the path below with your actual checkpoint path

# Find latest checkpoint
run_dirs = glob.glob('runs/*/')
if run_dirs:
    latest_run = max(run_dirs, key=os.path.getctime)
    checkpoint_dir = os.path.join(latest_run, 'checkpoints')
    
    if os.path.exists(checkpoint_dir):
        checkpoints = sorted(glob.glob(os.path.join(checkpoint_dir, 'ckpt_*.pth')))
        if checkpoints:
            latest_checkpoint = checkpoints[-1]
            print(f"Found checkpoint: {latest_checkpoint}")
            
            # Resume training
            from src.train import run_unified_training
            
            run_unified_training(
                resume_from=latest_checkpoint,
                curve_config_path="curve_config.json"
            )
        else:
            print("No checkpoints found")
    else:
        print("Checkpoint directory not found")
else:
    print("No runs found")
