# Chess RL Agent - Training

**Runtime:** 12-30 hours (GPU required)

**Requirements:** Colab Pro for background execution

Run cells 1-6 in order, then monitor progress with cell 7.

## 1. Verify GPU

In [None]:
import torch

print(f"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")
print(f"CUDA: {torch.cuda.is_available()}")

if not torch.cuda.is_available():
    print("\n⚠ Go to Runtime → Change runtime type → Select GPU (L4 recommended)")
else:
    print("\n✓ GPU ready")

## 2. Mount Google Drive

In [None]:
from google.colab import drive

drive.mount('/content/drive')
!mkdir -p /content/drive/MyDrive/chess_checkpoints

print("✓ Drive mounted")

## 3. Clone Repository

In [None]:
# Remove existing repo
!rm -rf rl_chess_agent

# Clone latest version
!git clone https://github.com/Capacap/rl_chess_agent.git
%cd rl_chess_agent

# Verify latest commit
!git log --oneline -1
print("\n✓ Repository ready")

## 4. Install Dependencies

In [None]:
# Install chess library (Colab has torch, numpy, etc.)
!pip install -q -r requirements-colab.txt

# Verify imports
import chess
from model.network import ChessNet

print(f"✓ Dependencies installed")
print(f"  chess: {chess.__version__}")
print(f"  torch: {torch.__version__}")

## 5. Configure Training

In [None]:
import datetime

# Training parameters (modify as needed)
# Quick test: ITERATIONS=5, GAMES=25, SIMS=20 (~3 hours)
# Development: ITERATIONS=10, GAMES=50, SIMS=20 (~12 hours)
# Production: ITERATIONS=15, GAMES=100, SIMS=40 (~30 hours)

ITERATIONS = 10
GAMES_PER_ITER = 50
SIMULATIONS = 20
ARENA_GAMES = 20

# Advanced (usually don't need to change)
BATCH_SIZE = 256
EPOCHS = 5
LEARNING_RATE = 1e-3

# Auto-backup to Drive
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
CHECKPOINT_DIR = f"checkpoints/{timestamp}"
GDRIVE_BACKUP = "/content/drive/MyDrive/chess_checkpoints"

print("Configuration:")
print(f"  {ITERATIONS} iterations × {GAMES_PER_ITER} games")
print(f"  {SIMULATIONS} MCTS simulations/move")
print(f"  Checkpoints: {CHECKPOINT_DIR}")
print(f"  Drive backup: {GDRIVE_BACKUP}")

## 6. Launch Training

**Enable background execution:** Runtime → Background execution

In [None]:
# Start training (will run for hours)
!python train.py \
  --iterations {ITERATIONS} \
  --games-per-iter {GAMES_PER_ITER} \
  --simulations {SIMULATIONS} \
  --arena-games {ARENA_GAMES} \
  --batch-size {BATCH_SIZE} \
  --epochs {EPOCHS} \
  --lr {LEARNING_RATE} \
  --checkpoint-dir {CHECKPOINT_DIR} \
  --gdrive-backup-dir {GDRIVE_BACKUP}

## 7. Monitor Progress

Run this cell periodically to check training status

In [None]:
import os
import glob

# Count completed iterations
checkpoints = glob.glob(f"{CHECKPOINT_DIR}/iteration_*.pt")
pkl_checkpoints = glob.glob(f"{CHECKPOINT_DIR}/iteration_*.pkl")

print(f"Progress: {len(checkpoints)}/{ITERATIONS} iterations")
print(f"Tournament files (.pkl): {len(pkl_checkpoints)}")

# Show latest checkpoint
if checkpoints:
    latest = sorted(checkpoints)[-1]
    print(f"\nLatest: {os.path.basename(latest)}")

# View recent log entries
print(f"\n--- Recent log ---")
!tail -15 {CHECKPOINT_DIR}/training.log 2>/dev/null || echo "Log not yet created"

# GPU status
print(f"\n--- GPU utilization ---")
!nvidia-smi --query-gpu=utilization.gpu,memory.used --format=csv,noheader

## 8. Resume Training (if interrupted)

In [None]:
# Find latest checkpoint
checkpoints = sorted(glob.glob(f"{CHECKPOINT_DIR}/iteration_*.pt"))
if checkpoints:
    latest = checkpoints[-1]
    remaining = ITERATIONS - int(latest.split('_')[-1].split('.')[0])
    
    print(f"Resuming from: {os.path.basename(latest)}")
    print(f"Remaining iterations: {remaining}")
    
    # Resume training
    !python train.py \
      --resume {latest} \
      --iterations {remaining} \
      --games-per-iter {GAMES_PER_ITER} \
      --simulations {SIMULATIONS} \
      --arena-games {ARENA_GAMES} \
      --batch-size {BATCH_SIZE} \
      --epochs {EPOCHS} \
      --checkpoint-dir {CHECKPOINT_DIR} \
      --gdrive-backup-dir {GDRIVE_BACKUP}
else:
    print("No checkpoint found to resume from")

## Troubleshooting

**Out of memory:** Reduce `BATCH_SIZE = 128` or `GAMES_PER_ITER = 25`

**Too slow:** Reduce `SIMULATIONS = 15` or `ARENA_GAMES = 10`

**Download checkpoints:** Already in Google Drive at `/MyDrive/chess_checkpoints/{timestamp}/`