# Chess RL Agent - Training on Colab

**Purpose:** Train chess agent using MCTS + neural network on GPU

**Runtime:** ~12-30 hours depending on configuration

**Requirements:** Colab Pro for background execution and longer runtimes

## Step 1: Environment Setup

In [None]:
# Verify GPU availability
!nvidia-smi

import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")

## Step 2: Mount Google Drive (for checkpoint persistence)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Create checkpoint directory in Drive
!mkdir -p /content/drive/MyDrive/chess_checkpoints
print("✓ Google Drive mounted")

# Remove existing repo if present
!rm -rf rl_chess_agent

# Try cloning (works if repo is public)
!git clone https://github.com/Capacap/rl_chess_agent.git

# If clone failed, try with authentication
import os
if not os.path.exists('rl_chess_agent'):
    print("\n⚠ Clone failed - repository appears to be private")
    print("\nQuick fix: Make repository public")
    print("  1. Visit: https://github.com/Capacap/rl_chess_agent/settings")
    print("  2. Scroll to 'Danger Zone' → 'Change visibility'")
    print("  3. Click 'Make public' and confirm")
    print("  4. Re-run this cell")
    
    # Alternative: Use token
    print("\nOR use GitHub Personal Access Token:")
    import getpass
    use_token = input("Do you have a token ready? (y/n): ").lower()
    if use_token == 'y':
        token = getpass.getpass("Enter token: ")
        !git clone https://{token}@github.com/Capacap/rl_chess_agent.git
    else:
        raise Exception("Cannot proceed - repository access required")

%cd rl_chess_agent

# Verify we're on the right branch
!git status
!git log --oneline -3

In [None]:
# Clone from GitHub
!git clone https://github.com/Capacap/rl_chess_agent.git
%cd rl_chess_agent

# Verify we're on the right branch
!git status
!git log --oneline -3

## Step 4: Install Dependencies

In [None]:
!pip install -r requirements.txt

# Verify key imports
import chess
import torch
import numpy as np
from model.network import ChessNet

print("✓ All dependencies installed")

## Step 5: Configure Training Run

**Recommended configurations:**

### Quick Test (2-3 hours)
```python
ITERATIONS = 5
GAMES_PER_ITER = 50
SIMULATIONS = 20
ARENA_GAMES = 20
```

### Development Run (12-15 hours)
```python
ITERATIONS = 10
GAMES_PER_ITER = 50
SIMULATIONS = 20
ARENA_GAMES = 20
```

### Production Run (24-30 hours)
```python
ITERATIONS = 15
GAMES_PER_ITER = 100
SIMULATIONS = 40
ARENA_GAMES = 30
```

In [None]:
# Configuration (modify as needed)
ITERATIONS = 10
GAMES_PER_ITER = 50
SIMULATIONS = 20
ARENA_GAMES = 20
BATCH_SIZE = 256
EPOCHS = 5
LEARNING_RATE = 1e-3

# Checkpoint directory (local, will sync to Drive)
import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
CHECKPOINT_DIR = f"checkpoints/{timestamp}"

print("Training Configuration:")
print(f"  Iterations: {ITERATIONS}")
print(f"  Games per iteration: {GAMES_PER_ITER}")
print(f"  MCTS simulations: {SIMULATIONS}")
print(f"  Arena games: {ARENA_GAMES}")
print(f"  Checkpoint dir: {CHECKPOINT_DIR}")

## Step 6: Launch Training

**IMPORTANT:** Enable background execution in Colab Pro to prevent disconnects

Training will:
1. Generate self-play games using MCTS + neural network
2. Train challenger network on game data
3. Evaluate challenger vs champion in arena
4. Promote challenger if win rate > 55%
5. Repeat for N iterations

Checkpoints saved after every iteration to `checkpoints/`

In [None]:
# Launch training
!python train.py \
  --iterations {ITERATIONS} \
  --games-per-iter {GAMES_PER_ITER} \
  --simulations {SIMULATIONS} \
  --arena-games {ARENA_GAMES} \
  --batch-size {BATCH_SIZE} \
  --epochs {EPOCHS} \
  --lr {LEARNING_RATE} \
  --checkpoint-dir {CHECKPOINT_DIR}

## Step 7: Backup Checkpoints to Google Drive

**Run this cell after training completes (or during training to backup progress)**

In [None]:
# Sync checkpoints to Google Drive
!cp -r {CHECKPOINT_DIR} /content/drive/MyDrive/chess_checkpoints/

print(f"✓ Checkpoints backed up to: /content/drive/MyDrive/chess_checkpoints/{timestamp}")
print("\nDownload from Google Drive to local machine:")
print("  1. Open Google Drive in browser")
print("  2. Navigate to 'chess_checkpoints' folder")
print(f"  3. Download {timestamp} folder")
print("  4. Extract to local project: checkpoints/{timestamp}")

## Step 8: View Training Log

In [None]:
# View last 50 lines of training log
!tail -50 {CHECKPOINT_DIR}/training.log

## Step 9: Monitor Training Progress (Optional)

Run this cell periodically to check progress without viewing full logs

In [None]:
import os
import glob

# Count checkpoints
checkpoints = glob.glob(f"{CHECKPOINT_DIR}/iteration_*.pt")
print(f"Checkpoints saved: {len(checkpoints)}")
print(f"Progress: {len(checkpoints)}/{ITERATIONS} iterations")

# Show latest checkpoint
if checkpoints:
    latest = sorted(checkpoints)[-1]
    size_mb = os.path.getsize(latest) / (1024 * 1024)
    print(f"\nLatest checkpoint: {latest}")
    print(f"Size: {size_mb:.1f} MB")

# GPU utilization
!nvidia-smi --query-gpu=utilization.gpu,memory.used --format=csv

## Troubleshooting

**Session disconnected:**
- With Colab Pro background execution, training continues
- Reconnect and check progress with Step 9
- Checkpoints auto-saved every iteration

**Out of memory:**
- Reduce `BATCH_SIZE` (try 128)
- Reduce `GAMES_PER_ITER` (try 25)

**Training too slow:**
- Reduce `SIMULATIONS` (try 10-15)
- Reduce `ARENA_GAMES` (try 10-15)

**Resume from checkpoint:**
```bash
!python train.py --resume {CHECKPOINT_DIR}/iteration_5.pt --iterations 10
```