# Bettafish - AI Catan Player

AlphaBeta search, MCTS, and AlphaZero on a fast bitboard engine.

**Runtime**: Use GPU (T4) for neural net training, or high-RAM CPU for multi-core search benchmarks.

Go to **Runtime > Change runtime type** and select your preferred hardware.

## 1. Setup

In [None]:
# Install uv (fast Python package manager)
!curl -LsSf https://astral.sh/uv/install.sh | sh
import os
os.environ["PATH"] = f"{os.path.expanduser('~')}/.local/bin:{os.environ['PATH']}"

In [None]:
# Clone the repo (or pull latest if already cloned)
import os
if os.path.exists('bettafish'):
    os.chdir('bettafish')
    !git pull
else:
    !git clone https://github.com/Samffprice/bettafish.git
    os.chdir('bettafish')
!pwd
!git log --oneline -3

In [None]:
# Install all dependencies (including Cython for the fast bitboard engine)
# Uses the system Python (Colab's Python 3.11+)
!uv pip install --system -e ".[colab]" -e "./catanatron[gym]" 2>&1 | tail -5

In [None]:
# Build the Cython extension for the fast bitboard engine
!python robottler/bitboard/setup_cython.py build_ext --inplace

# Verify it built
import importlib
from robottler.bitboard import _fast
print(f"Cython module loaded: {_fast.__file__}")

In [None]:
# Check hardware
import torch
import multiprocessing

NCPU = multiprocessing.cpu_count()
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB")
print(f"CPU cores: {NCPU}")
print(f"\nRecommended --workers: {max(1, NCPU - 1)}")

## 2. Benchmark (Gauntlet)

Run the bitboard search player against baseline opponents.

| Flag | Description |
|------|-------------|
| `--bb-search` | Use the fast bitboard search player |
| `--search-depth N` | Search depth (2 = fast, 3 = strong) |
| `--blend-weight W` | Neural/heuristic blend (1e8 optimal) |
| `--dice-sample N` | Sample top-N dice rolls (5 = 3x speedup) |
| `--games N` | Games per opponent |
| `--workers N` | Parallel processes (use all cores!) |
| `--baselines` | Run against all baseline opponents |

In [None]:
import multiprocessing; W = max(1, multiprocessing.cpu_count() - 1)

# Quick benchmark: bitboard search depth 2 vs all baselines (50 games each)
!python -m robottler.benchmark \
    --bb-search \
    --search-depth 2 \
    --blend-weight 1e8 \
    --dice-sample 5 \
    --baselines \
    --games 50 \
    --workers {W}

In [None]:
# Strong benchmark: depth 3 (slower but ~72% vs AlphaBeta)
!python -m robottler.benchmark \
    --bb-search \
    --search-depth 3 \
    --blend-weight 1e8 \
    --dice-sample 5 \
    --baselines \
    --games 20 \
    --workers {W}

## 3. Train on Local Data (GPU)

Upload your training data from your local machine and train on Colab's GPU.
Training is fast on T4 (~10-15 min per experiment). Benchmarking is slow on
Colab â€” do that locally on your Mac instead.

**Step 1 (local machine):** Zip your data and checkpoint:
```bash
cd /Volumes/BackupFiles/MasterCoding/Random-Projects/AICatan
zip -r training_data.zip datasets/az_v2_325k_200ep.pt datasets/az_selfplay_v2 datasets/exit_v1/iter1 datasets/exit_v1/iter2 datasets/exit_v1/iter3 datasets/exit_v1/iter4 datasets/exit_v1/iter5 datasets/expert_data_10k datasets/expert_depth3 datasets/expert_ranking
```

**Step 2:** Upload `training_data.zip` to your Google Drive root folder.

**Step 3:** Run the cells below.

In [None]:
# Mount Google Drive and unzip training data
from google.colab import drive
drive.mount('/content/drive')

import os, shutil

ZIP_PATH = '/content/drive/MyDrive/training_data.zip'
if os.path.exists(ZIP_PATH):
    !unzip -qo {ZIP_PATH} -d /content/bettafish/
    print("Data extracted!")
    # Show what we got
    !du -sh /content/bettafish/datasets/*/
else:
    print(f"ERROR: {ZIP_PATH} not found. Upload training_data.zip to your Google Drive root.")

In [None]:
# Exp #24a: Ranking loss, margin=0.1 (conservative)

DATA_DIRS = " ".join([
    "datasets/az_selfplay_v2",
    "datasets/exit_v1/iter1", "datasets/exit_v1/iter2", "datasets/exit_v1/iter3",
    "datasets/exit_v1/iter4", "datasets/exit_v1/iter5",
    "datasets/expert_data_10k", "datasets/expert_depth3",
])

!python -m robottler.az_selfplay train \
    --checkpoint datasets/az_v2_325k_200ep.pt \
    --data-dir {DATA_DIRS} \
    --output datasets/az_ranking_m01.pt \
    --epochs 200 --batch-size 16384 --lr 1e-3 \
    --scheduler cosine \
    --body-dims 512,256 --dropout 0.1 \
    --ranking-data datasets/expert_ranking \
    --ranking-weight 1.0 --ranking-margin 0.1

In [None]:
# Exp #24b: Ranking loss, margin=0.3 (moderate)

!python -m robottler.az_selfplay train \
    --checkpoint datasets/az_v2_325k_200ep.pt \
    --data-dir {DATA_DIRS} \
    --output datasets/az_ranking_m03.pt \
    --epochs 200 --batch-size 16384 --lr 1e-3 \
    --scheduler cosine \
    --body-dims 512,256 --dropout 0.1 \
    --ranking-data datasets/expert_ranking \
    --ranking-weight 1.0 --ranking-margin 0.3

In [None]:
# Exp #24c: Ranking loss, margin=0.5 (aggressive)

!python -m robottler.az_selfplay train \
    --checkpoint datasets/az_v2_325k_200ep.pt \
    --data-dir {DATA_DIRS} \
    --output datasets/az_ranking_m05.pt \
    --epochs 200 --batch-size 16384 --lr 1e-3 \
    --scheduler cosine \
    --body-dims 512,256 --dropout 0.1 \
    --ranking-data datasets/expert_ranking \
    --ranking-weight 1.0 --ranking-margin 0.5

In [None]:
# Copy trained models back to Google Drive for download
import shutil, os

models = [
    'datasets/az_ranking_m01.pt',
    'datasets/az_ranking_m03.pt',
    'datasets/az_ranking_m05.pt',
]

for src in models:
    name = os.path.basename(src)
    dst = f'/content/drive/MyDrive/{name}'
    if os.path.exists(src):
        shutil.copy2(src, dst)
        size_mb = os.path.getsize(src) / 1e6
        print(f"Saved {dst} ({size_mb:.1f} MB)")
    else:
        print(f"MISSING: {src}")

print("\nDownload from Google Drive, place in local datasets/ folder, then benchmark:")
print("  python /tmp/bench_az_vs_ab.py datasets/az_ranking_m01.pt 100 6 400")
print("  python /tmp/bench_az_vs_ab.py datasets/az_ranking_m03.pt 100 6 400")
print("  python /tmp/bench_az_vs_ab.py datasets/az_ranking_m05.pt 100 6 400")
print("\nMeasure Kendall tau improvement:")
print("  python /tmp/mcts_deep_diagnostics.py datasets/az_ranking_m03.pt 5")

## 3b. GNN Training (Exp #25)

Train a Graph Neural Network that processes per-node features on the board graph.
The GNN sees spatial layout (which nodes have buildings, adjacency) that the MLP cannot.

**Step 1 (local machine):** Extract GNN features from the 44K human games:
```bash
cd /Volumes/BackupFiles/MasterCoding/Random-Projects/AICatan
python3 -m datasets.extract_gnn_features \
    --games-dir datasets/games \
    --output-dir datasets/human_gnn_44k \
    --workers 0
```

**Step 2:** Zip and upload:
```bash
zip -r human_gnn_44k.zip datasets/human_gnn_44k/node_features.npy \
    datasets/human_gnn_44k/global_features.npy \
    datasets/human_gnn_44k/values.npy
```

**Step 3:** Upload `human_gnn_44k.zip` to Google Drive root, then run cells below.

In [None]:
# Unzip GNN training data from Google Drive
import os
GNN_ZIP = '/content/drive/MyDrive/human_gnn_44k.zip'
if os.path.exists(GNN_ZIP):
    !unzip -qo {GNN_ZIP} -d /content/bettafish/
    print("GNN data extracted!")
    !du -sh /content/bettafish/datasets/human_gnn_44k/
    !ls -lh /content/bettafish/datasets/human_gnn_44k/*.npy
else:
    print(f"ERROR: {GNN_ZIP} not found. Upload human_gnn_44k.zip to Google Drive.")

In [None]:
# Exp #25a: Small GNN (32,64,96) ~45K params on human game data
!python -m robottler.az_selfplay train \
    --data-dir datasets/human_gnn_44k \
    --output datasets/gnn_human_small.pt \
    --gnn --gnn-dims 32,64,96 \
    --epochs 200 --batch-size 2048 --lr 1e-3 \
    --scheduler cosine --dropout 0.2 --edge-dropout 0.1

## 4. AlphaZero Self-Play + Training Loop

Automated generate -> train -> evaluate cycle. Use this for running full
self-play iterations on Colab hardware.

In [None]:
# Full loop: 5 iterations of generate/train/evaluate
!python -m robottler.az_selfplay loop \
    --start-checkpoint robottler/models/az_iter0.pt \
    --iterations 5 \
    --games-per-iter 200 \
    --sims 200 \
    --output-dir datasets/az_selfplay/colab_loop \
    --epochs 20 \
    --eval-games 100 \
    --workers {W}

## 5. RL Training (MaskablePPO)

Train a policy network with reinforcement learning. Benefits from **multi-core** for
parallel environment rollouts.

In [None]:
!python -m robottler.train_rl \
    --opponent alphabeta \
    --total-steps 200000 \
    --n-envs 8 \
    --bc-model robottler/models/value_net_v2.pt \
    --vps 10

## 6. Save Results

Download trained models back to your local machine.

In [None]:
# List all model checkpoints
!ls -lh robottler/models/*.pt

In [None]:
# Zip models for download
!zip -j colab_models.zip robottler/models/az_colab_*.pt robottler/models/az_iter*.pt 2>/dev/null || echo "No new models yet"

from google.colab import files
try:
    files.download("colab_models.zip")
except:
    print("Download manually from the file browser (left panel)")