# NanoZero: Minimal AlphaZero in Colab

Train an AI to master TicTacToe, Connect4, or Go through pure self-play.

**First:** Go to `Runtime > Change runtime type > T4 GPU`

In [None]:
# Check GPU is available
import torch
print(f"GPU available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

In [None]:
# Clone the repository
!git clone https://github.com/caldred/nanozero.git
%cd nanozero

In [None]:
# Install dependencies (torch is pre-installed in Colab)
!pip install -q numpy

## Train TicTacToe (~5 minutes on T4)

In [None]:
!python -m scripts.train \
    --game=tictactoe \
    --n_layer=2 \
    --num_iterations=50 \
    --games_per_iteration=20 \
    --training_steps=250 \
    --mcts_simulations=25 \
    --batch_size=64 \
    --buffer_size=10000 \
    --eval_interval=10

## Evaluate the trained model

In [None]:
!python -m scripts.eval \
    --game=tictactoe \
    --checkpoint=checkpoints/tictactoe_final.pt \
    --n_layer=2 \
    --num_games=100

## Play against the AI!

This cell is interactive - enter moves when prompted.

In [None]:
# Interactive play (enter moves 0-8 for positions)
# Board positions:
# 0 | 1 | 2
# ---------
# 3 | 4 | 5
# ---------
# 6 | 7 | 8

!python -m scripts.play \
    --game=tictactoe \
    --checkpoint=checkpoints/tictactoe_final.pt \
    --n_layer=2

---
## Train Connect4 (~15-20 minutes on T4)

More complex game, needs more training.

In [None]:
# Connect4 - larger board, needs more compute
!python -m scripts.train \
    --game=connect4 \
    --n_layer=4 \
    --num_iterations=100 \
    --games_per_iteration=30 \
    --training_steps=500 \
    --mcts_simulations=50 \
    --batch_size=128 \
    --buffer_size=50000 \
    --eval_interval=20

In [None]:
# Play Connect4 against the AI
# Enter column 0-6 to drop a piece

!python -m scripts.play \
    --game=connect4 \
    --checkpoint=checkpoints/connect4_final.pt \
    --n_layer=4

---
## Train Go 9x9 (~30-45 minutes on T4)

Go is much more complex - 81 positions, games last 100-200 moves. Uses Chinese rules with 7.5 komi.

In [None]:
# Go 9x9 - much longer games, fewer sims to keep training tractable
!python -m scripts.train \
    --game=go9x9 \
    --n_layer=4 \
    --num_iterations=100 \
    --games_per_iteration=10 \
    --training_steps=500 \
    --mcts_simulations=25 \
    --batch_size=128 \
    --buffer_size=50000 \
    --parallel_games=8 \
    --eval_interval=20

In [None]:
# Play Go 9x9 against the AI
# Enter position as row-major index (0-80) or 81 to pass
# Board uses standard Go notation: A-J columns (skipping I), 1-9 rows

!python -m scripts.play \
    --game=go9x9 \
    --checkpoint=checkpoints/go9x9_final.pt \
    --n_layer=4