# NanoZero: Minimal AlphaZero in Colab

Train an AI to master TicTacToe, Connect4, or Go through pure self-play.

**GPU Setup:**
- For quick experiments: `Runtime > Change runtime type > T4 GPU`
- For serious training: `Runtime > Change runtime type > A100 GPU`

The notebook includes optimized settings for both.

In [1]:
# Check GPU is available
import torch
print(f"GPU available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

GPU available: False


In [None]:
# Clone the repository
!git clone https://github.com/caldred/nanozero.git
%cd nanozero

In [None]:
# Install dependencies (torch is pre-installed in Colab)
!pip install -q numpy

## Train TicTacToe

**T4 GPU:** ~5 minutes | **A100 GPU:** ~2 minutes

In [None]:
# Detect GPU type and use optimized settings
import torch
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else ""
is_a100 = "A100" in gpu_name

if is_a100:
    print("ðŸš€ A100 detected - using optimized settings")
    !python -m scripts.train \
        --game=tictactoe \
        --n_layer=2 \
        --num_iterations=50 \
        --games_per_iteration=64 \
        --training_steps=100 \
        --mcts_simulations=50 \
        --batch_size=256 \
        --buffer_size=20000 \
        --parallel_games=128 \
        --eval_interval=10
else:
    print("Using T4/default settings")
    !python -m scripts.train \
        --game=tictactoe \
        --n_layer=2 \
        --num_iterations=50 \
        --games_per_iteration=20 \
        --training_steps=250 \
        --mcts_simulations=25 \
        --batch_size=64 \
        --buffer_size=10000 \
        --eval_interval=10

## Evaluate the trained model

In [None]:
!python -m scripts.eval \
    --game=tictactoe \
    --checkpoint=checkpoints/tictactoe_final.pt \
    --n_layer=2 \
    --num_games=100

## Play against the AI!

This cell is interactive - enter moves when prompted.

In [None]:
# Interactive play (enter moves 0-8 for positions)
# Board positions:
# 0 | 1 | 2
# ---------
# 3 | 4 | 5
# ---------
# 6 | 7 | 8

!python -m scripts.play \
    --game=tictactoe \
    --checkpoint=checkpoints/tictactoe_final.pt \
    --n_layer=2

---
## Train Connect4

More complex game (6x7 board, 7 actions). Needs more training than TicTacToe.

**T4 GPU:** ~15-20 minutes | **A100 GPU:** ~5 minutes

In [None]:
# Detect GPU type and use optimized settings
import torch
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else ""
is_a100 = "A100" in gpu_name

if is_a100:
    print("ðŸš€ A100 detected - using optimized settings")
    !python -m scripts.train \
        --game=connect4 \
        --n_layer=6 \
        --num_iterations=100 \
        --games_per_iteration=128 \
        --training_steps=200 \
        --mcts_simulations=100 \
        --batch_size=512 \
        --buffer_size=100000 \
        --parallel_games=256 \
        --eval_interval=10
else:
    print("Using T4/default settings")
    !python -m scripts.train \
        --game=connect4 \
        --n_layer=4 \
        --num_iterations=100 \
        --games_per_iteration=30 \
        --training_steps=500 \
        --mcts_simulations=50 \
        --batch_size=128 \
        --buffer_size=50000 \
        --eval_interval=20

In [None]:
# Play Connect4 against the AI
# Enter column 0-6 to drop a piece
import torch
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else ""
n_layer = 6 if "A100" in gpu_name else 4

!python -m scripts.play \
    --game=connect4 \
    --checkpoint=checkpoints/connect4_final.pt \
    --n_layer={n_layer}

---
## Train Go 9x9

Go is much more complex - 81 board positions, games last 100-200 moves. Uses Chinese rules with 7.5 komi.

**T4 GPU:** ~30-45 minutes | **A100 GPU:** ~10-15 minutes

Note: For strong Go play, you'd need much longer training (hours/days). This is just a demo.

In [None]:
# Detect GPU type and use optimized settings
import torch
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else ""
is_a100 = "A100" in gpu_name

if is_a100:
    print("ðŸš€ A100 detected - using optimized settings")
    !python -m scripts.train \
        --game=go9x9 \
        --n_layer=8 \
        --num_iterations=100 \
        --games_per_iteration=64 \
        --training_steps=200 \
        --mcts_simulations=100 \
        --batch_size=256 \
        --buffer_size=100000 \
        --parallel_games=128 \
        --eval_interval=10
else:
    print("Using T4/default settings")
    !python -m scripts.train \
        --game=go9x9 \
        --n_layer=4 \
        --num_iterations=100 \
        --games_per_iteration=10 \
        --training_steps=500 \
        --mcts_simulations=25 \
        --batch_size=128 \
        --buffer_size=50000 \
        --parallel_games=8 \
        --eval_interval=20

In [None]:
# Play Go 9x9 against the AI
# Enter position as row-major index (0-80) or 81 to pass
import torch
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else ""
n_layer = 8 if "A100" in gpu_name else 4

!python -m scripts.play \
    --game=go9x9 \
    --checkpoint=checkpoints/go9x9_final.pt \
    --n_layer={n_layer}