BetaOne Chess Engine

A chess engine implementation inspired by AlphaZero, designed for Apple Silicon with MLX neural network acceleration.

Features

Apple Silicon Support: Uses MLX for neural network acceleration on Apple Silicon
Neural Network: 32M-parameter ResNet + multi-head attention stack with squeeze-and-excitation
MCTS Search: Monte Carlo Tree Search with PUCT algorithm
Self-Play Training: Generates training data through self-play games
UCI Protocol: Compatible with chess GUIs
Multi-Threading: Parallel MCTS search and self-play workers
Expanded Features: 22 AlphaZero-style planes including clock, material, and check indicators
Training Pipeline: Cosine LR schedule, gradient accumulation, EMA evaluation, and checkpoint management

Architecture

BetaOne/
├── network/          # MLX neural network (Python)
├── engine/           # C++ MCTS + UCI engine
├── orchestrator/     # Swift orchestrator
├── data/            # Training data storage
├── checkpoints/     # Model weights
└── logs/            # Logging output

Components

Neural Network (MLX): Residual tower with stacked attention blocks and gated channel mixing for policy/value heads
MCTS Engine (C++): Tree search with virtual loss parallelization
UCI Interface (C++): Standard chess engine protocol
Orchestrator (Swift): Coordinates self-play, training, and evaluation

System Architecture

The system consists of three main components:

Swift Orchestrator: Manages the training loop, coordinates self-play games, training, and evaluation
C++ Engine: Implements MCTS search, UCI protocol, and position handling
Python Network: MLX-based neural network for policy and value prediction

Training Pipeline

Self-Play: Generate games using current model with visit-count-aware policy exports
Training: Train with AlphaZero loss, cosine warmup/decay, gradient accumulation, and mixed precision
Evaluation: Compare both raw and EMA checkpoints against the previous best on held-out shards
Promotion: Update best model if validation/evaluation metrics improve beyond thresholds

Quick Start

Prerequisites

macOS with Apple Silicon (M1/M2/M3)
Xcode Command Line Tools
Python 3.8+

Installation

Install dependencies:
```
brew install cmake
```
Build the system:
```
./build.sh
```

Run the orchestrator: (example with three workers and 24 total games)

swift run --package-path orchestrator Orchestrator "$(pwd)/config.json" --workers 3 --games 24 --iterations 1 --verbose

Usage

Testing Individual Components

1. Test Neural Network

cd network && source venv/bin/activate && python -c "from model import ensure_device, create_model; from config import get_input_planes; import mlx.core as mx; ensure_device(); planes=get_input_planes(); m=create_model(); x=mx.random.normal((1,planes,8,8)); p,v=m(x); mx.eval(p,v); print('policy',p.shape,'value',v.shape)"

Expected output: policy (1, 4672) value (1, 1)

2. Test MLX Inference Server

cd .. && PYTHONPATH=network network/venv/bin/python -m network.inference --batch-size 2

Expected output: JSON response with policy array and value

3. Test UCI Engine

cd engine/build/bin && echo "uci\nisready\nquit" | ./betaone

Expected output: UCI protocol responses

Running the Full System

Option 1: Swift Orchestrator (Recommended)

cd orchestrator && swift run Orchestrator

This runs the complete self-play, training, and evaluation loop

Option 2: Direct UCI Engine

./engine/build/bin/betaone

This starts the UCI engine for use with chess GUIs

Option 3: Direct Training

cd network && source venv/bin/activate && python train.py

If MLX reports a Metal compiler failure (e.g. Unable to load kernel rbitsc), the process now exits cleanly with guidance. Restart the machine or rerun the training step with MLX_DISABLE_METAL=1 to force CPU execution. This runs training on existing self-play data

Orchestrator CLI Highlights

The Swift orchestrator accepts a number of knobs so you can tailor short smoke tests or longer training cycles without editing config.json:

Flag	Description
`--workers N`	Override the number of concurrent self-play workers (defaults to `self_play.num_workers`).
`--games N`	Total self-play games per iteration (work is split evenly among workers).
`--simulations N`	Override MCTS simulations per move.
`--batch-size N`, `--learning-rate X`, `--steps-per-checkpoint N`	Adjust training hyper-parameters on the fly.
`--eval-games N`, `--promotion-threshold X`	Tune evaluation workload and promotion rule.
`--max-cpu-cores N`	Limit Python/MLX CPU threads if you want predictable resource usage.
`--config-overrides '{...}'`	Apply ad-hoc JSON overrides for advanced scenarios.

When you launch the orchestrator the CLI writes a persistent JSONL event log under logs/ (e.g. logs/orchestrator_run_2025-10-21T19-15-43.995Z.jsonl). Tail it while a run is in progress to monitor phase transitions, per-iteration stats, and anomalous events:

tail -f "$(ls -t logs/orchestrator_run_*.jsonl | head -n1)"

Each iteration also updates logs/selfplay_manifest.json with a catalog of shards (size, timestamp) so you can inspect what training will consume.

Tip: The training loop now validates the corpus size, applies cosine LR scheduling, and evaluates both raw and EMA weights each epoch. Ensure you regenerate self-play shards after upgrading—existing 18-plane datasets are rejected because the feature encoder now emits 22 planes.

UCI Engine Commands

Once the UCI engine is running, use these commands:

uci
isready
position startpos
go depth 10
quit

Self-Play Data Generation

The orchestrator automatically generates self-play data, trains the model, and evaluates performance in a continuous loop. Each move now captures both normalized policy logits and raw visit counts from MCTS, which are encoded into the training shards to preserve search strength.

Device Selection

Device selection can be controlled through:

Command-line --device flag (auto, gpu, or cpu) on training and inference tools
Environment variable BETAONE_DEVICE
runtime.device in config.json (defaults to auto)

When auto is selected, the system attempts to use GPU acceleration first and falls back to CPU if unavailable.

Configuration

Edit config.json to adjust parameters:

{
  "self_play": {
    "num_games": 100,
    "num_workers": 2,
    "mcts_simulations": 800,
    "temperature_moves": 30
  },
  "training": {
    "batch_size": 256,
    "learning_rate": 0.001,
    "steps_per_checkpoint": 1000,
    "l2_reg": 0.0001
  },
  "runtime": {
    "device": "auto",
    "network_batch_size": 32,
    "network_timeout_ms": 5000,
    "max_concurrent_engines": 2
  },
  "evaluation": {
    "num_games": 100,
    "promotion_threshold": 0.55
  }
}

runtime.network_batch_size controls how many FENs the C++ engine batches per inference RPC
runtime.network_timeout_ms is the client-side deadline (in milliseconds) for inference responses
runtime.max_concurrent_engines bounds how many self-play/evaluation engines (and therefore MLX inference servers) may run at once; lower this to avoid exhausting Metal resources on smaller GPUs
training.min_positions defines the minimum number of positions required before a training epoch will run (the orchestrator now skips training gracefully when the corpus is too small)
training.max_cpu_cores, training.ema_decay, and training.grad_clip expose additional safeguards for MLX training stability

Logs & Diagnostics

Event log: every orchestrator run creates a JSON Lines file in logs/ with phase boundaries, per-iteration summaries, and warnings (engine restarts, policy fallbacks, training skips, etc.).
Self-play manifest: logs/selfplay_manifest.json tracks all generated shards, their size, and timestamps—useful for auditing what training will consume.
Console stream: --verbose keeps the classic human-readable log (still the best way to watch MCTS statistics and evaluation outcomes in real time).

Performance

Model Size: Approximately 22M parameters
Inference: Varies by device and batch size
Self-Play: Depends on MCTS simulations and hardware
Training: Depends on batch size and hardware
Memory Usage: Varies during training

Benchmarking Tools

tools/benchmark/latency_tracker.py runs scripted matches at fixed simulation counts and reports per-move latency, nodes per second, and basic engine telemetry.
tools/benchmark/resource_monitor.py wraps the latency tracker, samples CPU/GPU/memory usage while varying worker counts, and writes JSON reports under logs/benchmark/ for later analysis.

Development

Building Individual Components

C++ Engine:

cd engine
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(sysctl -n hw.ncpu)

Swift Orchestrator:

cd orchestrator
swift build -c release

Python Environment:

cd network
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Checkpoint artifacts are not committed. After MLX is available, create fresh base weights with:

cd ..
./network/venv/bin/python network/make_base_checkpoint.py --name base --config config.json

The training pipeline now auto-initialises latest.safetensors / EMA snapshots from the regenerated base checkpoint, so no manual copies are required.

Testing

Run the environment doctor followed by the integration smoke test after the first build:

python tools/doctor.py
./tools/smoke_test.sh

If you prefer manual spot checks:

# Test neural network
cd network && source venv/bin/activate
python -c "from model import create_model; model = create_model()"

# Test UCI engine
echo "uci\nisready\nquit" | ./engine/build/bin/betaone

# Test orchestrator
cd orchestrator && swift run Orchestrator --help

Technical Details

Neural Network Architecture

Input: 8x8x18 planes (piece positions, side to move, castling rights, en-passant target)
Backbone: 10 ResNet blocks with 256 channels
Attention: Multi-head self-attention (8 heads)
Policy Head: 4672 output classes (AlphaZero-style move encoding)
Value Head: Scalar output (-1 to 1)
Parameters: Approximately 22M total

MCTS Algorithm

PUCT Formula: Q + c_puct * P * sqrt(N_parent) / (1 + N_child)
Virtual Loss: Parallelization with virtual loss penalty
Dirichlet Noise: Root node exploration
Temperature: Move selection temperature scheduling

Training Process

Self-Play: Generate games using current model
Data Collection: Store positions, policies, and values
Training: AlphaZero loss (policy + value + L2)
Evaluation: Play against previous best model
Promotion: Update best model if win rate > 55%

Troubleshooting

Common Issues

Build failures: Ensure all dependencies are installed
Python errors: Activate virtual environment before running
UCI connection: Check that the engine binary is built correctly
Memory issues: Reduce batch size or number of workers

Detailed Troubleshooting

Build Issues

Ensure Xcode Command Line Tools are installed: xcode-select --install
Check CMake version: cmake --version (requires 3.20+)
Verify Swift installation: swift --version
Check if jsoncpp is installed: brew list jsoncpp

Runtime Issues

Check Python virtual environment: cd network && source venv/bin/activate
Verify MLX installation: python -c "import mlx.core as mx; print('MLX OK')"
Check engine binary: ./engine/build/bin/betaone --help
Verify inference server: echo '{"type": "single", "fen": "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"}' | cd network && source venv/bin/activate && python inference.py

Performance Issues

Monitor memory usage during training: top -pid $(pgrep -f "python.*train.py")
Adjust num_workers in config for your CPU cores
Reduce mcts_simulations for faster games
Use smaller batch sizes if memory constrained
Check Metal GPU usage: system_profiler SPDisplaysDataType

Logs

Check the logs/ directory for detailed execution logs:

training.log: Training progress and loss curves
orchestrator.log: Self-play and evaluation metrics
engine.log: UCI engine debug output

Debug Commands

# Check system resources
top -l 1 | grep -E "(CPU|PhysMem)"

# Monitor Metal GPU usage
system_profiler SPDisplaysDataType | grep -A 5 "Metal"

# Check Python environment
cd network && source venv/bin/activate && python -c "import mlx.core as mx; print(f'MLX version: {mx.__version__}')"

# Test UCI engine connectivity
echo "uci\nisready\nquit" | ./engine/build/bin/betaone

# Check build artifacts
ls -la engine/build/bin/
ls -la orchestrator/.build/release/

Utilities

tools/chess_utils.py: Shared helpers for applying moves, querying board status, and parsing policy strings
tools/fen_tools.py: CLI wrapper that calls the shared helpers and packages self-play shards
logs/selfplay_manifest.json: Automatically maintained summary of generated training shards

Contributing

This is a personal project, but suggestions and improvements are welcome.

License

MIT License - see LICENSE file for details.

Acknowledgments

Inspired by AlphaZero and LC0
Built with MLX for Apple Silicon optimization
Uses standard UCI protocol for compatibility

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.vscode		.vscode
build		build
data		data
engine		engine
network		network
orchestrator		orchestrator
tests		tests
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
DEVELOPMENT_CHECKLIST.md		DEVELOPMENT_CHECKLIST.md
LICENSE		LICENSE
README.md		README.md
benchmark_results.json		benchmark_results.json
benchmark_test.json		benchmark_test.json
build.sh		build.sh
config.json		config.json

Folders and files

Latest commit

History

Repository files navigation

BetaOne Chess Engine

Features

Architecture

Components

System Architecture

Training Pipeline

Quick Start

Prerequisites

Installation

Usage

Testing Individual Components

1. Test Neural Network

2. Test MLX Inference Server

3. Test UCI Engine

Running the Full System

Option 1: Swift Orchestrator (Recommended)

Option 2: Direct UCI Engine

Option 3: Direct Training

Orchestrator CLI Highlights

UCI Engine Commands

Self-Play Data Generation

Device Selection

Configuration

Logs & Diagnostics

Performance

Benchmarking Tools

Development

Building Individual Components

Testing

Technical Details

Neural Network Architecture

MCTS Algorithm

Training Process

Troubleshooting

Common Issues

Detailed Troubleshooting

Build Issues

Runtime Issues

Performance Issues

Logs

Debug Commands

Utilities

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages