Learning Attractors Enables Scalable Reasoning
Benhao Huang
·
Zhengyang Geng
·
Zico Kolter
CMU
Code for reproducing EqR experiments on Sudoku-Extreme and Maze-Unique.
uv venv
source .venv/bin/activate
uv pip install -r requirements.txtadam-atan2 must install its CUDA backend (adam_atan2_backend); a
Python-only install is not sufficient for training. Build it in an environment
where CUDA and Python headers are available:
# If CUDA is not already configured, point CUDA_HOME at the active CUDA
# toolkit. For example:
# export CUDA_HOME="$(dirname "$(dirname "$(which nvcc)")")"
# export PATH="$CUDA_HOME/bin:$PATH"
# Optional: set this when building without visible GPUs or when cross-compiling.
# Examples: H100=9.0, A100=8.0, RTX 4090/L40=8.9.
# export TORCH_CUDA_ARCH_LIST=9.0
python -m pip install --no-build-isolation --no-cache-dir --force-reinstall \
adam-atan2==0.0.3
python - <<'PY'
from adam_atan2 import AdamATan2
import adam_atan2_backend
print(AdamATan2, adam_atan2_backend.__file__)
PYMaze training also requires FlashAttention on CUDA.
FlashAttention install notes
EqR follows the HRM attention import pattern: prefer FlashAttention-3 via
flash_attn_interface when available, and fall back to FlashAttention-2 via
flash_attn. On NVIDIA Hopper GPUs, FlashAttention-3 is recommended. If a
local wheel is available:
python -m pip install --no-deps <path-to-flash-attn-3-wheel.whl>
python - <<'PY'
import flash_attn_interface
print(flash_attn_interface.__file__)
PYIf FlashAttention-3 is unavailable, install FlashAttention-2:
python -m pip install flash-attn --no-build-isolation
python - <<'PY'
from flash_attn import flash_attn_func
print(flash_attn_func)
PYFor W&B defaults, copy config/secrets.example.yaml to config/secrets.yaml
and fill in your entity/project. config/secrets.yaml is ignored by git.
Default dataset paths:
| Dataset | Path |
|---|---|
| Sudoku-Extreme | data/sudoku-extreme-1k-aug-1000 |
| Maze-Unique | data/maze-30x30-unique-1k |
Download datasets:
bash scripts/download_artifacts.shDownload datasets and pretrained checkpoints:
bash scripts/download_artifacts.sh --with-ckptsThe helper downloads Hugging Face snapshots under downloads/eqr-artifacts,
copies datasets into data/, and copies checkpoints into
downloaded_checkpoints/.
Checkpoint paths:
downloaded_checkpoints/sudoku-extreme/eqr.pth
downloaded_checkpoints/maze-unique/eqr.pth
Dataset rebuild commands
Use these commands only if you want to regenerate the datasets instead of
downloading locuslab/EqR-data.
Sudoku-Extreme follows the
sapientinc/HRM setup. The builder downloads
train.csv and test.csv from sapientinc/sudoku-extreme, converts them to
HRM/EqR-compatible NumPy arrays, and augments only the training split with
Sudoku-preserving digit, row, column, and transpose transformations.
Default EqR/HRM small-sample dataset:
python -m dataset.build_sudoku_dataset \
--output-dir data/sudoku-extreme-1k-aug-1000 \
--subsample-size 1000 \
--num-aug 1000 \
--seed 42Full Sudoku-Extreme dataset:
python -m dataset.build_sudoku_dataset \
--output-dir data/sudoku-extreme-full \
--seed 42The generated directory should contain train/, test/, and
identifiers.json. Each split contains dataset.json plus
all__inputs.npy, all__labels.npy, all__puzzle_identifiers.npy,
all__puzzle_indices.npy, and all__group_indices.npy.
Additional Sudoku-Extreme options:
python -m dataset.build_sudoku_dataset --helpBuild Maze-Unique:
python dataset/build_maze_unique_dataset.py \
--output-dir data/maze-30x30-unique-1k \
--grid-size 30 \
--train-samples 1000 \
--test-samples 1000 \
--maze-mode perfect \
--length-distribution uniform \
--min-path-length 100 \
--max-path-length 140 \
--require-unique \
--dedupescripts/train.sh launches torchrun --standalone with one local process by
default. Override NPROC_PER_NODE only when you want multi-GPU training:
bash scripts/train.sh eqr_sudoku
bash scripts/train.sh trm_sudoku
bash scripts/train.sh eqr_maze_unique
bash scripts/train.sh trm_maze_unique
NPROC_PER_NODE=2 bash scripts/train.sh eqr_maze_unique| Dataset | Config | Steps |
|---|---|---|
| Sudoku-Extreme | config/train/eqr_sudoku.yaml |
50k |
| Sudoku-Extreme TRM baseline | config/train/trm_sudoku.yaml |
50k |
| Maze-Unique | config/train/eqr_maze_unique.yaml |
100k |
| Maze-Unique TRM baseline | config/train/trm_maze_unique.yaml |
100k |
scripts/eval.sh uses config/eval/depth_breadth.yaml. By default it runs
depth 16 with breadth 1.
bash scripts/eval.sh /path/to/checkpoint.pth
bash scripts/eval.sh /path/to/checkpoint.pth halt_max_steps=64
bash scripts/eval.sh /path/to/checkpoint.pth \
halt_max_steps=64 different_init=128 convergence_top_k=1 global_batch_size=16After downloading checkpoints, evaluate the pretrained EqR models:
bash scripts/eval.sh downloaded_checkpoints/sudoku-extreme/eqr.pth
bash scripts/eval.sh downloaded_checkpoints/maze-unique/eqr.pthEvaluation config:
config/eval/depth_breadth.yaml
- Release XLA code for large-scale inference.
This codebase builds on the HRM and TRM repositories.
This project is released under the Apache License 2.0. See LICENSE for details.
TODO
