A collection of community game environments for the ARC-AGI-3 benchmark.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
mm01 |
sk01 |
tb01 |
rs01 |
sy01 |
tt01 |
ms01 |
ff01 |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
sl01 |
sv03 |
lo02 |
ph01 |
or01 |
tt03 |
zq03 |
ml01 |
The intelligence of a system is a measure of its skill-acquisition efficiency over a scope of tasks, with respect to priors, experience, and generalization difficulty.
— François Chollet, On the Measure of Intelligence (2019)
These games are designed to be easy for humans to solve, but very hard for modern AI systems—including frontier large language models. Together they stress reasoning, planning, and interactive control rather than memorized puzzle templates.
- Massive testing ground — 200+ community games in
GAMES.mdbeside the official ARC-AGI-3 list; train, test, and evaluate agents on varied unseen tasks for generalization. - Fast prototyping — Local offline
environment_files/, agent step-through, and--mode humanto learn win conditions by playing. - New joiner friendly — Tutorial stems ez01–ez04 in
GAMES.mdteach ARC-AGI-3 movement, goals, and the shared human/agent action interface before challenging difficulty / official ARC-AGI-3 games. - LLM-friendly authoring —
AGENTS.mdplus skills underskills/: create-arc-game, play-arc-game. - Human–AI discoverable tasks — Goals and rules are intended to be learnable through play (observation, actions, consequences), not from docs alone, under the same interface for humans and agents; see check-arc-game-discoverable.
- Mechanical solvability checks — Per-level winnability under the shipped rules via
devtools/verify_level_solvability.py; see check-arc-game-solvable. - Competition mode —
uv run python run_game.py --competitionmatches the real toolkit (competition rules) before you submit. - Official leaderboard — With
ARC_API_KEYand--online, runs can count on three.arcprize.org (see API / leaderboard / competition below).
Requires Python 3.12+ and uv. From the repo root, install dependencies once:
uv syncRun a game (tutorial ez01):
# Omitting --mode defaults to random-agent (random ACTION1–ACTION5 for --steps); set explicitly below.
# Local play uses environment_files/ by default unless --online / --competition.
uv run python run_game.py \
--offline \
--version auto \
--mode random-agent \
--game ez01Discover local environments:
# List local environments; add --offline to pin the listing to local environment_files/.
# For a chosen stem, --version auto picks the sole package under it.
uv run python run_game.py --list
# uv run python run_game.py --offline --listWith --mode human, local keys map to abstract actions as in Actions (WASD + Space, arrows + F; digits 1–5 also send ACTION1–5):
| Input | Action |
|---|---|
W / ↑, S / ↓, A / ←, D / → (or 1–4) |
ACTION1–ACTION4 |
Space / F / 5 |
ACTION5 |
| Click grid | ACTION6 |
U or Ctrl/Cmd+Z |
ACTION7 |
R |
Restart (calls environment.reset()) |
Q / Esc |
Quit window |
uv run python run_game.py \
--offline \
--version auto \
--mode human \
--game ez01Swap ez01 for any stem from uv run python run_game.py --list (use --offline to list only local environment_files/ packages).
three.arcprize.org — leaderboard, online play, and ARC_API_KEY. Only online runs count there; play against local environments does not.
- Copy
.env.example→.envand setARC_API_KEY(nothing else required in the template). - For API play, pass
--online(registry) or--competition(competition rules, Kaggle-style) — pick one; they are mutually exclusive. Omit both to use local environments (same as Quickstart). More flags:uv run python run_game.py --help.
Online (registry):
uv run python run_game.py --online \
--version auto \
--game ls20Competition toolkit mode:
uv run python run_game.py --competition \
--version auto \
--game ls20Example prompt (replace {game_id} and the bracketed design):
Implement a new ARC-AGI-3 game {game_id} at environment_files/{game_id}/v1/. Follow AGENTS.md and skills/create-arc-game/SKILL.md: static levels only, ARCBaseGame + metadata.json, register a row in GAMES.md. Game design: [grid size, entities, win/lose, which actions 1–7 do].
Available skills (under skills/):
- create-arc-game — End-to-end game design and implementation (
environment_files/,ARCBaseGame,metadata.json,GAMES.md) aligned withAGENTS.md. - play-arc-game — Run and smoke-test local environments with
run_game.py(list, random agent, human pygame, offline/online flags). - generate-arc-game-gif — GIF-ready
RenderableUserDisplayand registry previews viascripts/render_arc_game_gif.py. - check-arc-game-discoverable — Review whether goals and mechanics are inferable through play under the shared human/agent interface (not repo prose as the teacher).
- check-arc-game-solvable — Mechanical per-level solvability under the shipped rules via
devtools/verify_level_solvability.py.
Full checklist: CONTRIBUTING.md.















