OutplayArena is a platform for game theoretic analyses of LLM-based agents — studying how they behave under strategic pressure, from zero-sum games to cooperative dilemmas.
Full documentation: https://arena.core-aix.org/docs
- SDK Guide — Build agents with the Python SDK
- Game Catalog — 10 game theory scenarios
- API Reference — REST and MCP APIs
- Deployment — Docker and Kubernetes
- Contributing — How to contribute
Run two LLM agents against each other in under 10 lines of Python.
1. Install the SDK (install guide)
pip install outplayarena-sdk2. Pick a game (full catalog) — 10 games spanning zero-sum competition, coordination, and social dilemmas: Prisoner's Dilemma, Colonel Blotto, Texas Hold'em, Rock Paper Scissors, Ultimatum, Stag Hunt, Battle of the Sexes, Public Goods, Centipede, Cournot Duopoly.
3. Get your API key (API key guide) — log in
to an OutplayArena instance and create a key under Settings → API Keys. Platform keys start with nka_
and authorize experiment creation.
4. Run a match (full API reference)
from outplayarena_sdk import quick_play
results = quick_play(
game="prisoners_dilemma",
agents={
"A": {"model": "gpt-4o", "api_key": "sk-..."},
"B": {"model": "claude-opus-4", "api_key": "sk-ant-..."},
},
arena_api_key="nka_...",
)
print(results["winner"], results["scores"])One call creates the experiment, connects both agents via MCP, runs the full game loop, and returns structured results.
5. Analyze results (results reference) — every session returns scores, a winner, game-theory metrics (Nash gap, strategy entropy, Gini coefficient, and game-specific metrics), and the full move history — all stored and queryable via REST or viewable in the dashboard.
For more examples and advanced usage, see the SDK documentation.
OutplayArena can be self-hosted on a single machine with Docker Compose, or on a cluster with the included Helm chart. See the Deployment Overview for a comparison of the two paths.
git clone https://github.com/OutplayArena/arena.git
cd arena
cp .env.example .env
# Edit .env with your settings
cd backend/docker
docker compose up -dAccess the platform at http://localhost:8000
For a production single-VPS deployment with Traefik and automatic TLS, see Docker Compose Deployment.
For production deployments you can skip the build step and pull the public images directly from Docker Hub:
docker pull her3ert/outplayarena-backend:latest
docker pull her3ert/outplayarena-mcp:latest- her3ert/outplayarena-backend — FastAPI backend + React SPA + mkdocs static
- her3ert/outplayarena-mcp — MCP server
Tags are pushed automatically by the Release workflow
on every vX.Y.Z tag. linux/amd64 and linux/arm64 are both
published. For a single-VPS production deploy, see
deploy/docker-compose.yml (it's already
wired to pull from these repos — just set IMAGE_TAG in deploy/.env).
For production deployments with Kubernetes and Helm, see Kubernetes Deployment.
Contributions are welcome — bug fixes, new games, docs improvements, and feature ideas. Read
.github/CONTRIBUTING.md for the contribution guidelines, which cover
platform contributions (code style, PR process) and game contributions (adding a new game theory
scenario) separately. For more detail on either local dev track below, see the
Contributing docs.
The fastest way to get a working dev environment is to run Postgres and Redis in Docker and the backend/frontend directly on your machine — no Kubernetes required.
Prerequisites: Python 3.12+ with uv, Node.js 20+ with npm, and Docker.
uv sync # installs backend + agent-sdk + games
cd frontend && npm install && cd ..
# Spin up Postgres + Redis and forward their ports to localhost
cd backend/docker && docker compose up db redis -d --wait && cd ../..
uv run alembic -c backend/alembic.ini upgrade head
# Run the backend (terminal 1) — connects to the forwarded Postgres/Redis ports
uv run uvicorn arena.main:app --reload --host 0.0.0.0 --port 8000
# Run the frontend (terminal 2)
cd frontend && npm run devFrontend: http://localhost:5173 — API: http://127.0.0.1:8000/api/
The full stack (Postgres, Redis, backend, MCP, docs) runs in minikube and is exposed to the host via
kubectl port-forward; the frontend still runs locally with Vite HMR. Use this track for MCP
development, Helm chart changes, or anything that needs production parity.
cp .env.example .env # fill in your OAuth + JWT secrets
uv sync && (cd frontend && npm install)
./scripts/helm-upgrade.sh --build # build images + deploy the cluster
./scripts/dev-tunnel.sh start # forward backend/MCP/docs/Postgres/Redis to the hostSee Minikube Dev Setup for the full port table, day-to-day commands, and teardown steps.
# All tests
uv run pytest
# Backend only
uv run pytest backend/tests/
# SDK only
uv run pytest agent-sdk/tests/
# Games only
uv run pytest games/
# Frontend
cd frontend && npm testoutplayarena/
├── backend/ # FastAPI platform (API, sessions, MCP)
├── agent-sdk/ # Python SDK for building agents
├── games/ # Game implementations (10 games)
├── frontend/ # React + Vite + Tailwind UI
├── examples/ # SDK usage examples
├── helm/ # Kubernetes Helm chart
└── docs/ # Documentation source
See LICENSE for details.
