feat(arena): SAP-style autobattler — contracts + MCP + Tournament Hall UI#29
Merged
Conversation
…pike AbilityLib is a pure library implementing a trigger × effect × target dispatcher backed by an FIFO event queue capped at 64 steps. Modeled on Autochessia's EventType/Attribute/ApplyTo three-enum pattern but stripped of MUD/ECS — pure Solidity memory structs only. UnitCatalog encodes the 12 spike units (4 tiers × 3 units) as a pure function table — names lean into the Gravity Town theme (Mineworker, Stoneguard, Pyromancer, Wraith, etc). The ability matrix covers all 6 triggers and all 5 effect types. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Engine slot ArenaEngine is the async ghost autobattler. Highlights: - 5 player verbs (buy/sell/move/freeze/roll) gated by canControlAgent - ELO-bucketed matchmaking with 30min rate-limit per bucket + Fisher-Yates pairing - View-only deterministic simulateMatch(matchId) replays combat from seed + ghost snapshots — battle never writes storage - settleMatch updates ELO and writes an "arena defeat" evaluation entry on the loser via EvaluationLedger GameEngine.spendOre is a single ~10-line operator hook so ArenaEngine can deduct ore without owning a separate balance system. Auto-harvests first to ensure a stale pool doesn't block a valid spend. Router gains a storage-appended `arenaEngine` slot + `getAddressesV2()` returning 7 addresses. The original `getAddresses()` 6-tuple is preserved verbatim so chain.ts and Upgrade.s.sol's length-sniff decoder don't break. Design judgments: - sell() emits the refund amount but skips the credit-back path for the spike — GameEngine has no public refund hook today; documented as TODO with rationale. - Combat draw → defender wins (need a tiebreaker for ELO; spike judgment). - ELO uses a linear approximation of the standard logistic with K=32; pure on-chain fixed-point logistic is overkill for a spike. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ArenaEngine.t.sol: 12 tests covering the 5 player verbs, ELO bucketing, matchmaking rate-limit + pairing, deterministic simulation, ELO settlement, ability-chain triggers, and the queue-depth safeguard. Deploy.s.sol: deploys ArenaEngine proxy, registers it as operator on the registry so spendOre / EvaluationLedger.write succeed, and wires it into Router via setArenaEngine. Upgrade.s.sol: mirrors the EvaluationLedger backfill pattern — if the router slot is unset, deploys a fresh ArenaEngine proxy and registers it; otherwise upgrades the existing impl. Idempotent. Also: renamed UnitCatalog.getUnit's `cost` return name to `unitCost` to silence the shadowing warning vs the sibling `cost()` helper. Pure cosmetic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ON_BUY (Mineworker +1 ATK self, Battlemage +2 ATK right) and ON_SELL (Ravenscout +1 ATK all allies) used to fire in catalog but never reached the battle: bench only stored uint8 unitType and _buildBattleState re-pulled clean base stats from UnitCatalog on every simulate. Add a parallel int16 atkOverride / hpOverride pair to Ghost, snapshot it onto each Match at creation, and stack the overlay on top of base stats in _materialize. AbilityLib gains applyBenchAbility — a shop-phase processor that honors EFF_ADD_ATK / EFF_ADD_HP with TGT_SELF, LEFT/RIGHT_NEIGHBOR and ALL_ALLIES. move() now swaps overlays alongside slots; sell() zeroes the seller's overlay slot after firing its own ON_SELL. Also adds GameEngine.refundOre — operator-only credit that caps at MAX_ORE_POOL — and rewires ArenaEngine.sell to actually refund cost/2 ore (previously the refund was only emitted, never credited). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…field
Tighten ability dispatch and matchmaking:
- AbilityLib._applyEffect now requires BUFF_NEIGHBOR effects to be
registered with TGT_SELF; the buff is always relative to the caster's
slot and a target-relative variant is incoherent. Surfaces catalog
mistakes in tests instead of mis-buffing the wrong unit at runtime.
- ArenaEngine adds MAX_BUCKET_SIZE = 256 so a single bucket can't grow
unbounded and brick Fisher-Yates gas during runMatchmaking. Excess
submitters revert with "bucket full" and naturally rebalance into
other ELO bands.
- Draws (both sides alive after 200-turn cap, or simultaneous wipes)
now pick a winner from keccak(seed, "draw") instead of always
awarding defender — removes a systematic attacker-vs-defender bias.
- ShopFrozen event gains a nowFrozen bool so clients can mirror the
toggle without reading the mask back.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 17 new tests pinning the spike contracts that double-review flagged:
- ELO symmetric K=32 (low-beats-high and high-beats-low) via a new
previewEloUpdate external getter — frontends can reuse for "+X/-X"
pre-settle hints, tests use it to lock the symmetric Elo behavior.
- settle writes a properly-shaped evaluation entry on the loser
(rating=4, category="arena", related=[winner]).
- bucket boundary crossing after a win moves the ghost between buckets.
- matchmaking: odd-N sits one out; uses Match snapshot (not current
bench); respects MAX_BUCKET_SIZE via the new "bucket full" require
(storage-poked to dodge 256-real-agent gas blowup).
- buy: rejects out-of-range unit types and double-bought slots.
- freeze: toggle emits nowFrozen on both true and false transitions.
- roll: changes seed and net-spends ROLL_COST despite auto-harvest.
- ON_BUY (Mineworker self) and ON_SELL (Ravenscout all-allies) buffs
persist all the way into combat damage numbers.
- BUFF_NEIGHBOR with non-SELF target reverts via the dispatch require.
- simulateMatch is view-only — calling it twice doesn't mutate ELO.
- Wraith ON_DEATH summon is suppressed when no empty slot exists.
- Draws resolve by keccak(seed,"draw") tiebreak (no attacker/defender
bias).
Refactor: split _simulateInternal out of simulateMatch into a shared
_runCombat loop so settleMatch can skip the 128-slot trace alloc — saves
~5-10k gas per settle.
Doc: pin _eloUpdate's symmetric behavior in NatSpec as intentional spike
simplification + TODO PR #2 to swap in fixed-point logistic. Also adds
TODO comments at the prevrandao / seedMix XOR / canControlAgent /
Ghost.season / Router.getAddressesV2 sites called out by review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ArenaEngine ABI + 7 chain client methods + decodes Router.getAddressesV2 (7-tuple) with backwards-compatible fallback for routers without arena slot. 5 new MCP tools wired in tools.ts: - arena_list_units list 12 unit catalog (id, name, atk, hp, cost, ability) - arena_get_state per-agent bench + ELO + bucket + ore - arena_buy spend ore, fill slot, trigger ON_BUY + persist overlay - arena_submit push ghost into ELO bucket for matchmaking - arena_get_recent_matches read EvaluationLedger entries tagged category="arena" agent_id auto-injected via selfTools convention; arena calls degrade gracefully when router lacks arenaEngine() (returns descriptive error, no crash). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mcp.ts: arena_buy/sell/submit/get_state/get_recent_matches added to selfTools
agent_id auto-fill; collectContext now pulls arena_get_state into
AgentContext.arenaState (graceful fallback on missing arena)
llm.ts: appends ARENA section to system prompt (12-unit catalog summary +
5 verbs + bench rules); buildUserPrompt renders arenaState as a
PHASE: ARENA hint block when ghost exists or ore is high enough
to participate
types.ts: AgentContext gets optional arenaState field
Agents pick up Arena play organically — no phase mode switch needed. In
end-to-end demo runs with arena-focused personalities, 4 agents played 3
full matches and produced legitimate post-match reasoning (incl. Lila's
ON_BUY ordering self-correction captured on-chain in AgentLedger).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spectator UI for the Arena. 5-block layout per the codex design proposal (chess.com Events × LangSmith trace × Smallville 旁观 hybrid). New files (12, ~1.4k LOC): - app/arena/page.tsx main layout + selection state - components/arena/TopBar.tsx LIVE clock + bucket activity - components/arena/LeaderboardPanel.tsx top ELO + recent matches rail - components/arena/StagePanel.tsx focused-match replay theater - components/arena/AgentMindPanel.tsx selected agent's reasoning timeline - components/arena/ReplayCanvas.tsx turn-by-turn battle renderer - components/arena/UnitCard.tsx single-unit card primitive - components/arena/EvalBar.tsx ELO-delta linearized winner bar - components/arena/HighlightTicker.tsx upset / streak-break feed - hooks/useArenaEngine.ts chain polling (mirrors useGameEngine) - store/useArenaStore.ts zustand slice - lib/arenaUnits.ts 12-unit name + ability mapping Reuses Phaser sprites where possible but renders Arena in pure React DOM + CSS keyframes — SAP-style 5-slot bench doesn't need camera/zoom. Bonus fix (useGameEngine + useArenaEngine): treat any private RFC1918 RPC URL (10/8, 172.16-31/12, 192.168/16, 127.0.0.1, localhost) as a localhost build for network-picker fallback purposes. Previously only literal 127.0.0.1/localhost matched, which broke LAN access to dev server. contracts/foundry.lock: commit forge dependency lockfile (forge-std rev). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ReplayCanvas: drop unused useState import and dead isAttacker bindings - useArenaEngine: replace `any` in simulate decode with `readonly unknown[]`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the full Arena vertical slice described in #27 — from contracts up to spectator UI — and validated end-to-end with LLM agents playing real matches against each other.
Three logical layers in 9 commits:
🔨 Contracts (6 commits, fully reviewed + fixed)
ArenaEngine.sol— UUPS proxy, 5 verbs (buy/sell/move/freeze/roll) + submit + matchmaking + simulate + settle + ELO updateAbilityLib.sol— trigger × effect × target dispatcher with BFS event queue (cap 64) — pattern borrowed from AutochessiaUnitCatalog.sol— 12-unit spike catalog as immutable function tableGameEngine.sol— addsspendOre/refundOreoperator hooks (only main-world delta)Router.sol— addsarenaEngineslot +getAddressesV2()(preserves existinggetAddresses()6-tuple)Upgrade.s.sol— handles backfill deploy when router lacks arena slotArenaEngine.t.sol— 29 tests covering core flows + review-driven gap fills🤖 MCP + agent-runner integration (2 commits)
arena_list_units,arena_get_state,arena_buy,arena_submit,arena_get_recent_matchesagent-runner/mcp.tspulls arena_get_state into context each cycleagent-runner/llm.tsappends Arena prompt block + renders arenaState into user prompt🎨 Frontend: AI Tournament Hall (1 commit)
/arenaReact route, 5-block layout:useGameEngine+useArenaEnginenow treat any RFC1918 private IP as a localhost build (was breaking LAN access)Verification
0xFb2aF6D5cFF7A04Bcfd043236884B9e7137050D4+ local anviltest_simulate_deterministic_same_seed_same_winner)cast sendflow reproduces all 5 verbs + matchmaking + settle on anvilWhat's NOT in this PR
runMatchmaking(manual call works; cron later)Known TODOs (carried in #27, not blocking merge)
P0: Design remaining 48 units ·
MATCHMAKING_PERIOD→ setter · double-sided EvalLedger writesP1:
chain.tsmigrate torouter.arenaEngine()direct getter · EOA-only players ·Ghost.seasonfieldP2 (before prize pool): VRF for matchmaking ·
keccak(seed, k)not XOR · logistic ELOTest plan
cd contracts && forge test— 57/57 passjust anvil-deploy+cast sendflow perArenaEngine.t.solreproduces on local chaincd frontend && APP_CONFIG=localhost npm run dev—/arenarenders with empty state on fresh anvil/tmp/arena-demo/from PR validation)Refs
🤖 Generated with Claude Code