diff --git a/docs/audit-2026-05-08.md b/docs/audit-2026-05-08.md new file mode 100644 index 0000000..0472a42 --- /dev/null +++ b/docs/audit-2026-05-08.md @@ -0,0 +1,249 @@ +# Claude Kingdoms — CTO-Style Audit Against PRD v1.5 + +**Date:** 2026-05-08 +**Reviewer:** claude-code +**Repo HEAD:** `4b4b119` (PR #21 merged) +**Test count on `main`:** 149 passing + +This is a hard-eyed alignment review against `ClaudeKingdomsPRD.md` v1.5. +It calls out what's truly shipped, where the implementation diverges +from the spec, and what's outstanding for V1 launch. + +## Executive summary + +The bridge is in genuinely good shape. The Unciv mod side is +substantially shipped at the JSON level and a Lua runtime hook is now +on `main`. Three milestones (M0, M1 core, M2 core) are landed. + +But there are two **P0 alignment gaps** between the implementation and +the PRD's binding contracts that need to be closed before V1 launch: + +1. **`SessionStatus` enum is missing `Inactive` and `Redirected` as + distinct states** despite the PRD requiring all five (BR-1, TU-4). +2. **The Bridge–TUI JSON schema as written on disk does not match the + "minimum required fields" defined in the PRD** §"Bridge–TUI + Interface Contract" (line 231). Eight per-session fields are + missing from the on-disk payload. + +There are also several **P1 gaps**: the README GIF (acceptance #11), +the working tutorial flow at game-engine level (acceptance #9, #20), +parchment tileset and illuminated UI skin as image assets (M3 gap), +and most of M4 (full Operators civ, tutorial map, campaign map, +fun-factor check). + +## A. Functional requirement scorecard + +Legend: ✅ done · 🟡 partial · ❌ missing · n/a + +### Bridge Layer (BR-1 → BR-11) + +| ID | Requirement | Status | Evidence / gap | +|----|---|---|---| +| BR-1 | Classify sessions: Inactive, Active, WaitingHuman, Completed, Redirected | ❌ | `SessionStatus` enum has only ACTIVE/WAITING/COMPLETED/SUSPENDED. **Missing Inactive AND Redirected as states.** | +| BR-2 | Accumulate turn-value score between End Turn events | ✅ | `bridge_loop.process_turn` + `scoring.calculate_turn_rewards` | +| BR-3 | Per-session payload fields | 🟡 | Internal scoring is correct, but on-disk per-session payload omits the breakdown (see §B Schema audit) | +| BR-4 | Waiting/Inactive → zero contribution | ✅ | Verified by test (`test_waiting_session_yields_zero_contribution`) | +| BR-5 | Redirect resets/reduces momentum, no negative balance | ✅ | `_handle_permission_denied` + `test_permission_denied_resets_momentum_only` | +| BR-6 | Plan creation bonus (+15 default) | ✅ | `SimpleScoringStrategy` 15 prod/plan, capped at 2 | +| BR-7 | Goal completion bonus (+40 default) | ✅ | 40 gold + 20 sci per goal in `SimpleScoringStrategy` | +| BR-8 | Token bonus capped (+10/turn) | ✅ | `min(session.tokens, 10)` in scoring | +| BR-9 | Audit log of events and turn outputs | 🟡 | TUI `EventLogPanel` shows recent changes; no persisted audit log file | +| BR-10 | Local API or file exchange polled at End Turn | ✅ | `SaveExchange` writes `kingdom_save.json` per turn; mod.lua polls | +| BR-11 | Manual event injection in TUI | ✅ | g/p/r/a/t/s bindings, Pilot-tested | + +### Bridge & Save-State (BS-1 → BS-7) + +| ID | Requirement | Status | Evidence / gap | +|----|---|---|---| +| BS-1 | Versioned exchange payload (per-session/city totals) | 🟡 | Versioned at top level; per-session totals NOT in the payload (see §B) | +| BS-2 | Synchronizable with Unciv save-compatible state | ✅ | mod.lua reads kingdom_save.json | +| BS-3 | No Kotlin fork in V1 | ✅ | Mod is pure JSON + Lua | +| BS-4 | Future engine-integration ready | ✅ | Architecture is decoupled | +| BS-5 | Reward application turn-boundary based, not real-time | ✅ | `process_turn` is the only write path | +| BS-6 | All state preserved on close, no decay | ✅ | E2E test `test_no_decay_on_close` | +| BS-7 | Suspended (closed) vs Idle (open, no work) distinct | ❌ | Status enum has SUSPENDED but no IDLE/INACTIVE state. The distinction PRD requires is not implementable today. | + +### Claude Integration (CL-1 → CL-6) + +| ID | Requirement | Status | Evidence / gap | +|----|---|---|---| +| CL-1 | Detect Active vs WaitingHuman from hooks | ✅ | `_handle_user_prompt_submit` + `tick_idle_check` | +| CL-2 | Detect redirect via PermissionDenied | ✅ | `_handle_permission_denied` | +| CL-3 | Detect Stop → Completed | ❌ | Stop currently only sets the idle timer; never transitions to COMPLETED | +| CL-4 | Hooks scoped, opt-in, reviewable | 🟡 | `docs/hooks-setup.md` documents but no per-hook opt-in toggle | +| CL-5 | Hook installation reversible | 🟡 | Capture+replay path is documented; no one-command install/uninstall script | +| CL-6 | Plan mode + ultraplan as plan signals | 🟡 | SubagentStart with type=Plan handled; ultraplan-specific signal not differentiated | + +### Game Layer (GM-1 → GM-10) + +| ID | Requirement | Status | Evidence / gap | +|----|---|---|---| +| GM-1 | Installable via Unciv mod manager, no fork | ✅ | mod/ClaudeKingdoms/ + mod README | +| GM-2 | The Operators civ in JSON | ✅ | `mod/ClaudeKingdoms/jsons/Nations.json` | +| GM-3 | Session Forge building | ✅ | `Buildings.json` | +| GM-4 | Turn-boundary reward resolution from bridge | ✅ | `mod.lua` reads + applies + deletes file | +| GM-5 | Plan/goal events trigger visible bonuses | 🟡 | Bonus is in scoring; mod.lua applies the per-city total but does NOT show a discrete "plan +15" UI feedback | +| GM-6 | Custom parchment tileset | ❌ | Not present (image assets) | +| GM-7 | Custom illuminated UI skin | ❌ | Not present (image assets) | +| GM-8 | Per-tile session status indicator | ❌ | mod.lua does not emit tile annotations | +| GM-9 | Standard Unciv save + multiplayer compatibility | 🟡 | Save format unmodified; multiplayer untested | +| GM-10 | Long-running campaign, no forced end | 🟡 | No forced end is fine; campaign map itself not yet authored | + +### TUI Sidecar (TU-1 → TU-9) + +| ID | Requirement | Status | Evidence / gap | +|----|---|---|---| +| TU-1 | Terminal-first, Textual | ✅ | | +| TU-2 | Slot panel per session: state, streak, estimate, recent events | 🟡 | Single combined view; no per-slot panel decomposition (works for 1 session, weak for many) | +| TU-3 | Launch/attach Claude Code from a slot | ❌ | No "attach session" UI affordance | +| TU-4 | State badges in plain English | 🟡 | KingdomView prints status, but as raw enum `[active]`/`[waiting]` not full English; missing Inactive/Redirected | +| TU-5 | Next-turn estimate per slot | 🟡 | Single estimate panel for all sessions, not per-slot | +| TU-6 | Manual controls labeled as dev instrumentation | 🟡 | Footer shows binding labels but no "DEV" disclaimer | +| TU-7 | Standard terminal usable, degrades over SSH | ✅ | Textual handles this | +| TU-8 | Keyboard-first, attractive, legible | ✅ | Parchment palette, footer hints | +| TU-9 | TUI consumes only the versioned schema | ❌ | TUI reads `SessionState` directly, not via the JSON schema. Couples TUI to bridge internals. | + +## B. Schema audit — the binding contract drift + +PRD §"Bridge–TUI Interface Contract" line 231 specifies the **minimum +required fields per session slot**: + +| PRD field | Present? | Notes | +|---|---|---| +| `schema_version` | ❌ | We pin a top-level `version`; not embedded per session | +| `session_id` | ✅ | | +| `state` | ❌ | We have `status` (different name) AND wrong enum (see §A BR-1) | +| `momentum_streak` | ❌ | We store `momentum` as a single int — PRD splits streak count from multiplier | +| `momentum_multiplier` | ❌ | Not stored or computed as a separate float | +| `base_value_this_turn` | ❌ | Computed inside scoring, not embedded in the per-session payload | +| `token_bonus` | ❌ | Same | +| `plan_bonus` | ❌ | Same | +| `goal_bonus` | ❌ | Same | +| `net_payout` | ❌ | Same | +| `last_event` | ❌ | Not tracked | +| `last_event_ts` | ❌ | Not tracked | +| `is_manual_injection` | ❌ | Not tracked | + +**This is the largest single gap in the build.** The PRD is explicit +that this schema is "the binding contract between bridge contributors +and TUI contributors, enabling parallel work without blocking" — and +it's drifted enough that a strict TUI-side contributor reading +`docs/bridge-tui-schema.md` would not find these fields. + +`docs/bridge-tui-schema.md` itself appears to be the M0 placeholder +schema, not the PRD-mandated one. Worth re-deriving from the PRD. + +## C. Acceptance criteria scorecard (V1 launch gates) + +| # | Criterion | Status | +|---|---|---| +| 1 | Player installs in 10 min | 🟡 README has install steps; not timed/measured | +| 2 | Session activity drives next-turn rewards end-to-end | ✅ HookListener + ScoringEngine + SaveExchange + mod.lua | +| 3 | WaitingHuman → zero turn value | ✅ | +| 4 | Redirect resets momentum + log entry | ✅ scoring + listener; log entry in TUI EventLogPanel | +| 5 | Plan creation triggers visible in-game bonus | 🟡 Bonus computed and persisted; **not visually distinguished in-game** (GM-5 partial) | +| 6 | Goal completion triggers larger visible in-game bonus | 🟡 Same — bonus persisted, not visually distinguished | +| 7 | Map and UI look medieval, parchment-like | 🟡 TUI: yes; Unciv mod: NOT YET — no tileset/skin | +| 8 | Onboarding delivers emotional contract first | ✅ `OnboardingScreen` + Unciv tutorials | +| 9 | Tutorial completes <10 min, transitions to campaign | ❌ Tutorial flow doesn't exist in-engine yet | +| 10 | Bridge tests pass in CI on every PR with ownership matrix | ✅ 149 tests, CI green, ownership documented in PRD | +| 11 | README has working GIF of session→reward loop | ❌ No GIF | +| 12 | Documentation covers four contribution paths | 🟡 README + hooks-setup.md + ADRs; "four paths" not enumerated explicitly | +| 13 | New contributor completes a `good first issue` unaided | ❌ No `good first issue` exists yet | +| 14 | V1 architecture works without Kotlin fork | ✅ | +| 15 | Reward application turn-boundary based | ✅ | +| 16 | Empire state preserved on close, restored on reopen | ✅ | +| 17 | Bridge distinguishes Suspended from Idle | ❌ Enum gap (BS-7 above) | +| 18 | Long-running campaign, no forced end | 🟡 architecturally yes; no campaign map yet | +| 19 | Every session ends with a visible unresolved frontier | ❌ Not designed/implemented in TUI or mod | +| 20 | Tutorial→campaign handoff works in same session | ❌ Tutorial doesn't exist (see #9) | +| 21 | Spike 1 + 2 ADRs published before M0 | ✅ ADR-001, ADR-002 on main | + +**Summary: 9/21 ✅, 6/21 🟡, 6/21 ❌.** The gap-to-launch is real but +mostly concentrated in three buckets: schema/enum corrections (P0), +in-engine visual feedback for events (P1), and content authoring for +M4 (tutorial map, campaign map, fun-factor check, GIF, four +contribution paths). + +## D. Critical issues to fix before V1 launch + +### P0 — must close before any further milestone work + +1. **Add `INACTIVE` and `REDIRECTED` to `SessionStatus`** (BR-1, TU-4). + Today's PR-merge cadence has been moving so fast that this enum + gap has propagated through the listener, scoring, TUI, mod.lua, + and tests. Each callsite assumes 4 states; PRD requires 5+. + +2. **Realign the Bridge–TUI JSON schema with the PRD-mandated minimum + fields** (Schema audit above). Either the PRD's schema needs an + update, or the bridge's payload does — pick one and converge. + Currently the binding contract is broken on paper. + +3. **Implement `Stop → COMPLETED` transition** (CL-3). Today Stop only + starts the idle timer; the spec requires it can also flip the + session to COMPLETED in the right context. + +### P1 — needed for V1 launch but not blocking M2/M3 + +4. **Visual feedback for plan/goal events in the Unciv mod** (GM-5, + acceptance #5/#6). The bonus lands as gold/production but the + player can't see "+15 from plan" — just a higher resource total. + Without this the emotional contract from M3 onboarding rings + hollow. + +5. **Per-tile session status indicator** (GM-8). The mod.lua hook + doesn't emit per-tile annotations. + +6. **Author the tutorial flow + tutorial map in-engine** (acceptance + #9, #20). Right now there's an OnboardingScreen in the TUI but no + tutorial map / mode in Unciv. + +7. **README GIF** (acceptance #11). Mechanical. + +8. **Four contribution paths documented** (acceptance #12). Pick + four (bridge / TUI / mod-content / docs) and add stubs. + +### P2 — quality issues and tech debt + +9. The `default` profile keeps showing up as the assignee on tasks + that should be claude-code. Hermes's dispatcher behavior here is + inconsistent. + +10. The TUI `KingdomView` reads `SessionState` directly, violating + TU-9 ("TUI must consume only via the schema"). + +11. `audit log` (BR-9) — only an in-memory rolling log exists. PRD + expects a persisted, structured, human-readable log file. + +12. `gemini-cli` MCP keeps re-adding itself via `hermes setup` and + failing to load. Either fix the upstream `mcp-gemini-cli` package + or pin it as `enabled: false` in the Hermes config. + +## E. Recommended next-priority order + +1. P0 #1 (enum) — small change, blast radius is contained, unblocks + every PRD-spec audit downstream. +2. P0 #2 (schema) — write the proper schema doc + adapter that emits + the PRD's minimum fields. Probably 1–2 PRs. +3. P0 #3 (Stop → COMPLETED). +4. P1 #4 (per-event visual feedback in mod). +5. M4 content (Operators civ flesh-out, tutorial + campaign maps, + fun-factor check) — substantial work. +6. The remaining acceptance-criterion shores (GIF, contributor + docs, `good first issue`). + +## F. What I would not stress about + +- Test coverage is genuinely strong (149 tests, CI green, Pilot + integration tests for the TUI bindings). +- The bridge architecture is clean and decoupled — no Kotlin fork, + schema-versioned save exchange, no-decay invariant verified end to + end. +- The kanban board is now a faithful record after today's + reconciliation work; ghost-done patterns have been called out and + Hermes's memory has the discipline rules. +- Mod installs cleanly per `mod/ClaudeKingdoms/README.md`. JSON + schema test coverage will catch breakage in CI. + +The build is V1-ready in **bones**; the V1 contract requires the P0 +fixes plus M4 content work to ship.