solatis · solatis · Apr 19, 2026 · Mar 26, 2026 · Mar 26, 2026 · Mar 26, 2026
diff --git a/.config/wt.toml b/.config/wt.toml
@@ -0,0 +1,12 @@
+# Koan project worktree hooks
+# Docs: https://worktrunk.dev/hook/
+
+[post-create]
+deps = "uv sync --dev"
+
+[post-start]
+copy = "wt step copy-ignored"
+
+[pre-merge]
+check = "uv run ruff check ."
+test = "uv run pytest"
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,28 @@
+name: CI
+
+on:
+  push:
+    branches: ["main"]
+  pull_request:
+  workflow_dispatch:
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Install dependencies
+        run: uv sync --dev
+
+      - name: Run tests
+        run: uv run pytest
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,14 @@
-node_modules/
-dist/
 .pi/
 .DS_Store
+
+.claude/
+plans/
+.env
+.env.*
+*.log
+
+# Frontend build output (committed source lives in frontend/src/)
+koan/web/static/app/
+frontend/node_modules/
+frontend/dist/
+__pycache__/
diff --git a/.koan/memory/.gitignore b/.koan/memory/.gitignore
@@ -0,0 +1,2 @@
+.index/
+summary.md
diff --git a/.koan/memory/0001-persistent-orchestrator-over-per-phase-cli.md b/.koan/memory/0001-persistent-orchestrator-over-per-phase-cli.md
@@ -0,0 +1,8 @@
+---
+title: Persistent orchestrator over per-phase CLI spawning
+type: decision
+created: '2026-04-16T07:13:41Z'
+modified: '2026-04-16T07:13:41Z'
+---
+
+This entry documents the orchestrator spawn architecture decision in koan's workflow engine (`koan/driver.py`). On 2026-04-02, Leon redesigned the system to replace per-phase CLI process spawning with a single long-lived orchestrator process running the entire workflow in one continuous session. Previously, each planning phase spawned a fresh `claude`, `codex`, or `gemini` CLI process; a separate `workflow-orchestrator` subagent was then spawned to present the user with a phase-selection decision after each phase completed. Leon's rationale: per-phase spawning caused compounding context loss (each new process re-derived what the previous had explored), and the workflow-orchestrator role was architecturally wasteful -- "a process-boot just to ask a question." Two alternatives were explicitly rejected: (1) API-based conversation (driver calling the LLM API directly) -- would have bypassed the runner abstraction handling model selection, MCP config, output streaming, and thinking mode; (2) context injection into fresh processes per phase -- cheaper but fails to provide a persistent reasoning chain and does not eliminate the workflow-orchestrator overhead. The redesign landed in `koan/driver.py` as a single `spawn_subagent()` call awaiting the orchestrator's exit, and added `koan_set_phase` as the new phase-transition tool replacing the two-tool `koan_propose_workflow` / `koan_set_next_phase` dance.
diff --git a/.koan/memory/0002-step-first-workflow-pattern-boot-prompt-is.md b/.koan/memory/0002-step-first-workflow-pattern-boot-prompt-is.md
@@ -0,0 +1,8 @@
+---
+title: Step-first workflow pattern -- boot prompt is exactly one sentence
+type: decision
+created: '2026-04-16T07:13:50Z'
+modified: '2026-04-16T07:13:50Z'
+---
+
+The step-first workflow pattern governs how all LLM subagent CLI processes in koan receive task instructions. On 2026-02-10, Leon established this as a load-bearing architectural invariant in the koan initial design (documented in `docs/architecture.md` as Invariant 2 and enforced in `koan/web/mcp_endpoint.py`). The rule: every subagent's boot prompt is exactly one sentence -- role identity plus "Call koan_complete_step to receive your instructions." Task details, phase guidance, and tool lists arrive exclusively as the return value of the first `koan_complete_step` MCP call. The pattern was motivated by a failure mode observed with haiku-class (weaker) models: complex task instructions in the boot prompt caused these models to produce text output on the first turn and exit without ever entering the tool-calling loop. Three reinforcement mechanisms make the pattern robust across model capability levels: primacy (boot prompt is the LLM's very first message), recency (`format_step()` in `koan/phases/format_step.py` always appends "WHEN DONE: Call koan_complete_step..." last), and muscle memory (by step 2 the LLM has called the tool multiple times, locking in the pattern).
diff --git a/.koan/memory/0003-server-authoritative-projection-via-json-patch.md b/.koan/memory/0003-server-authoritative-projection-via-json-patch.md
@@ -0,0 +1,8 @@
+---
+title: Server-authoritative projection via JSON Patch over symmetric dual fold
+type: decision
+created: '2026-04-16T07:13:57Z'
+modified: '2026-04-16T07:13:57Z'
+---
+
+The koan projection system maintains frontend-visible workflow state for the browser dashboard, served via Server-Sent Events from `koan/projections.py`. On 2026-03-29, Leon decided to replace a dual fold architecture with a server-authoritative JSON Patch model. The prior design maintained two independent fold implementations -- one in Python (`koan/projections.py`) and one in TypeScript (`frontend/src/sse/connect.ts`) -- required to produce identical projections from the same event sequence. Two production bugs traced directly to these folds diverging: fragmented thinking cards in the activity feed, and scout events appearing incorrectly in the primary agent's conversation feed. Leon's decision: Python computes the fold and the RFC 6902 JSON Patch diff after each event; the browser applies patches mechanically via `fast-json-patch` with no fold logic, no event interpretation, and no business rules. Simultaneously, Leon adopted camelCase for all wire-format keys so patches apply directly to the Zustand store without a field-renaming layer. The correctness guarantee is now structural: one fold in one place.
diff --git a/.koan/memory/0004-file-boundary-invariant-llms-write-markdown.md b/.koan/memory/0004-file-boundary-invariant-llms-write-markdown.md
@@ -0,0 +1,8 @@
+---
+title: File boundary invariant -- LLMs write markdown, driver writes JSON
+type: decision
+created: '2026-04-16T07:14:03Z'
+modified: '2026-04-16T07:14:03Z'
+---
+
+The file boundary invariant is a load-bearing architectural constraint in koan governing file ownership across the system's actors. On 2026-02-10, Leon established this rule in the koan initial design (documented in `docs/architecture.md` as Invariant 1). The rule: LLM subagents write markdown files only; the koan driver (`koan/driver.py`) reads and writes JSON state files exclusively; tool code in `koan/web/mcp_endpoint.py` bridges both worlds by writing JSON state (for the driver) and templated markdown status files (for LLMs) in the same operation. Leon's stated rationale: if an LLM writes a JSON file, schema drift and parse errors in the payload become runtime failures in the deterministic driver, while markdown is forgiving. The invariant is enforced structurally -- planning-role subagents have write access scoped to the run directory (`~/.koan/runs/<id>/`) but no mechanism to produce JSON state files, and the driver reads JSON state files and exit codes only, never parsing markdown.
diff --git a/.koan/memory/0005-phase-trust-model-plan-review-as-designated.md b/.koan/memory/0005-phase-trust-model-plan-review-as-designated.md
@@ -0,0 +1,10 @@
+---
+title: Phase trust model -- plan-review as designated adversarial verifier
+type: decision
+created: '2026-04-16T07:35:13Z'
+modified: '2026-04-16T07:35:13Z'
+related:
+- 0001-persistent-orchestrator-over-per-phase-cli.md
+---
+
+The plan workflow's phase trust architecture in koan (`docs/phase-trust.md`, `koan/lib/workflows.py`) was designed around an asymmetric verification model. On 2026-02-10, Leon formalized this as part of the initial koan design: phases in the plan pipeline (intake, plan-spec, execute) were built to trust each other's outputs without re-verification; only plan-review was designated as the adversarial verifier. Leon documented the rationale in `docs/phase-trust.md`: cross-phase re-verification is the "intrinsic self-correction" anti-pattern -- research shows the same LLM re-checking its own prior work is more likely to change correct conclusions to incorrect ones than the reverse. Leon gave plan-review the CRITIC role: it uses the actual codebase as an external tool to check every file path, function name, signature, and type claim in `plan.md` against reality. Leon also decided that plan-review would be advisory only -- it reports findings with severity classification and may suggest looping back to plan-spec for critical or major issues, but it does not modify `plan.md` itself.
diff --git a/.koan/memory/0006-directory-as-contract-taskjson-over-cli-flags-for.md b/.koan/memory/0006-directory-as-contract-taskjson-over-cli-flags-for.md
@@ -0,0 +1,10 @@
+---
+title: Directory-as-contract -- task.json over CLI flags for subagent configuration
+type: decision
+created: '2026-04-16T07:35:24Z'
+modified: '2026-04-16T07:35:24Z'
+related:
+- 0004-file-boundary-invariant-llms-write-markdown.md
+---
+
+The subagent configuration mechanism in koan (`koan/subagent.py`, `docs/subagents.md`) was redesigned on 2026-02-10 when Leon replaced a 9-CLI-flag approach with a task.json file convention, later documented as Invariant 6 (Directory-as-contract) in `docs/architecture.md`. The previous design passed task configuration as 9 CLI arguments; Leon replaced it after identifying four problems: (1) the flat flag namespace caused naming collisions (`--koan-role` vs `--koan-scout-role`); (2) role-specific fields mixed with common fields without structure; (3) `--koan-retry-context` needed to carry multi-paragraph summaries exceeding practical CLI limits; (4) after a crash, reconstructing what a subagent had been asked required parsing process arguments from system logs. Leon adopted the convention that the driver would write `task.json` atomically (tmp + `os.rename()`) to the subagent directory before spawn. The subagent discovers its MCP endpoint by reading `mcp_url` from that file. No structured configuration flows through CLI flags, environment variables, or other process-level channels. Leon designated `task.json` as write-once by the parent before spawn and read-once by the parent at agent registration, never modified afterward.
diff --git a/.koan/memory/0007-dual-fold-system-audit-fold-per-subagent-disk-vs.md b/.koan/memory/0007-dual-fold-system-audit-fold-per-subagent-disk-vs.md
@@ -0,0 +1,11 @@
+---
+title: Dual fold system -- audit fold (per-subagent disk) vs projection fold (workflow
+  SSE)
+type: decision
+created: '2026-04-16T07:35:36Z'
+modified: '2026-04-16T07:35:36Z'
+related:
+- 0003-server-authoritative-projection-via-json-patch.md
+---
+
+The state-management layer of koan (`koan/audit/fold.py`, `koan/projections.py`) was designed around two independent fold systems. On 2026-03-29, Leon documented the distinction in `docs/architecture.md` (section "Two Fold Systems"). Leon designed the audit fold to process per-subagent audit events from each subagent's `events.jsonl`, materialize a per-subagent `Projection` object written to `state.json` on disk after every event, and serve debugging and post-mortem consumers. Leon designed the projection fold to process workflow-level projection events emitted by `ProjectionStore.push_event()`, maintain a single in-memory `Projection` covering all agents and run state for the entire workflow, and serve the browser frontend via SSE. Leon chose to keep the two systems independent rather than merging them: the audit fold needed per-event disk writes for durability, while the projection fold needed to stay in-memory for SSE streaming throughput. Leon established the rule that state visible only in logs belongs to the audit fold, while state visible in the browser UI belongs to the projection fold.
diff --git a/.koan/memory/0008-three-tier-model-system-strongstandardcheap-over.md b/.koan/memory/0008-three-tier-model-system-strongstandardcheap-over.md
@@ -0,0 +1,10 @@
+---
+title: Three-tier model system (strong/standard/cheap) over per-role model configuration
+type: decision
+created: '2026-04-16T07:35:45Z'
+modified: '2026-04-16T07:35:45Z'
+related:
+- 0001-persistent-orchestrator-over-per-phase-cli.md
+---
+
+The model selection system in koan (`koan/config.py`, `docs/subagents.md` -- Model Tiers section) was designed on 2026-02-10 when Leon grouped the 6+ agent roles into three capability tiers rather than mapping each role to an individual model. Leon defined the tiers as: `strong` (orchestrator -- complex multi-step reasoning), `standard` (executor -- reliable tool use for code implementation), and `cheap` (scout -- narrow codebase investigation). Leon encoded the role-to-tier mapping in `koan/config.py`. Leon adopted a profile-based configuration system persisted to `~/.koan/config.json` that binds each tier to a specific runner type and model name; switching profiles changes all three tier bindings at once without touching role definitions. Leon rejected per-role model configuration because, with 6+ roles, each model change would require updating 6+ bindings; the tier system reduces that to 3 bindings per profile switch.
diff --git a/.koan/memory/0009-permission-fence-impractical-across-llm-backends.md b/.koan/memory/0009-permission-fence-impractical-across-llm-backends.md
@@ -0,0 +1,10 @@
+---
+title: Permission fence impractical across LLM backends; planned for removal
+type: lesson
+created: '2026-04-16T08:34:06Z'
+modified: '2026-04-16T08:34:06Z'
+related:
+- 0001-persistent-orchestrator-over-per-phase-cli.md
+---
+
+The permission fence in koan (`koan/lib/permissions.py`) was initially designed as a load-bearing default-deny gate enforced on every MCP tool call. On 2026-02-10, Leon established it as Invariant 4 in `docs/architecture.md`, describing it as a load-bearing rule that blocked unknown roles and tools. By approximately 2026-04-08, Leon reversed this assessment, stating in a Claude Code project memory note that the fence is "probably not worth maintaining" because many coding agents do not support accurately disabling tool features, making the gate impractical to enforce reliably across different LLM backends. Leon identified the root cause: enforcement does not work reliably across LLM backends, and the maintenance cost outweighs the benefit. Leon directed that no effort should be invested in extending or hardening the permission fence and that it may be completely removed in a future update. The fence still exists in the codebase as of 2026-04-16, but is deprioritized; the architecture documentation was not updated to reflect this direction change and still describes it as load-bearing.
diff --git a/.koan/memory/0010-curation-phase-3-step-layout-collapsed-to-2-to.md b/.koan/memory/0010-curation-phase-3-step-layout-collapsed-to-2-to.md
@@ -0,0 +1,10 @@
+---
+title: 'Curation phase: 3-step layout collapsed to 2 to prevent meaty-step skip failure'
+type: lesson
+created: '2026-04-16T08:34:15Z'
+modified: '2026-04-16T08:34:15Z'
+related:
+- 0002-step-first-workflow-pattern-boot-prompt-is.md
+---
+
+The curation phase module in koan (`koan/phases/curation.py`) was originally implemented as a 3-step workflow with step names "Survey", "Curate", and "Finalize/Reporting". During a curation run whose output Leon reviewed in screenshots, the orchestrator was observed to confuse "Survey" with intake-style exploration and then reach "phase complete" without ever calling `koan_memorize` -- a failure mode where the curation phase ended with zero memory writes. Leon identified two root causes: (1) the name "Survey" triggered intake-like behavior; (2) there was no per-step structural framing (no workflow_shape, goal, or tools list) visible at the moment the LLM decided whether to advance. On 2026-04-16, Leon approved a redesign that collapsed the 3 steps to 2 (Inventory and Memorize), named after their primary tool effects (`koan_memory_status` and `koan_memorize`) to make step-skipping visible, and added `<workflow_shape>`, `<goal>`, and `<tools_this_step>` XML blocks to every step, re-injected at each `koan_complete_step` call so the phase structure is visible at the moment of use rather than only at step 1.
diff --git a/.koan/memory/0011-intake-confidence-loop-removed-unnecessary-scout.md b/.koan/memory/0011-intake-confidence-loop-removed-unnecessary-scout.md
@@ -0,0 +1,15 @@
+---
+title: 'Intake confidence loop removed: unnecessary scout batches and intrinsic self-correction
+  risk'
+type: lesson
+created: '2026-04-16T08:34:26Z'
+modified: '2026-04-18T16:21:49Z'
+related:
+- 0002-step-first-workflow-pattern-boot-prompt-is.md
+- 0005-phase-trust-model-plan-review-as-designated.md
+- 0013-single-cognitive-goal-per-step-prevents-simulated.md
+---
+
+The intake phase in koan (koan/phases/intake.py) previously included a confidence-gated loop where steps 2-4 would repeat based on a structured confidence value. On 2026-04-12, Leon collapsed intake to a focused 2-step design (Gather + Deepen), removing the loop for three reasons: (a) it produced unnecessary second scout batches; (b) the Reflect step risked intrinsic self-correction -- the same LLM verifying its own prior reasoning rather than checking against actual codebase files; (c) a single thorough Deepen pass was sufficient when that step was well-scoped. Phase completion was redefined by depth of understanding, not iteration count.
+
+On 2026-04-17, Leon extracted a dedicated Summarize step from Deepen's conclusion, bringing intake to 3 steps total: Gather, Deepen, Summarize. The split applies the single-cognitive-goal-per-step principle (entry 0013): Deepen stays focused on dialogue and codebase verification; Summarize is a distinct step for synthesizing findings into a planning handoff. The confidence-loop removal rationale is unchanged -- the step count change only separates concerns that were already happening at the end of step 2. Note: docs/intake-loop.md still describes the older 2-step design as of 2026-04-18 and requires a separate update.
diff --git a/.koan/memory/0012-koan-is-dog-fooded-on-its-own-development-meta.md b/.koan/memory/0012-koan-is-dog-fooded-on-its-own-development-meta.md
@@ -0,0 +1,8 @@
+---
+title: Koan is dog-fooded on its own development -- meta-context for agents
+type: context
+created: '2026-04-16T08:34:35Z'
+modified: '2026-04-16T08:34:35Z'
+---
+
+Koan is a solo project maintained by Leon Mergen, as confirmed by Leon in a curation run on 2026-04-16. Since the initial koan design on 2026-02-10, Leon adopted a practice of using koan's own plan workflow to develop koan itself -- dog-fooding the system as its own first user. This creates a meta-context constraint for any agent working on the koan codebase: workflow instructions and phase prompts in `koan/phases/*.py` and `koan/lib/workflows.py` are runtime instructions for koan's orchestrator subagents to execute, not instructions for the agent currently editing the source files. For example, the `SYSTEM_PROMPT` strings in `koan/phases/intake.py` are the intake orchestrator's role instructions; `koan/phases/curation.py` contains the step guidance that koan's curation orchestrator follows. An agent must not conflate "a prompt being analyzed as source material" with "a prompt being given as a direct instruction." Leon named this the "meta use of koan" and stated it explicitly in the task prompt for the 2026-04-16 curation run.
diff --git a/.koan/memory/0013-single-cognitive-goal-per-step-prevents-simulated.md b/.koan/memory/0013-single-cognitive-goal-per-step-prevents-simulated.md
@@ -0,0 +1,11 @@
+---
+title: Single cognitive goal per step -- prevents simulated refinement
+type: decision
+created: '2026-04-16T08:37:25Z'
+modified: '2026-04-16T08:37:25Z'
+related:
+- 0002-step-first-workflow-pattern-boot-prompt-is.md
+- 0010-curation-phase-3-step-layout-collapsed-to-2-to.md
+---
+
+The step design constraint for koan phases (`docs/architecture.md` -- Pitfalls section, "Don't give a step multiple cognitive goals") was established on 2026-02-10 when Leon set a rule: each `koan_complete_step` call must correspond to exactly one cognitive goal. Leon identified the failure mode that motivated this rule: when a single step combines multiple goals ("do A, then B, then C"), the LLM can engage in "simulated refinement" -- artificially downgrading its output for A in order to manufacture visible improvement in C, without genuinely improving anything. Leon documented this as a design constraint: when adding a new phase, each step must answer "what is the single thing this step accomplishes?" and if the answer requires "and then," the step must be split. Leon's reference designs in `koan/phases/plan_spec.py` (Analyze + Write), `koan/phases/intake.py` (Gather + Deepen), and `koan/phases/curation.py` (Inventory + Memorize) each place cognitively distinct operations into separate `koan_complete_step` calls.
diff --git a/.koan/memory/0014-camelcase-wire-format-eliminates-renaming-layer.md b/.koan/memory/0014-camelcase-wire-format-eliminates-renaming-layer.md
@@ -0,0 +1,12 @@
+---
+title: 'CamelCase wire format: eliminates renaming layer between projection and Zustand
+  store'
+type: decision
+created: '2026-04-16T08:37:35Z'
+modified: '2026-04-16T08:37:35Z'
+related:
+- 0003-server-authoritative-projection-via-json-patch.md
+- 0007-dual-fold-system-audit-fold-per-subagent-disk-vs.md
+---
+
+The SSE wire format for koan's projection system (`koan/projections.py`, `frontend/src/sse/connect.ts`) was designed to use camelCase keys for all serialized projection fields. On 2026-03-29, Leon documented this decision in `docs/projections.md` (Design Decisions -- "Why camelCase on the wire"). Leon's rationale: emitting snake_case from the server would require a `mapProjectionToStore()` renaming function in the frontend TypeScript plus a `projectionState` shadow object for patch application (patches must apply to the pre-renamed dict, not the renamed Zustand store); every new projection field would require a rename entry in that mapping. Leon identified this mapping layer as frontend business logic, contradicting his "frontend has zero business logic" principle. By adopting camelCase -- via Pydantic's `alias_generator=to_camel` in `KoanBaseModel` (`koan/projections.py`) -- patches produced by `jsonpatch.make_patch()` apply directly to the Zustand store in `frontend/src/store/`, and snapshot state spreads directly into the store at reconnect with no field renaming.