Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
412 commits
Select commit Hold shift + click to select a range
ea81628
T1 Scaffold (koan/, pyproject.toml)
solatis Mar 26, 2026
2fac79b
T2 Runners (12 files)
solatis Mar 26, 2026
da02414
T3 MCP Fence (8 files)
solatis Mar 26, 2026
5d3f46f
T4 Phases (25 files)
solatis Mar 26, 2026
b508329
T3 Fixup (5 files)
solatis Mar 26, 2026
8cae06b
T5 Driver FSM (10 files)
solatis Mar 26, 2026
29338c8
T6 Subagent Audit (22 files)
solatis Mar 26, 2026
ff60bb7
T7 Blocking Flow (13 files)
solatis Mar 26, 2026
19700b4
T8 Web UI (12 files)
solatis Mar 26, 2026
919621d
T6-T8 Fixups (14 files)
solatis Mar 26, 2026
0d414da
T9 TS Deletion (113 files)
solatis Mar 26, 2026
ced9a96
Validation Fixups (driver.py, subagent.py, app.py)
solatis Mar 27, 2026
6068741
Cosmetic Cleanup (driver.py, pyproject.toml)
solatis Mar 27, 2026
e2ee51b
T10+T11 Types & Probe (35 files)
solatis Mar 27, 2026
3dcef68
T12 Runner Interface (13 files)
solatis Mar 27, 2026
e21318c
T13 Config & Registry (12 files)
solatis Mar 27, 2026
108e2f8
T14 Web API Profiles Agents (6 files)
solatis Mar 27, 2026
60261b4
T15 Settings Overlay UI (10 files)
solatis Mar 27, 2026
7c7d7e2
Fixup Probe Refresh & Strict Install (9 files)
solatis Mar 27, 2026
4a95797
chore: update .gitignore for frontend build output and pycache
solatis Mar 28, 2026
c2c6cff
chore: remove tracked __pycache__ files
solatis Mar 28, 2026
41e6877
fix: MCP path routing — path="/" for Mount prefix strip, trailing sla…
solatis Mar 28, 2026
1b65969
fix: step-0 routing bug in brief_writer and workflow_orchestrator
solatis Mar 28, 2026
0e0195b
fix: naive utcnow() caused ~420min timer offset; add is_primary to Ag…
solatis Mar 28, 2026
0d92cb5
fix: codex stream parsing — handle item.completed events for tool cal…
solatis Mar 28, 2026
f79ace9
fix: add --verbose flag to claude runner for diagnostic output
solatis Mar 28, 2026
8e30205
fix: subagent — is_primary flag, dedupe thinking label, cancelled int…
solatis Mar 28, 2026
5dc26ac
feat: server-side MCP tool call logging in activity feed, normalise w…
solatis Mar 28, 2026
cd837c9
feat: runner registry strict install validation, probe refresh, open_…
solatis Mar 28, 2026
230ac8c
feat: React + Zustand + Vite frontend SPA replacing Jinja2 + vanilla JS
solatis Mar 28, 2026
f4a605a
feat: backend emits JSON-only SSE; delete Jinja2 templates, koan.js, …
solatis Mar 28, 2026
c02810f
docs: add frontend.md spoke doc, update architecture/ipc/token-stream…
solatis Mar 28, 2026
59ee8b4
add projection engine and event payload builders
solatis Mar 30, 2026
088cd30
add KOAN_MCP_TOOLS constant and per-runner tool name normalization
solatis Mar 30, 2026
1613021
replace push_sse with versioned projection events across backend
solatis Mar 30, 2026
6451c1a
rewrite frontend store and SSE bridge for versioned event protocol
solatis Mar 30, 2026
92ad4f9
update tests for projection system
solatis Mar 30, 2026
2da1938
update documentation for event-sourced projection system
solatis Mar 30, 2026
fa1be6f
auto-rebuild frontend on startup when sources are newer than build
solatis Mar 30, 2026
b166653
fix codex MCP tool approval: pass --dangerously-bypass-approvals-and-…
solatis Mar 30, 2026
c5ea9d2
fix activity feed rendering: wrap stream text, filter thinking noise,…
solatis Mar 30, 2026
e24e0c3
prefer claude/opus over codex/gpt-5 for strong tier in balanced profile
solatis Mar 30, 2026
f46bb7b
enable source maps and skip minification in vite build
solatis Mar 30, 2026
f4b66b4
use random free port when --port is not specified
solatis Mar 31, 2026
0af9a2b
Add -p/--prompt CLI option to pre-fill task description
solatis Mar 31, 2026
8ce1dd5
add resolve_installation with binary validation and PATH fallback
solatis Mar 31, 2026
2aa9ea3
refresh default installation paths from probe and remove redundant bi…
solatis Mar 31, 2026
0b15a82
add resolve_installation tests and update binary-not-found test expec…
solatis Mar 31, 2026
da270bd
make resolve_installation fail-fast instead of silently falling back
solatis Mar 31, 2026
476f9e8
add preflight endpoint and installation validation to start-run
solatis Mar 31, 2026
93ea2c0
add installation selector to landing page driven by profile selection
solatis Mar 31, 2026
fc7dba0
add granular config event types to projection system and event builders
solatis Mar 31, 2026
605eebe
emit config events from all settings endpoints and lifespan
solatis Mar 31, 2026
7121428
add config event fold cases and snapshot extraction to frontend store
solatis Mar 31, 2026
68a159c
rewrite settings and landing page to read from projection store inste…
solatis Mar 31, 2026
0c28d8d
remove dead API functions no longer used by components
solatis Mar 31, 2026
0591a9c
use consistent row layout for installations matching profile design
solatis Mar 31, 2026
37e5a61
tabbed runner-type layout for agent installations in settings
solatis Mar 31, 2026
2bb9b24
fix claude thinking: use --effort flag instead of --thinking-budget-t…
solatis Mar 31, 2026
00dfdb6
haiku supports low/medium/high effort (not max)
solatis Mar 31, 2026
3706693
shutdown: kill active agents immediately, don't wait for HTTP clients
solatis Mar 31, 2026
c1f96d1
remove global activeInstallations; model installation selection as pe…
solatis Mar 31, 2026
e966c54
rich activity feed: thinking cards, step headers, tool detail
solatis Mar 31, 2026
da7e582
fix: claude stream parsing — content nested under message envelope
solatis Mar 31, 2026
5ea4419
fix: interleave text output chronologically, filter scout activity fr…
solatis Mar 31, 2026
0c7da8b
add --yolo flag: skip all agent permission prompts
solatis Mar 31, 2026
bee4268
agent monitor: numeric counter bar + grouped sections by status
solatis Mar 31, 2026
5cd7454
fix: close in-flight tool when thinking/text starts
solatis Mar 31, 2026
9eac474
typed tool events: read/write/edit/bash/grep/ls with metadata
solatis Mar 31, 2026
9fe96ad
fix: align tool detail text — fixed-width tool name column
solatis Mar 31, 2026
4bde5aa
agent monitor: scout names, queue tracking, last tool display
solatis Mar 31, 2026
72f5491
fix: track scout lastTool for all typed tool events
solatis Mar 31, 2026
12a8415
fix: missing thinking after scouts, empty question options, monitor c…
solatis Mar 31, 2026
19a1cfa
fix: agent name truncates instead of wrapping to multiple lines
solatis Mar 31, 2026
7bca7fc
fix: snapshot reconstruction — filter scouts, merge thinking, skip bo…
solatis Mar 31, 2026
7e15745
plan: symmetric projection folds
solatis Mar 31, 2026
650b267
plan: symmetric folds — add motivation, rationale, migration, doc upd…
solatis Mar 31, 2026
e6c22bd
fix: agent-row-name fixed width 200px (was min/max range)
solatis Mar 31, 2026
d0266fb
plan: split ConversationEntry into discriminated union of 10 types
solatis Apr 1, 2026
e7c0b7e
plan: BaseToolEntry base class for shared call_id + in_flight
solatis Apr 1, 2026
877e818
plan: server-authoritative state with JSON Patch
solatis Apr 1, 2026
346f4ec
plan: technical writer review — fix event count, add missing details
solatis Apr 1, 2026
e292d94
refactor: extract normalizeAskQuestions helper, update plan per review
solatis Apr 1, 2026
25eb35c
plan: resolve review feedback — 7 clarifications
solatis Apr 1, 2026
e2eb930
plan: server emits camelCase via Pydantic alias_generator
solatis Apr 1, 2026
04920b3
plan: uniform JSON Patch, drop delta bypass, rename pending fields
solatis Apr 1, 2026
6cb77d5
plan: complete rewrite — unified agents dict, focus, settings/run split
solatis Apr 1, 2026
9f36ce1
plan: technical writer pass — code comments, clarity, consistency
solatis Apr 1, 2026
11e57a8
plan: comprehensive quality pass — code sketches, type definitions, n…
solatis Apr 1, 2026
b600ebe
add jsonpatch and fast-json-patch dependencies
solatis Apr 1, 2026
543471b
rewrite projection model with KoanBaseModel, per-agent conversations,…
solatis Apr 1, 2026
73067c0
rewrite SSE stream, emit run_started, fix startup event ordering
solatis Apr 1, 2026
73c0b75
rewrite projection tests for new model (92 fold tests, patch paths, s…
solatis Apr 1, 2026
59d7a8a
rewrite frontend store and SSE as dumb JSON Patch renderer
solatis Apr 1, 2026
79b794c
update all components for new store shape
solatis Apr 1, 2026
0cf2541
update docs for JSON Patch architecture
solatis Apr 1, 2026
9137f48
constrain intake scouts to 3-5 with broad multi-part prompts
solatis Apr 1, 2026
811ee38
invert activity feed visual hierarchy: muted thinking, elevated text
solatis Apr 1, 2026
b865859
fix scout queued-to-running transition when IDs differ
solatis Apr 1, 2026
345470d
compress scout report format for signal density
solatis Apr 1, 2026
8774d48
show connecting state before first SSE snapshot arrives
solatis Apr 1, 2026
1430476
make feed container the white surface, not individual entries
solatis Apr 1, 2026
fdfe71b
render all LLM content as markdown via react-markdown
solatis Apr 2, 2026
d465d97
rewrite auto-scroll with ResizeObserver for reliable sticky-scroll
solatis Apr 2, 2026
b68ed56
improve layout spacing: wider app, narrower sidebars, more breathing …
solatis Apr 2, 2026
07df4ba
hide agent monitor entirely when no agents are running or queued
solatis Apr 2, 2026
354cded
chore: remove stale __pycache__ bytecode files
solatis Apr 2, 2026
869f593
feat: add assistant_text stream event type to runners
solatis Apr 2, 2026
b4f42e8
feat: add read_only parameter to runner build_command
solatis Apr 2, 2026
af0897e
refactor: scout returns findings via SubagentResult instead of file
solatis Apr 2, 2026
de1a58f
refactor: remove read_only runner parameter
solatis Apr 2, 2026
485fe58
chore: add CLAUDE.md config pointing to AGENTS.md
solatis Apr 2, 2026
a9b70ed
docs: update intake phase — rename steps, simplify confidence gate
solatis Apr 2, 2026
285f116
docs: update subagent roles, permissions, and configuration
solatis Apr 2, 2026
47ee53d
docs: rewrite IPC documentation for interaction queue model
solatis Apr 2, 2026
4d6444d
docs: update frontend and artifact-review for projection model
solatis Apr 2, 2026
c3e9050
feat: show tool call count instead of token counts in agent monitor
solatis Apr 2, 2026
fb5cd91
fix: guard normalizeOptions against undefined question.options
solatis Apr 2, 2026
b06d4b0
refactor: simplify intake phase from 5 steps to 3
solatis Apr 2, 2026
11996c1
feat: --debug flag to show step guidance prompts in UI
solatis Apr 2, 2026
d016549
fix: scope scouts to the project directory
solatis Apr 2, 2026
83de05e
refactor: inline task description into intake prompt, remove conversa…
solatis Apr 2, 2026
b04fac2
fix: deliver SYSTEM_PROMPT to subagent processes
solatis Apr 2, 2026
db1dac2
feat: support free-form text input in question UI
solatis Apr 2, 2026
b222e7e
fix: always show 'Other' text input on every question
solatis Apr 2, 2026
52f07fd
fix: spawn all agents with cwd=project_dir
solatis Apr 2, 2026
f780c83
fix: show 'Starting agent…' indicator while activity feed is empty
solatis Apr 2, 2026
53053c6
fix: prevent LLM from sending letter-prefixed and 'Other' options
solatis Apr 2, 2026
7eca0cc
chore: set log level to debug when --debug flag is provided
solatis Apr 3, 2026
a4f5b3b
fix: freeze agent timer on completion
solatis Apr 3, 2026
a1f6f23
feat: replace workflow orchestrator with persistent orchestrator
solatis Apr 3, 2026
036abfb
docs: update architecture docs for persistent orchestrator
solatis Apr 3, 2026
9f790e4
fix: restyle chat input as bordered field inside the activity feed card
solatis Apr 3, 2026
6f9e7fa
refactor: rename intake step 2 to Deepen, rewrite guidance for iterat…
solatis Apr 3, 2026
ab12920
refactor: remove client-side option sanitization from AskWizard
solatis Apr 3, 2026
1cc24a6
style: redesign question option cards with left-border accent pattern
solatis Apr 3, 2026
f2414be
refactor: rename chat placeholder from 'Message the orchestrator' to …
solatis Apr 3, 2026
ca76fdf
feat: add steering queue infrastructure
solatis Apr 3, 2026
c2a0d1b
feat: wire steering queue into tool handlers and message routing
solatis Apr 3, 2026
b965c2c
feat: add steering indicator component above chat input
solatis Apr 3, 2026
d9e1716
rename epic → run/workflow throughout codebase
solatis Apr 4, 2026
fd0f3b6
refactor: remove temporal contamination from code comments
solatis Apr 4, 2026
e696c86
refactor: fix stale epic references in code and tests
solatis Apr 4, 2026
bb16d3b
refactor: add SCOPE field to PhaseModule protocol and reorganize regi…
solatis Apr 4, 2026
e3006d8
refactor: update legacy phase prompts from epic to run terminology
solatis Apr 4, 2026
75737ea
docs: update documentation for workflow system
solatis Apr 4, 2026
1e8fbef
feat: redesign launch page with stacked-card layout
solatis Apr 4, 2026
4ca4d09
fix: scan artifacts at phase boundaries and step transitions
solatis Apr 4, 2026
1005298
fix: emit phase_boundary_reached event for user visibility
solatis Apr 4, 2026
b6d478c
fix: strengthen plan-mode question guidance to prevent skipping
solatis Apr 4, 2026
bfd4c26
refactor: remove landscape.md identity from intake SYSTEM_PROMPT
solatis Apr 4, 2026
0776f66
refactor: clean intake step 2 — remove landscape.md refs, strengthen …
solatis Apr 4, 2026
fb5abd2
refactor: rewrite intake step 3 from 'write landscape.md' to 'summarize'
solatis Apr 4, 2026
06ab65f
refactor: clean workflow phase_guidance — remove landscape.md, hedgin…
solatis Apr 4, 2026
b87fd7f
refactor: remove landscape.md from plan-spec prompts
solatis Apr 4, 2026
161fbdf
refactor: remove landscape.md from plan-review prompts
solatis Apr 4, 2026
0e2f7f5
refactor: remove stale tool-list claim from orchestrator system prompt
solatis Apr 4, 2026
258068b
test: update phase tests for landscape.md removal
solatis Apr 4, 2026
91387d8
refactor: update koan_request_executor docstring example
solatis Apr 4, 2026
5095d18
refactor: describe split-panel layout in koan_ask_question tool and i…
solatis Apr 4, 2026
56c0983
feat: redesign question cards as split reference panels
solatis Apr 4, 2026
f51a041
fix: restore landing page card styling broken by question card redesign
solatis Apr 4, 2026
8a4ec4e
fix: use stone surface background for question context panel
solatis Apr 4, 2026
02f7394
fix: improve context panel readability — larger font, visible code, s…
solatis Apr 4, 2026
b61caee
feat: white-on-white context panel with copper left rule
solatis Apr 4, 2026
e7f1239
fix: quiet inline code in context panel — mono font only, no bordered…
solatis Apr 4, 2026
8e6636a
fix: textarea padding was zero due to undefined --space-3 variable
solatis Apr 4, 2026
d937ffd
fix: enforce minimum scout dispatches in intake gather
solatis Apr 5, 2026
0364659
docs: replace design system with new visual language
solatis Apr 5, 2026
582b60f
refactor: replace CSS tokens and migrate to new design system
solatis Apr 5, 2026
ede5077
feat: add atom components for new design system
solatis Apr 5, 2026
d2bc3e2
feat: add molecule components for new design system
solatis Apr 5, 2026
c8cc929
docs: add derived tokens and fix hardcoded values in design system
solatis Apr 6, 2026
cbdabd9
docs: rewrite layout section for centered container approach
solatis Apr 6, 2026
b139b61
feat: add organism components for new design system
solatis Apr 6, 2026
a2ec397
feat: add remaining content stream molecules
solatis Apr 6, 2026
e0438cb
feat: extend ElicitationPanel and NewRunForm to replace legacy compon…
solatis Apr 6, 2026
646d8d3
refactor: extract normalizeOptions utility from AskWizard
solatis Apr 6, 2026
0d46d53
refactor: rewrite view layer with new organisms, delete legacy compon…
solatis Apr 6, 2026
020a2b9
fix: align header and scout bar content with centered container
solatis Apr 6, 2026
3589aad
docs: sync design system with implemented components and content stre…
solatis Apr 6, 2026
8535a39
fix: handle empty phase string during run initialization
solatis Apr 6, 2026
4e1cd86
fix: show last tool call and completion status in scout rows
solatis Apr 6, 2026
76b6cb3
fix: hide scout bar when all scouts have finished
solatis Apr 6, 2026
a71d714
refactor: improve scout bar column distribution and summary readability
solatis Apr 6, 2026
4827e50
fix: prevent code blocks from expanding elicitation grid columns
solatis Apr 6, 2026
0c7fdc0
refactor: rename phase_complete_future to yield_future and add workfl…
solatis Apr 6, 2026
908c7f8
feat: add yield projection types, events, and permissions infrastructure
solatis Apr 6, 2026
da469fb
feat: replace phase boundary with koan_yield tool and add done tombstone
solatis Apr 6, 2026
73c004e
feat: add YieldCard molecule and wire frontend yield support
solatis Apr 6, 2026
11f7e95
docs: add frontend component development guidelines
solatis Apr 6, 2026
8232b7f
docs: update frontend.md for atomic design system and new component h…
solatis Apr 6, 2026
8d78683
docs: document koan_yield conversation primitive and done tombstone
solatis Apr 6, 2026
3f46397
feat: support multiple built-in runner profiles
solatis Apr 9, 2026
6e88f5c
fix: relax intake scout dispatch mandate
solatis Apr 9, 2026
11f45f4
feat: redesign settings and yield command UX
solatis Apr 9, 2026
13ecabb
refactor: restyle recommended elicitation options
solatis Apr 9, 2026
ce18fea
docs: update design system for new frontend patterns
solatis Apr 9, 2026
24ef627
fix: wire yield command palette to projection state
solatis Apr 9, 2026
03f345b
fix: honor yield suggestion command text
solatis Apr 12, 2026
2763476
feat: add selectable artifacts sidebar state
solatis Apr 12, 2026
8808e47
feat: define koan_yield review feedback contract
solatis Apr 12, 2026
879e1bc
feat: add inline artifact review panel workflow
solatis Apr 12, 2026
e3eb1a4
docs: specify review panel and review event design
solatis Apr 12, 2026
9ceb1de
feat: add sessions list/delete API endpoints
solatis Apr 12, 2026
e460a08
feat: add sessions page backed by sessions API
solatis Apr 12, 2026
df9c276
feat: collapse intake phase into two-step flow
solatis Apr 12, 2026
2c753fe
feat: enforce phase trust boundaries in plan workflow
solatis Apr 12, 2026
4f98927
docs: refresh architecture links for intake and phase trust
solatis Apr 12, 2026
7642e7a
fix: support approval and summary-only review responses
solatis Apr 12, 2026
d698443
fix: display yield suggestion command text
solatis Apr 12, 2026
be974da
style: switch guidance text to ascii dash separators
solatis Apr 12, 2026
27de3f7
chore: add pyyaml dependency for memory frontmatter
solatis Apr 12, 2026
a24c697
feat: add file-based memory parsing and storage APIs
solatis Apr 12, 2026
d64a0a2
test: add coverage for memory parser, writer, store, and validation
solatis Apr 12, 2026
89b9442
plan: add memory system specification v3
solatis Apr 12, 2026
f2d7c99
feat: redesign memory storage and summary generation
solatis Apr 14, 2026
048d1ec
feat: add MCP tools for memory curation
solatis Apr 14, 2026
41178de
feat: add standalone curation workflow and phase
solatis Apr 14, 2026
30deec7
docs: update memory system specification to v4
solatis Apr 14, 2026
4758240
fix: improve memory entry slug and frontmatter rendering
solatis Apr 16, 2026
f47ff2b
refactor: bind workflows directly to phase modules
solatis Apr 16, 2026
d0e9b3a
feat: strengthen curation guidance with draft-quality loop
solatis Apr 16, 2026
8950ca3
feat: stream Claude tool-use input deltas
solatis Apr 16, 2026
7a4ca7f
feat: surface tool lifecycle events in activity stream
solatis Apr 16, 2026
a1d5212
chore: update default memory model alias
solatis Apr 16, 2026
5d5b209
chore: add structured logging to memory operations
solatis Apr 16, 2026
b6fbf9e
fix: surface memory summary regeneration failures
solatis Apr 16, 2026
a329c8e
fix: tighten curation memory-entry guidance
solatis Apr 16, 2026
7fd7c51
refactor: separate agent prompts from phase role context
solatis Apr 17, 2026
c5de1e1
feat: add memory CLI and semantic retrieval search
solatis Apr 17, 2026
8cd7552
fix: hide orchestrator model label when model is unknown
solatis Apr 17, 2026
51a962e
docs: add curated koan memory entries
solatis Apr 17, 2026
71b0b15
fix: coerce read range args to ints in Claude summaries
solatis Apr 17, 2026
b973f6c
fix: recover subagent stream after parse errors
solatis Apr 17, 2026
822bb06
feat: add per-phase memory injection at phase handshakes
solatis Apr 18, 2026
d27ecb9
fix: surface subagent failure details in phase logs
solatis Apr 18, 2026
fd2a79b
fix: use opus[1m] alias in built-in Claude profiles
solatis Apr 18, 2026
0e4bd71
test: remove brittle prompt-structure test suite
solatis Apr 18, 2026
1fb7bf3
feat: add inspect-ai eval harness for full koan runs
solatis Apr 18, 2026
79e0834
fix: allow all roles to call read-only memory tools
solatis Apr 19, 2026
3da8d08
docs: update memory notes for permission and retrieval behavior
solatis Apr 19, 2026
8f0542f
docs: document intake step-3 summarize flow
solatis Apr 19, 2026
4f2e7cc
feat: aggregate exploration tool calls with parsed metrics
solatis Apr 19, 2026
5017ecf
feat: render aggregated exploration tool cards in UI
solatis Apr 19, 2026
ffd2038
docs: specify tool aggregation UI in design system
solatis Apr 19, 2026
61b63aa
chore: align eval fixtures with LFS snapshot archives
solatis Apr 19, 2026
67471d3
fix: restore thinking display for opus claude models
solatis Apr 19, 2026
f1d6de7
fix: compose claude permission and dir flags at spawn time
solatis Apr 19, 2026
4e4415e
docs: rewrite README around workflows and memory
solatis Apr 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
12 changes: 12 additions & 0 deletions .config/wt.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Koan project worktree hooks
# Docs: https://worktrunk.dev/hook/

[post-create]
deps = "uv sync --dev"

[post-start]
copy = "wt step copy-ignored"

[pre-merge]
check = "uv run ruff check ."
test = "uv run pytest"
28 changes: 28 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: CI

on:
push:
branches: ["main"]
pull_request:
workflow_dispatch:

jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install uv
uses: astral-sh/setup-uv@v4

- name: Install dependencies
run: uv sync --dev

- name: Run tests
run: uv run pytest
14 changes: 12 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
node_modules/
dist/
.pi/
.DS_Store

.claude/
plans/
.env
.env.*
*.log

# Frontend build output (committed source lives in frontend/src/)
koan/web/static/app/
frontend/node_modules/
frontend/dist/
__pycache__/
2 changes: 2 additions & 0 deletions .koan/memory/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.index/
summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Persistent orchestrator over per-phase CLI spawning
type: decision
created: '2026-04-16T07:13:41Z'
modified: '2026-04-16T07:13:41Z'
---

This entry documents the orchestrator spawn architecture decision in koan's workflow engine (`koan/driver.py`). On 2026-04-02, Leon redesigned the system to replace per-phase CLI process spawning with a single long-lived orchestrator process running the entire workflow in one continuous session. Previously, each planning phase spawned a fresh `claude`, `codex`, or `gemini` CLI process; a separate `workflow-orchestrator` subagent was then spawned to present the user with a phase-selection decision after each phase completed. Leon's rationale: per-phase spawning caused compounding context loss (each new process re-derived what the previous had explored), and the workflow-orchestrator role was architecturally wasteful -- "a process-boot just to ask a question." Two alternatives were explicitly rejected: (1) API-based conversation (driver calling the LLM API directly) -- would have bypassed the runner abstraction handling model selection, MCP config, output streaming, and thinking mode; (2) context injection into fresh processes per phase -- cheaper but fails to provide a persistent reasoning chain and does not eliminate the workflow-orchestrator overhead. The redesign landed in `koan/driver.py` as a single `spawn_subagent()` call awaiting the orchestrator's exit, and added `koan_set_phase` as the new phase-transition tool replacing the two-tool `koan_propose_workflow` / `koan_set_next_phase` dance.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Step-first workflow pattern -- boot prompt is exactly one sentence
type: decision
created: '2026-04-16T07:13:50Z'
modified: '2026-04-16T07:13:50Z'
---

The step-first workflow pattern governs how all LLM subagent CLI processes in koan receive task instructions. On 2026-02-10, Leon established this as a load-bearing architectural invariant in the koan initial design (documented in `docs/architecture.md` as Invariant 2 and enforced in `koan/web/mcp_endpoint.py`). The rule: every subagent's boot prompt is exactly one sentence -- role identity plus "Call koan_complete_step to receive your instructions." Task details, phase guidance, and tool lists arrive exclusively as the return value of the first `koan_complete_step` MCP call. The pattern was motivated by a failure mode observed with haiku-class (weaker) models: complex task instructions in the boot prompt caused these models to produce text output on the first turn and exit without ever entering the tool-calling loop. Three reinforcement mechanisms make the pattern robust across model capability levels: primacy (boot prompt is the LLM's very first message), recency (`format_step()` in `koan/phases/format_step.py` always appends "WHEN DONE: Call koan_complete_step..." last), and muscle memory (by step 2 the LLM has called the tool multiple times, locking in the pattern).
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Server-authoritative projection via JSON Patch over symmetric dual fold
type: decision
created: '2026-04-16T07:13:57Z'
modified: '2026-04-16T07:13:57Z'
---

The koan projection system maintains frontend-visible workflow state for the browser dashboard, served via Server-Sent Events from `koan/projections.py`. On 2026-03-29, Leon decided to replace a dual fold architecture with a server-authoritative JSON Patch model. The prior design maintained two independent fold implementations -- one in Python (`koan/projections.py`) and one in TypeScript (`frontend/src/sse/connect.ts`) -- required to produce identical projections from the same event sequence. Two production bugs traced directly to these folds diverging: fragmented thinking cards in the activity feed, and scout events appearing incorrectly in the primary agent's conversation feed. Leon's decision: Python computes the fold and the RFC 6902 JSON Patch diff after each event; the browser applies patches mechanically via `fast-json-patch` with no fold logic, no event interpretation, and no business rules. Simultaneously, Leon adopted camelCase for all wire-format keys so patches apply directly to the Zustand store without a field-renaming layer. The correctness guarantee is now structural: one fold in one place.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: File boundary invariant -- LLMs write markdown, driver writes JSON
type: decision
created: '2026-04-16T07:14:03Z'
modified: '2026-04-16T07:14:03Z'
---

The file boundary invariant is a load-bearing architectural constraint in koan governing file ownership across the system's actors. On 2026-02-10, Leon established this rule in the koan initial design (documented in `docs/architecture.md` as Invariant 1). The rule: LLM subagents write markdown files only; the koan driver (`koan/driver.py`) reads and writes JSON state files exclusively; tool code in `koan/web/mcp_endpoint.py` bridges both worlds by writing JSON state (for the driver) and templated markdown status files (for LLMs) in the same operation. Leon's stated rationale: if an LLM writes a JSON file, schema drift and parse errors in the payload become runtime failures in the deterministic driver, while markdown is forgiving. The invariant is enforced structurally -- planning-role subagents have write access scoped to the run directory (`~/.koan/runs/<id>/`) but no mechanism to produce JSON state files, and the driver reads JSON state files and exit codes only, never parsing markdown.
10 changes: 10 additions & 0 deletions .koan/memory/0005-phase-trust-model-plan-review-as-designated.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Phase trust model -- plan-review as designated adversarial verifier
type: decision
created: '2026-04-16T07:35:13Z'
modified: '2026-04-16T07:35:13Z'
related:
- 0001-persistent-orchestrator-over-per-phase-cli.md
---

The plan workflow's phase trust architecture in koan (`docs/phase-trust.md`, `koan/lib/workflows.py`) was designed around an asymmetric verification model. On 2026-02-10, Leon formalized this as part of the initial koan design: phases in the plan pipeline (intake, plan-spec, execute) were built to trust each other's outputs without re-verification; only plan-review was designated as the adversarial verifier. Leon documented the rationale in `docs/phase-trust.md`: cross-phase re-verification is the "intrinsic self-correction" anti-pattern -- research shows the same LLM re-checking its own prior work is more likely to change correct conclusions to incorrect ones than the reverse. Leon gave plan-review the CRITIC role: it uses the actual codebase as an external tool to check every file path, function name, signature, and type claim in `plan.md` against reality. Leon also decided that plan-review would be advisory only -- it reports findings with severity classification and may suggest looping back to plan-spec for critical or major issues, but it does not modify `plan.md` itself.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Directory-as-contract -- task.json over CLI flags for subagent configuration
type: decision
created: '2026-04-16T07:35:24Z'
modified: '2026-04-16T07:35:24Z'
related:
- 0004-file-boundary-invariant-llms-write-markdown.md
---

The subagent configuration mechanism in koan (`koan/subagent.py`, `docs/subagents.md`) was redesigned on 2026-02-10 when Leon replaced a 9-CLI-flag approach with a task.json file convention, later documented as Invariant 6 (Directory-as-contract) in `docs/architecture.md`. The previous design passed task configuration as 9 CLI arguments; Leon replaced it after identifying four problems: (1) the flat flag namespace caused naming collisions (`--koan-role` vs `--koan-scout-role`); (2) role-specific fields mixed with common fields without structure; (3) `--koan-retry-context` needed to carry multi-paragraph summaries exceeding practical CLI limits; (4) after a crash, reconstructing what a subagent had been asked required parsing process arguments from system logs. Leon adopted the convention that the driver would write `task.json` atomically (tmp + `os.rename()`) to the subagent directory before spawn. The subagent discovers its MCP endpoint by reading `mcp_url` from that file. No structured configuration flows through CLI flags, environment variables, or other process-level channels. Leon designated `task.json` as write-once by the parent before spawn and read-once by the parent at agent registration, never modified afterward.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: Dual fold system -- audit fold (per-subagent disk) vs projection fold (workflow
SSE)
type: decision
created: '2026-04-16T07:35:36Z'
modified: '2026-04-16T07:35:36Z'
related:
- 0003-server-authoritative-projection-via-json-patch.md
---

The state-management layer of koan (`koan/audit/fold.py`, `koan/projections.py`) was designed around two independent fold systems. On 2026-03-29, Leon documented the distinction in `docs/architecture.md` (section "Two Fold Systems"). Leon designed the audit fold to process per-subagent audit events from each subagent's `events.jsonl`, materialize a per-subagent `Projection` object written to `state.json` on disk after every event, and serve debugging and post-mortem consumers. Leon designed the projection fold to process workflow-level projection events emitted by `ProjectionStore.push_event()`, maintain a single in-memory `Projection` covering all agents and run state for the entire workflow, and serve the browser frontend via SSE. Leon chose to keep the two systems independent rather than merging them: the audit fold needed per-event disk writes for durability, while the projection fold needed to stay in-memory for SSE streaming throughput. Leon established the rule that state visible only in logs belongs to the audit fold, while state visible in the browser UI belongs to the projection fold.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Three-tier model system (strong/standard/cheap) over per-role model configuration
type: decision
created: '2026-04-16T07:35:45Z'
modified: '2026-04-16T07:35:45Z'
related:
- 0001-persistent-orchestrator-over-per-phase-cli.md
---

The model selection system in koan (`koan/config.py`, `docs/subagents.md` -- Model Tiers section) was designed on 2026-02-10 when Leon grouped the 6+ agent roles into three capability tiers rather than mapping each role to an individual model. Leon defined the tiers as: `strong` (orchestrator -- complex multi-step reasoning), `standard` (executor -- reliable tool use for code implementation), and `cheap` (scout -- narrow codebase investigation). Leon encoded the role-to-tier mapping in `koan/config.py`. Leon adopted a profile-based configuration system persisted to `~/.koan/config.json` that binds each tier to a specific runner type and model name; switching profiles changes all three tier bindings at once without touching role definitions. Leon rejected per-role model configuration because, with 6+ roles, each model change would require updating 6+ bindings; the tier system reduces that to 3 bindings per profile switch.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Permission fence impractical across LLM backends; planned for removal
type: lesson
created: '2026-04-16T08:34:06Z'
modified: '2026-04-16T08:34:06Z'
related:
- 0001-persistent-orchestrator-over-per-phase-cli.md
---

The permission fence in koan (`koan/lib/permissions.py`) was initially designed as a load-bearing default-deny gate enforced on every MCP tool call. On 2026-02-10, Leon established it as Invariant 4 in `docs/architecture.md`, describing it as a load-bearing rule that blocked unknown roles and tools. By approximately 2026-04-08, Leon reversed this assessment, stating in a Claude Code project memory note that the fence is "probably not worth maintaining" because many coding agents do not support accurately disabling tool features, making the gate impractical to enforce reliably across different LLM backends. Leon identified the root cause: enforcement does not work reliably across LLM backends, and the maintenance cost outweighs the benefit. Leon directed that no effort should be invested in extending or hardening the permission fence and that it may be completely removed in a future update. The fence still exists in the codebase as of 2026-04-16, but is deprioritized; the architecture documentation was not updated to reflect this direction change and still describes it as load-bearing.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: 'Curation phase: 3-step layout collapsed to 2 to prevent meaty-step skip failure'
type: lesson
created: '2026-04-16T08:34:15Z'
modified: '2026-04-16T08:34:15Z'
related:
- 0002-step-first-workflow-pattern-boot-prompt-is.md
---

The curation phase module in koan (`koan/phases/curation.py`) was originally implemented as a 3-step workflow with step names "Survey", "Curate", and "Finalize/Reporting". During a curation run whose output Leon reviewed in screenshots, the orchestrator was observed to confuse "Survey" with intake-style exploration and then reach "phase complete" without ever calling `koan_memorize` -- a failure mode where the curation phase ended with zero memory writes. Leon identified two root causes: (1) the name "Survey" triggered intake-like behavior; (2) there was no per-step structural framing (no workflow_shape, goal, or tools list) visible at the moment the LLM decided whether to advance. On 2026-04-16, Leon approved a redesign that collapsed the 3 steps to 2 (Inventory and Memorize), named after their primary tool effects (`koan_memory_status` and `koan_memorize`) to make step-skipping visible, and added `<workflow_shape>`, `<goal>`, and `<tools_this_step>` XML blocks to every step, re-injected at each `koan_complete_step` call so the phase structure is visible at the moment of use rather than only at step 1.
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: 'Intake confidence loop removed: unnecessary scout batches and intrinsic self-correction
risk'
type: lesson
created: '2026-04-16T08:34:26Z'
modified: '2026-04-18T16:21:49Z'
related:
- 0002-step-first-workflow-pattern-boot-prompt-is.md
- 0005-phase-trust-model-plan-review-as-designated.md
- 0013-single-cognitive-goal-per-step-prevents-simulated.md
---

The intake phase in koan (koan/phases/intake.py) previously included a confidence-gated loop where steps 2-4 would repeat based on a structured confidence value. On 2026-04-12, Leon collapsed intake to a focused 2-step design (Gather + Deepen), removing the loop for three reasons: (a) it produced unnecessary second scout batches; (b) the Reflect step risked intrinsic self-correction -- the same LLM verifying its own prior reasoning rather than checking against actual codebase files; (c) a single thorough Deepen pass was sufficient when that step was well-scoped. Phase completion was redefined by depth of understanding, not iteration count.

On 2026-04-17, Leon extracted a dedicated Summarize step from Deepen's conclusion, bringing intake to 3 steps total: Gather, Deepen, Summarize. The split applies the single-cognitive-goal-per-step principle (entry 0013): Deepen stays focused on dialogue and codebase verification; Summarize is a distinct step for synthesizing findings into a planning handoff. The confidence-loop removal rationale is unchanged -- the step count change only separates concerns that were already happening at the end of step 2. Note: docs/intake-loop.md still describes the older 2-step design as of 2026-04-18 and requires a separate update.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Koan is dog-fooded on its own development -- meta-context for agents
type: context
created: '2026-04-16T08:34:35Z'
modified: '2026-04-16T08:34:35Z'
---

Koan is a solo project maintained by Leon Mergen, as confirmed by Leon in a curation run on 2026-04-16. Since the initial koan design on 2026-02-10, Leon adopted a practice of using koan's own plan workflow to develop koan itself -- dog-fooding the system as its own first user. This creates a meta-context constraint for any agent working on the koan codebase: workflow instructions and phase prompts in `koan/phases/*.py` and `koan/lib/workflows.py` are runtime instructions for koan's orchestrator subagents to execute, not instructions for the agent currently editing the source files. For example, the `SYSTEM_PROMPT` strings in `koan/phases/intake.py` are the intake orchestrator's role instructions; `koan/phases/curation.py` contains the step guidance that koan's curation orchestrator follows. An agent must not conflate "a prompt being analyzed as source material" with "a prompt being given as a direct instruction." Leon named this the "meta use of koan" and stated it explicitly in the task prompt for the 2026-04-16 curation run.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: Single cognitive goal per step -- prevents simulated refinement
type: decision
created: '2026-04-16T08:37:25Z'
modified: '2026-04-16T08:37:25Z'
related:
- 0002-step-first-workflow-pattern-boot-prompt-is.md
- 0010-curation-phase-3-step-layout-collapsed-to-2-to.md
---

The step design constraint for koan phases (`docs/architecture.md` -- Pitfalls section, "Don't give a step multiple cognitive goals") was established on 2026-02-10 when Leon set a rule: each `koan_complete_step` call must correspond to exactly one cognitive goal. Leon identified the failure mode that motivated this rule: when a single step combines multiple goals ("do A, then B, then C"), the LLM can engage in "simulated refinement" -- artificially downgrading its output for A in order to manufacture visible improvement in C, without genuinely improving anything. Leon documented this as a design constraint: when adding a new phase, each step must answer "what is the single thing this step accomplishes?" and if the answer requires "and then," the step must be split. Leon's reference designs in `koan/phases/plan_spec.py` (Analyze + Write), `koan/phases/intake.py` (Gather + Deepen), and `koan/phases/curation.py` (Inventory + Memorize) each place cognitively distinct operations into separate `koan_complete_step` calls.
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: 'CamelCase wire format: eliminates renaming layer between projection and Zustand
store'
type: decision
created: '2026-04-16T08:37:35Z'
modified: '2026-04-16T08:37:35Z'
related:
- 0003-server-authoritative-projection-via-json-patch.md
- 0007-dual-fold-system-audit-fold-per-subagent-disk-vs.md
---

The SSE wire format for koan's projection system (`koan/projections.py`, `frontend/src/sse/connect.ts`) was designed to use camelCase keys for all serialized projection fields. On 2026-03-29, Leon documented this decision in `docs/projections.md` (Design Decisions -- "Why camelCase on the wire"). Leon's rationale: emitting snake_case from the server would require a `mapProjectionToStore()` renaming function in the frontend TypeScript plus a `projectionState` shadow object for patch application (patches must apply to the pre-renamed dict, not the renamed Zustand store); every new projection field would require a rename entry in that mapping. Leon identified this mapping layer as frontend business logic, contradicting his "frontend has zero business logic" principle. By adopting camelCase -- via Pydantic's `alias_generator=to_camel` in `KoanBaseModel` (`koan/projections.py`) -- patches produced by `jsonpatch.make_patch()` apply directly to the Zustand store in `frontend/src/store/`, and snapshot state spreads directly into the store at reconnect with no field renaming.
Loading
Loading