diff --git a/.agents/skills/ln-build/SKILL.md b/.agents/skills/ln-build/SKILL.md
index b0deab81..9055e9a7 100644
--- a/.agents/skills/ln-build/SKILL.md
+++ b/.agents/skills/ln-build/SKILL.md
@@ -40,15 +40,63 @@ Run the project's verification harness. All checks must pass. Commit: `feat: [ta
## Traceability (mandatory — do before routing)
-After the slice lands and verification passes, do all of these before presenting routing options:
+After the slice lands and verification passes, update only the traceability items touched by this slice. For each candidate artifact, choose exactly one action: **add**, **update**, **merge**, **archive**, or **no-op**.
-1. If working from `memory/PLAN.md`, mark the slice `done`. Check `## Dependencies` — if this slice unblocked multiple downstream slices, note them as newly available (some may be parallelizable). If working from `memory/REFACTOR.md`, mark the commit step complete there instead
-2. Update `memory/SPEC.md` §Assumptions — set `Status` to `validated` or `invalidated` as evidence warrants, update `Confidence` if the evidence changed it, and flag implicated slices in PLAN.md
-3. Add new invariants to `memory/SPEC.md` §Invariants — each structural property now protected by tests. If working from `memory/PLAN.md`, update the `Invariants established` field on the corresponding slice
-4. Add any new decisions to `memory/SPEC.md` §Decisions, new assumptions to §Assumptions
-5. Update `memory/SPEC.md` §Verification Design → Current Coverage with new test files and counts
+### Local comparison set
-These are bookkeeping steps, not optional. Routing comes after.
+Compare new facts only against items the current slice already references:
+
+- the current slice block in `memory/PLAN.md` (and its tracer bullets)
+- rows in `memory/SPEC.md` named by the slice (§Assumptions, §Decisions, §Invariants to respect/established)
+- assumption/decision IDs from the scope card
+- test files added or changed in this slice
+
+Do **not** scan the whole spec looking for the perfect merge target. If nothing in this local set clearly matches, **add** and let `ln-sync` consolidate later.
+
+### Same-item tests
+
+Use these to decide whether a candidate fact is already covered by an existing local row:
+
+- **Same assumption** = same boundary/component + same unresolved claim. Differences in wording, confidence, evidence, or validation method → same assumption.
+- **Same decision** = same seam/boundary + same chosen alternative. Narrower helpers, file layout, implementation mechanics, or first concrete use of an already-chosen pattern → same decision.
+- **Same invariant** = same seam/boundary + same rule template + same proved decision(s). Approve/reject, confirm/force-close, reload/refresh/resume, or kind/phase/state variants of one shared rule → same invariant.
+
+### Steps
+
+1. **Mark completion.** Mark the slice or tracer bullet `done` in `memory/PLAN.md`. Note newly unblocked downstream slices.
+
+2. **Assumptions** — for each assumption the slice touched or relied on:
+ - Evidence answered it → **update** status to `validated` or `invalidated`; flag implicated slices
+ - Evidence changed certainty only → **update** confidence
+ - Same assumption exists locally → **merge** into it
+ - New unresolved belief the slice depended on, not already guaranteed by code/tests, and if false would change future work → **add**
+ - Otherwise → **no-op**
+
+3. **Decisions** — a decision records a committed choice at a seam, not an execution diary entry:
+ - Slice only implemented an existing decision without changing the choice → **no-op**
+ - Same decision exists locally and choice stayed the same → **update** (clearer rationale/scope) or **merge** (narrower instance of same pattern)
+ - Slice chose one alternative among ≥2 plausible alternatives, non-trivial to reverse, future work could revisit → **add**
+ - Slice changed the answer at the same seam → **add** new row with `Supersedes: Dn`
+ - Otherwise → **no-op**
+
+4. **Invariants** — prefer one seam-level invariant over many branch-level invariants:
+ - No new/changed test protects the property → **no-op**
+ - Property is temporary migration state or one example of a broader rule → **merge** or **no-op**
+ - Same invariant exists locally and only `Protected by` grew → **update**
+ - Candidate is another branch/state/kind/phase/action variant of the same rule → **merge** (keep surviving ID, union `Protected by`, append to `Established by` only if the statement widened)
+ - Property can regress independently of all local invariants (different seam, rule, proved decision, or test family) → **add**
+ - Otherwise → **merge**
+
+5. **Completed-slice note in PLAN.md** — max 4 bullets / 6 lines:
+ - shipped outcome
+ - seam changed (optional)
+ - evidence (tests/manual)
+ - remaining debt or follow-up (optional)
+ - If a note already exists, **update** it; do not append another paragraph. If marking `done` plus invariant/decision updates already captures everything → **no-op**
+
+6. **Verification coverage** — update `memory/SPEC.md` §Current Coverage. If the test file already appears, **update** counts; do not add a duplicate entry.
+
+When uncertain between merge and add → add. When uncertain between update and no-op → update.
## Routing
diff --git a/.agents/skills/ln-plan/SKILL.md b/.agents/skills/ln-plan/SKILL.md
index 69d50d79..86545f60 100644
--- a/.agents/skills/ln-plan/SKILL.md
+++ b/.agents/skills/ln-plan/SKILL.md
@@ -8,6 +8,10 @@ argument-hint: "[feature or project area to plan]"
Break a feature into tracer-bullet slices and spikes (Hunt & Thomas), grouped into temporal phases. Slices are thin end-to-end paths through all integration layers. Order by uncertainty first, dependency second (Reinertsen: retire risk early, not just finish tasks early).
+**Anti-fragmentation heuristic.** Create a new slice only when it introduces at least one of: (1) a new lifecycle seam, (2) a new cross-boundary transport/persistence seam, (3) a new workflow-mode entry/exit behavior, or (4) a new unblocker for reaching the end-to-end working app state. Do **not** create separate slices for additional action/status permutations or rarer branches on the same seam unless they materially unblock progress.
+
+**Refinement sink.** When deferred variants accumulate, collect them into a later cross-cutting refinement slice instead of fragmenting the active major slice. Plan for the dominant path first; explicitly defer edge-case polish and rarer lifecycle variants.
+
**Epistemic horizon.** Plan depth must match confidence depth. If `memory/SPEC.md` §Assumptions contains low-confidence items that downstream slices depend on, the plan's horizon stops there — plan spikes that retire the uncertainty, not slices that assume it away.
**Spike economics.** For each low-confidence assumption, evaluate: how many slices depend on it, how cheaply it can be falsified, what decisions it unlocks. High fan-out + low falsification cost → spike early. When uncertainty is broad, the first slices should be invariant-establishing (walking skeleton), not feature-delivering.
@@ -24,7 +28,7 @@ If context is thin, run a brief interview (not a full `ln-grill`) to fill gaps.
1. If `memory/PLAN.md` exists, read it first. Retire completed slices (mark `done`). Assess what remains and what's changed.
2. Explore the codebase. Identify architectural constraints the slices must respect (routes, schema, auth, third-party boundaries).
-3. Draft or revise phases and slices. Each slice must be independently demoable and independently grabbable where possible. Group into temporal phases. For each, name dependent requirements and assumptions from `memory/SPEC.md`, plus any candidate invariant goals to establish or existing invariants to respect.
+3. Draft or revise phases and slices. Each slice must be independently demoable and independently grabbable where possible. Group into temporal phases. Bias toward the minimum slice set that covers the main user story and keeps the app moving end to end; if a major slice can be sub-sliced many ways, keep only the variations that unblock forward progress and defer the rest explicitly. For each slice, name dependent requirements and assumptions from `memory/SPEC.md`, plus any candidate invariant goals to establish or existing invariants to respect.
4. Observe and respect local project protocols for mapping slices/spikes to issues or tickets, associated codes, and branch naming conventions, if any. Capture project-specific tracking metadata as optional execution detail — not as the core identity of the slice.
5. Confirm with user — adjust granularity, reorder, split or merge.
6. **Post-edit checklist** — after any addition, removal, or reordering:
diff --git a/.agents/skills/ln-scope/SKILL.md b/.agents/skills/ln-scope/SKILL.md
index bd4c32b0..e32a6762 100644
--- a/.agents/skills/ln-scope/SKILL.md
+++ b/.agents/skills/ln-scope/SKILL.md
@@ -8,6 +8,10 @@ argument-hint: "[behavior to deliver in this slice]"
Define **one** tracer-bullet slice (Hunt & Thomas) — a thin end-to-end path, not a horizontal layer. If the target behavior needs "and", split it.
+**Sub-slicing restraint.** Create a new sub-slice only when it introduces at least one of: (1) a new lifecycle seam, (2) a new cross-boundary transport/persistence seam, (3) a new workflow-mode entry/exit behavior, or (4) a new unblocker for reaching the end-to-end working app state. Do **not** split off a separate slice just for another action/status permutation or a rarer branch on the same seam. If refinements accumulate, prefer one later cross-cutting refinement slice over fragmenting the current major slice.
+
+**Main-path bias.** Scope the smallest slice set that covers the dominant user story and unblocks forward progress. Rare variants, polish, and refinement work should be named explicitly as deferred rather than silently folded into the current slice.
+
## Input
The behavior to deliver: $ARGUMENTS
@@ -65,7 +69,7 @@ A slice without a verification approach is not fully scoped. At minimum, inner-l
After the scope card is complete, do these before presenting routing options:
-1. New assumptions surfaced during scoping → add to `memory/SPEC.md` §Assumptions with links to this slice
+1. New assumptions surfaced during scoping → apply `ln-build` §Same-item tests first. If the same assumption already exists in `memory/SPEC.md`, **update** or **merge** into it. Only **add** if no existing row covers the same boundary + claim.
## Routing
diff --git a/.agents/skills/ln-spec/SKILL.md b/.agents/skills/ln-spec/SKILL.md
index 4820b974..87a2c54c 100644
--- a/.agents/skills/ln-spec/SKILL.md
+++ b/.agents/skills/ln-spec/SKILL.md
@@ -38,7 +38,27 @@ If `memory/PLAN.md` exists, verify that changed assumptions and decisions still
### Weight management
-Spec documents accumulate. Each ln-sync pass may prune items that are embedded, moot, or superseded (see ln-sync §Pruning check). When *adding* items, consider whether an existing item should be retired to make room. A spec with 30 assumptions is not more rigorous than one with 10 — it's harder to read and more likely to mislead a new session.
+Use the same unit-of-record rules as `ln-build` §Same-item tests. Before adding a row, compare against nearby items in the same feature area. Prefer **update** or **merge** over **add** when the seam is the same.
+
+**Units of record:**
+
+- **Assumption** = one unresolved question at one seam
+- **Decision** = one committed choice between alternatives at one seam
+- **Invariant** = one seam-level structural property protected by tests
+
+**These are not new rows** — they are updates or merges to existing rows:
+- confidence changes, validation narratives, added evidence
+- helper names, file layout, or implementation mechanics
+- one more branch/state/kind/phase/action example of an existing rule
+- one implementation step under an already-recorded decision
+
+**Smell checks before adding:**
+- The sentence starts with "for this slice" or names a temporary cutover step → probably an update, not a new item
+- The difference is only approve/reject, confirm/force-close, or kind/phase/state variants of one shared rule → merge into the seam-level row
+- The item would stop making sense once the code ships and no alternative remains live → probably a decision that should not be tracked
+- The item is an implementation mechanic inside an already-chosen boundary → no-op
+
+Large cleanup is `ln-sync` work. When writing or patching, keep the touched area coherent; do not attempt a risky whole-document consolidation.
### Cross-reference integrity
diff --git a/.agents/skills/ln-sync/SKILL.md b/.agents/skills/ln-sync/SKILL.md
index cfa215b5..f5cf8126 100644
--- a/.agents/skills/ln-sync/SKILL.md
+++ b/.agents/skills/ln-sync/SKILL.md
@@ -32,21 +32,69 @@ For each assumption whose `Status` is `invalidated`:
- Preserve `Status: invalidated` in §Assumptions
- Flag all implicated slices in `memory/PLAN.md`
-### 2b. Pruning check
+### 2b. Consolidation pass
-Tracked items accumulate. Large assumption and decision tables become a confusion surface — new sessions inherit stale context and make wrong inferences. After graduation, assess each remaining item for removal:
+`ln-build` is local and conservative — it only compares against items the current slice references. `ln-sync` owns whole-document consolidation: merging equivalent rows, generalizing micro-variants, and absorbing implementation-detail decisions.
+
+Read the full affected sections and merge overly-granular items before pruning.
+
+#### Same-item tests (from ln-build — apply globally here)
+
+- **Same assumption** = same boundary/component + same unresolved claim
+- **Same decision** = same seam/boundary + same chosen alternative
+- **Same invariant** = same seam/boundary + same rule template + same proved decision(s)
+
+#### Global consolidation rules
+
+- Keep the **oldest surviving ID** among equivalent rows
+- Rewrite that survivor to the **generalized statement**
+- Union metadata (`Protected by`, `Established by`, dependencies, implicated slices, validation evidence)
+- Remove absorbed rows and leave an HTML comment naming absorbed IDs and why
+- Do **not** renumber surviving items
+- Rewrite references in both `SPEC.md` and `PLAN.md` from absorbed IDs to the surviving ID
+
+Comment format: ``
+
+#### Assumptions — merge when:
+
+- same boundary/component + same unresolved claim + differences are only wording, confidence, evidence, or validation method
+- After merge: keep one row, preserve strongest status/evidence, union dependent decisions and implicated slices
+
+#### Decisions — merge when:
+
+- same seam/boundary + same chosen alternative + newer rows only add implementation detail, narrower examples, or first use cases of the same pattern
+- Keep separate when different alternatives at the same seam, or either choice could still be revisited independently
+
+#### Invariants — merge when:
+
+- same seam/boundary + same rule template + same proved decision(s), or one row is a strict example/branch of the other
+- Prefer the generalized seam-level wording; union protecting test files and establishing slices
+- Keep separate only when they can regress independently because seam, rule, proof, or test family differs
+
+#### Completed-slice notes in PLAN.md
+
+For every `done` slice:
+- keep at most one compact completion block (max 4 bullets / 6 lines): shipped outcome, seam changed, evidence, remaining debt
+- replace verbose `Observed current state` / `Observed code seam` narratives with the compact form
+- if the parent slice is `done`, fold tracer-bullet prose into the parent note
+- delete completion notes that only repeat acceptance text, invariant IDs, or commit history
+
+Git is the history. PLAN.md keeps only routing-relevant summaries.
+
+### 2c. Pruning check
+
+After consolidation, assess each remaining item for removal:
| State | Criterion | Action |
| --- | --- | --- |
-| **Embedded** | Status `validated`, and now a structural property of the code — restating it as a tracked question adds noise, not clarity | Remove — the code is the proof |
-| **Moot** | Status `invalidated`, and the concern no longer applies (e.g. the technology it worried about was replaced entirely) | Remove |
-| **Superseded** | Replaced by a newer decision or assumption | Remove, note in the replacement |
-
-When pruning, leave a comment noting which IDs were removed and why (e.g. ``). Do not renumber surviving items — IDs are stable. Dereference removed items from PLAN.md slice cross-references.
+| **Embedded** | Now a structural property of code/tests/decisions/invariants; restating it as a live tracked item adds noise | Remove |
+| **Moot** | The concern no longer applies in the current architecture | Remove |
+| **Superseded** | Replaced by a newer decision/assumption/invariant and all references can point to the replacement | Remove, note replacement |
+| **Redundant** | Equivalent to another surviving row after consolidation | Remove |
-After pruning, repair or replace any dangling cross-references in `memory/SPEC.md` and `memory/PLAN.md` that pointed at removed assumptions, decisions, invariants, or verification notes.
+When pruning, leave a comment noting which IDs were removed and why (e.g. ``). Do not renumber surviving items.
-The same logic applies to §Decisions: a decision that is now simply how the code works, with no live alternative being weighed, can be removed. Keep decisions that record a *choice between alternatives* that future work might revisit.
+After pruning, repair or replace any dangling cross-references in `memory/SPEC.md` and `memory/PLAN.md` that pointed at removed or absorbed items.
### 3. Staleness check
diff --git a/HANDOFF.md b/HANDOFF.md
deleted file mode 100644
index 981adb2c..00000000
--- a/HANDOFF.md
+++ /dev/null
@@ -1,129 +0,0 @@
-# Handoff
-
-> Generated by `ln-handoff` at 2026-04-09T19:39Z. Read this file to resume work.
-
-## Goal
-
-Finish hardening the Phase 5 design-mode / shared phase-close seam, keep project memory current, and resume the next high-signal planning or review step without re-discovering the just-landed refactor state.
-
-## Session State
-
-- **Last completed skill**: `ln-sync` — refreshed `memory/SPEC.md` and `memory/PLAN.md`, pruned temporary refactor assumptions, removed the resolved `memory/REFACTOR.md`, and cleaned stale spike artifacts.
-- **Current skill**: `ln-handoff`
-- **Flow position**: `grill → spec → plan → [design] → [oracles] → scope → [spike] → build → review → [refactor] → [sync]`
-- **Handoff trigger**: user requested wrap-up after the completed phase-close refactor, repo cleanup, and sync pass.
-
-## In-flight work
-
-> CRITICAL: These artifacts exist only in the prior conversation, not on disk.
-> Reproduce them here with full fidelity.
-
-### Artifact 1/2 — Sync report
-
-## Sync Report
-
-### Graduations
-- No new assumption-to-decision promotions were needed during sync.
-- The recent phase-close refactor truths were already captured in persisted decisions/invariants (`D75`, `D76`, `I85`, `I86`) before sync began.
-
-### Pruned
-- `A42` removed from `memory/SPEC.md` as **embedded** — it was a temporary cutover assumption for deleting transcript-driven closure-basis recovery, and its truth now lives in `D75` + `I85`.
-- `A43` removed from `memory/SPEC.md` as **embedded** — it was a temporary transport-cutover assumption for the discriminated phase-close command union, and its truth now lives in `D73` + `I83`.
-- `memory/REFACTOR.md` removed — all six commit steps were complete, so keeping the file would preserve an already-resolved transition artifact.
-
-### Stale items
-- `memory/PLAN.md` slice 8 observed-state paragraph still referenced `memory/REFACTOR.md` by path after the refactor completed. This was updated to describe the completed phase-close refactor directly instead of pointing at a file slated for deletion.
-
-### Drift
-- No material code ↔ memory drift remained after cleanup.
-- `slice 8` is still correctly `in-progress`; downstream slices remain `9`, `10`, `11a`, `12`, `13` as documented.
-
-### Actions taken
-- Updated `memory/SPEC.md` prune comment to record the removal of `A42` and `A43`.
-- Removed `A42` and `A43` from the assumptions table.
-- Updated `memory/PLAN.md` slice 8 observed-state prose to remove the `memory/REFACTOR.md` path reference and describe the completed refactor as durable project state.
-- Deleted `memory/REFACTOR.md`.
-
-### Artifact 2/2 — Repo hygiene verdict for `spike/`
-
-- `spike/filesystem-tools.ts` was **kept**.
- - Reason: `memory/SPEC.md` assumption `A29` still cites it explicitly as the evidence artifact for the validated core-tools spike.
-- `spike/observer-fidelity.ts` was **deleted**.
- - Reason: unreferenced in repo; relies on the old `@anthropic-ai/claude-agent-sdk` spike path and the pre-canonical ontology; looked like a stale investigative relic rather than a current evidence artifact.
-- `spike/raw-sdk-tool-use.ts` was **deleted**.
- - Reason: unreferenced in repo; documents a superseded raw Anthropic SDK exploration from before the current AI SDK direction.
-
-## Review findings
-
-> ALL findings from ln-review, not just the one being acted on.
-
-| # | Finding | Status | Implications |
-| --- | ------- | ------ | ------------ |
-| 1 | No `ln-review` was run in this session after the phase-close refactor sequence | `deferred` | The next thread should strongly consider `ln-review` before widening into slice 9 or 11a |
-
-## Diagnostic evidence
-
-> Concrete proof points that informed diagnoses or shifted direction.
-> Without these, a new thread inherits conclusions but not reasoning.
-
-- `rg -n "spike/filesystem-tools|spike/" .` found exactly one live canonical reference to `spike/filesystem-tools.ts`, in `memory/SPEC.md` assumption `A29`: this justified keeping that file.
-- `rg` found no live repo references to `spike/observer-fidelity.ts` or `spike/raw-sdk-tool-use.ts`: this justified deleting them as relics.
-- `find spike -maxdepth 3 -print` originally showed three files: `filesystem-tools.ts`, `observer-fidelity.ts`, `raw-sdk-tool-use.ts`; after cleanup it showed only `spike/filesystem-tools.ts`.
-- `rg -n "memory/REFACTOR\.md" .` after cleanup found only one remaining reference in `docs/design/ln-skills-review-after-alignment.md`; no canonical `memory/*` doc still depended on the file.
-- Last full verification before this cleanup was green on commit `d2189fd` via `npm run verify` (184 tests passed, build green). The subsequent cleanup touched only memory docs and deleted unreferenced spike artifacts.
-
-## Decisions and assumptions
-
-| Item | Type | Status | Source |
-| ---- | ---- | ------ | ------ |
-| D75 | `decision` | `persisted` | `memory/SPEC.md` §Decisions |
-| D76 | `decision` | `persisted` | `memory/SPEC.md` §Decisions |
-| A15 | `assumption` | `persisted` | `memory/SPEC.md` §Assumptions |
-| A40 | `assumption` | `persisted` | `memory/SPEC.md` §Assumptions |
-| A29 evidence file should remain `spike/filesystem-tools.ts` | `decision` | `volatile` | conversation + sync reasoning |
-
-## Repo state
-
-- **Branch**: `ln/fe-540-design-mode-commitment-exploration`
-- **Recent commits**:
- - `d2189fd feat: deepen the shared phase-close module`
- - `eb117ce feat: ship explicit phase-close command union`
- - `4f1c3eb feat: cut workflow projection over to durable closure basis`
- - `f49e884 feat: persist phase-outcome closure basis`
- - `fb2854e feat: project shared force-close availability`
-- **Dirty files**:
- - `M memory/PLAN.md`
- - `M memory/SPEC.md`
- - `D memory/REFACTOR.md`
- - `D spike/observer-fidelity.ts`
- - `D spike/raw-sdk-tool-use.ts`
- - `?? HANDOFF.md`
-- **Test status**: last known full gate `npm run verify` passed on commit `d2189fd` before the docs/spike cleanup; no code-bearing runtime paths changed afterward.
-
-## Artifact status
-
-| Artifact | Exists | Current vs conversation |
-| ---------- | ------ | ----------------------- |
-| memory/SPEC.md | yes | current |
-| memory/PLAN.md | yes | current |
-| memory/REFACTOR.md | no | n/a |
-
-## Next steps
-
-1. Run `ln-review` on the phase-close / design-mode seam now that the full refactor sequence is complete.
-2. If review is clean, use `ln-scope` to pick the next planned slice — most likely `9` (requirements-review mode), unless priority shifts toward `11a` (dashboard workflow state).
-3. Commit the current doc/spike cleanup if the user wants this hygiene pass preserved as its own checkpoint.
-
-## Open questions
-
-- Should the historical note in `docs/design/ln-skills-review-after-alignment.md` that mentions `memory/REFACTOR.md` be updated, or is it acceptable as archival design commentary?
-- Is the next best move `ln-review` first, or should the team jump directly to scoping slice `9`?
-- Does the team want to keep `spike/filesystem-tools.ts` indefinitely as evidence for `A29`, or should that evidence eventually migrate into a more durable docs location?
-
-## Resume prompt
-
-Paste this into a new session:
-
-> Read `HANDOFF.md` in the workspace root for this work area. It contains the full state of in-progress work.
-> The immediate next step is: run `ln-review` on the phase-close / design-mode seam.
-> Start by reviewing the sync report and repo hygiene verdict in the In-flight section, then inspect the dirty files before deciding whether to commit the cleanup or continue into review.
diff --git a/memory/CARDS.md b/memory/CARDS.md
new file mode 100644
index 00000000..4cb1f785
--- /dev/null
+++ b/memory/CARDS.md
@@ -0,0 +1,161 @@
+
+
+# Cards
+
+## Scope Card — `10.1` Criteria grounding + first synthesis/review loop
+
+### Target Behavior
+After requirements closes, the first prepared criteria turn is grounded in the approved requirement set and can round-trip one initial criterion through the existing seams without dropping out of `criteria` mode.
+
+### Boundary Crossings
+```text
+→ requirements phase is confirmed closed from slice 9.5, with criteria now the active phase
+→ next prepared turn / /chat invocation selects `phase: 'criteria'`
+→ interviewer context injects the approved requirement inventory (and any existing criterion inventory)
+→ criteria-mode interviewer emits a criteria-shaped question rather than a generic follow-up
+→ user replies through the existing turn-response seam
+→ observer persists one criterion through the generic knowledge seam
+→ workflow remains `criteria` / `in_progress` and entities projection exposes the new criterion
+```
+
+### Risks and Assumptions
+- **RISK:** Criteria mode may technically activate but still receive weak or stale grounding, producing a generic follow-up instead of a verification-oriented question.
+ - **MITIGATION:** Lock both artifacts independently: approved-requirement inventory reaching the interviewer seam and the emitted turn remaining criteria-shaped.
+
+- **RISK:** The first criterion round-trip could accidentally widen into explicit criterion review-state or closeability work.
+ - **MITIGATION:** Keep this slice synthesis-thin: prove one first criterion can emerge and persist; defer approve/reject/closeability to `10.2`.
+
+- **RISK:** Criteria grounding may include rejected or otherwise out-of-scope requirements rather than the reviewed active set.
+ - **MITIGATION:** Assert approved-requirement grounding specifically, not just generic requirement inventory presence.
+
+- **ASSUMPTION:** The criteria interviewer can be usefully grounded by the approved requirement inventory plus any earlier criterion-like signals without first introducing a dedicated criteria workspace.
+ - **VALIDATE:** criteria context/prompt tests and one criterion persistence round-trip stay coherent.
+ - → existing `SPEC.md` assumptions/decisions already cover this (`A28`, `A40`, `D25`, `D55`, `D56`, `D71`)
+
+### Acceptance Criteria
+```text
+✓ src/server/context.test.ts or src/server/interview.test.ts — criteria-mode interviewer context includes the approved requirement inventory (not just prior chat history)
+✓ src/server/app.test.ts — the first post-9.5 /chat turn runs the interviewer and observer with `phase: 'criteria'` and emits a criteria-shaped question rather than a generic follow-up
+✓ src/server/app.test.ts or src/server/db.test.ts — one initial criterion can round-trip through criteria-mode reply → observer persistence → entities projection while criteria remains `in_progress`
+```
+
+### Verification Approach
+- **Inner:** fast unit tests — proves criteria context/prompt grounding and criteria-mode phase selection.
+- **Middle:** round-trip oracle (criteria synthesis) — approved requirements → criteria turn → observer-created criterion → entities projection with no drift.
+- **Outer:** manual walkthrough — finish requirements, enter criteria, verify the first criteria question feels tied to the reviewed requirement set.
+
+### Traceability
+- **PLAN:** Slice `10.1`
+- **SPEC requirements:** `#6`, `#8`, `#12`
+- **SPEC assumptions:** `A28`, `A40`
+- **SPEC decisions:** `D25`, `D55`, `D56`, `D71`
+- **Invariants to respect:** `I18`, `I19`, `I21`, `I24`, `I95`, `I96`
+- **Deferred to later cards:** explicit per-criterion review state, criteria closeability, final criteria closure
+
+---
+
+## Scope Card — `10.2` Explicit criterion review state + minimal closeability
+
+### Target Behavior
+Criteria review can persist explicit per-criterion positive/non-positive review state, project `approved` / `rejected` / `pending` status from the active path, and mark criteria closeable only when no current criterion remains pending.
+
+### Boundary Crossings
+```text
+→ criteria mode has a current criterion set on the active path
+→ interviewer emits a targeted criteria-review turn carrying explicit criterion review metadata
+→ user responds through the existing turn-response seam
+→ response handling records durable active-path criterion review linkage
+→ entities / sidebar read model projects `approved` / `rejected` / `pending` criterion state from the latest explicit review action
+→ workflow projection computes criteria `closeability: true` only when every current criterion has non-pending review state
+```
+
+### Risks and Assumptions
+- **RISK:** Repeating the slice-9 pattern too literally could split criterion approval, rejection, and closeability into separate tracer bullets that do not unblock the app.
+ - **MITIGATION:** Bundle the minimum useful criterion lifecycle into one slice: explicit positive action, explicit non-positive action, latest-action projection, and deterministic closeability.
+
+- **RISK:** Criterion review semantics may diverge from requirement review enough that reusing the same metadata/linkage seam becomes awkward.
+ - **MITIGATION:** Keep the first seam narrow and structural; defer edit/merge/split/stale semantics to `13a`.
+
+- **RISK:** Criteria closeability could accidentally widen into richer coverage logic such as "every approved requirement must have N criteria" before the thin end-to-end path is working.
+ - **MITIGATION:** Start with the same deterministic minimum-bar approach as earlier phases: current criterion set has no pending explicit review state.
+
+- **ASSUMPTION:** The first explicit criterion review seam can reuse the same targeted-review + active-path link pattern already proven for requirements.
+ - **VALIDATE:** targeted criterion review round-trip and latest-action projection pass without introducing a separate mutation path.
+ - → likely SPEC follow-on from existing seams (`A15`, `A28`, `D24`, `D61`, `D65`, `D66`, `D70`)
+
+### Acceptance Criteria
+```text
+✓ src/shared/chat.ts or src/server/interview.test.ts — criteria review metadata validates targeted criterion review actions through the existing structured question seam
+✓ src/server/app.test.ts — one explicit positive criterion review action and one explicit non-positive criterion review action persist through the response seam and project correctly in entities/sidebar state
+✓ src/server/db.test.ts — criterion review projection resolves `approved` / `rejected` / `pending` from the latest explicit active-path action
+✓ src/server/db.test.ts — criteria becomes closeable only when every current criterion has explicit non-pending review state
+```
+
+### Verification Approach
+- **Inner:** fast unit tests — proves criterion review metadata, read-model projection, and closeability rule.
+- **Middle:** round-trip oracle (criteria review) — targeted review metadata → response submit → durable review linkage → projected criterion state with no drift.
+- **Middle:** model-based lifecycle oracle (criteria review) — criteria remains `in_progress` until all current criteria are explicitly reviewed.
+- **Outer:** manual walkthrough — confirm the thin approve/reject semantics are legible enough to keep moving toward completed workflow.
+
+### Traceability
+- **PLAN:** Slice `10.2`
+- **SPEC requirements:** `#7`, `#8`, `#12`, `#13`
+- **SPEC assumptions:** `A15`, `A28`
+- **SPEC decisions:** `D24`, `D61`, `D65`, `D66`, `D70`
+- **Invariants to respect:** `I18`, `I21`, `I24`, `I62`, `I63`, `I96`
+- **Deferred to later cards:** criterion edit/add/merge/split flows, richer requirement↔criterion coverage logic, stale review invalidation
+
+---
+
+## Scope Card — `10.3` Criteria closure + completed workflow state
+
+### Target Behavior
+Once criteria is closeable, the shared phase-close seam can propose and confirm final criteria closure, leaving workflow with all phases closed and no stale active interviewer phase before export.
+
+### Boundary Crossings
+```text
+→ criteria phase is active and closeable after 10.2
+→ interviewer emits `propose_phase_closure` for `criteria`
+→ `/chat` streams `data-phase-summary` and persists a proposed criteria `phase_outcome`
+→ user confirms the proposed criteria closure through `data-confirmation`
+→ `phase_outcome` persistence confirms the final criteria outcome with durable closure basis
+→ workflow projection marks criteria `closed` and projects no remaining active phase
+→ next app/project state reads as interview-complete rather than reopening stale criteria mode
+```
+
+### Risks and Assumptions
+- **RISK:** The current server-side "first unclosed phase or `criteria` fallback" behavior may fabricate an active criteria phase even after final closure.
+ - **MITIGATION:** Lock completed-workflow projection explicitly and treat "all phases closed" as its own tested structural state.
+
+- **RISK:** Final-phase closure could widen into export gating, dashboard polish, or force-close variants.
+ - **MITIGATION:** Keep this slice about closure mechanics only: proposed final close, confirmation, and completed workflow projection. Leave export/dashboard concerns to slices `11a` and `13`.
+
+- **RISK:** Criteria closure may be implemented correctly in persistence but fail to suppress stale interviewer activity on the next app interaction.
+ - **MITIGATION:** Add an app/core assertion that post-confirmation state has no remaining active interview phase before export.
+
+- **ASSUMPTION:** The shared phase-close transport used for scope, design, and requirements can close the terminal `criteria` phase without a criteria-specific mutation path.
+ - **VALIDATE:** proposal/confirmation round-trip persists and projects a completed workflow state through the same seam.
+ - → existing `SPEC.md` assumptions/decisions already cover this (`A15`, `A28`, `D65`, `D66`, `D71`)
+
+### Acceptance Criteria
+```text
+✓ src/server/app.test.ts — criteria can emit a shared `data-phase-summary` proposal and persist a proposed final `phase_outcome`
+✓ src/server/db.test.ts or src/server/app.test.ts — confirming the proposed criteria outcome closes criteria with durable closure provenance and leaves all workflow phases closed
+✓ src/server/core.test.ts or src/server/app.test.ts — post-confirmation app state projects no stale active interviewer phase before export rather than reopening `criteria`
+```
+
+### Verification Approach
+- **Inner:** fast unit/integration tests — proves final-phase proposal/confirmation persistence and completed-workflow projection.
+- **Middle:** round-trip oracle (phase outcome) — criteria proposal → confirmation → confirmed final `phase_outcome` → completed workflow state with no drift.
+- **Outer:** manual walkthrough — complete the interview, confirm final criteria closure, and verify the app feels done before export-specific polish lands.
+
+### Traceability
+- **PLAN:** Slice `10.3`
+- **SPEC requirements:** `#7`, `#8`, `#13`
+- **SPEC assumptions:** `A15`, `A28`
+- **SPEC decisions:** `D65`, `D66`, `D71`
+- **Invariants to respect:** `I18`, `I24`, `I96`
+- **Deferred to later cards:** export predicate details, dashboard summarization, low-readiness/forced-close variants if they prove necessary later
diff --git a/memory/PLAN.md b/memory/PLAN.md
index 8c4932d4..afcc37e2 100644
--- a/memory/PLAN.md
+++ b/memory/PLAN.md
@@ -41,80 +41,54 @@
4c. **UI foundation: shadcn/ui + Tailwind 4 + AI Elements** `FE-558` `done`
5. **Observer agent + entity persistence** `FE-537` `done` — I20, I21, I22
6. **Entity sidebar (read-only)** `FE-538` `done` — I23
-6b. **AI SDK-native chat pivot** `FE-559` `done` — I21↑, I22↑, I23↑; core tools spike proven
-6b1. **Workspace seam characterization oracle** `done` — I24, I25
+6b. **AI SDK-native chat pivot** `FE-559` `done` — I21↑, I22↑, I23↑
+6b1. **Workspace seam characterization oracle** `done` — I24
- Purpose: add a client integration harness around the interview workspace before the state-ownership refactor
- Coverage: initial hydration from persisted turns, same-project refresh stability, observer-result sidebar reactivity, option-selection follow-through
- Unblocks: 6c live streaming fix, workspace state-ownership refactor commits
## Phase 4: Interaction + Knowledge Foundations `done`
-
-
### Slices
-6c. **Live streaming fix** — Fix the turn-card rendering regression: during live SSE streaming, the structured turn card does not render until page refresh. Thinking streams live; server persists correctly; hydration from DB works. Root cause is in the interaction between `toUIMessageStream()`, `useChat` part accumulation, and the tool-part lifecycle. `done`
+6c. **Live streaming fix** `done`
- Requirements: → SPEC.md §Requirements #2, #3, #4
- Assumptions: → SPEC.md §Assumptions A16, A28
- - Candidate invariant goals: live tool-part rendering matches persisted state after refresh
- Invariants to respect: → SPEC.md §Invariants I16, I17, I18, I22
- - Invariants established: → SPEC.md §Invariants I43
- - Acceptance: send a message in dev, see the structured turn card appear live without refresh; `npm run verify` passes
- - **Observed current state (2026-04-07, post-build):** the workspace controller now projects the latest streamed `tool-ask_question` input into the visible `TurnCard` before `onFinish` route invalidation, targeted regression tests (`InterviewWorkspace`, `workspace-controller`, `workspace-data`, `app`) are green, the branch's latest full `npm run verify` passed before the docs-only SPEC commit, and manual browser verification confirmed the live turn card now appears without refresh.
- - **Observed code seam:** `InterviewWorkspace.renderParts()` still drops `tool-ask_question` transcript parts, but `workspace-controller-core.ts` now projects the latest streamed tool input into a temporary visible turn card while loading; durable loader state still owns the post-finish turn card after router invalidation.
- - **Verification approach**: inner — unit/integration tests for tool-part state transitions or alternate live render path. Outer — manual interview: turn card renders live, matches post-refresh state.
+ - Invariants established: → SPEC.md §Invariants I24
+ - Acceptance: streamed turn card appears live without refresh; `npm run verify` passes
+ - Result: workspace controller projects streamed `tool-ask_question` into visible turn card before durable route refresh
+ - Evidence: InterviewWorkspace.test.tsx, workspace-controller.test.tsx, workspace-data.test.ts, app.test.ts
-6d. **Flexible turn-response model** — Replace the single-select answer assumption with typed turn responses that support zero/one/many selections plus unified free-text response content. Keep structured interviewer guidance, recommendation, and strategic grounding, but stop assuming every turn maps to one categorical choice or one scalar answer string. `done`
+6d. **Flexible turn-response model** `done`
- Requirements: → SPEC.md §Requirements #3, #6
- Assumptions: → SPEC.md §Assumptions A16, A28, A33
- - Decisions: → SPEC.md §Decisions D23, D24, D25, D45, D46, D47, D48
- - Candidate invariant goals: turn-response payload round-trip fidelity; multi-select/custom-answer state hydrates and replays correctly
+ - Decisions: → SPEC.md §Decisions D23, D24, D25
- Invariants to respect: → SPEC.md §Invariants I17, I18, I19, I22
- - Invariants established: → SPEC.md §Invariants I44, I45, I46, I47
- - Acceptance: a turn can be answered with one-or-more selections plus optional free-text response or with zero selections plus required free-text response; transcript, persistence, interviewer context, and resume stay aligned
- - **Observed current state (2026-04-07, tracer bullets 1–3):** zero/one/many selected options plus optional free-text now persist as `data-turn-response`, store a user-visible summary seam, rehydrate through the workspace path, and project into interviewer context coherently. The client turn card now stages many selections locally and submits them through the same turn-response seam as the other remodeled paths.
- - **Verification approach**: inner — response-schema + projection characterization tests (`SPEC.md` §Verification Design, inner loop) prove cardinality and response-shaped context projection; middle — round-trip integration from submit → persistence → hydration → interviewer-context composition (`SPEC.md` §Verification Design, middle loop) validates A33 while protecting I17, I18, I19, and I22; outer — manual interview with zero/one/many option responses plus free-text-only replies confirms coherent follow-through (`SPEC.md` §Verification Design, outer loop).
- - Tracer bullets:
- - `6d.1` Single selected option + optional free-text response. `done`
- - `6d.2` Zero selections + required free-text-only response. `done`
- - `6d.3` True many-selection UX + persistence/hydration. `done`
-
-6e. **Generic knowledge layer schema + sidebar projection** — Introduce the broader semantic layer (`framing`, `constraint`, `decision`, `assumption`, `requirement`, `criterion`) with generic provenance and graph edges, then project it cleanly into the sidebar without regressing existing reads. `done`
+ - Invariants established: → SPEC.md §Invariants I44
+ - Acceptance: zero/one/many selections plus free-text round-trip through persistence, hydration, and interviewer context
+ - Result: `data-turn-response` parts carry structured replies; workspace stages multi-select locally and submits one response
+ - Tracer bullets: 6d.1 single + free-text `done`, 6d.2 free-text-only `done`, 6d.3 many-selection `done`
+
+6e. **Generic knowledge layer schema + sidebar projection** `done`
- Requirements: → SPEC.md §Requirements #5, #6, #14
- Assumptions: → SPEC.md §Assumptions A14
- Decisions: → SPEC.md §Decisions D5, D13, D25, D49, D50, D51
- - Candidate invariant goals: generic knowledge-item persistence with turn linkage; graph-edge fidelity across item kinds
- - Invariants to respect: → SPEC.md §Invariants I20, I21, I23, I34
- - Invariants established: → SPEC.md §Invariants I48, I49, I50, I51, I52, I53
- - Acceptance: project state can load and display generic knowledge items and edges from the active path without losing current resume behavior
- - **Observed current state (2026-04-07, tracer bullets 1–2b):** generic `knowledge_item` + `turn_knowledge_item` persistence now carries `framing`, `constraint`, `requirement`, and `criterion` items with subtype/rationale metadata, `/api/projects/:id/entities` returns those kind-specific collections plus a typed `relationships[]` projection alongside legacy decisions/assumptions, and the workspace sidebar renders Framing, Constraints, Requirements, Criteria, Decisions, and Assumptions tabs without regressing existing dependency affordances.
- - **Verification approach**: inner — DB/core tests for generic item persistence and relationship projection. Middle — workspace integration tests for sidebar hydration and dependency rendering.
- - Tracer bullets:
- - `6e.1` Framing items through the generic knowledge seam. `done`
- - `6e.2a` Legacy dependency edges through the generic entity seam. `done`
- - `6e.2b` Remaining kind widening through the sidebar seam. `done`
-
-6f. **Phase-aware observer extraction** — Teach the observer to bias extraction by mode: scope prefers framing/constraints, design prefers decisions/assumptions, later modes can surface requirements/criteria and revisions. Keep the observer as a single structured extraction pass, but give it richer context and a broader ontology. `done`
+ - Invariants to respect: → SPEC.md §Invariants I20, I21, I23
+ - Invariants established: → SPEC.md §Invariants I48
+ - Acceptance: generic knowledge items and edges load and display from the active path without losing resume behavior
+ - Result: `knowledge_item` + `turn_knowledge_item` + `knowledge_edge` persistence; entities API projects kind-specific collections plus typed relationships
+ - Tracer bullets: 6e.1 framing items `done`, 6e.2a legacy edges `done`, 6e.2b remaining kinds `done`
+
+6f. **Phase-aware observer extraction** `done`
- Requirements: → SPEC.md §Requirements #5, #6, #11, #12
- Assumptions: → SPEC.md §Assumptions A14, A20
- - Decisions: → SPEC.md §Decisions D4, D5, D13, D25, D52, D53, D54, D55, D56
- - Candidate invariant goals: observer extracts framing without assumption inflation; phase-aware extraction deltas stay attributable to source turns
+ - Decisions: → SPEC.md §Decisions D4, D5, D13, D25
- Invariants to respect: → SPEC.md §Invariants I20, I21, I23
- - **Observed current state (2026-04-08, tracer bullets 6f.1–6f.4b):** scope-mode observer output now widens to generic `framing` and `constraint`, design-mode observer prompting biases toward legacy `decision`/`assumption` extraction while allowing framing corrections and constraint spillover, requirements-mode observer prompting can now surface generic `requirement` items while deferring premature criteria, and criteria-mode observer prompting can now surface generic `criterion` items without collapsing them back into requirements. The observer context includes existing framing/constraints/requirements/criteria alongside legacy decisions/assumptions, persisted assistant parts and SSE `data-observer-result` payloads carry mixed framing/constraint/requirement/criterion/decision/assumption IDs, and sidebar invalidation can refetch and render those observer-created entities through the shared entity seam, including the `Criteria` tab.
- - Acceptance: scope turns primarily yield framing/constraints; design turns primarily yield decisions/assumptions; later review turns can surface requirements/criteria without breaking observer sync; observer results still stream in-band to the sidebar
- - **Verification approach**: inner — schema + DB/parts tests prove widened observer contracts and generic persistence; middle — mocked observer-sync round-trip proves observer result → entities API → sidebar refresh coherence without gating on live-model quality; outer — manual scope/design/requirements/criteria walkthroughs judge ontology fit and seed future observer fixtures. See SPEC.md §Verification Design.
- - Tracer bullets:
- - `6f.1` Scope-mode framing extraction through the generic observer seam. `done`
- - Invariants established: → SPEC.md §Invariants I54, I55
- - `6f.2` Scope-mode constraint extraction through the generic observer seam. `done`
- - Invariants established: → SPEC.md §Invariants I56, I57
- - `6f.3` Design-mode observer bias over decisions/assumptions with generic spillover. `done`
- - Invariants established: → SPEC.md §Invariants I58, I59
- - `6f.4a` Requirements-mode requirement emergence through the generic observer seam. `done`
- - Invariants established: → SPEC.md §Invariants I60, I61
- - `6f.4b` Criteria-mode criterion emergence through the generic observer seam. `done`
- - Invariants established: → SPEC.md §Invariants I62, I63
+ - Invariants established: → SPEC.md §Invariants I54
+ - Acceptance: observer biases extraction by phase; results stream in-band to sidebar without breaking sync
+ - Result: scope yields goals/terms/contexts/constraints, design yields decisions/assumptions with scope spillover, requirements yields requirements, criteria yields criteria
+ - Tracer bullets: 6f.1 scope framing `done`, 6f.2 scope constraints `done`, 6f.3 design bias `done`, 6f.4a requirements `done`, 6f.4b criteria `done`
## Phase 5: Mode Closure + Full Interview
@@ -125,74 +99,77 @@
### Slices
-7. **Explicit phase outcomes + scope closure** — Replace pure `is_resolution` semantics with explicit phase outcomes and user-confirmed scope closure. Scope mode closes when sufficient shared understanding of goals, terms, context, and constraints is reached, not just when the model feels done. `done`
+7. **Explicit phase outcomes + scope closure** `done`
- Requirements: → SPEC.md §Requirements #7, #8
- Assumptions: → SPEC.md §Assumptions A15, A28
- Decisions: → SPEC.md §Decisions D2, D3, D6, D62, D65, D66
- - Candidate invariant goals: confirmed scope outcome survives refresh and invalidates correctly when upstream turns change
- - Invariants to respect: → SPEC.md §Invariants I18, I24, I25
- - Invariants established: → SPEC.md §Invariants I72, I73
- - Acceptance: scope mode proposes closure with a summary over the current scope knowledge family, user confirms, explicit phase outcome persists, and the project shows updated workflow state
- - **Observed current state (2026-04-08, slice 7):** scope-mode interviewer turns can now persist explicit `phase_outcome` proposal records in a dedicated readiness table, stream/persist typed `data-phase-summary` artifacts, confirm those proposals through typed `data-confirmation` chat parts, project workflow state from the active path, and supersede outcomes when their proposal turn leaves the active path. The workspace header now shows scope status, suppresses the normal prompt while a closure proposal is pending, and renders a dedicated confirmation card wired to the chat seam. This establishes the durable proposal/confirmation substrate only; shared closeability/readiness/closure-basis semantics are folded into slices 8, 11a, and 13 rather than reopening slice 7.
- - **Verification approach**: inner — schema + DB/core/parts tests for explicit phase-outcome proposal/confirmation contracts and lifecycle. Middle — round-trip + model-based lifecycle oracles prove submit → persistence → reload → workspace projection and supersession on upstream branch changes. Outer — manual closure/confirmation walkthrough deferred until after 7a.
-
-7a. **Knowledge-layer redesign spike (ontology + graph + workspace direction)** — Retire the current `framing` umbrella and mixed legacy/generic storage risk by specifying the canonical knowledge ontology, cross-kind graph model, and non-sidebar-first review surface before design/review modes harden today's transitional semantics. `done`
- - Requirements: → SPEC.md §Requirements #5, #6, #11, #12, #13
- - Assumptions: → SPEC.md §Assumptions A14, A15
+ - Invariants to respect: → SPEC.md §Invariants I18, I24
+ - Invariants established: → SPEC.md §Invariants I72
+ - Acceptance: scope proposes closure, user confirms, explicit phase outcome persists, workflow state updates
+ - Result: durable `phase_outcome` proposal/confirmation records; `data-phase-summary` + `data-confirmation` chat seams; workspace header shows scope status and confirmation card
+ - Debt: shared closeability/readiness/closure-basis generalization folded into slice 8
+
+7a. **Knowledge-layer redesign spike** `done`
- Decisions: → SPEC.md §Decisions D5, D17, D59, D61, D62, D63, D64, D67, D68, D69
- - Candidate invariant goals: later Phase 5/6 slices can treat the knowledge layer as one coherent semantic model rather than a provisional migration seam
- - Invariants to respect: → SPEC.md §Invariants I20, I21, I23, I68
- - Acceptance: produce an approved target model for canonical kinds, cross-kind edges, storage direction, and knowledge-workspace boundaries, plus a migration path that keeps slice 7 valid while gating slices 8–10 and 12 on the redesign
- - **Observed current state (2026-04-09, spike verdict):** the canonical durable ontology should be `goal`, `term`, `context`, `constraint`, `assumption`, `decision`, `requirement`, and `criterion`; `framing` is demoted to a migration-only intake alias rather than a final stored kind. The long-term storage direction is one generic knowledge-item/readiness model plus one generic cross-kind edge model, with a compatibility projection that preserves slice 7's `phase_outcome` closure mechanics by summarizing a scope bundle over canonical kinds and any unmigrated legacy `framing` rows. The primary review UX should be a dedicated knowledge workspace with phase-oriented list/detail review; the sidebar remains summary/navigation, not the main review surface.
- - **Recommendation:** land a canonical knowledge foundation slice before design/review mode work so the migration seam is explicit rather than hidden inside slice 8.
- - **Verification approach**: inner — concrete model examples and seam inventory reviewed against SPEC lexicon/decisions. Outer — design review over representative knowledge items and graph relationships to prove the ontology is discriminable and useful.
-
-7b. **Canonical knowledge model foundation + cutover seam** — Introduce canonical `goal` / `term` / `context` kinds, unify durable knowledge storage and cross-kind edges behind the generic seam, and preserve slice 7 coherence through the smallest necessary compatibility projection rather than migration-hardening legacy scratch data. `done`
- - Requirements: → SPEC.md §Requirements #5, #6, #7, #11, #12, #13
+ - Invariants to respect: → SPEC.md §Invariants I20, I21, I23, I48
+ - Acceptance: approved target model for canonical kinds, cross-kind edges, storage direction, and knowledge-workspace boundaries
+ - Result: canonical ontology is 8 kinds (`goal`, `term`, `context`, `constraint`, `assumption`, `decision`, `requirement`, `criterion`); `framing` demoted to migration alias; primary review UX is dedicated knowledge workspace, not sidebar
+
+7b. **Canonical knowledge model foundation + cutover seam** `done`
- Assumptions: → SPEC.md §Assumptions A14, A40
- - Decisions: → SPEC.md §Decisions D5, D17, D49, D51, D52, D53, D54, D59, D61, D62, D63, D67, D68, D69
- - Candidate invariant goals: canonical knowledge writes/readiness coexist with scope closure during cutover; no new Phase 5/6 slice depends on durable `framing`
- - Invariants to respect: → SPEC.md §Invariants I20, I21, I23, I68, I72
- - Invariants established: → SPEC.md §Invariants I74, I75, I76, I77, I78
- - Acceptance: schema/registry/context/API can represent all eight canonical kinds plus generic cross-kind edges; scope closure still reads a coherent scope bundle; no new writes rely on durable `framing` or decision/assumption-only edge semantics; destructive schema reset remains acceptable
- - **Observed current state (2026-04-09, tracer bullets 7b.1 + 7b.2):** the shared knowledge registry, observer-result payload schema, scope-mode observer output, entities API, workspace entity state, and sidebar tabs now use canonical `goal` / `term` / `context` / `constraint` collections on a clean DB rather than durable `framing` rows. Design-mode observer prompting still biases toward `decision` / `assumption` extraction, but those commitments now persist through `knowledge_item` / `turn_knowledge_item` / `knowledge_edge` instead of legacy decision/assumption tables and edge joins. The shared entities API preserves dedicated `decisions` / `assumptions` collections as compatibility projections, and an explicit canonical scope-bundle projection remains available so the slice-7 `phase_outcome` readiness seam stays intact during the cutover.
- - **Verification approach**: inner — schema/registry/core/API tests for canonical kinds, generic edges, and the minimal scope-closure compatibility projection. Middle — workspace/entity projection tests for canonical scope kinds on a clean DB. Outer — manual inspection of a representative project's scope bundle and carry-forward into the next mode.
- - Tracer bullets:
- - `7b.1` Canonical scope kinds through the generic seam. `done`
- - `7b.2` Generic edge/storage cutover + scope-readiness compatibility projection beyond legacy decision/assumption tables. `done`
-
-8. **Design mode (commitment / exploration)** — Implement the second workflow mode on the new turn and canonical knowledge model after 7b lands, while generalizing the current scope-only proposal/confirmation seam into a shared phase-closing model with deterministic closeability, coarse readiness bands, and explicit closure basis. The interviewer walks design forks; the observer captures decisions, assumptions, new constraints, and emerging requirements against the unified knowledge seam. `in-progress`
+ - Decisions: → SPEC.md §Decisions D5, D13, D17, D49, D51, D59, D61, D62, D63, D67, D68, D69
+ - Invariants to respect: → SPEC.md §Invariants I20, I21, I23, I48, I72
+ - Invariants established: → SPEC.md §Invariants I48, I54
+ - Acceptance: all eight canonical kinds plus generic edges work; scope closure reads coherent scope bundle; no new writes rely on `framing`
+ - Result: registry/observer/entities/sidebar use canonical kinds on clean DB; decisions/assumptions persist through generic seam; compatibility projections preserve slice-7 readiness
+ - Tracer bullets: 7b.1 canonical scope kinds `done`, 7b.2 generic edge/storage cutover `done`
+
+8. **Design mode (commitment / exploration)** `done`
- Requirements: → SPEC.md §Requirements #2, #3, #5, #6, #7, #8
- Assumptions: → SPEC.md §Assumptions A14, A15, A28, A40
- Decisions: → SPEC.md §Decisions D2, D5, D6, D61, D62, D65, D66, D67, D68, D70, D71, D72, D73, D74, D75
- - Candidate invariant goals: mode transition preserves interview continuity; design-mode turns produce coherent decision/assumption graph growth on the canonical knowledge seam; phase-closing state separates status, closeability, readiness, and closure basis instead of hidden interviewer authority
- - Invariants to respect: → SPEC.md §Invariants I18, I19, I21, I22, I72, I73
- - Invariants established: → SPEC.md §Invariants I79, I80, I81, I82, I83, I84, I85, I86
- - Acceptance: after scope closes and slice 7b lands, the interview enters design mode; design turns yield coherent commitments and assumptions on the canonical knowledge layer; the UI projects design status/closeability/readiness; and once the minimum bar is met the user can either accept an interviewer-recommended close or force-close design with persisted closure basis/readiness snapshot
- - **Observed current state (2026-04-09, tracer bullets 8.1–8.3 + completed phase-close refactor):** confirmed scope closure now projects through a shared workflow state carrying `status`, `closeability`, `readiness`, `closureBasis`, and pending-proposal visibility instead of the old scope-only `open/proposed/confirmed` seam. The next prepared turn after confirmed scope closure now enters `design` automatically, the observer runs against that design turn phase, and the workspace header renders shared workflow summaries for closed scope plus the newly active design phase rather than hardcoding scope-only status copy. Design now also uses the same typed `data-phase-summary` closure seam as scope: the design interviewer can recommend closure, the workflow projects a pending design summary through the shared phase state, confirmation persists design closure, and the next prepared turn enters `requirements`. That same typed confirmation seam now also carries a user-forced design close with visible `closureBasis: user_forced`, so forced-close debt survives refresh/resume and still hands the next prepared turn into `requirements`. The completed phase-close refactor also hardened the seam end to end: close intent moved into explicit shared phase-close commands, force-close availability now projects from one shared workflow-policy seam consumed by both UI and server validation, confirmed `phase_outcome` rows persist durable `closure_basis`, read-side workflow projection trusts that durable phase-outcome field instead of reconstructing provenance from confirmation-turn payloads, `data-confirmation` is now an explicit discriminated command union consumed consistently by the workspace controller and `/chat` request handling, and the remaining user-visible close-command labels, rejection copy, and forced-close summary text now also project from the shared phase-close module instead of being rebuilt inline across layers.
- - **Verification approach**: inner — mode-transition/controller/workflow-state projection tests. Outer — manual design walkthrough covering interviewer-recommended close, user-forced close, and visible carried-debt caveats.
- - Tracer bullets:
- - `8.1` Design-mode entry + shared workflow-state projection. `done`
- - `8.2` Design-phase closure proposal + requirements handoff. `done`
- - `8.3` User-forced design close + carried-debt projection. `done`
-
-9. **Requirements-review mode** — Synthesize the requirement set from the full canonical knowledge layer, then let the user approve, edit, merge, reject, and add requirements through review turns using the shared phase-closing seam rather than a requirements-only completion bit. This slice assumes the redesigned ontology/graph from 7a + 7b, not the current transitional `framing` seam. `not-started`
+ - Invariants to respect: → SPEC.md §Invariants I18, I19, I21, I22, I72
+ - Invariants established: → SPEC.md §Invariants I72
+ - Acceptance: design mode enters after scope close; design turns yield commitments on canonical knowledge seam; user can accept recommended close or force-close with persisted closure basis
+ - Result: shared workflow projection (status/closeability/readiness/closureBasis) replaces scope-only seam; explicit discriminated phase-close commands; force-close availability from shared policy; durable closure basis on `phase_outcome`
+ - Tracer bullets: 8.1 design entry + shared workflow `done`, 8.2 design closure + requirements handoff `done`, 8.3 user-forced close + carried debt `done`
+
+9. **Requirements-review mode** `done`
- Requirements: → SPEC.md §Requirements #6, #7, #8, #11, #13
- - Assumptions: → SPEC.md §Assumptions A15, A28, A40
- - Decisions: → SPEC.md §Decisions D2, D5, D6, D61, D62, D65, D66, D67, D68, D69, D70
- - Candidate invariant goals: requirements are capture-anytime but review-complete only through explicit review state; requirements workflow state stays legible as status + closeability + readiness + closure basis
+ - Assumptions: → SPEC.md §Assumptions A15, A28, A40, A44, A45, A46
+ - Decisions: → SPEC.md §Decisions D2, D5, D6, D61, D62, D65, D66, D67, D68, D69, D70, D71, D77, D78, D79
- Invariants to respect: → SPEC.md §Invariants I18, I19, I21, I24
- - Acceptance: requirements-review mode presents a synthesized requirement set from canonical knowledge items, records explicit approval/edit state, projects requirements status/closeability/readiness, and lets the user close once the minimum bar is met while carrying unresolved debt forward visibly when readiness is low
- - **Verification approach**: inner — review-state + workflow-state lifecycle tests. Outer — manual requirement review with approvals, edits, missing-item additions, and a low-readiness forced-close walkthrough.
-
-10. **Criteria-review mode** — Synthesize verification conditions from approved requirements plus any earlier criteria-like signals, then drive review turns until coverage is complete using the shared phase-closing seam rather than a criteria-only completion bit. This slice assumes the redesigned ontology/graph from 7a + 7b, not the current transitional `framing` seam. `not-started`
- - Requirements: → SPEC.md §Requirements #6, #7, #8, #12, #13
- - Assumptions: → SPEC.md §Assumptions A15, A28, A40
- - Decisions: → SPEC.md §Decisions D2, D5, D6, D17, D61, D62, D65, D66, D67, D68, D69, D70
- - Candidate invariant goals: criteria verify requirements explicitly and track review completeness separately from requirement state; criteria workflow state stays legible as status + closeability + readiness + closure basis
- - Invariants to respect: → SPEC.md §Invariants I18, I19, I21, I24
- - Acceptance: criteria-review mode presents synthesized criteria from the canonical knowledge layer, records explicit review state, projects criteria status/closeability/readiness, and lets the user close once the minimum bar is met while preserving caveats when verification coverage remains thin
- - **Verification approach**: inner — criterion/review edge + workflow-state tests. Outer — manual criteria review with edits, coverage checks, and a low-readiness forced-close walkthrough.
+ - Invariants established: → SPEC.md §Invariants I87
+ - Acceptance: requirement set synthesized from canonical knowledge; explicit approve/reject state; requirements closeability + closure proposal; criteria handoff on confirmation
+ - Result: interviewer grounded in requirement inventory; targeted approve/reject via review metadata + `turn_knowledge_item` links; closeability from full review coverage; shared phase-close seam reused for requirements → criteria handoff
+ - Tracer bullets: 9.1 inventory grounding `done`, 9.2 targeted approval `done`, 9.3 targeted rejection `done`, 9.4 closeability + proposal `done`, 9.5 closure + criteria handoff `done`
+
+10.1 **Criteria grounding + first synthesis/review loop** — Prove the first post-requirements criteria turn is grounded in the approved requirement set and can round-trip one first criterion through the existing seams without widening into the full criteria lifecycle. `not-started`
+ - Requirements: → SPEC.md §Requirements #6, #8, #12
+ - Assumptions: → SPEC.md §Assumptions A28, A40
+ - Decisions: → SPEC.md §Decisions D25, D55, D56, D71
+ - Candidate invariant goals: the first criteria turn is grounded in approved requirements; criteria-mode interviewer/observer behavior stays criteria-shaped and can persist one initial criterion through the existing seam
+ - Invariants to respect: → SPEC.md §Invariants I18, I19, I21, I24, I95, I96
+ - Acceptance: after requirements closes, the first criteria turn includes the approved requirement inventory, asks a criteria-shaped question rather than a generic follow-up, and one initial criterion can round-trip through observer/entity persistence without dropping out of criteria mode
+ - **Verification approach**: inner — criteria context/prompt seam tests plus criterion projection tests. Middle — round-trip oracle proving approved requirement inventory → criteria interviewer turn → criterion persistence/entities refresh. Outer — manual walkthrough judges whether the first criteria turn feels grounded in the reviewed requirement set.
+
+10.2 **Explicit criterion review state + minimal closeability** — Establish the first explicit per-criterion review seam and deterministic closeability rule in one slice rather than splitting approval, rejection, and closeability into separate tracer bullets. `not-started`
+ - Requirements: → SPEC.md §Requirements #7, #8, #12, #13
+ - Assumptions: → SPEC.md §Assumptions A15, A28
+ - Decisions: → SPEC.md §Decisions D24, D61, D65, D66, D70
+ - Candidate invariant goals: criteria project explicit `approved` / `rejected` / `pending` review state; criteria becomes closeable only when every current criterion has explicit non-pending review state
+ - Invariants to respect: → SPEC.md §Invariants I18, I21, I24, I62, I63, I96
+ - Acceptance: a targeted criteria-review turn can persist one explicit positive review action and one explicit non-positive review action, read-side projection resolves latest review state per criterion, and workflow marks criteria closeable only when no criterion remains `pending`
+ - **Verification approach**: inner — criterion review metadata/read-model/workflow-state tests. Middle — round-trip oracle proving explicit criterion review actions persist and project without drift, plus lifecycle oracle proving criteria stays `in_progress` until review coverage is complete. Outer — manual criteria review walkthrough judges whether the thin approve/reject semantics are legible enough to keep moving.
+
+10.3 **Criteria closure + completed workflow state** — Reuse the shared phase-close seam to close the final workflow phase and project a completed interview state once criteria review reaches the minimum bar. `not-started`
+ - Requirements: → SPEC.md §Requirements #7, #8, #13
+ - Assumptions: → SPEC.md §Assumptions A15, A28
+ - Decisions: → SPEC.md §Decisions D65, D66, D71
+ - Candidate invariant goals: the terminal phase can propose and confirm closure through the shared seam; workflow can project all phases closed with no stale active interviewer phase
+ - Invariants to respect: → SPEC.md §Invariants I18, I24, I96
+ - Acceptance: once criteria is closeable, the interviewer can propose criteria closure, user confirmation persists the final `phase_outcome`, and workflow projects all phases closed with no remaining active phase before export
+ - **Verification approach**: inner — phase-summary/confirmation/workflow-state tests. Middle — round-trip oracle proving criteria proposal → confirmation → confirmed final outcome → completed workflow projection. Outer — manual walkthrough judges whether final closure feels coherent before export/polish work.
## Phase 6: Readiness Surfaces + Export
@@ -208,7 +185,7 @@
- Assumptions: → SPEC.md §Assumptions A15, A28
- Decisions: → SPEC.md §Decisions D3, D17, D65, D66, D70
- Candidate invariant goals: project-list workflow state derives from durable phase outcomes, closeability/readiness projection, and review records, not ad hoc turn heuristics
- - Invariants to respect: → SPEC.md §Invariants I24, I25, I36, I41, I42
+ - Invariants to respect: → SPEC.md §Invariants I24
- Acceptance: the project list shows each project's per-phase status/readiness/closure-basis summary from persisted readiness artifacts plus live workflow projection, distinguishes forced-close or low-readiness debt from ordinary closed state, and updates correctly after refresh/resume
- **Verification approach**: inner — workflow-summary projection tests plus project-list route/component tests. Outer — manual multi-project walkthrough covering in-progress, forced-close debt, invalidated, and export-ready states.
@@ -217,7 +194,7 @@
- Assumptions: → SPEC.md §Assumptions A14, A40
- Decisions: → SPEC.md §Decisions D5, D17, D61, D63, D67, D68, D69
- Candidate invariant goals: review/edit actions are reflected in both knowledge state and readiness state; the knowledge workspace can present graph relationships and review actions without lossy sidebar compression
- - Invariants to respect: → SPEC.md §Invariants I23, I36, I41, I42
+ - Invariants to respect: → SPEC.md §Invariants I23, I24
- Acceptance: inspect and review/edit canonical knowledge items from a dedicated phase-oriented workspace surface; affected readiness updates visibly and persist across refresh/resume; dependency/provenance context remains legible during those actions
- **Verification approach**: inner — mutation + projection tests. Outer — manual knowledge-workspace review/edit walkthrough.
@@ -230,6 +207,15 @@
- Acceptance: complete all modes, satisfy review completeness, navigate to export, see markdown preview from the reviewed knowledge layer plus relevant phase-outcome caveats, download `.md` file
- **Verification approach**: inner — export projection tests. Outer — manual export after a full walkthrough, after a low-readiness/forced-close path surfaces caveats, and after a readiness-incomplete state blocks export.
+13a. **Review lifecycle refinement across requirements + criteria** — Revisit the first-cut review model only after the thin end-to-end path is working, and add the deferred variants that were intentionally excluded from slices 9 and 10 so the app kept moving toward completion. `not-started`
+ - Requirements: → SPEC.md §Requirements #11, #12, #13
+ - Assumptions: → SPEC.md §Assumptions A15, A40
+ - Decisions: → SPEC.md §Decisions D17, D61, D65, D66, D69
+ - Candidate invariant goals: richer review actions and invalidation semantics can evolve without regressing the thin end-to-end workflow; deferred edge-case variants are collected in one explicit refinement slice rather than fragmenting earlier mode slices
+ - Invariants to respect: → SPEC.md §Invariants I18, I21, I24
+ - Acceptance: deferred review refinements such as edit/add/merge/stale semantics across requirements and criteria can land behind one cross-cutting slice without regressing completion, export, or workflow-state coherence
+ - **Verification approach**: inner — mutation/read-model/invalidation tests per refinement added. Outer — manual cross-phase review lifecycle walkthrough after the dedicated knowledge workspace exists.
+
## Phase 7: Distribution
@@ -289,23 +275,28 @@ Phase 4: 6b ──→ 6b1 (workspace oracle) ──→ 6c (live streaming fix)
Phase 5: 6f ──┬──→ 7 (explicit phase outcomes + scope closure)
└──→ 7a (knowledge-layer redesign spike) ──→ 7b (canonical knowledge foundation)
7 ────┐
- 7b ───┴──→ 8 (design mode) ──→ 9 (requirements-review) ──→ 10 (criteria-review)
+ 7b ───┴──→ 8 (design mode) ──→ 9 (requirements-review) ──→ 10.1 (criteria grounding)
+ 10.1 ──→ 10.2 (criterion review + closeability)
+ 10.2 ──→ 10.3 (criteria closure)
Phase 6: 7 ──┐
8 ──┼──→ 11a (project dashboard workflow state)
9 ──┤
- 10 ─┘
+ 10.3 ─┘
7b ──→ 12 (knowledge workspace review surface + lifecycle API)
- 9 ──→ 12
- 10 ──→ 13 (export)
+ 10.3 ──→ 12
+ 10.3 ──→ 13 (export)
+ 12 ──┬──→ 13a (review lifecycle refinement)
+ 13 ──┘
Phase 7: 13 ──→ 14 (npx + CLI)
Phase 8: 14 ──→ 15 (drizzle-kit audit remediation)
```
### Parallelism opportunities
-- 7 (explicit phase outcomes + scope closure) and 7a (knowledge-layer redesign spike) can proceed in parallel: 7 establishes workflow closure mechanics, while 7a retires the ontology/graph/workspace risk that would otherwise leak into later mode and review slices.
-- With 7b landed, 8 (design mode + shared phase-closing model) is now the next primary unblocked slice. 12 still waits on the later reviewed-artifact path in 9/10 even though the canonical knowledge foundation is now in place.
-- 11a (project dashboard workflow state) can begin once the workflow-state artifacts from 7/8/9/10 exist; it does not need the broader knowledge workspace to surface durable project status, readiness bands, and carried-debt caveats.
-- 12 (knowledge workspace review surface + lifecycle API) and 13 (export) can proceed in parallel once 7b and the requirements/criteria review artifacts stabilize, because the first reviewed export path does not require the full knowledge workspace to land first.
-- 14 (npx) can start early with a basic launcher, completing after slice 13 when the new export predicate stabilizes.
-- 15 (drizzle-kit audit remediation) should wait until 14 lands, so packaging/distribution regressions can be judged on the real shipped path instead of the current dev-only setup.
+- With 7, 7a, 7b, 8, and 9 all done, the next primary slice is 10.1 (criteria grounding + first synthesis/review loop).
+- 10.2 and 10.3 should follow linearly; they are intentionally the minimum slices needed to unblock completed interview flow rather than separate variants of the same review seam.
+- 11a (project dashboard workflow state) can begin once 10.3 lands; it does not need the broader knowledge workspace.
+- 12 (knowledge workspace) and 13 (export) can proceed in parallel once 10.3 stabilizes the criteria artifacts and completed-workflow state.
+- 13a (review lifecycle refinement) is explicitly deferred; it should collect rarer review variants after 12 and 13 stabilize rather than fragmenting slices 9 and 10.
+- 14 (npx) can start early with a basic launcher, completing after slice 13 when the export predicate stabilizes.
+- 15 (drizzle-kit audit remediation) should wait until 14 lands.
diff --git a/memory/SPEC.md b/memory/SPEC.md
index 4aa0404c..3afaedb7 100644
--- a/memory/SPEC.md
+++ b/memory/SPEC.md
@@ -94,81 +94,49 @@ Detailed schema and mode-model rationale: `docs/design/INTERVIEW_MODE_MODEL.md`.
| A4 | Observer extraction completes in 1-3s during user read/think time (10-60s), adding zero perceived latency | medium | D1 | Observer agent | Spike measured 14-17s with Sonnet. Haiku expected 2-5s — validate with `generateObject` model switch. |
| A6 | Turn-tree branching in SQLite is sufficient for decision revisit and undo in a single-user tool | high | D7 | Turn tree, Branching | Validate with realistic branch/merge scenarios |
| A7 | Users arriving at the tool have a reasonably defined goal | medium | — | Scope phase | User testing; characterization kickoff mode mitigates if false |
-| A14 | A second-thread observer agent can reliably extract typed knowledge items and graph edges from a turn plus accumulated context | **validated** | D4, D5, D13, D52, D53, D54, D55, D56, D61, D62 | Observer agent, Knowledge layer | Validated structurally for the current widened ontology — canonical scope-mode `goal` / `term` / `context` / `constraint`, design-mode mixed scope-kind + generic `decision` / `assumption` deltas, requirements-mode `requirement` extraction, and criteria-mode `criterion` extraction — through the observer seam: `observer.test.ts`, `context.test.ts`, `parts.test.ts`, `app.test.ts`, and `InterviewWorkspace.test.tsx` prove widened observer contracts, generic persistence, in-band sync, and workspace refresh. Live-model discriminability for the sharper canonical scope kinds remains an outer-loop concern tracked separately in A40. |
-| A15 | The LLM can produce a useful coarse readiness estimate and closure recommendation, but phase closure authority must not depend solely on that judgment | medium | D3, D65, D66, D70 | Phase resolution, readiness projection | Partially validated structurally: slices 8.2 and 8.3 now prove the shared phase-closing seam supports both interviewer-recommended design closure and user-forced design closure, with persisted closure basis surviving reload and handoff into requirements (`db.test.ts`, `core.test.ts`, `app.test.ts`, `InterviewWorkspace.test.tsx`). Remaining validation still depends on outer-loop comparison of model recommendations vs user overrides across varied project types. |
+| A14 | A second-thread observer agent can reliably extract typed knowledge items and graph edges from a turn plus accumulated context | **validated** | D4, D5, D13, D61, D62 | Observer agent, Knowledge layer | Validated: observer seam tests prove widened extraction for all eight canonical kinds through generic persistence, in-band sync, and workspace refresh. |
+| A15 | The LLM can produce a useful coarse readiness estimate and closure recommendation, but phase closure authority must not depend solely on that judgment | medium | D3, D65, D66, D70 | Phase resolution, readiness projection | Partially validated structurally: slices 8.2, 8.3, and 9.5 now prove the shared phase-closing seam supports interviewer-recommended closure across multiple phases plus user-forced design closure, with persisted closure basis surviving reload and handoff through requirements into criteria (`db.test.ts`, `core.test.ts`, `app.test.ts`, `InterviewWorkspace.test.tsx`). Remaining validation still depends on outer-loop comparison of model recommendations vs user overrides across varied project types. |
| A16 | AI SDK `useChat` hook's `ToolUIPart` state machine models all permutations of pending, error, and success for tool calls | high | D14, D58 | Rich chat UI, pending-question projection | Partially validated: typed `tool-ask_question` parts render with correct state labels, streamed ask-question output now projects into a dedicated pending-question turn-card state without fabricating persisted turns in `workspace-data.test.ts`, `workspace-controller.test.tsx`, and `InterviewWorkspace.test.tsx`, and manual browser verification confirmed the pending-question card now appears without refresh. |
| A20 | Observer results can be delivered as typed data parts on the existing chat stream without holding the connection open unacceptably long | high | D22 | Observer agent, Entity sidebar | Measure observer latency with `generateObject`; if >5s, fall back to out-of-band SSE |
-| A21 | `useChat` `onData` callback reliably bridges to `queryClient.invalidateQueries` without stale-closure issues | **validated** | D22 | Entity sidebar | Validated: `InterviewWorkspace.test.tsx` covers `data-observer-result` → query invalidation → sidebar refresh, plus manual outer-loop verification remains for live browser/runtime behavior. |
-| A28 | AI SDK `ToolLoopAgent` with `stopWhen: stepCountIs(N)` is sufficient for brunch's multi-step interviewing, review, and phase-transition needs — no custom agent loop required | high | D30 | Agent loop, Phase transitions | Partially validated structurally: slices 8.1–8.3 now prove confirmed scope closure can hand off into a design-phase interviewer/observer turn, that design can recommend closure and hand off into requirements, and that a user-forced design close can bypass a recommendation without a handwritten loop (`interview.test.ts`, `db.test.ts`, `core.test.ts`, `app.test.ts`). Remaining proof for later review/closure slices still depends on the downstream phase-transition work. |
-| A33 | Structured turn responses can replace today's single-select flow while keeping persisted response parts, transcript hydration, and downstream context projection aligned for the first thin slice | **validated** | D23, D24, D25, D45, D46, D47, D48, D57, D60 | 6d flexible turn-response model; Phase 4 response-seam refactor | Validated: `parts.test.ts`, `app.test.ts`, `context.test.ts`, `observer.test.ts`, `turn-response.test.ts`, `workspace-data.test.ts`, and `InterviewWorkspace.test.tsx` now prove zero/one/many selected options plus optional free-text persist, rehydrate, and reach interviewer history, observer context, and workspace turn-card state coherently through the shared turn-response seam rather than selected-option flags. |
+| A21 | `useChat` `onData` callback reliably bridges to `queryClient.invalidateQueries` without stale-closure issues | **validated** | D22 | Entity sidebar | Validated: `InterviewWorkspace.test.tsx` proves observer-result → query invalidation → sidebar refresh. |
+| A28 | AI SDK `ToolLoopAgent` with `stopWhen: stepCountIs(N)` is sufficient for brunch's multi-step interviewing, review, and phase-transition needs — no custom agent loop required | high | D30 | Agent loop, Phase transitions | Partially validated structurally: slices 8.1–8.3 plus 9.5 now prove confirmed scope closure can hand off into a design-phase interviewer/observer turn, that design can recommend closure and hand off into requirements, that a user-forced design close can bypass a recommendation without a handwritten loop, and that confirmed requirements closure hands the next `/chat` turn into `criteria` through the same shared seam (`interview.test.ts`, `db.test.ts`, `core.test.ts`, `app.test.ts`). Remaining proof for later review/closure slices still depends on the downstream criteria/export work. |
+| A33 | Structured turn responses can replace today's single-select flow while keeping persisted response parts, transcript hydration, and downstream context projection aligned for the first thin slice | **validated** | D23, D24, D25, D57, D60 | 6d flexible turn-response model; Phase 4 response-seam refactor | Validated: zero/one/many selections plus free-text round-trip through persistence, hydration, and all downstream projections coherently. |
| A40 | The observer and review workspace can discriminate `goal`, `term`, and `context` well enough for a first canonical-scope implementation if low-confidence cases stay reviewable instead of collapsing back into `framing` | medium | D5, D67, D68, D69 | 7b canonical knowledge model foundation; 8 design mode; 9 requirements-review mode; 12 knowledge workspace | Validate with fixture probes and the first canonical-scope review flow: measure confusion between `goal` / `term` / `context`, then confirm that workspace normalization/editing can correct low-confidence captures without losing provenance or blocking downstream review. |
+| A44 | The existing choice-turn response seam (multi-select plus free text) is sufficient for a first requirements-review interaction that asks whether the synthesized requirement set is complete, needs correction, or is missing items before dedicated per-item review actions exist | medium | D24, D25 | 9 requirements-review mode | Partially validated structurally: slice 9.1 now proves requirements-review grounding can include the current requirement inventory, that a missing-requirement reply can round-trip through the existing turn-response seam into observer/entity refresh, and that requirements remains `in_progress` / not yet closeable after the first review interaction (`context.test.ts`, `interview.test.ts`, `db.test.ts`, `app.test.ts`). Usability of the reused choice-turn UI remains an outer-loop judgment before richer per-item review actions land. |
+| A45 | A first requirement-level review slice can establish durable explicit approval state for one requirement without yet changing requirements closeability or introducing the full edit/reject/stale lifecycle | **validated** | D61, D65, D70, D77 | 9 requirements-review mode | Validated: targeted approval persists durable `reviewed` turn/item links and projects `approved` vs `pending` state. |
+| A46 | A first deterministic requirements closeability rule can treat every active-path requirement having explicit non-pending review state (`approved` or `rejected`) as sufficient for a closure proposal, while readiness remains a separate descriptive signal | **validated** | D65, D66, D70, D79 | 9 requirements-review mode; 10 criteria-review mode | Validated: full review coverage triggers closeability, shared phase-close seam reuses for requirements → criteria handoff. |
## Decisions
-
+
30. **Vercel AI SDK replaces both Claude Agent SDK and raw Anthropic SDK** — `@ai-sdk/anthropic` provider with AI SDK primitives: `ToolLoopAgent` powers the interviewer (typed tools via `tool()` with Zod schemas, multi-step loop via `stopWhen`), `generateObject` powers the observer (structured extraction with Zod schema, no JSON parsing), `createUIMessageStream` + `pipeUIMessageStreamToResponse` handle server-side streaming, `validateUIMessages` validates incoming chat payloads. No hand-written stream translator, no DomainEvent layer on the web path. The `@anthropic-ai/sdk` package remains as a transitive dependency only. Depends on: —. Supersedes: Claude Agent SDK, raw Anthropic SDK approach, D27 (generator composition), D28 (outputFormat), D29 (ResultMessage metrics), custom agent loop plan (old D31).
32. **Core filesystem tools following pi-mono pattern** — 7 generic tools (read, write, edit, bash, grep, find, ls) in `src/server/tools/`, each a factory function returning an AI SDK `tool()` bound to a working directory. Tools are thin wrappers around Node.js fs APIs and shell commands (rg, fd), with truncation limits (500 lines / 64KB) following pi-mono's defaults. Composed via `createCoreTools(cwd)`. First use case: project characterization kickoff mode. Depends on: D30. Supersedes: —.
-33. **Component-level workspace oracle before state refactors** — The interview workspace has a client integration harness (`InterviewWorkspace.test.tsx`) that uses the real React Query cache and component tree while mocking `useChat` transport boundaries. It locks four seam behaviors before state-ownership refactors: initial hydration from persisted turns, same-project refresh preserving local chat state, `data-observer-result` invalidating entities into the sidebar, and option selection flowing through route refresh and chat submission. Depends on: D19, D22. Supersedes: manual-only workspace seam verification.
-
-34. **Heavy client capabilities live behind named boundaries before perf changes** — Streamed markdown rendering, reasoning rendering, code highlighting, and the developer debug route are each imported through dedicated client boundary modules (`src/client/capabilities/*`, `src/client/routes/debug-surface.tsx`) rather than directly from feature components. This keeps runtime behavior unchanged now while giving later refactor commits one place to introduce lazy loading, deferred enhancement, or alternative adapters without another cross-cutting import rewrite. Depends on: D9, D14. Supersedes: direct heavy-dependency imports from message, reasoning, code-block, and router modules.
-
-35. **Developer debug surface is route-lazy, not startup-eager** — The `/debug` route remains declared in the main router, but its UI loads through a lazy client boundary so the default interview entrypoint does not inline developer-only debug content into the initial application chunk. This keeps the route available without charging normal startup for the debug surface. Depends on: D9, D34. Supersedes: eager debug-route component loading from the main router.
-
-36. **Assistant rich rendering is progressive enhancement, not the baseline path** — Message and reasoning text render immediately through a plain text-safe boundary. The shipped app currently enhances only general markdown structure (plus lightweight rich rendering plugins such as math/CJK) after content proves enhancement is needed; mermaid graph rendering and syntax-highlighted code fences are intentionally out of scope for now because their bundle cost is not justified by current product needs. Depends on: D14, D34. Supersedes: startup-eager `streamdown` + highlighting on the default transcript path.
-
-37. **Workspace state ownership lives behind a data adapter before semantics change** — The client reads workspace data through an explicit adapter that separates durable project snapshots, durable entity snapshots, and ephemeral chat seed state. This commit preserves current behavior, including the current project-scoped chat hydration boundary, while giving later commits one place to change fetch concurrency and hydration policy without another cross-cutting rewrite. Depends on: D19, D22. Supersedes: inline workspace ownership logic spread across `InterviewWorkspace` and `EntitySidebar`.
-
-38. **Workspace route loading is the project-scoped durable-data entrypoint** — The interview route loader now starts project and entity snapshot fetches together, then seeds the entity query cache from that loader result so the sidebar can render from the same project entry boundary without a post-mount waterfall. Later observer-result invalidations still refetch through the entity query key, while same-project loader refreshes can update durable snapshots without implicitly rewriting the visible transcript. Depends on: D9, D22, D37. Supersedes: project-only route loading plus post-mount entity fetch from the sidebar path.
-
-39. **Chat hydration is an explicit workspace boundary policy** — Persisted turns seed `useChat` only on initial project entry or when navigation changes the active project. Same-project route invalidations may refresh durable project/entity snapshots and derived affordances, but they do not rewrite the current in-flight transcript. The policy lives in a dedicated client boundary instead of being inferred indirectly from adapter memoization. Depends on: D19, D37, D38. Supersedes: implicit project-id-keyed hydration behavior hidden inside workspace adapter wiring.
-
-40. **Client writes use a shared typed mutation boundary with visible failure states** — Project creation, option selection, and similar client-triggered writes go through one shared POST-mutation helper plus React Query mutation state. Server `error` payloads are surfaced as visible UI feedback instead of being swallowed by silent early returns, while successful writes keep their existing navigation or route-refresh follow-through. Depends on: D22, D37, D39. Supersedes: ad hoc `fetch` calls in route components with inconsistent error handling.
-
-41. **Render-sensitive client primitives use explicit lifecycle boundaries** — Code highlighting now uses an effect-owned async path with cache reads kept synchronous and side-effect-free, message-branch bookkeeping re-synchronizes when branch identity changes and clamps stale indices when branch sets shrink, and transient copy-feedback timers are cleared explicitly on replacement or unmount. Depends on: D34, D39, D40. Supersedes: render-time state resets, callback-style async highlighting orchestration, and branch bookkeeping that only tracked collection length.
-
-42. **Advanced rendering boundaries expose explicit preload surfaces without contaminating first paint** — Markdown and code-highlighting capabilities now export preload hooks so pointer, focus, or touch intent can warm rich rendering before full use, while the transcript keeps the plain path during active animation and the build oracle enforces both chunk separation and a default-entry size ceiling. Depends on: D34, D35, D36, D41. Supersedes: lazy-only enhancement with no intent-preload or budget guardrail.
-
-43. **Workspace orchestration reads through one controller boundary backed by a pure core and imperative shells** — Route components now consume a single workspace controller interface, while durable-state shaping, transcript seeding, and view projection live in pure functions and React Query/chat side effects live in dedicated shells. Depends on: D37, D38, D39. Supersedes: workspace ownership spread across route components plus loosely coordinated helper modules.
-
-44. **Domain-shaped client mutations own success choreography above the shared transport seam** — `client-mutation.ts` remains the shared POST/error boundary, but project creation and turn-option selection now flow through domain hooks that own navigation, invalidation, and chat follow-through so route/controller callsites do not repeat workflow logic. Depends on: D40, D43. Supersedes: route- or controller-local success choreography on top of the generic mutation helper.
-
-45. **Choice-turn responses persist as structured data plus a human-readable summary seam** — The single-option response path now writes a `data-turn-response` user part (`selectedOptionIds[]` + optional `freeText`) while `turn.answer` and the persisted text part carry a human-readable summary for transcript, observer, and transport seams. This keeps the response model structured without requiring a migration-hardening bridge for the old scalar-only worldview. Depends on: D23, D24. Supersedes: `data-option-selection` + scalar selected-option persistence.
-
-46. **Interviewer history prefers response-shaped projection when structured turn-response data exists** — `buildInterviewerContext(...)` should project prior choice-turn replies as chosen options plus free-text response when structured response data exists, falling back to scalar `Answer:` text only for older/non-structured turns. This gives the interviewer coherent access to remodeled response semantics without locking exact prompt prose too early. Depends on: D25, D45. Supersedes: scalar-only `Answer:` projection for choice-turn replies.
-
-47. **Zero-selection free-text responses reuse the same turn-response seam as option picks** — The existing choice-turn submit path now accepts either an option position or free-text-only content, but it always persists the same `data-turn-response` shape and emits the same chat-follow-through summary seam. Client naming should reflect “submit turn response,” not only “select option,” so the no-selection path is first-class rather than an exception case. Depends on: D24, D45. Supersedes: selection-only client/transport seam.
-
-48. **Choice-turn cards stage many selections locally and submit one array-shaped response seam** — Turn-card interaction now toggles zero/one/many selected options locally, then submits a single turn response carrying `positions[]` plus optional free-text through the same mutation/server boundary used by the other response paths. This keeps transcript hydration, persistence, and interviewer-context projection aligned without preserving the old immediate single-click selection behavior. Depends on: D45, D47. Supersedes: immediate single-option submit UI.
-
49. **Generic knowledge reads now surface canonical scope kinds and design commitments through the shared entity seam** — `knowledge_item` plus `turn_knowledge_item` are now the active persistence seam for all currently shipped knowledge writes, including canonical scope kinds plus `decision` / `assumption` commitments. The shared `/api/projects/:id/entities` projection and workspace surfaces still expose kind-specific `goals`, `terms`, `contexts`, `constraints`, `requirements`, `criteria`, `decisions`, and `assumptions` collections so the cutover can proceed without durable `framing` or a flat mixed list. Depends on: D5, D13, D22. Supersedes: decision/assumption-only entity reads in the sidebar and mixed storage assumptions at the read seam.
50. **Generic dependency edges read through one typed entity-graph seam** — `knowledge_edge` now owns persisted dependency edges for the active knowledge graph, and `/api/projects/:id/entities` projects them as one typed `relationships[]` payload with explicit source/target identity (`collection`, `kind`, `id`) so workspace surfaces can render dependency affordances without consulting legacy decision/assumption edge tables. Depends on: D5, D13, D22, D49. Supersedes: flat entity reads with no graph relationship projection and legacy-table-owned dependency edges.
51. **Generic knowledge reads stay kind-specific at the workspace seam** — While generic knowledge items share one persistence table, the shared entities API and workspace surfaces should project `goal`, `term`, `context`, `constraint`, `requirement`, `criterion`, `decision`, and `assumption` as distinct collections rather than a flat mixed list. This keeps tab-level affordances simple while preserving one generic storage model underneath. Depends on: D22, D49, D50. Supersedes: flat mixed generic projection at the workspace seam.
-52. **Phase-aware observer widening now starts with canonical scope kinds through the generic seam** — On `scope` turns, the observer should emit generic `goal`, `term`, and `context` items before later-mode kinds. The observer prompt/context should bias scope turns toward those canonical scope distinctions rather than collapsing everything into one umbrella kind, and the in-band `data-observer-result` seam should carry created scope-kind IDs alongside `decision` / `assumption` IDs so workspace refresh stays coherent while later slices land. Depends on: D13, D22, D25, D49, D51. Supersedes: decision/assumption-only observer-result payloads.
-
-53. **Scope-mode observer widening adds constraints as typed generic boundaries alongside the canonical scope bundle** — After the canonical scope bundle is established, scope-mode observer widening should add generic `constraint` items with optional subtype/rationale through the same generic persistence seam. The observer prompt/context should bias scope turns toward solution-space boundaries and non-goals without collapsing them into context or requirements, and the in-band `data-observer-result` seam should carry created constraint IDs so the `Constraints` tab refresh path stays coherent. Depends on: D22, D25, D49, D51, D52. Supersedes: scope-mode observer widening that still relied on a `framing` umbrella.
-
-54. **Design-mode observer widening keeps design commitments primary while allowing canonical scope-kind spillover** — On `design` turns, the observer should bias extraction toward `decision` and `assumption` entities while still allowing `goal` / `term` / `context` corrections and `constraint` spillover through the same generic seam when the turn adds real scope context or boundaries. The in-band `data-observer-result` seam continues to carry those ID buckets together so workspace refresh remains coherent across all collections. Depends on: D22, D25, D49, D50, D52, D53. Supersedes: phase-blind non-scope observer bias.
+57. **Turn-response projection is one shared semantic boundary for downstream consumers** — Interviewer history and observer context should both read structured replies through one projection module that resolves selected option content from persisted `data-turn-response` parts. When that seam is absent, downstream consumers should treat `turn.answer` as display/compatibility copy rather than reconstructing structured reply semantics from `option.is_selected`. Depends on: D24, D25. Supersedes: ad hoc context-local response parsing, observer-only scalar answer projection, and selected-option fallback as a semantic compatibility seam.
-55. **Requirements-mode observer widening starts with requirement emergence before criteria extraction** — On `requirements` turns, the observer should first widen toward generic `requirement` items that capture must-do capabilities or obligations, while explicitly deferring `criterion` extraction to a later criteria-focused slice unless a turn truly cannot be represented otherwise. The observer context should include existing requirements, and the in-band `data-observer-result` seam should widen again to carry created requirement IDs so the `Requirements` tab refresh path stays coherent without regressing existing legacy or generic tabs. Depends on: D22, D25, D49, D51, D54. Supersedes: later-mode observer bias that lacked requirement-specific widening.
+58. **Pending questions are a distinct workspace view model, not invented persisted turns** — Streamed `tool-ask_question` output now projects into a dedicated `pending-question` controller/view-state branch with its own ephemeral identity and option list, while persisted turn cards continue to use the durable `ProjectStateTurn` shape and submission flow. Route and UI layers render the union through one turn-card surface without inventing sentinel turn IDs, negative option IDs, or borrowed ancestry metadata. Depends on: D24, A16. Supersedes: fabricated persisted-turn projection for streamed interviewer questions.
-56. **Criteria-mode observer widening promotes verifiable checks through the same generic seam** — On `criteria` turns, the observer should widen toward generic `criterion` items that capture observable success conditions and verification evidence rather than restating product obligations as more `requirement` entities. The observer context should include existing criteria alongside requirements, and the in-band `data-observer-result` seam should widen once more to carry created criterion IDs so the `Criteria` tab refresh path stays coherent without regressing existing legacy or generic tabs. Depends on: D22, D25, D49, D51, D55. Supersedes: criteria turns treated as generic later-mode extraction with no criterion-specific bias.
+59. **Knowledge-kind metadata lives behind one shared registry seam** — The active knowledge ontology should declare ordering, collection keys, labels, context headings, and empty-state copy in one shared registry module. Observer-result payload schemas, observer-created ID maps, entities projection, observer context sections, and user-facing knowledge surfaces should read from that registry instead of re-declaring parallel arrays or object shapes. Depends on: D13, D22, D25, D49, D51. Supersedes: duplicated knowledge-kind metadata across shared, server, and client seams.
-57. **Turn-response projection is one shared semantic boundary for downstream consumers** — Interviewer history and observer context should both read structured replies through one projection module that resolves selected option content from persisted `data-turn-response` parts. When that seam is absent, downstream consumers should treat `turn.answer` as display/compatibility copy rather than reconstructing structured reply semantics from `option.is_selected`. Depends on: D25, D45, D46, D47, D48. Supersedes: ad hoc context-local response parsing, observer-only scalar answer projection, and selected-option fallback as a semantic compatibility seam.
-
-58. **Pending questions are a distinct workspace view model, not invented persisted turns** — Streamed `tool-ask_question` output now projects into a dedicated `pending-question` controller/view-state branch with its own ephemeral identity and option list, while persisted turn cards continue to use the durable `ProjectStateTurn` shape and submission flow. Route and UI layers render the union through one turn-card surface without inventing sentinel turn IDs, negative option IDs, or borrowed ancestry metadata. Depends on: D43, A16. Supersedes: fabricated persisted-turn projection for streamed interviewer questions.
-
-59. **Knowledge-kind metadata lives behind one shared registry seam** — The active knowledge ontology should declare ordering, collection keys, labels, context headings, and empty-state copy in one shared registry module. Observer-result payload schemas, observer-created ID maps, entities projection, observer context sections, and user-facing knowledge surfaces should read from that registry instead of re-declaring parallel arrays or object shapes. Depends on: D22, D25, D49, D51, D56. Supersedes: duplicated knowledge-kind metadata across shared, server, and client seams.
-
-60. **Workspace response affordances derive from persisted turn responses, not selected-option flags** — Choice-turn cards still render durable option state, but prompt visibility and “answered” affordances should derive from the presence of persisted `data-turn-response` user parts so selected-option replies and free-text-only replies share one semantic seam after reload. Depends on: D39, D45, D47, D48. Supersedes: `is_selected`-driven answered-state logic in workspace view-state.
+60. **Workspace response affordances derive from persisted turn responses, not selected-option flags** — Choice-turn cards still render durable option state, but prompt visibility and “answered” affordances should derive from the presence of persisted `data-turn-response` user parts so selected-option replies and free-text-only replies share one semantic seam after reload. Depends on: D24. Supersedes: `is_selected`-driven answered-state logic in workspace view-state.
61. **The current mixed knowledge storage model is transitional, not architectural target state** — The current split between generic `knowledge_item` rows for some kinds and dedicated legacy tables for `decision` / `assumption` is a migration seam, not the intended end state. The target is one coherent generic knowledge model — `knowledge_item` + `turn_knowledge_item` + `knowledge_edge` + `knowledge_review`, all keyed by canonical `kind + subtype` metadata across the full ontology — and not a permanent hybrid or a return to one-table-per-kind storage. Legacy `decision` / `assumption` tables and their specialized edge tables are temporary compatibility seams to retire. Depends on: D5, D17, D50, D59. Supersedes: treating mixed legacy/generic storage as an acceptable steady state.
@@ -200,6 +168,12 @@ Detailed schema and mode-model rationale: `docs/design/INTERVIEW_MODE_MODEL.md`.
75. **Workflow projection reads closure provenance only from durable phase outcomes** — Confirmed `phase_outcome` rows should store durable `closure_basis` directly at write time for both interviewer-recommended and user-forced closes, and workflow projection should trust that durable field as the sole provenance source. If a confirmed outcome lacks `closure_basis`, projection should surface `closureBasis: null` rather than reconstructing provenance from confirmation-turn payloads. Depends on: D65, D72, D73, D74. Supersedes: transcript-driven closure-basis recovery during workflow projection.
+77. **The first explicit requirement approval seam reuses targeted review metadata plus active-path `turn_knowledge_item(relation='reviewed')` links** — Slice 9.2 should not wait for the full `knowledge_review` lifecycle before proving per-item review state. A requirements-review question may carry explicit approval metadata naming the target requirement and its approval option; if the user chooses that option, the response seam records a durable `reviewed` turn/item link for that requirement on the active path. Read-side projection then surfaces targeted requirements as `approved` and untouched requirements as `pending` without yet introducing edit/reject/stale semantics or changing requirements closeability. Depends on: D13, D24, D61, D65, D70. Supersedes: inferring requirement approval only from free-form response text or generic turn presence.
+
+78. **The first explicit requirement rejection seam reuses targeted review metadata plus active-path `turn_knowledge_item(relation='rejected')` links, with latest explicit active-path review action winning projection** — Slice 9.3 should extend the same narrow per-item review seam rather than jumping to the full `knowledge_review` lifecycle. A requirements-review question may carry explicit rejection metadata naming the target requirement and its rejection option; if the user chooses that option, the response seam records a durable `rejected` turn/item link for that requirement on the active path. Read-side requirement projection resolves `approved` / `rejected` / `pending` from the latest explicit active-path review action so a later rejection can supersede an earlier approval without changing requirements closeability or overloading generic invalidation semantics. Depends on: D13, D24, D61, D65, D70, D77. Supersedes: treating the first approval as irrevocable on the active path, or reusing generic `invalidated` semantics for explicit user rejection.
+
+79. **The first requirements closeability seam is deterministic review coverage over explicit per-item requirement state, and it reuses the shared phase-close proposal transport** — Slice 9.4 should not introduce a requirements-only completion bit or readiness gate. Requirements become closeable once the current requirement set has no `pending` review state — i.e. every requirement is explicitly `approved` or `rejected` — while readiness remains a separate descriptive signal. Once closeable, requirements review reuses the shared `propose_phase_closure` / `data-phase-summary` / `phase_outcome` seam already established in earlier phases; criteria remains unopened until the proposal is explicitly confirmed. Depends on: D65, D66, D70, D77, D78. Supersedes: keeping requirements permanently non-closeable until a later bespoke workflow seam exists.
+
26. **`md-pen` for programmatic markdown rendering** — Structured data (entity tables, dependency graphs, checklists) rendered to markdown via `md-pen` rather than hand-rolled string concatenation. Pure string-return functions (`table()`, `taskList()`, `mermaid()`, `heading()`, `alert()`, `details()`) compose by nesting — no AST, no intermediate representation. Escaping is context-aware per function (table cells, URLs, code fences), eliminating a class of bugs when rendering user-supplied text from interviews. Primary use cases: (1) observer context builders presenting growing entity graphs to agents (`table()` for decisions/assumptions with metadata, `taskList()` for reviewed/unreviewed items), (2) spec export rendering active-path entities into downloadable markdown (slice 13), (3) any future agent-facing or user-facing projection of structured data. Zero dependencies, ESM-only, TypeScript-first. Depends on: —. Supersedes: hand-rolled string assembly in context builders.
### Domain model
@@ -238,94 +212,107 @@ Detailed schema and mode-model rationale: `docs/design/INTERVIEW_MODE_MODEL.md`.
Established by ln-build/ln-spike traceability.
Referenced by PLAN.md slices (to establish / to respect). -->
-| # | Invariant | Established by | Protected by | Proves |
-| --- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | ------------------ |
-| I1 | SSE protocol conformance | Slice 1 (skeleton) | app.test.ts | D8 |
-| I2 | Stream lifecycle correctness | Slice 1 (skeleton) | app.test.ts | D8 |
-| I3 | Thinking/text separation | Slice 1 (skeleton) | app.test.ts | D8 |
-| I4 | Vite proxy routing | Slice 1 (skeleton) | vite.config.ts (manual) | D10 |
-| I5 | DB lifecycle correctness | Slice 2 (SQLite) | db.test.ts | D7 |
-| I6 | Turn persistence | Slice 3 (turn tree) | db.test.ts, app.test.ts | D1, D7 |
-| I7 | Tool call SSE conformance | Slice 3b (rich UI) | app.test.ts, manual (outer loop) | D8, D14 |
-| I8 | Tool part state rendering | Slice 3b (rich UI) | manual (outer loop) | D14 |
-| I9 | Turn tree parent chain | Slice 3 (turn tree) | db.test.ts | D1 |
-| I10 | Active path resolution | Slice 3 (turn tree) | db.test.ts | D1 |
-| I11 | Drizzle migration auto-apply | Slice 3c (Drizzle) | db.test.ts | D18 |
-| I12 | Typed server chat boundary | Slice 3c (Drizzle) | core.test.ts, app.test.ts | D19 |
-| I13 | Core/adapter separation | Slice 3c (Drizzle) | core.test.ts, app.test.ts | D19 |
-| I14 | Project-scoped API routes | Slice 3d (routing) | app.test.ts | D9 |
-| I15 | Route loader hydration | Slice 3d (routing) | manual (outer loop) | D9 |
-| I16 | Schema validation on agent tool output | Slice 4 (scope interview) | interview.test.ts | D2, A13 |
-| I17 | Data Part schema validation | Slice 4a (parts persistence) | parts.test.ts (7 tests) | D24 |
-| I18 | Parts round-trip fidelity | Slice 4a (parts persistence) | parts.test.ts (7 tests), core.test.ts | D23 |
-| I19 | Context builder equivalence | Slice 4a (parts persistence) | context.test.ts (9 tests) | D25 |
-| I20 | Entity persistence with turn linkage | Slice 5 (observer) | db.test.ts (7 tests), observer.test.ts | D4, D5 |
-| I21 | Observer-result in-band sync | Slice 5 (observer) | observer.test.ts, app.test.ts | D22 |
-| I22 | AI SDK-native interviewer path | Slice 6b (AI SDK pivot) | app.test.ts, interview.test.ts | D30 |
-| I23 | Entity sidebar reactive update | Slice 6 (sidebar) | app.test.ts, manual (outer loop) | D22 |
-| I24 | Workspace hydration boundary stability | Slice 6b1 (workspace oracle) | InterviewWorkspace.test.tsx | D19, D22 |
-| I25 | Workspace event bridge correctness | Slice 6b1 (workspace oracle) | InterviewWorkspace.test.tsx | D9, D22 |
-| I26 | Progressive code-render fallback | Refactor commit 1 (client characterization coverage) | code-block.test.tsx | D14 |
-| I27 | Equal-length branch replacement stability | Refactor commit 1 (client characterization coverage) | message.test.tsx | D14 |
-| I28 | Client build boundary observability | Refactor commit 1 (client characterization coverage) | build-boundary.test.ts | — |
-| I29 | Heavy client dependency indirection | Refactor commit 2 (client capability boundaries) | capability-boundaries.test.ts | D34 |
-| I30 | Default entry excludes debug surface code | Refactor commit 3 (lazy debug route boundary) | build-boundary.test.ts | D35 |
-| I31 | Assistant transcript rendering stays text-first until enhancement is needed | Refactor commit 4 (progressive rich rendering split) | markdown-rendering.test.tsx | D36 |
-| I32 | Default entry excludes rich rendering and eager highlighting implementation | Refactor commit 4 (progressive rich rendering split) | build-boundary.test.ts | D36 |
-| I33 | Workspace state ownership is explicit even while current hydration semantics are preserved | Refactor commit 5 (workspace data adapter) | workspace-data.test.ts, InterviewWorkspace.test.tsx | D37 |
-| I34 | Workspace project and entity snapshots enter together through one project-scoped loader boundary | Refactor commit 6 (workspace loading concurrency) | InterviewWorkspace.test.tsx | D38 |
-| I35 | Persisted chat state hydrates only on initial project entry or explicit project navigation | Refactor commit 7 (explicit chat hydration policy) | InterviewWorkspace.test.tsx, chat-hydration.test.ts | D39 |
-| I36 | Client-triggered writes surface consistent visible failure states instead of silent no-ops | Refactor commit 8 (shared client mutations) | InterviewWorkspace.test.tsx, ProjectList.test.tsx | D40 |
-| I37 | Code highlighting upgrades from lifecycle-owned async work and ignores stale completions during prop churn | Refactor commit 9 (render-sensitive primitive purity) | code-block.test.tsx | D41 |
-| I38 | Message branch navigation stays aligned with the current branch set after replacement or shrink | Refactor commit 9 (render-sensitive primitive purity) | message.test.tsx | D41 |
-| I39 | Advanced rendering boundaries expose intent-preload seams while keeping animated transcript content on the plain first-paint path | Refactor commit 10 (intent preloading + performance guardrails) | markdown-rendering.test.tsx, code-block.test.tsx, capability-boundaries.test.ts | D42 |
-| I40 | The default client entry remains under an explicit size budget while excluding debug and rich-rendering payloads | Refactor commit 10 (intent preloading + performance guardrails) | build-boundary.test.ts | D42 |
-| I41 | Workspace controller behavior is protected below the route boundary for loader seeding and same-project refresh stability | Refactor commit 14 (controller seam oracles) | workspace-controller.test.tsx | D43 |
-| I42 | Shared client mutation transport reports network, non-JSON, and malformed-success failures consistently | Refactor commit 14 (mutation seam oracles) | client-mutation.test.ts | D44 |
-| I43 | Streamed `tool-ask_question` parts project into the visible turn-card surface before durable route refresh | Slice 6c (live streaming fix) | InterviewWorkspace.test.tsx, workspace-controller.test.tsx | D14, D43 |
-| I44 | Choice-turn replies persist as structured turn-response parts plus a user-visible summary on the first two remodeled response paths | Slice 6d.1 / 6d.2 (single-option + free-text; free-text-only) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D24, D45 |
-| I45 | Interviewer history projects chosen options and/or free-text from structured turn responses when available | Slice 6d.1 / 6d.2 (single-option + free-text; free-text-only) | context.test.ts | D25, D46 |
-| I46 | Free-text-only choice-turn replies require non-empty text and submit through the same turn-response seam as option picks | Slice 6d.2 (zero-selection + required free-text) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D24, D47 |
-| I47 | Choice-turn replies can stage and persist many selected options as one structured turn response without collapsing back to scalar selection semantics | Slice 6d.3 (many-selection UX + persistence/hydration) | app.test.ts, parts.test.ts, context.test.ts, InterviewWorkspace.test.tsx | D24, D45, D46, D48 |
-| I48 | Canonical scope knowledge items (`goal`, `term`, `context`) persist with project linkage and turn provenance through the generic seam without regressing other knowledge collections | Slice 7b.1 (canonical scope kinds through the generic seam) | db.test.ts | D49, D52 |
-| I49 | Workspace entity projections can surface canonical scope kinds alongside dedicated `decision` / `assumption` collections through the shared entity seam | Slice 7b.1 (canonical scope kinds through the generic seam) | app.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx | D22, D49 |
-| I50 | Generic dependency edges project into one typed relationship read model with stable source/target identity | Slice 7b.2 (generic commitment + edge cutover) | db.test.ts | D50 |
-| I51 | Workspace entity projections hydrate and render dependency relationships through the shared entity seam without regressing existing tabs | Slice 6e.2a (legacy dependency edges through the entity seam) | app.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx | D22, D50 |
-| I52 | Generic `constraint`, `requirement`, and `criterion` items project through kind-specific sidebar collections with preserved subtype/rationale metadata | Slice 6e.2b (remaining generic kinds through the sidebar seam) | db.test.ts | D49, D51 |
-| I53 | Workspace entity projections hydrate and render remaining generic knowledge kinds through the shared entity seam without regressing existing tabs | Slice 6e.2b (remaining generic kinds through the shared entity seam) | app.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx | D22, D51 |
-| I54 | Scope-mode observer context and output schema can widen to canonical scope kinds (`goal`, `term`, `context`) while preserving turn provenance and the shared decision/assumption ID seam | Slice 7b.1 (canonical scope kinds through the generic observer seam) | context.test.ts, observer.test.ts | D25, D52 |
-| I55 | Widened `data-observer-result` parts can carry created goal/term/context IDs through persistence, SSE emission, and workspace refresh without regressing existing observer sync | Slice 7b.1 (canonical scope kinds through the generic observer seam) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D22, D24, D52 |
-| I56 | Scope-mode observer context and output schema can widen to `constraint` items with subtype/rationale while preserving generic turn provenance | Slice 6f.2 (scope-mode constraints through the generic observer seam) | context.test.ts, observer.test.ts | D25, D53 |
-| I57 | Widened `data-observer-result` parts can carry created constraint IDs through persistence, SSE emission, and `Constraints` tab refresh without regressing observer sync | Slice 6f.2 (scope-mode constraints through the generic observer seam) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D22, D24, D53 |
-| I58 | Design-mode observer extraction can bias toward `decision` / `assumption` entities while still persisting goal/term/context/constraint spillover and generic dependency edges from one turn | Slice 7b.1 (design-mode observer bias with canonical scope-kind spillover) | observer.test.ts | D25, D54 |
-| I59 | Mixed observer-result payloads can carry goal/term/context/constraint plus decision/assumption IDs through assistant-part persistence, SSE emission, entities projection, and workspace refresh without regressing observer sync | Slice 7b.1 (design-mode observer bias with canonical scope-kind spillover) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D22, D24, D54 |
-| I74 | Canonical scope kinds (`goal`, `term`, `context`, `constraint`) round-trip through observer result → generic persistence → entities API on a clean DB without durable `framing` rows | Slice 7b.1 (canonical scope-kind cutover) | db.test.ts, observer.test.ts, app.test.ts | D49, D52, D53, D68 |
-| I75 | Shared knowledge registry and workspace entity state project `goals`, `terms`, and `contexts` as first-class collections without ontology drift or refresh regression | Slice 7b.1 (canonical scope-kind cutover) | knowledge.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx, workspace-controller.test.tsx | D49, D51, D59, D68 |
-| I76 | Design-mode observer prompting still biases toward design commitments while allowing canonical scope-kind spillover after the cutover | Slice 7b.1 (canonical scope-kind cutover) | observer.test.ts | D54, D68 |
-| I77 | Decisions and assumptions round-trip through `knowledge_item` / `turn_knowledge_item` / `knowledge_edge` while `/api/projects/:id/entities` preserves the dedicated `decisions` / `assumptions` compatibility collections | Slice 7b.2 (generic commitment + edge cutover) | db.test.ts, observer.test.ts, app.test.ts | D49, D50, D54, D61 |
-| I78 | An explicit scope-bundle projection over canonical `goal` / `term` / `context` / `constraint` items remains coherent after the generic commitment/edge cutover | Slice 7b.2 (scope-bundle compatibility projection) | db.test.ts | D67, D68 |
-| I60 | Requirements-mode observer context and output schema can widen to `requirement` items while preserving generic turn provenance and existing entity context | Slice 6f.4a (requirements-mode requirement emergence through the generic observer seam) | context.test.ts, observer.test.ts | D25, D55 |
-| I61 | Widened `data-observer-result` parts can carry created requirement IDs through persistence, SSE emission, entities projection, and `Requirements` tab refresh without regressing observer sync | Slice 6f.4a (requirements-mode requirement emergence through the generic observer seam) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D22, D24, D55 |
-| I62 | Criteria-mode observer context and output schema can widen to `criterion` items while preserving generic turn provenance and existing requirement/criterion context | Slice 6f.4b (criteria-mode criterion emergence through the generic observer seam) | context.test.ts, observer.test.ts | D25, D56 |
-| I63 | Widened `data-observer-result` parts can carry created criterion IDs through persistence, SSE emission, entities projection, and `Criteria` tab refresh without regressing observer sync | Slice 6f.4b (criteria-mode criterion emergence through the generic observer seam) | parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D22, D24, D56 |
-| I64 | Structured turn responses survive submit → project reload → transcript hydration → interviewer-history projection without collapsing back to scalar-only semantics | Refactor commit 1 (Phase 4 characterization coverage) | app.test.ts | D23, D24, D25 |
-| I65 | Workspace view-state can project a streamed `ask_question` turn card even when the durable path is still empty | Refactor commit 1 (Phase 4 characterization coverage) | workspace-data.test.ts | D43 |
-| I66 | Interviewer history and observer context project structured turn responses through one shared seam instead of diverging between structured and scalar answer views | Refactor commit 3 (shared turn-response projection seam) | turn-response.test.ts, context.test.ts, observer.test.ts | D25, D57 |
-| I67 | Workspace turn-card state distinguishes persisted turns from pending questions, so streamed interviewer output never fabricates persisted turn ids or ancestry | Refactor commit 4 (pending-question view model) | workspace-data.test.ts, workspace-controller.test.tsx, InterviewWorkspace.test.tsx | D43, D58 |
-| I68 | Knowledge-kind ordering and collection metadata flow from one shared registry through observer-result payloads, observer context sections, entities projection, and sidebar tabs without ontology drift | Refactor commit 5 (shared knowledge-kind registry) | knowledge.test.ts, context.test.ts, InterviewWorkspace.test.tsx | D22, D25, D49, D51, D56, D59 |
-| I69 | Structured turn-response projection reads semantic reply state only from persisted `data-turn-response` parts, so downstream consumers never reconstruct a structured reply from selected-option flags alone | Refactor commit 6 (remove turn-response compatibility shims) | turn-response.test.ts, context.test.ts, app.test.ts | D25, D45, D46, D47, D48, D57 |
-| I70 | Workspace durable/view state derives answered choice-turn affordances from persisted turn responses, so free-text-only replies resume without reopening the pending-response prompt | Refactor commit 6 (remove turn-response compatibility shims) | workspace-data.test.ts | D39, D45, D47, D48, D60 |
-| I71 | Workspace turn-card selected-option rendering rehydrates from persisted `data-turn-response` option IDs, so saved replies stay visibly selected even when durable option flags are false | Post-refactor cleanup slice (workspace durable turn-response seam) | workspace-data.test.ts, InterviewWorkspace.test.tsx | D39, D45, D48, D60 |
-| I72 | Explicit scope phase outcomes persist proposal/confirmation state in a dedicated readiness seam, project current workflow status from the active path, and supersede when the proposal turn leaves that path | Slice 7 (explicit phase outcomes + scope closure) | db.test.ts, app.test.ts | D3, D17, D65 |
-| I73 | Workspace view-state can project a typed `data-phase-summary` closure proposal into a confirmable card and submit typed `data-confirmation` parts through the chat seam without reopening the normal prompt | Slice 7 (explicit phase outcomes + scope closure) | workspace-data.test.ts, InterviewWorkspace.test.tsx | D24, D66 |
-| I79 | Workflow projection now exposes shared `status`, `closeability`, `readiness`, and `closureBasis` fields across a closed scope phase and the newly active design phase instead of the old scope-only `open/proposed/confirmed` seam | Slice 8.1 (design-mode entry + shared workflow projection) | db.test.ts, app.test.ts | D65, D70 |
-| I80 | After confirmed scope closure, the next prepared turn and workspace both enter design mode by reading the shared workflow projection rather than defaulting new turns and UI state back to scope | Slice 8.1 (design-mode entry + shared workflow projection) | core.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D66, D71 |
-| I81 | Scope and design now share one typed phase-summary closure seam, so a design closure proposal can persist, project `proposalPending`, confirm through chat, and advance the next active turn into requirements | Slice 8.2 (design-phase closure proposal + requirements handoff) | interview.test.ts, core.test.ts, app.test.ts | D66, D71 |
-| I82 | Typed `data-confirmation` parts now carry either interviewer-recommended proposal confirmation or a user-forced design close, and workflow projection recovers the forced-close basis from persisted confirmation turns so requirements handoff survives reload through the same chat seam | Slice 8.3 (user-forced design close + carried-debt projection) | parts.test.ts, db.test.ts, core.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D66, D72 |
-| I83 | `data-confirmation` now carries an explicit discriminated phase-close command union, and workspace/controller/request seams consume that command shape end to end for both interviewer-recommended and user-forced closes | Refactor commit 5 (explicit close-command transport) | phase-close.test.ts, parts.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D73 |
-| I84 | Force-close availability now projects once from shared workflow policy, and both workspace affordance rendering and server-side validation consume that projection while preserving the current rejection semantics | Refactor commit 2 (shared force-close action projection) | phase-close.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D74 |
-| I85 | Workflow projection reads closure provenance only from durable `phase_outcome.closure_basis`, so confirmed outcomes with missing durable provenance project `closureBasis: null` instead of re-reading confirmation-turn payloads | Refactor commit 4 (workflow projection closure-basis cutover) | db.test.ts, app.test.ts | D75 |
-| I86 | Phase-close command labels, force-close rejection messages, and user-forced close summaries derive from shared `phase-close` helpers, so UI and server layers no longer reconstruct those semantics inline | Refactor commit 6 (phase-close module deepening) | phase-close.test.ts, app.test.ts, InterviewWorkspace.test.tsx | D73, D74 |
+
+
+### Foundation
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I1 | SSE protocol conformance | Slice 1 (skeleton) | app.test.ts | D8 |
+| I2 | Stream lifecycle correctness | Slice 1 (skeleton) | app.test.ts | D8 |
+| I3 | Thinking/text separation | Slice 1 (skeleton) | app.test.ts | D8 |
+| I4 | Vite proxy routing | Slice 1 (skeleton) | vite.config.ts (manual) | D10 |
+| I5 | DB lifecycle correctness | Slice 2 (SQLite) | db.test.ts | D7 |
+| I6 | Turn persistence | Slice 3 (turn tree) | db.test.ts, app.test.ts | D1, D7 |
+| I7 | Tool call SSE conformance | Slice 3b (rich UI) | app.test.ts, manual (outer loop) | D8, D14 |
+| I8 | Tool part state rendering | Slice 3b (rich UI) | manual (outer loop) | D14 |
+| I9 | Turn tree parent chain | Slice 3 (turn tree) | db.test.ts | D1 |
+| I10 | Active path resolution | Slice 3 (turn tree) | db.test.ts | D1 |
+| I11 | Drizzle migration auto-apply | Slice 3c (Drizzle) | db.test.ts | D18 |
+| I12 | Typed server chat boundary | Slice 3c (Drizzle) | core.test.ts, app.test.ts | D19 |
+| I13 | Core/adapter separation | Slice 3c (Drizzle) | core.test.ts, app.test.ts | D19 |
+| I14 | Project-scoped API routes | Slice 3d (routing) | app.test.ts | D9 |
+| I15 | Route loader hydration | Slice 3d (routing) | manual (outer loop) | D9 |
+| I16 | Schema validation on agent tool output | Slice 4 (scope interview) | interview.test.ts | D2, A13 |
+| I17 | Data Part schema validation | Slice 4a (parts) | parts.test.ts | D24 |
+| I18 | Parts round-trip fidelity | Slice 4a (parts) | parts.test.ts, core.test.ts | D23 |
+| I19 | Context builder equivalence | Slice 4a (parts) | context.test.ts | D25 |
+| I20 | Entity persistence with turn linkage | Slice 5 (observer) | db.test.ts, observer.test.ts | D4, D5 |
+| I21 | Observer-result in-band sync | Slice 5 (observer) | observer.test.ts, app.test.ts | D22 |
+| I22 | AI SDK-native interviewer path | Slice 6b (AI SDK pivot) | app.test.ts, interview.test.ts | D30 |
+| I23 | Entity sidebar reactive update | Slice 6 (sidebar) | app.test.ts, manual (outer loop) | D22 |
+
+### Client characterization
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I26 | Progressive code-render fallback | Client characterization | code-block.test.tsx | D14 |
+| I27 | Equal-length branch replacement stability | Client characterization | message.test.tsx | D14 |
+| I28 | Client build boundary observability | Client characterization | build-boundary.test.ts | — |
+| I29 | Heavy client dependency indirection | Client capability boundaries | capability-boundaries.test.ts | D34 |
+| I30 | Default entry excludes debug surface code | Lazy debug route boundary | build-boundary.test.ts | D35 |
+| I31 | Assistant transcript rendering stays text-first | Progressive rich rendering | markdown-rendering.test.tsx | D36 |
+| I32 | Default entry excludes rich rendering and eager highlighting | Progressive rich rendering | build-boundary.test.ts | D36 |
+
+### Workspace seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I24 | Workspace hydration, streaming projection, controller orchestration, mutation transport, and render-lifecycle boundaries remain stable across project entry, same-project refresh, observer-result invalidation, streamed pending-question cards, and chat submission | Slices 6b1, 6c, 6d, 6e, 6f; refactors 1–14 | InterviewWorkspace.test.tsx, workspace-data.test.ts, workspace-controller.test.tsx, chat-hydration.test.ts, client-mutation.test.ts, ProjectList.test.tsx, code-block.test.tsx, message.test.tsx, markdown-rendering.test.tsx, capability-boundaries.test.ts, build-boundary.test.ts | D9, D19, D22, D14, D34, D35, D36, D37, D38, D39, D40, D41, D42, D43, D44, D58 |
+
+### Turn response seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I44 | Structured turn responses (zero/one/many selected options plus optional free-text) round-trip through persistence, transcript hydration, interviewer-history projection, observer-context projection, and workspace affordance state without collapsing back to scalar-only semantics | Slices 6d.1–6d.3; refactors 1, 3, 6; post-refactor cleanup | parts.test.ts, app.test.ts, context.test.ts, turn-response.test.ts, observer.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx | D23, D24, D25, D39, D45, D46, D47, D48, D57, D60 |
+
+### Generic knowledge seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I48 | Canonical knowledge kinds (`goal`, `term`, `context`, `constraint`, `requirement`, `criterion`, `decision`, `assumption`) persist with turn provenance, project through kind-specific entity collections and typed dependency edges, and surface through the shared knowledge registry without ontology drift or refresh regression — including scope-bundle coherence and compatibility collections | Slices 6e, 7b.1, 7b.2; refactor 5 (knowledge registry) | db.test.ts, app.test.ts, knowledge.test.ts, workspace-data.test.ts, workspace-controller.test.tsx, InterviewWorkspace.test.tsx | D5, D22, D49, D50, D51, D52, D53, D59, D61, D67, D68 |
+
+### Observer widening seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I54 | Phase-aware observer extraction widens to all canonical knowledge kinds through the generic seam: scope biases toward goals/terms/contexts/constraints, design biases toward decisions/assumptions with scope-kind spillover, requirements widens to requirements, criteria widens to criteria — and widened `data-observer-result` parts carry created IDs through persistence, SSE emission, entities projection, and workspace refresh without regressing observer sync | Slices 6f.1–6f.4b, 7b.1, 7b.2 | context.test.ts, observer.test.ts, parts.test.ts, app.test.ts, db.test.ts, InterviewWorkspace.test.tsx | D22, D24, D25, D49, D52, D53, D54, D55, D56, D68 |
+
+### Phase-close seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I72 | Explicit phase outcomes persist proposal/confirmation state with durable closure provenance, project shared workflow status/closeability/readiness/closureBasis from the active path, supersede when the proposal turn leaves that path, and advance the active phase on confirmation — through one discriminated phase-close command union and one shared policy seam for both interviewer-recommended and user-forced closes | Slices 7, 8.1–8.3; refactors 2, 4, 5, 6 (phase-close module) | db.test.ts, app.test.ts, core.test.ts, interview.test.ts, parts.test.ts, phase-close.test.ts, workspace-data.test.ts, InterviewWorkspace.test.tsx | D3, D17, D24, D65, D66, D70, D71, D72, D73, D74, D75 |
+
+### Requirements-review seam
+
+
+
+| # | Invariant | Established by | Protected by | Proves |
+| --- | ---------------------------------------------------------- | ------------------------- | ------------------------------------ | ----------- |
+| I87 | Requirements-review mode grounds the interviewer in the current requirement inventory, targeted approve/reject actions persist durable active-path review links with latest-action-wins projection, closeability derives from full review coverage, and the shared phase-close proposal/confirmation seam reuses for requirements → criteria handoff with correct mode advancement | Slices 9.1–9.5 | context.test.ts, interview.test.ts, db.test.ts, app.test.ts, core.test.ts, EntitySidebar.test.tsx | D24, D25, D51, D61, D65, D66, D70, D71, D77, D78, D79, A28, A44, A45, A46 |
## Lexicon
@@ -353,7 +340,7 @@ Detailed schema and mode-model rationale: `docs/design/INTERVIEW_MODE_MODEL.md`.
| **active path** | The branch from HEAD to root in the turn tree. Determines which turns, knowledge items, phase outcomes, and review state are currently trusted. |
| **phase** / **mode** | A workflow stage of the interview: `scope`, `design`, `requirements`, `criteria`. Modes change interviewer behavior, observer extraction bias, and closure logic. They are not exclusive capture windows. |
| **choice turn** | An exploratory interaction turn where the interviewer proposes structured options and strategic grounding. Supports zero/one/many selections plus a unified free-text response field that is optional when options are chosen and required when none are chosen. |
-| **review turn** | A review interaction turn where the interviewer asks the user to approve, edit, reject, merge, or add to a synthesized item set. |
+| **review turn** | A review interaction turn where the interviewer asks the user to approve, edit, reject, merge, or add to a synthesized item set. Early tracer bullets may approximate this with existing choice-turn responses before dedicated review-action payloads exist. |
| **turn response** | The full structured user reply to a turn: chosen options plus optional/required free-text content. This is the conceptual answer shape even when compatibility scalars exist in storage or transport seams. |
| **free-text response** | User-authored text attached to a turn response. It can supplement chosen options or stand alone when no option fits. |
| **goal** | A desired project outcome or value to optimize for: what the work is trying to achieve. |
@@ -420,9 +407,9 @@ Scored per the arc-oracle diagnostic framework (high / partial / low):
| Dimension | Score | Notes |
| ------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| **Observability** | partial | Inner/middle: high for persisted response structure, hydration state, interviewer-context projection, observer-created entity persistence, and explicit phase-outcome lifecycle because those seams are text-native and testable. Outer: still partial for ontology fit, closure-summary quality, and closure timing, which remain visible only through runtime interaction and human judgment. |
-| **Reproducibility** | partial | Deterministic systems (turn tree, DB, schema validation, context projection, readiness records): high. LLM boundary (interviewer output, observer extraction): low — non-deterministic. For slice 6d, the first 6f widening step, and slice 7 phase closure, we therefore prove structural coherence in the middle loop and defer qualitative ontology/closure judgment to the outer loop while the knowledge-item typology and closure prompt shape are still settling. |
-| **Controllability** | high | Single-user, local SQLite, mocked `generateText`, and explicit DB/API/sidebar seams make structural observer-widening and phase-outcome lifecycle work highly controllable. Human review remains reserved for outer-loop judgment on live-model extraction quality and closure quality. |
+| **Observability** | partial | Inner/middle: high for persisted response structure, hydration state, interviewer-context projection, observer-created entity persistence, and explicit phase-outcome lifecycle because those seams are text-native and testable. For slice 9.1, the remaining partial is requirements-review groundedness: tests can prove the current requirement inventory reaches the interviewer seam and survives the review loop, but whether the emitted review prompt is genuinely useful still requires manual judgment. |
+| **Reproducibility** | partial | Deterministic systems (turn tree, DB, schema validation, context projection, readiness records): high. LLM boundary (interviewer output, observer extraction): low — non-deterministic. For slice 6d, the first 6f widening step, slice 7 phase closure, and slice 9.1 requirements review, we therefore prove structural coherence in the middle loop and defer qualitative ontology/closure/review judgment to the outer loop while prompt shape and review semantics are still settling. |
+| **Controllability** | partial | Single-user, local SQLite, mocked `generateText`, and explicit DB/API/sidebar seams keep the structural parts of observer widening, phase outcomes, and the first requirements-review loop agent-controllable. The remaining partial is qualitative review adequacy: groundedness and usefulness of the requirements-review prompt still need a human walkthrough, so the agent cannot fully close the loop autonomously yet. |
### Verification Commands
@@ -455,9 +442,9 @@ End-to-end slices must be **user-testable**, not just programmatically tested. E
| Fast unit tests — DB | Turn persistence with phase provenance, entity writes with dependency edges | I5, I6, I9, I10, I11 | ms |
| Fast unit tests — core | DomainEvent streaming, core/adapter separation, structured turn creation | I12, I13 | ms |
| Fast unit tests — parts | Parts round-trip (DomainEvents → assemble → persist JSON → load → hydrate); Data Part schema validation (Zod parse on structured user input); context builder output shape | I17, I18, I19 | ms |
-| Fast unit tests — turn response | Structured turn-response schema and submit seams establish zero/one/many selected-option arrays plus the required-free-text rule; interviewer context projection stays response-shaped, not scalar-only | I17, I18, I19, I44, I45, I46, I47, A33 | ms |
+| Fast unit tests — turn response | Structured turn-response schema and submit seams establish zero/one/many selected-option arrays plus the required-free-text rule; interviewer context projection stays response-shaped, not scalar-only | I17, I18, I19, I44, A33 | ms |
| Fast unit tests — observer sync | `observer-complete` emitted post-commit with entity IDs matching DB state; SSE adapter encodes as typed data part | D22, A20 | ms |
-| Fast unit tests — phase outcome lifecycle | Explicit phase-outcome records persist proposal/confirmation state, derive current readiness from the active path, and supersede correctly when upstream turns change | I18, I24, I25, I72 | ms |
+| Fast unit tests — phase outcome lifecycle | Explicit phase-outcome records persist proposal/confirmation state, derive current readiness from the active path, and supersede correctly when upstream turns change | I18, I24, I72 | ms |
| Type-aware linting | Semantic static checks (oxlint + tsgolint) | All | ms |
**Middle loop** (seconds–minutes): regression gates
@@ -465,12 +452,15 @@ End-to-end slices must be **user-testable**, not just programmatically tested. E
| Oracle family | What it proves | Protects | Cost |
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- | ---------------------------------------- |
| Differential testing (observer) | Observer extraction meets ≥80% entity capture rate against golden master fixtures | A14 | seconds per fixture; requires Claude API |
-| Round-trip oracle (turn response) | Structured turn response survives submit → persistence → hydration → interviewer-context composition with no drift | I17, I18, I19, I22, I44, I45, I46, I47, A33 | seconds |
+| Round-trip oracle (turn response) | Structured turn response survives submit → persistence → hydration → interviewer-context composition with no drift | I17, I18, I19, I22, I44, A33 | seconds |
| Round-trip oracle (turn tree) | Structured turns → active path → entity resolution intact | I6, I9, I10 | ms |
| Integration tests | SSE stream contains expected event types in order; DB lifecycle survives close/reopen | I2, I5, I13, I14 | seconds |
| Round-trip oracle (observer sync) | Mocked observer widening survives result schema → persistence → entities API → sidebar refresh with no drift | I20, I21, I23, A20 | seconds |
-| Round-trip oracle (phase outcome) | Mocked scope-closure proposal plus user confirmation survives submit → persistence → project reload → workspace workflow-state projection with no drift | I18, I24, I25, I72, I73 | seconds |
-| Model-based lifecycle oracle (phase outcome) | A tiny reference state machine (`none → proposed → confirmed → superseded/stale`) matches real readiness behavior across refresh, revisit, and active-path changes | I24, I25, I72, A15 | seconds |
+| Round-trip oracle (phase outcome) | Mocked phase-closure proposals plus user confirmation survive submit → persistence → project reload → workspace workflow-state projection with no drift, including the first requirements closeability/proposal path | I18, I24, I72, I87 | seconds |
+| Model-based lifecycle oracle (phase outcome) | A tiny reference state machine (`none → proposed → confirmed → superseded/stale`) matches real readiness behavior across refresh, revisit, and active-path changes, including requirements staying `in_progress` while proposal is pending | I24, I72, I87, A15 | seconds |
+| Contract testing (requirements-review grounding) | The first requirements-review tracer bullet proves two contracts independently: the current requirement inventory reaches the requirements-mode interviewer context, and the emitted review turn stays requirements-shaped rather than falling back to generic design follow-up behavior | I19, A44 | seconds |
+| Round-trip oracle (requirements review) | A requirements-review turn plus a user reply about either a missing requirement, a targeted approval, or a targeted rejection survives submit → persistence → requirement review linkage / observer result → entities API → workspace/sidebar refresh with no drift | I18, I21, I23, I87, A44, A45 | seconds |
+| Model-based lifecycle oracle (requirements review) | A tiny reference rule proves the first requirements-review interactions leave `requirements` active, `in_progress`, and not yet closeable rather than accidentally reusing scope/design closure semantics | I24, I87, A15, A44 | seconds |
**Outer loop** (minutes–hours): human observer
@@ -484,6 +474,7 @@ End-to-end slices must be **user-testable**, not just programmatically tested. E
| Resume test | Close/reopen browser, verify state intact | Human + browser |
| Observer → sidebar reactivity | `onData` → query invalidation updates sidebar after observer extraction; validates A21 | Human + `/cli-cdp` (slice 6) |
| Manual phase-closure walkthrough | The interviewer proposes scope closure at sensible moments, the summary is understandable, and confirmation/transition UX is coherent | Human + browser (after 7a) |
+| Manual requirements-review walkthrough | The requirements-review turn feels grounded in the current requirement set, the reused choice-turn UI is acceptable for completeness review, the first explicit approve/reject actions remain legible as review state rather than deletion or invalidation, and the first close proposal feels timely once review coverage is complete | Human + browser |
### Design notes
@@ -496,6 +487,8 @@ End-to-end slices must be **user-testable**, not just programmatically tested. E
- **No migration-hardening oracle for observer widening** — Existing data is not valuable enough yet to justify bridge or compatibility oracles across these observer-schema changes, so verification budget should go to current-state structural coherence and semantic quality instead.
- **Slice 7 phase-closure oracle boundary** — Before 7a, slice 7 should prove the explicit readiness seam, not closure taste. The recommended pair is (1) a round-trip oracle over mocked `phase-summary` + `confirmation` artifacts and (2) a tiny model-based lifecycle oracle over `none → proposed → confirmed → superseded/stale`. This assumes a durable explicit phase-outcome record rather than inferred `turn.is_resolution` semantics.
- **Deferred qualitative closure judgment is deliberate, not accidental** — For slice 7, summary quality and closure timing are acknowledged blind spots until 7a sharpens the ontology and knowledge-workspace direction. The middle loop should therefore lock proposal shape, confirmation persistence, active-path projection, and supersession semantics without pretending the model's closure judgment is already trustworthy.
+- **Slice 9.1 requirements-review oracle boundary** — The first requirements-review tracer bullet proves set-level completeness review, not full per-item approval semantics. The middle loop should lock two independent structural artifacts — (1) the current requirement inventory reaches the requirements-review interviewer seam and (2) a requirements-review reply about missing requirements survives the existing turn-response + observer + workspace refresh loop — while the outer loop judges whether the emitted review turn actually feels grounded and useful.
+- **Slice 9.2–9.5 requirements-review oracle boundary** — The current tracer bullets prove explicit per-item approval/rejection actions plus the first deterministic closeability/proposal/confirmation seam, not the full review lifecycle. The middle loop should lock (1) targeted approval/rejection turns carrying explicit review metadata, (2) matching responses producing durable active-path `reviewed` / `rejected` links for the named requirement, (3) read-side projection of `approved` / `rejected` / `pending` requirement state in the entities/sidebar seam, (4) a closeability/proposal round-trip in which fully reviewed requirements can emit a shared `data-phase-summary` proposal without yet closing the phase, and (5) a confirmation/handoff round-trip in which that proposed close advances the next interviewer turn into `criteria` without stale requirements-mode instructions.
### Acknowledged Blind Spots
@@ -526,6 +519,9 @@ This projection difference is a deliberate design choice, not an implementation
| Cumulative entity graph integrity | Individual extractions may be correct but compose into an incoherent graph over 15-20 turns. No programmatic check for drift. | Debug mode (human eyeballs the growing graph). Future: structural property tests (no orphaned edges, no DAG cycles, monotonic entity count). | After observer slice lands and manual testing reveals graph-level issues. |
| Ontology sharpness / kind discrimination | The generic knowledge-item typology is still being pressure-tested for semantic separation and naming quality, so a schema-valid observer result may still be conceptually wrong. | Treat 6f.1 as structural-only in the middle loop; use manual walkthroughs to judge whether extracted `framing` is truly framing and capture confirmed-good sessions as future fixtures. | Revisit once several good/bad observer examples have been captured and the kind vocabulary feels stable. |
| Phase transition UX | Summary quality, resolution timing, confirmation flow. Fully visual. Slice 7 intentionally defers this outer-loop judgment until after 7a so the ontology/workspace redesign can sharpen what “ready to close scope” should mean. | Middle loop proves only proposal contract, persistence, active-path projection, and supersession semantics for slice 7; manual phase-closure walkthrough lands after 7a. | Revisit immediately after 7a, or sooner if the structural build reveals the proposal/confirmation shape is itself unstable. |
+| Requirements-review prompt grounding | Slice 9.1 proves the requirement inventory reaches the interviewer seam and the review loop stays coherent, but it does not force the emitted review question to quote or deeply reason over specific requirement text. | Manual requirements-review walkthrough judges whether the first review turn feels grounded enough; promote to a stronger contract or fixture-backed oracle if it feels generic. | Revisit if manual runs show requirements-review prompts drifting into generic follow-up instead of requirement-set review. |
+| Per-item requirement review semantics | Slices 9.2 and 9.3 now prove explicit requirement-level approval and rejection actions with durable `approved` / `rejected` / `pending` projection, but they still do not cover edit/add-missing payloads, merge semantics, or stale-state invalidation. | Keep the first per-item slices narrow; add focused review-action seams and oracles for edit/add/merge/stale behavior in later tracer bullets. | Revisit when scoping the next tracer bullet inside slice 9. |
+| Criteria synthesis after requirements handoff | Slice 9.5 now proves the structural confirmation/handoff seam into `criteria`, but it does not yet prove that the first criteria-mode question is well grounded in the approved requirement set or that criteria closeability should depend on richer coverage than simple mode entry. | Use slice 10 to add criteria-specific grounding and lifecycle oracles; keep manual walkthroughs focused on whether the first criteria turn feels connected to the reviewed requirement set. | Revisit when building slice 10 criteria-review mode. |
| Performance under realistic load | 20+ turns, growing history summaries, observer latency. No budget oracle. | Acceptable for single-user tool. | If latency becomes noticeable during manual testing. |
| `onData` stale-closure correctness | The workspace seam now has a component-level integration oracle, but it still mocks `useChat` and does not prove the exact live browser/runtime behavior of the AI SDK hook. Known `onFinish` stale-closure bug (ai-sdk#550) may still affect production wiring. | `InterviewWorkspace.test.tsx` protects the app-side invalidation logic; manual outer-loop validation remains required for live browser/runtime confirmation. If broken, fall back to parallel `EventSource` (D22 Option 2). | If sidebar fails to update after observer extraction during manual testing. |
| Parts/scalar consistency | Persisted `assistant_parts` and scalar fields (`question`, `why`, `impact`, options) are two representations of the same turn content. No programmatic check that they agree. | Acceptable for initial delivery — scalars are written by MCP tool handler, parts assembled from stream. Both derive from the same `query()` call. Future: metamorphic oracle (text in parts matches scalars). | If turns appear correct in one view (parts-based UI) but wrong in another (scalar-based entity queries or export). |
@@ -536,27 +532,28 @@ This projection difference is a deliberate design choice, not an implementation
| File | Tests | Protects |
| ----------------------------- | ----- | ----------------------------------------------------- |
-| db.test.ts | 37 | I5, I6, I9, I10, I11, I20, I48, I50, I52, I72, I74, I77, I78, I79, I82, I85 |
-| knowledge.test.ts | 1 | I68, I75 |
-| app.test.ts | 24 | I1, I2, I3, I7, I14, I21, I23, I44, I46, I47, I49, I51, I53, I55, I57, I59, I61, I63, I64, I69, I72, I74, I77, I79, I80, I81, I82, I83, I84, I85, I86 |
-| core.test.ts | 9 | I12, I13, I18, I80, I81, I82 |
-| interview.test.ts | 8 | I16, I81 |
-| parts.test.ts | 15 | I17, I18, I44, I46, I47, I55, I57, I59, I61, I63, I82, I83 |
-| context.test.ts | 13 | I19, I45, I47, I54, I56, I60, I62, I66, I68, I69 |
-| observer.test.ts | 9 | I20, I21, I54, I56, I58, I60, I62, I66, I74, I76, I77 |
-| phase-close.test.ts | 13 | I83, I84, I86 |
-| turn-response.test.ts | 4 | I66, I69 |
-| InterviewWorkspace.test.tsx | 21 | I24, I25, I23, I33, I34, I35, I36, I43, I44, I46, I47, I49, I51, I53, I55, I57, I59, I61, I63, I67, I68, I71, I73, I75, I80, I82, I83, I84, I86 |
-| ProjectList.test.tsx | 2 | I36 |
-| workspace-data.test.ts | 7 | I33, I49, I51, I53, I65, I67, I70, I71, I73, I75 |
-| chat-hydration.test.ts | 3 | I35 |
-| workspace-controller.test.tsx | 3 | I41, I43, I67, I75 |
-| client-mutation.test.ts | 3 | I42 |
-| code-block.test.tsx | 4 | I26, I37, I39 |
-| markdown-rendering.test.tsx | 3 | I31, I39 |
-| message.test.tsx | 2 | I27, I38 |
-| build-boundary.test.ts | 1 | I28, I30, I32, I40 |
-| capability-boundaries.test.ts | 2 | I29, I39 |
+| db.test.ts | 41 | I5, I6, I9, I10, I11, I20, I48, I54, I72, I87 |
+| knowledge.test.ts | 1 | I48 |
+| app.test.ts | 26 | I1, I2, I3, I7, I14, I21, I23, I44, I48, I54, I72, I87 |
+| core.test.ts | 10 | I12, I13, I18, I72, I87 |
+| interview.test.ts | 9 | I16, I72, I87 |
+| parts.test.ts | 15 | I17, I18, I44, I54, I72 |
+| context.test.ts | 14 | I19, I44, I48, I54, I87 |
+| observer.test.ts | 9 | I20, I21, I44, I48, I54 |
+| phase-close.test.ts | 13 | I72 |
+| turn-response.test.ts | 4 | I44 |
+| InterviewWorkspace.test.tsx | 21 | I23, I24, I44, I48, I54, I72 |
+| ProjectList.test.tsx | 2 | I24 |
+| workspace-data.test.ts | 7 | I24, I48, I72 |
+| chat-hydration.test.ts | 3 | I24 |
+| workspace-controller.test.tsx | 3 | I24, I48 |
+| client-mutation.test.ts | 3 | I24 |
+| EntitySidebar.test.tsx | 1 | I87 |
+| code-block.test.tsx | 4 | I24, I26 |
+| markdown-rendering.test.tsx | 3 | I24, I31 |
+| message.test.tsx | 2 | I24, I27 |
+| build-boundary.test.ts | 1 | I24, I28, I30, I32 |
+| capability-boundaries.test.ts | 2 | I24, I29 |
## Acceptance Criteria (exit conditions)
diff --git a/src/client/components/EntitySidebar.test.tsx b/src/client/components/EntitySidebar.test.tsx
new file mode 100644
index 00000000..d25897d6
--- /dev/null
+++ b/src/client/components/EntitySidebar.test.tsx
@@ -0,0 +1,68 @@
+// @vitest-environment happy-dom
+
+import { cleanup, fireEvent, render, screen } from '@testing-library/react';
+import { afterEach, describe, expect, it } from 'vitest';
+
+import { EntitySidebar } from './EntitySidebar.js';
+
+afterEach(() => {
+ cleanup();
+});
+
+describe('EntitySidebar', () => {
+ it('renders explicit approved, rejected, and pending badges for requirements', () => {
+ render(
+
{item.content}
+{item.content}
+ {item.reviewStatus && ( +{item.subtype}
} {item.rationale &&{item.rationale}
}