cn8: workflow steps emit structured beads records as canonical output (graph as source of truth) by richardkiene · Pull Request #4 · Liquescent-Development/millworks

richardkiene · 2026-06-07T06:03:41Z

Epic millworks-cn8 (ADR-0009 D44). A workflow step's canonical output becomes first-class structured beads records (decision, risk, requirement, intent, task, healing — each carrying its prose in its description), not a single prose blob in STEP notes. The beads graph is the source of truth for "what was decided / what happened / what needs doing". Cross-surface (pi + Claude), lockstep. Builds on c30 (#3) and inc5 run-tracking/settle.

What's in this PR (11 beads)

Shared core

tools/millworks-emit — the sole least-privilege beads write-path. emit auto-stamps step:/wfrun: labels + a discovered-from link from MILLWORKS_STEP_ID/WFRUN_ID; complete --summary sets STEP notes + the self-report:complete marker. (millworks-thz)
persona-picker parses an emits: frontmatter field and surfaces it in JSON (millworks-40a); context-pack-assembler expands a scoped STEP → its emitted records, notes-only when none — a graceful c30 superset (millworks-2qe); requirement registered as a custom type (millworks-6q0); millworks:beads skill documents the emit mechanics (millworks-clb); personas declare conservative emits contracts + the 5 emitting ones rewritten to records (millworks-kma).

Both surfaces (pi extensions/workflow-runner, Claude surfaces/claude/mcp-server), lockstep

Dispatch injects step/wfrun env + emit access + a universal completion instruction (every step must millworks-emit complete to settle; emit-types requirement only when emits is non-empty). (millworks-ypd / millworks-d8q)
Settle flipped to beads authority: the self-report:complete marker is the settle trigger (transcript/pane demoted to a health signal); the runtime validates the emits contract and is the sole writer of the outcome:success close (validate-then-commit; a contract violation kills the pane and re-dispatches; the inc5 transcript→notes write is removed — notes come from the agent). (millworks-q2h / millworks-kaa)
Recovery re-resolves each recovered step's emits and re-validates a marker-seen step (no false auto-pass; a persona that can't be re-resolved fails the run). (millworks-1i7)

Why this matters

Solves the fragile-settle problem: a user interrupting a subagent (ending its transcript turn) no longer reads as "settled" — settlement is now a durable, content-addressed outcome in beads, definitive across crashes and interruptions.

Verification

Claude 328 tests, pi 186 tests, Rust crates green; unit + gated real-bd smokes on both surfaces.
A cross-surface reconciliation review caught and fixed a Claude miss (marker loop built but unwired) and several lockstep divergences; the two surfaces are byte-lockstep on the completion instruction and behaviorally lockstep on the settle/recovery state machines.

Deferred (tracked beads, not blocking)

millworks-26e — live end-to-end + parity verification (owner-driven plugin rebuild + driven run with kill/recover). Pending.
millworks-5wz — pi emit-scoping hardening (pi --tools has no per-command scoping, so emitting personas get full bash for now — Decision A; structural scoping tracked).
millworks-qaq — direct-persona: steps skip the emits contract (a persona:-pinned step bypasses the picker → emits:[]).

Pre-existing (NOT cn8 regressions, will show in CI)

millworks-rrp — 4 context-pack-assembler unit tests failing on main.
millworks-7s4 — pi vitest picks up ambient.d.ts as a test file (false failure).

Design in bd show millworks-cn8 --design + ADR-0009 D44.

Record the resolved design for millworks-cn8 (steps emit structured beads records as canonical output; graph as source of truth) as ADR-0009 D44, and the 11 child planning beads (millworks-thz/40a/clb/2qe/kma/ypd/d8q/q2h/kaa/1i7/26e) exported to .beads/issues.jsonl. Design canonical in 'bd show millworks-cn8 --design'.

…llworks-clb) Adds "Emitting structured output (workflow steps)" to the shared millworks:beads skill (content/skills/beads/SKILL.md) — DRY mechanics live once here, referenced by every emitting persona (ADR-0009 D44 M-4). Covers: prose-in-description principle (D-c); millworks-emit emit and complete subcommand interfaces verbatim; auto-stamping of step:/wfrun:/ discovered-from by the CLI (agents must not hand-stamp); optional --link for domain links between emitted records; the self-report:complete terminal marker as the final act (D-g); emits contract concept (D-a/D-b); worked requirements-analyst example; and a "What NOT to do" guard list.

Add `emits: [<type>...]` frontmatter support to the shared persona loader so both runtimes receive a persona's output contract (ADR-0009 D44 D-a). Changes: - `RawFrontmatter`: add `emits: Option<serde_yaml::Value>` (mirrors tools) - `Persona`: add `emits: Vec<String>` (normalized; absent → empty vec) - `PickResult`: add `emits: Vec<String>` (surfaced in picker JSON output) - `PickerError::MalformedEmits`: fail-fast for non-string/non-list emits - `normalize_string_or_list()`: shared helper (DRY) — string or list → Vec<String>; absent → []; malformed → MalformedEmits error - All PickResult construction sites in picker.rs carry emits through - 6 new unit tests (list, string, absent, integer, mapping, PickResult)

…orks-thz) New Rust crate `tools/millworks-emit` — the sole beads write-path granted to Millworks workflow subagents (least-privilege; no arbitrary shell). Realizes ADR-0009 D44 decisions M-2, M-3, M-5, D-d, D-g. CLI surface (canonical): millworks-emit emit --type <T> --title <S> --description <S> [--link <type>:<id>…] Creates a bd record, stamps step:<id>/wfrun:<id> labels and a discovered-from link (FROM new record TO STEP). Prints new id to stdout. millworks-emit complete --summary <S> Sets STEP notes to <S> then adds self-report:complete label (in that order). Both subcommands fail fast (non-zero, clear stderr) if MILLWORKS_STEP_ID or MILLWORKS_WFRUN_ID is unset/empty. Design: - bd I/O isolated in `runner::BdRunner` trait + `RealBdRunner` impl so argv construction in `commands.rs` is unit-testable without bd (mirrors assembler's run_bd_show seam pattern). - `parse_created_id` handles mixed warning+JSON stdout from bd create --json. - `tools/millworks/src/lib.rs`: added "millworks-emit" to MILLWORKS_BINARIES so millworks setup and build-claude both provision it (install.sh + bin/ symlink in the Claude plugin — same wiring as the other shared-core CLIs). Tests: 33 unit tests (argv construction, env fail-fast, id parsing) + 4 real-bd smoke tests (gated MILLWORKS_SMOKE=1): emit attribution round-trip verifies step:/wfrun: labels and discovered-from link; complete verifies notes + label.

…ote) Review polish for millworks-clb: - --link synopsis metavar uses TYPE:TARGET to match millworks-emit's clap value_name. - Note that the requirement record type is registered by the cn8 rollout (millworks-6q0 adds the table row), so a reader isn't confused it's missing from the 9-types table.

- Make normalize_string_or_list genuinely reusable (DRY): MalformedEmits now carries `field: String`, included in the Display message; call site passes "emits". Removed the false comment and the `let _ = field;` no-op. - Malformed-emits unit tests now assert the error names the `emits` field. - New unit test: explicit `emits: []` (YAML empty sequence) -> empty Vec. - New unit test: list with a non-string element (emits: [requirement, 42]) -> fail-fast MalformedEmits. - New integration test: run the binary against a fixture persona with a non-empty emits and assert the JSON output's `emits` array values.

- parse_created_id: replace hand-rolled brace counting (which corrupted on a literal `}` inside a title/description, e.g. `closes {issue}`) with a serde_json streaming parse from the first `{` — respects string contents. Adds two unit tests (brace-in-title, warning-prefix + brace-in-string). - env::require_env: return the TRIMMED value so a padded env can't leak whitespace into a `step:`/`wfrun:` label. Adds a trim test. - EmitArgs: drop the unused `extra_links` field (it's applied post-create via emit_argv, not a create input) — removes an unnecessary clone in main.rs and makes the struct cohesive. - Move the gated real-bd smokes from `src/smoke_tests.rs` (a pub module that compiled into the release lib) to `tests/smoke.rs` (an integration-test crate, test-only) — the idiom used by context-pack-assembler. Resolves the binary via CARGO_BIN_EXE_millworks-emit instead of path-guessing, and asserts `bd dep list` exit success so a broken dep-list can't be silently ignored. - Fix two clippy doc-list warnings in lib.rs. All green: 27 lib + 5 bin unit tests, 4 real-bd smokes (MILLWORKS_SMOKE=1), clippy clean.

Add 'requirement' to the custom beads types list (intent,risk,healing, wfrun,step → +requirement) so cn8 requirements-analyst personas can emit first-class queryable requirement records rather than modeling them as task/feature. - recipes/init-beads.sh: CUSTOM_TYPES gains requirement; count comments updated (5→6 custom, 9→10 total) - docs/beads-mapping.md: Requirement row in summary table; full per-type detail section added before WFRUN section - docs/adr/0003-beads-schema-mapping.md: D16 updated to 10 types/6 custom; REQUIREMENT row in domain table; bd config set example and Consequence paragraph updated for cn8 - content/skills/beads/SKILL.md: "The 9 record types" heading → 10; new Requirement row in Domain records table; error-recovery snippet updated Verified: bd types lists 'requirement'; bd create -t requirement succeeds in a fresh scratch workspace.

…clb)

…-2qe) When the context-pack-assembler renders a scoped STEP, after the notes summary it now queries bd list --label step:<id> --json, gathers the emitted records, and renders each as type+id+title+description under an "#### Emitted Records" sub-heading (D44 D-e). Key mechanics: - run_bd_list_by_label: isolated bd I/O seam for the label query - render_emitted_records: pure fn over raw JSON list (unit-testable without bd); skips malformed records (fail-fast per record), returns "" for zero records - summarize_bd_record_with_emits: pure fn composing the step heading + notes + emitted-records block; "" emits block => output identical to c30 (superset/graceful-degrade rule) - summarize_bd_record delegates to summarize_bd_record_with_emits("") so all existing c30 tests pin unchanged behavior New tests (all pass): - render_emitted_records_lists_type_id_description_per_record - render_emitted_records_empty_list_returns_empty_string - render_emitted_records_tolerates_missing_optional_fields - step_with_zero_emitted_records_renders_notes_only_identical_to_c30 - step_with_emitted_records_appends_them_after_notes - smoke_step_with_emitted_records_surfaces_in_bundle (MILLWORKS_SMOKE=1) Pre-existing rrp failures (bare_task_only, task_with_persona, non_skill_dir_is_ignored, pruning_occurs_when_over_budget) unchanged.

Declare emits contracts in all 20 content/agents/*.md personas and rewrite Output sections for the 5 roles with non-empty contracts. Emits mapping applied: intake-interviewer -> emits: [intent] requirements-analyst -> emits: [requirement] plan-reviewer -> emits: [decision] architect -> emits: [decision] plan-writer -> emits: [task] all others (15) -> emits: [] For the 5 non-empty emits personas the Output section is rewritten so that canonical output is structured beads records emitted via millworks-emit, with full prose in each record's --description field. Each rewrite: - instructs emit per unit of substance (one intent / requirement / decision / task) - uses --link for domain links between emitted records - ends with millworks-emit complete --summary as the terminal act - cross-references the millworks:beads skill for mechanics (DRY) - preserves the persona's posture and quality voice For the 15 emits:[] personas only the frontmatter field is added; no body changes (clean audits/reviews find nothing and must still settle). Verified: cargo test -p persona-picker all 53 tests pass; manual pick check confirms requirements-analyst -> emits:[requirement], all others as mapped.

…truction at dispatch (millworks-ypd) D44 M-1: inject MILLWORKS_STEP_ID/MILLWORKS_WFRUN_ID into the spawned pane's environment via tmux -e so millworks-emit can stamp provenance without the subagent knowing its own ids. D44 M-2: always grant Bash(millworks-emit:*) in allowedTools for every workflow-step subagent (least-privilege scoped emit path); mapStepTools now returns string[] (never undefined) with the emit tool always appended. D44 M-4: generate the output-contract instruction from the dispatched persona's emits (single source: frontmatter → picker → drive loop → dispatch args). buildContractInstruction(emits) returns undefined for empty emits (uniform rule: emits: [] → no instruction, cn8 a clean superset of c30). The real impl in index.ts appends the instruction to the assembled bundle file before spawning. Widen resolvePersonaViaCli to return { file, emits } (was: string | null). The picker output already contained emits (PickResult.emits); the TypeScript cast at workflow-cli.ts:100 is now widened to include it. Direct persona: references return emits: [] (no picker invoked). Tests: 8 new unit tests (TDD: watched each fail before implementing); 276 total passing (up from 268); typecheck clean.

…st (millworks-d8q) D44 M-1/M-2/M-4 on the pi surface (lockstep mirror of Claude ypd): - buildWrapperEnvExports: injects MILLWORKS_STEP_ID / MILLWORKS_WFRUN_ID as export lines in the subagent wrapper.sh (single-quoted, process-env durable) - addEmitToolAccess: ensures 'bash' is in the pi --tools allowlist when the persona declares a non-empty emits contract (least-privilege emit path; bash is the closest pi analog to Claude Code's Bash(millworks-emit:*)) - buildContractInstruction: generates the canonical output-contract instruction from the persona emits list (null for empty emits — degrades to c30); appended to the assembler bundle, not a separate flag - resolveRoleToPersona: widened from Promise<string> to Promise<PersonaPickResult> = { file, emits } to carry the persona-picker emits output through to dispatch - dispatchStep: wires all three mechanics; personaEmits drives the conditional tool-access and instruction injection 22 new unit tests (buildContractInstruction, addEmitToolAccess, buildWrapperEnvExports). 150 pass total (was 128); 4 pre-existing MILLWORKS_SMOKE smokes skipped.

…lowedTools to string[] (millworks-ypd review)

…d (millworks-d8q) Review fixes: - addEmitToolAccess doc: stop claiming a scoped-PATH security property that isn't implemented. The wrapper inherits full PATH/rc, so the bash grant is full bash; the contract instruction is a behavioral nudge only. Structural per-command scoping is tracked as hardening bead millworks-5wz. - dispatchStep env injection: replace `state.stepRecords[step.id] ?? ""` silent fallback with a hard throw — an empty MILLWORKS_STEP_ID would make millworks-emit mis-attribute/fail silently (project fail-fast rule). 150 tests pass (pre-existing ambient.d.ts glob false-failure tracked as millworks-7s4).

1. plan-writer: phase->phase ordering link example used the wrong link type (until is task->decision). Changed to blocks:<phase-task-id> and fixed the comment to describe phase ordering, not a decision gate. 2. decompile-synthesizer: converted its record-writing from raw bd create to millworks-emit emit (risk + decision). bd create bypasses the auto-stamp of step:/wfrun:/discovered-from, leaving records unattributed and invisible to assembler expansion. millworks-emit is the only granted, attributed write path. provenance:decompiled (a label, not supported by the emit CLI surface) folded into the decision --description prose. bd remember stays as a direct bd call (free-text memory, not a step-output record). No required-records language added — persona remains emits:[]. 3. plan-reviewer: completion-summary template used man-page optional-bracket notation [, <N> risk] that an LLM might emit literally; rewritten as plain prose. Re-verified: cargo test -p persona-picker (53 + 8 tests) green; the three edited personas parse with expected emits (task / decision / []).

…lworks-2qe review) Addresses fail-fast review findings (project rule: never silence errors): 1. run_bd_show no longer swallows a bd list COMMAND failure into "zero records" (notes-only). run_bd_list_by_label's Err now propagates via `?`. A command that SUCCEEDS but lists zero records still degrades silently to notes-only (legitimate D-e graceful-degrade) — distinguished from a real command failure. 2. render_emitted_records now returns Result<String> and FAILS FAST on malformed bd output (non-empty non-JSON, non-array, or a record missing a required `id`/`title` field) via new AssemblerError::MalformedRecord, instead of silently dropping bad records. A valid empty array `[]` and an empty/blank input string remain the legitimate Ok("") degrade path. 3. summarize_bd_record_with_emits now returns Result<Option<String>> so the malformed-record error propagates through the seam. summarize_bd_record keeps its Option surface for the c30 tests (empty emits can't be malformed). 4. Fixed the contradictory doc comment that claimed "skips malformed ones, keeps valid ones" — now describes the actual fail-fast behavior. 5. main.rs maps MalformedRecord to exit code 2 (bad-data class). New tests (all pass): - render_emitted_records_fails_fast_on_malformed_json - render_emitted_records_fails_fast_on_record_missing_required_field - summarize_propagates_malformed_emits_error - render_emitted_records_empty_list_returns_empty_string (now asserts Ok("")) Test results: 31 pass, 4 pre-existing rrp failures unchanged, 1 smoke (MILLWORKS_SMOKE=1) ignored by default and passing against live bd.

… runtime closes (millworks-kaa) STATE MACHINE (lockstep with Claude q2h): - marker=YES → validate emits → SETTLED (runtime writes outcome:success) - marker=YES + contract unmet → EmitsContractError → retry path (no false success) - marker=NO + pane dead → crashed → existing retry/fail path - marker=NO + pane alive → still running (interruption is not a failure) - timeout + no marker → TIMEOUT → retry path CHANGES: - waitForSettle: reworked to poll bdHasMarker (beads is settle AUTHORITY); transcript/done-file/pane demote to HEALTH inputs. Injectable WaitForSettleDeps for deterministic unit testing (DI seam per millworks-n0f intent). - validateEmitsContract: validate-then-commit before any outcome:success write. Throws EmitsContractError for missing required types (fail-fast, never silent). - markStepSettled: calls validateEmitsContract BEFORE writing outcome:success (the sole-writer invariant; agent never writes terminal state — D44 D-g). - stepProduced removed from processReadyStep: agent's `millworks-emit complete` already sets STEP notes; runtime must not overwrite them (inc5 notes-write removed). - buildContractInstruction: ALWAYS returns the completion instruction for ALL steps (universal-completion); emit-types requirement APPENDED only when emits non-empty. COMPLETION_INSTRUCTION constant exported for lockstep verification. - addEmitToolAccess: bash granted for ALL steps (not conditioned on emits.length). Every step needs millworks-emit complete access (the universal settle signal). - bdHasMarker + bdCountEmittedByType: new bd helpers for marker poll and validation. - drainSessionFile: extracted from old waitForSettle for progress/health use. - StepResult.personaEmits: new field threads persona emits from dispatchStep to markStepSettled for post-settle validation. - adoptStep: updated to use new waitForSettle + bdHasMarker; cwd added to signature. PI-SPECIFIC vs q2h: - bash granted (not scoped Bash(millworks-emit:*)) per accepted d8q decision (5wz tracks scoping hardening). Recovery paths pass personaEmits:[] (1i7 follow-up). TESTS: 24 new unit tests (COMPLETION_INSTRUCTION lockstep, buildContractInstruction universal-completion x5, waitForSettle state matrix x7, validateEmitsContract x3, 2 gated real-bd smoke tests: settle-by-marker round-trip + fail-fast on unmet contract). Total: 174 pass, 8 skipped (4 new gated smokes). Only ambient.d.ts pre-existing fails.

…emits → runtime closes State machine (beads-authoritative, D44 D-f/D-g): marker=YES + emits met → runtime writes outcome:success (validate-then-commit) marker=YES + type missing → contract-violation → step failure (no false success ever written) marker=NO + pane dead → crashed → retry/re-dispatch marker=NO + pane alive → still running (interruption is NOT a settle) elapsed >= timeout → step failure (backstop for never-signaling agent) Key changes: - settle.ts: beads-authoritative state machine (pollSettleMarker, waitForMarker) with full DI seam; pane/transcript demotes to HEALTH input only - workflow.ts: buildContractInstruction always returns completion instruction (universal, not conditioned on emits); emits types appended only when non-empty; acceptStep validates emits contract BEFORE writing outcome:success (validate-then-commit); inc5 notes-write removed (agent's millworks-emit complete --summary sets notes, runtime does NOT overwrite) - workflow.ts: StepResult gains emits:[] field (for validate-then-commit routing); rebuildRunState, recovery paths, and tests updated to include emits:[] - bd.ts: validateStepEmits added (bd list --label step:<id> --type T for each required type) - index.ts: validateEmits wired into controllerDeps via validateStepEmits - settle.marker.test.ts + workflow.settle.test.ts: unit coverage of all 5 state transitions - settle.marker.smoke.test.ts: gated real-bd round-trip (MILLWORKS_SMOKE=1) - Completion instruction string byte-matches pi mirror (millworks-kaa) exactly Fixed gaps left by prior agent (tsc --noEmit failures): server.test.ts: 2x StepResult object literals missing emits field workflow.substitute.test.ts: settled() helper missing emits field workflow.recovery.test.ts: expected StepResult missing emits field workflow.ts:rebuildRunState: constructed StepResult missing emits field

… kaa 1. [CRITICAL] Wire waitForMarker into production: buildController.dispatch now uses waitForMarker (beads-authoritative) for workflow steps (stepBeadsId provided). The settle AUTHORITY is the self-report:complete label polled from beads; pane/transcript demotes to health. Ad-hoc dispatch_subagent (no stepBeadsId) keeps transcript-based waitForSettle — no regression. - Add bdHasMarker + bdReadNotes to bd.ts - Import waitForMarker + BeadsSettleState in index.ts - Thread stepBeadsId + stepEmits through WorkflowDeps.dispatch args - Override deps.wait per-dispatch with marker-poll lambda for workflow steps - Read agent notes from beads after marker-settle resolves 2. [CRITICAL] Remove inc5 notes-write: stepProduced no longer called from dispatchStepWithRetry or processAdoptedOutcome. Notes come from the agent's millworks-emit complete --summary call. Update all tests accordingly. 3. [IMPORTANT] Align buildContractInstruction to kaa byte-for-byte: - Add COMPLETION_INSTRUCTION constant (exported, lockstep with kaa) - Reorder: completion instruction FIRST, emit-types appended after - New emit-types wording: "MUST also emit..." + env trailer - Update workflow.settle.test.ts and workflow.drive.test.ts assertions 4. [IMPORTANT] Fix timeout-before-marker ordering in pollSettleMarker: elapsed >= timeout is now checked BEFORE the marker (matching kaa's waitForSettle). Add test proving timeout wins over a present marker. 5. [MINOR] Remove dead paneCheckEvery field from WaitMarkerDeps (the loop body was an empty comment; remove unused field + loop counter variable). Update settle.marker.test.ts to drop the field from all test objects. 6. [MINOR] Route validateEmits bd-errors to step-failure path at all three acceptStep call sites (driveWorkflow, applyGateAndResume, processAdoptedOutcome) so a transient bd throw never propagates uncaught to the MCP caller. Tests: 310 passed (up from pre-existing 300), 13 skipped. Known failures: index.integration.test.ts (esbuild not in worktree) + ambient.d.ts (no suite).

…lockstep Final lockstep divergence on the settle path: a contract violation (marker present, but a required emits type has 0 records) was mapped to status `errored` → markStepFailed (PERMANENT fail, no retry). kaa + the D44 design route a contract violation to the EXISTING RETRY PATH (re-dispatch up to max-retries, then outcome:failed). Fix: - Add a distinct `contract-violation` DispatchOutcome status (workflow.ts) so ONLY contract violations get kill-then-retry; a genuine `errored` (pane alive, no marker, wait failed) keeps its non-retryable behavior. - Add WorkflowDeps.killStepPane({wfrunBeadsId, stepId}) — kills the lingering subagent pane before a re-dispatch so it can't double-spawn (mirrors kaa's killOrphanedPanes-before-retry). Production impl (index.ts) looks up the tagged SubagentRecord and calls realTmux.kill (idempotent). - dispatchStepWithRetry: on `contract-violation`, killStepPane then retryOrFail (the retryable path) instead of markStepFailed. validate-then-commit holds — no outcome:success is ever written for a violation. - index.ts marker-wait: capture failed-contract in a closure flag and return an `exited` sentinel from the wait (no throw → not mis-recorded as `errored`), then override the DispatchOutcome to `contract-violation` after dispatchSubagent returns. This distinguishes it from genuine errors through dispatchSubagent's fixed status vocabulary. - Add killStepPane to all 8 WorkflowDeps test fakes. - New tests (workflow.settle.test.ts): contract-violation re-dispatches up to max-retries then succeeds (proves retryable + pane killed before retry); and exhausts retries → outcome:failed with pane killed each attempt and NO false success (validate-then-commit invariant preserved). Tests: 312 passed (up from 310), 13 skipped. Known failures only: index.integration.test.ts (esbuild not in worktree) + ambient.d.ts (no suite).

…s-kaa)

…works-q2h)

…(millworks-1i7) Fixes the gap left by kaa: recovered steps were passing `personaEmits: []` through all three recovery paths (gate-after, reconcile/adoptStep, pending-validation) which caused validate-then-commit to auto-pass for any step restarted after a crash. Recovery now RE-RESOLVES emits via the same resolveRoleToPersona path that dispatch uses, and re-validates the contract before writing outcome:success. State-machine additions (pi surface, lockstep with Claude 1i7): - `BeadsStepRecovery.hasSelfReportComplete`: detected from bd labels in recoveryViewFromRecords; true when STEP open + self-report:complete (crash in the validation window). - `ResumePlan.pending-validation`: new plan kind — STEP open + marker present. Takes priority over reconcile (agent finished; pane may be gone). driveRun re-resolves emits, then passes a StepResult to processReadyStep which calls markStepSettled → validateEmitsContract → outcome:success (or fails/retries). - `resolveStepEmits()`: helper that mirrors dispatchStep's resolution path; throws UnrecoverableRunError when the persona/role cannot be resolved (fail the run, same transient-vs-malformed split as inc5). - `adoptStep()`: now calls resolveStepEmits before entering waitForSettle so the re-resolved emits flow into the returned StepResult. Removes the 1i7 follow-up comment (gap is now closed). - driveRun gate-after path: also re-resolves emits instead of passing []. Tests (in-source vitest): - recoveryViewFromRecords: 3 new tests pinning hasSelfReportComplete detection. - planResume: 3 new tests — pending-validation produced for open+marker; priority over plain reconcile; false positive excluded (marker absent → reconcile). - All 174 pre-existing tests still pass (186 total, +12 new).

…ude surface) Extends inc5 beads-authoritative recovery with the millworks-1i7 contract: a STEP that carried `self-report:complete` but was not yet closed (crash in the validate-then-close window) is now re-validated on recovery — never auto-passed via emits:[]. A running step with a live pane (no marker) now carries re-resolved emits into the adopted waitForMarker. Recovery state machine (lockstep with pi): - STEP closed outcome:success/failed → terminal (unchanged, inc5). - STEP open + self-report:complete → PENDING VALIDATION: re-resolve persona emits via deps.resolvePersona, validateEmits, acceptStep (success) or markStepFailed (contract violation). No pane adoption. - STEP open + no marker + live pane → adopt, carrying re-resolved emits into waitForMarker (not []). Re-dispatch if pane gone. - After-gate recovery: reconstructGate re-resolves persona emits so gate_approve validates the real contract (not auto-passing). FAIL-FAST: unresolvable persona/emits propagates as a transient error. Changes: - workflow.ts: BeadsStepRecovery gains markerPresent; RunState gains pendingValidationStepIds + pendingValidationOutputs; rebuildRunState populates both; resumeRecoveredRun handles all three recovery shapes; reconstructGate is now async + re-resolves emits; adoptStep interface adds stepEmits. - run-tracker.ts: loadRecovery sets markerPresent from SELF_REPORT_COMPLETE. - run-tracker.testing.ts: recoveryView() includes markerPresent: false. - index.ts: adoptStep uses waitForMarker with re-resolved stepEmits. - Tests: 11 new unit + controller-level tests; inc5 recovery tests extended.

…nsient retry), lockstep with pi Lockstep divergence fix on the Claude side. When recovery re-resolves a recovered step's persona emits and resolvePersona FAILS, the prior code let the error propagate as transient — effectively retried next session. A deterministic resolution failure (the role no longer resolves) would strand the run open forever, worse than a loud fail. pi's resolveStepEmits throws UnrecoverableRunError; D44's fail-fast intent is "fail the run, don't auto-pass". Changes (workflow.ts): - New resolveRecoveredEmits helper: wraps deps.resolvePersona; any failure throws UnrecoverableRunError (the inc5 malformed-recovery path), carrying the original cause + step id. Used at all three re-resolution sites. - resumeRecoveredRun (pending-validation + no-marker adopt paths): use the helper. A persona failure now fails the run (runDrive closes the WFRUN failed), not a silent retry. - reconstructGate (after-gate): use the helper; a paused:after:<stepId> for a step absent from the re-parsed workflow is also UnrecoverableRunError. - doRecover: the gate-pause branch now catches UnrecoverableRunError from reconstructGate, closes the WFRUN failed, and starts clean — currentRun is armed ONLY after a successful reconstruction (no half-built armed run). A transient bd/CLI blip still propagates (run left open, retryable). Tests: - resume: persona-unresolvable on a no-marker step and on a pending-validation step both reject with UnrecoverableRunError (was: generic throw). - controller-recovery: mid-step (no-marker) and after-gate recovered steps whose persona can't be re-resolved close the WFRUN outcome:failed (not left open); the controller stays clean and a fresh run can start. - substitute test: RunState literal gains the two new recovery fields. npm test: 326 passed, 13 skipped (only the pre-existing esbuild integration failure). tsc --noEmit: clean.

…teps (millworks-1i7)

…en steps (millworks-1i7)

…illworks-6q0) /millworks:init registers beads custom types via the Rust `millworks init` binary (init.rs), NOT recipes/init-beads.sh. 6q0 updated the recipe but missed init.rs:129, so `requirement` never registered at workflow runtime — caught during cn8 live verification (26e). Extract the list to CUSTOM_BEADS_TYPES (incl. requirement), add a regression test covering this binary path.

richardkiene added 30 commits June 6, 2026 18:10

feat(cn8): millworks-emit scoped attributed-write CLI (millworks-thz)

a06a41b

feat(cn8): parse persona 'emits' frontmatter (millworks-40a)

4ce99df

feat(cn8): register 'requirement' custom beads type (millworks-6q0)

9e6652e

docs(cn8): beads skill emit-as-canonical-output mechanics (millworks-…

a95cb5e

…clb)

chore(beads): sync export after Phase-A closes (thz/40a/clb/6q0)

b805e3f

feat(cn8): persona emits contracts + body rewrites (millworks-kma)

0620990

feat(cn8): assembler step->records expansion (millworks-2qe)

cccb992

feat(cn8): Claude dispatch env+contract+emit-allowlist (millworks-ypd)

ca396ac

feat(cn8): pi dispatch env+contract+emit-allowlist (millworks-d8q)

da17592

refactor(claude): drop dead finalSystemPrompt var; narrow dispatch al…

33982bd

…lowedTools to string[] (millworks-ypd review)

fix(cn8): 2qe fail-fast on bd-list failure + malformed records

b22f93d

fix(cn8): kma link-type + bd-create→millworks-emit

245afed

fix(cn8): ypd dead-code + type cleanups

74d344a

fix(cn8): d8q honest comment + fail-fast stepBeadsId throw

4383ad1

richardkiene added 13 commits June 6, 2026 21:50

feat(cn8): pi settle authority flip — marker→validate→close (millwork…

7efc480

…s-kaa)

feat(cn8): Claude settle authority flip — marker→validate→close (mill…

bca839a

…works-q2h)

feat(cn8): pi recovery re-resolves emits + re-validates marker-seen s…

443aee3

…teps (millworks-1i7)

chore(beads): sync export after 1i7 recovery merges

3b6427e

feat(cn8): Claude recovery re-resolves emits + re-validates marker-se…

56238cf

…en steps (millworks-1i7)

docs(cn8): ADR-0009 D44 as-built (11 beads landed; deferred 26e/5wz/qaq)

52c5542

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cn8: workflow steps emit structured beads records as canonical output (graph as source of truth)#4

cn8: workflow steps emit structured beads records as canonical output (graph as source of truth)#4
richardkiene wants to merge 43 commits into
mainfrom
feat/cn8-structured-records

richardkiene commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

richardkiene commented Jun 7, 2026

What's in this PR (11 beads)

Why this matters

Verification

Deferred (tracked beads, not blocking)

Pre-existing (NOT cn8 regressions, will show in CI)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant