feat(orchestration): v2 foundation — durable state, lifecycle, status/resume commands (PR 1 of 3) by kelsonpw · Pull Request #689 · amplitude/wizard

kelsonpw · 2026-05-09T15:10:24Z

TL;DR

Unique to this PR: introduces the durable orchestration store (OrchestrationStore), Task lifecycle state machine, and six new read-only inspection commands (tasks, task, sessions, session, resume, orchestration status) — all behind Zod-validated --json envelopes.
Depends on: nothing — this is the v2 base PR. Stacks under feat(orchestration): v2 PR 2 — user-choice + manual-verification checkpoints + MCP-app lifecycle (stacks on #689) #690 → feat(orchestration): v2 PR 3 — TUI integration + MCP-server tool parity + perf hot-paths + resilience (stacks on #690) #691 → feat(orchestration): v2 PR 4 — widen wiring + retire WizardSession redundancy + supervisor + live status refresh (stacks on #691) #693 → feat(tui): v2 PR 5 — full screen-tree redesign + IA + render-cost teardown (stacks on #693) #696.
Most load-bearing test: src/lib/orchestration/__tests__/lifecycle.test.ts (transition matrix; illegal transitions throw IllegalTaskTransitionError rather than corrupting state).
If you read one section, read this one: "State model" + "Task lifecycle" + "Schemas" below — the rest of the v2 stack is built on these contracts.

Problem

Wizard's WizardSession (src/lib/wizard-session.ts) is the de-facto orchestration state, but it's an in-memory snapshot held in a single process — there's no durable, machine-readable surface for "what's running, what stopped, what's the resume command." Outer agents that wrap the wizard scrape TUI text or grep log.ndjson. Status JSON exists for a few specific commands (wizard status, wizard projects list) but each shape is ad-hoc — there is no unified envelope.

Task lifecycle is implicit in the tasks array (booleans done / error). Last-stopping-point is partially captured by session-checkpoint.ts but only stores intro-phase state, not active task ownership of branches/PRs/worktrees.

Why this is "PR 1 of 3"

This PR introduces a single durable orchestration store that becomes the source of truth for sessions, tasks, subagents, ownership, last-stopping-point, and structured task results.

Foundation only. The TUI redesign, the user-choice / verification / MCP-app lifecycle plumbing, and the MCP-server tool surface deferred to PRs 2 and 3:

PR 1 (this PR) — durable schema, lifecycle, store, six new read CLI commands, last-stopping-point derivation. Legacy WizardSession stays live alongside.
PR 2 — widens PendingCheckpoint with concrete schemas, routes existing user-choice / event-plan-confirm / MCP-app prompt sites through the store. The pendingChoices / pendingMcpActions / pendingManualVerifications arrays start carrying real content.
PR 3 — TUI v2 reads from the store as its source of truth, retires duplicate state, surfaces the new MCP-server read-only tools so outer coding agents can call them as typed tools instead of parsing CLI stdout.

Architecture is documented in docs/orchestration.md.

State model

                 ┌────────────────────────────────────────┐
                 │             Session                    │
                 │  id, installDir, status, goal, branch  │
                 └─────────────┬──────────────────────────┘
                               │ 1..n
                               ▼
                 ┌────────────────────────────────────────┐
                 │              Task                      │
                 │ id, sessionId, parentTaskId,           │
                 │ label, state, ownership[], result      │
                 └─────────┬───────────────────┬──────────┘
                           │ 1..n              │ 1..n
                           ▼                   ▼
            ┌──────────────────────┐  ┌──────────────────┐
            │      Subagent        │  │   Ownership      │
            │ kind, rootTaskId     │  │ kind, …          │
            └──────────────────────┘  └──────────────────┘

Task lifecycle

  ┌──────────┐        ┌──────────┐        ┌────────────┐
  │  queued  │──────► │ running  │──────► │ completed  │ (terminal)
  └──────────┘        └────┬─────┘        └────────────┘
       │                   ├──► waiting_for_user ◄──┐
       │                   ├──► blocked          ◄──┤
       └─► cancelled       └──► failed              │
                                                    │
   any non-terminal  ──────► superseded ────────────┘

Implemented in src/lib/orchestration/lifecycle.ts. The assertTransition(taskId, from, to) helper is the trust boundary — illegal transitions throw IllegalTaskTransitionError rather than corrupt persisted state.

New commands

Command	Purpose
`wizard tasks`	List tasks; filter with `--state`, `--session-id`
`wizard task <id>`	Inspect a single task
`wizard sessions`	List sessions
`wizard session <id>`	Inspect a session and its tasks
`wizard resume <session-id>`	Print (or run with `--execute`) the resume command
`wizard orchestration status`	Print the `LastStoppingPoint` snapshot

Every command supports --json (auto-enabled when stdout isn't a TTY) and validates its JSON payload against a Zod envelope schema before writing. A regression in the producer surfaces as a thrown ZodError on the producer side rather than silent corruption downstream.

Example human output

$ amplitude-wizard orchestration status
Store: /Users/me/.amplitude/wizard/runs/3d8f2a1b9c4e/orchestration.json (generated 2026-05-09T12:34:56.789Z)
Active session: session_a3f9e7c2d1b48a09f5c6
Goal:           set up Amplitude in nextjs
Branch:         feat/amplitude-setup
Active tasks:           1
Stopped tasks (24h):    0
Recently completed:     2

Next action: A task is waiting for user input: review the proposed events.json.
Resume:      amplitude-wizard --install-dir /Users/me/myapp

Example JSON envelope (`wizard orchestration status --json`)

{
  "v": 1,
  "type": "orchestration_status",
  "generatedAt": "2026-05-09T12:34:56.789Z",
  "installDir": "/Users/me/myapp",
  "storePath": "/Users/me/.amplitude/wizard/runs/3d8f2a1b9c4e/orchestration.json",
  "storeExists": true,
  "lastStoppingPoint": {
    "generatedAt": 1715258096789,
    "currentSessionId": "session_a3f9e7c2d1b48a09f5c6",
    "currentGoal": "set up Amplitude in nextjs",
    "currentBranch": "feat/amplitude-setup",
    "currentWorktree": "/Users/me/myapp",
    "activeTasks": [
      {
        "id": "task_b1c2d3e4f5a6b7c8d9e0",
        "sessionId": "session_a3f9e7c2d1b48a09f5c6",
        "label": "event plan confirmation",
        "state": "waiting_for_user",
        "ownership": [],
        "subagentKind": "instrumentation",
        "createdAt": 1715258090000,
        "updatedAt": 1715258091000,
        "startedAt": 1715258090500,
        "waitingFor": {
          "id": "cp_event_plan",
          "kind": "event_plan_confirm",
          "summary": "review the proposed events.json",
          "enteredAt": 1715258091000
        }
      }
    ],
    "stoppedTasks": [],
    "recentlyCompletedTasks": [],
    "relevantOwnership": [],
    "pendingChoices": [
      {
        "id": "cp_event_plan",
        "kind": "event_plan_confirm",
        "summary": "review the proposed events.json",
        "enteredAt": 1715258091000
      }
    ],
    "pendingMcpActions": [],
    "pendingManualVerifications": [],
    "nextAction": {
      "kind": "await_user_choice",
      "description": "A task is waiting for user input: review the proposed events.json.",
      "command": ["amplitude-wizard", "--install-dir", "/Users/me/myapp"]
    },
    "resumeCommand": "amplitude-wizard --install-dir /Users/me/myapp"
  }
}

Schemas

All persisted shapes have a runtime Zod validator in src/lib/orchestration/schemas.ts. The on-disk envelope carries an explicit version: 1 literal — a version-mismatched file returns kind: 'corrupt' from loadStore so readers can distinguish "no store yet" from "found a store but couldn't parse it." See docs/orchestration.md for the full schema reference.

Exit codes

The new commands extend the existing ExitCode contract — see docs/exit-codes.md:

Code	When
0	command succeeded
1	unexpected error reading the store / serialising output
2	invalid args — bad `--state` value, malformed `task_<id>` / `session_<id>` prefix, unknown id
130	SIGINT during the command

These commands are read-only and never trigger auth flows or network calls, so codes 3 / 4 / 10 / 11 / 12 / 13 / 20 are not reachable.

Performance / cost

Each task transition triggers ≤ 1 atomic write of a small JSON file (temp-file + rename via atomicWriteJSON, mode 0o600). A typical wizard run touches a few hundred transitions; the orchestration file therefore stays well under the I/O budget the wizard already spends on runs/<hash>/log.ndjson writes. PR 3 will add a debounced in-memory cache for high-frequency call sites if needed.

Backward compatibility

WizardSession is untouched. Every existing call site continues to read from / write to the in-memory snapshot.
The new orchestration store is mirrored from a single high-leverage hook (session start in src/run.ts). Mirror failures are logged and swallowed — they cannot block the wizard run.
All existing CLI flags, env vars, exit codes, NDJSON envelope (v: 1), and MCP server behavior are preserved.
One new hidden yargs shadow flag (--cache-dir) was added so .strict() accepts the existing AMPLITUDE_WIZARD_CACHE_DIR env-var auto-mapping on every command (the env var was already read by src/utils/storage-paths.ts:getCacheRoot; this PR just unblocks .strict()).

Tests added (48 total, all passing)

src/lib/orchestration/__tests__/lifecycle.test.ts — 10 tests; exhaustive transition matrix, identity rejection, terminal-state outbound rejection, illegal-transition error message contract.
src/lib/orchestration/__tests__/schemas.test.ts — 14 tests; round-trip tests for Task, Session, Subagent, Ownership, TaskResult, OrchestrationStoreFile, StatusEnvelope. Negative cases for malformed ids, unknown lifecycle states, broken discriminated unions, wrong version literal.
src/lib/orchestration/__tests__/store.test.ts — 11 tests; create/list/transition round-trips, illegal-transition rejection, terminal result stamping, idempotent ownership, atomic-write durability (a thrown write leaves the prior file intact and orphan-free), corrupt store handling for invalid JSON and schema mismatch.
src/lib/orchestration/__tests__/last-stopping-point.test.ts — 5 tests; empty store snapshot, populated-store grouping, auth-blocked → fix_auth next-action, 24-hour window expiry, ownership aggregation.
src/commands/__tests__/orchestration.test.ts — 8 CLI smoke tests spawning the real bin.ts against a tmp install dir. Validates JSON output of every new command against its Zod envelope schema. Checks exit codes for both happy-path and error cases (bad state filter, nonexistent ids).

Test plan

pnpm exec vitest run --pool=forks --maxWorkers=1 src/lib/orchestration/ src/commands/__tests__/orchestration.test.ts — 48/48 passing
pnpm exec vitest run --pool=forks --maxWorkers=1 — full suite 3828/3828 passing
pnpm test:bdd — 100 scenarios / 445 steps passing
pnpm lint — clean (one pre-existing warning in EventPlanFullScreen.test.tsx, untouched)
pnpm build — compiles + smoke test passes

Manual verification

Run node dist/bin.js orchestration status --json in a fresh project — empty store, nextAction.kind === "none".
Run a normal wizard session, then re-invoke orchestration status — observe a populated store, currentSessionId set.
Run wizard tasks --state running — filter takes effect.
Run wizard task <bogus-id> — exit code 2.
Run wizard resume <session-id> — prints (does not execute) a resume command. Pass --execute to actually invoke it.
Inspect ~/.amplitude/wizard/runs/<hash>/orchestration.json — file mode is 0o600, JSON validates against OrchestrationStoreFileSchema.

Known limitations (deferred)

TUI hasn't been wired to read from the orchestration store — src/ui/tui/ is unchanged. (PR 3.)
The MCP server (amplitude-wizard mcp serve) hasn't gained read-only tools that wrap the store. (PR 3.)
Existing tasks.push(...) sites in src/ui/tui/store.ts and src/lib/wizard-session.ts continue to use the legacy in-memory shape. The orchestration store is mirrored only from session start in src/run.ts. (PR 2 widens this; PR 3 retires the duplicate.)
pendingChoices / pendingMcpActions / pendingManualVerifications are stub arrays in PR 1 — the schema is stable but the producer sites land in PR 2.

Follow-up roadmap

PR 2 — Concrete PendingCheckpoint schemas; route existing user-choice / event-plan-confirm / MCP-app prompt sites through the store; checkpoints + MCP-app lifecycle.
PR 3 — TUI v2 reads from the store; retire duplicated state; MCP-server tool parity (read-only orchestration tools); perf / batched writes.

🤖 Generated with Claude Code

Note

Medium Risk
Adds a new file-backed orchestration store plus multiple new CLI commands and JSON contracts; while largely read-only, it introduces new persistence and exit-path behavior that could affect tooling and session startup in edge cases (permissions/corrupt state).

Overview
Introduces a new durable v2 orchestration store under src/lib/orchestration/ (file-backed JSON, atomic writes, Zod-validated schemas) with an explicit TaskLifecycle state machine and computeLastStoppingPoint derivation for “what to do next”/resume hints.

Adds six new read-only inspection commands (tasks, task <id>, sessions, session <id>, resume <session-id> [--execute], orchestration status) that emit Zod-validated JSON envelopes (auto-JSON when piped) and standardized exit codes; wires them into bin.ts and exports.

Mirrors wizard session start into the orchestration store from src/run.ts (best-effort, errors logged and swallowed), makes CLI_INVOCATION a constant npx @amplitude/wizard for stable resume commands, adds a hidden --cache-dir passthrough for AMPLITUDE_WIZARD_CACHE_DIR, and includes extensive new unit + CLI smoke tests plus new docs (docs/orchestration.md, docs/exit-codes.md).

^{Reviewed by Cursor Bugbot for commit be4dbdd. Bugbot is set up for automated code reviews on this repo. Configure here.}

…/resume commands (PR 1 of 3) Introduce src/lib/orchestration/ — a durable, file-backed orchestration store that becomes the source of truth for sessions, tasks, subagents, ownership, and last-stopping-point. Adds six new read-only CLI commands (tasks/task/sessions/session/resume/orchestration status), each emitting Zod-validated JSON envelopes for outer agents. Foundation only. Legacy WizardSession remains the live in-memory surface; PR 2 wires checkpoints + MCP-app lifecycle, PR 3 retires duplicate state and ships the TUI redesign + MCP-server tool parity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Raw spawn from node:child_process breaks Windows
- Replaced await import('node:child_process') with await import('../utils/cross-platform-spawn.js') so the amplitude-wizard .cmd shim is resolved correctly on Windows.
✅ Fixed: User-provided --install-dir missing tilde expansion
- Added resolveInstallDir(argv.installDir) in resolveCommonOpts to properly expand tilde and resolve the path, matching the pattern used by the dashboard command.

Or push these changes by commenting:

@cursor push e5add8f93d

Preview (e5add8f93d)

diff --git a/src/commands/orchestration.ts b/src/commands/orchestration.ts
--- a/src/commands/orchestration.ts
+++ b/src/commands/orchestration.ts
@@ -45,7 +45,8 @@
   json?: boolean;
   human?: boolean;
 }): Promise<CommonOpts> {
-  const installDir = argv.installDir ?? process.cwd();
+  const { resolveInstallDir } = await import('../utils/install-dir.js');
+  const installDir = resolveInstallDir(argv.installDir);
   const { resolveMode } = await import('../lib/mode-config.js');
   const { jsonOutput } = resolveMode({
     json: argv.json,
@@ -583,7 +584,7 @@
         if (execute) {
           // Spawn the resume command. Default behavior is "print only" for
           // safety — orchestrators that want auto-execution opt in.
-          const { spawn } = await import('node:child_process');
+          const { spawn } = await import('../utils/cross-platform-spawn.js');
           const [cmd, ...rest] = command;
           if (!cmd) {
             if (opts.jsonOutput)

_{You can send follow-ups to the cloud agent here.}

resolveCommonOpts now passes argv.installDir through resolveInstallDir so quoted/env-sourced `~` actually expands instead of being treated as a literal directory name. The resume --execute path now imports spawn from utils/cross-platform- spawn so the npm-installed `amplitude-wizard` .cmd shim resolves on Windows. Node's built-in spawn does not consult PATHEXT and would fail with ENOENT for every Windows user invoking `wizard resume --execute`.

…kpoints + MCP-app lifecycle Stacks on PR 1 (#689). Adds three typed checkpoint surfaces on top of the v2 orchestration foundation: - Choice — typed user-choice records with stable promptId for de-dup, requiresHuman automation gate, and full status transitions (pending → answered/expired/cancelled/superseded). - Verification — manual out-of-band verification records with status transitions (pending → passed/failed/skipped, skipped/failed may recover to passed; passed/skipped/failed may supersede). - McpAppCapability — durable lifecycle for every MCP-app capability with an anti-nag invariant: install_skipped → needs_user_choice REQUIRES a non-empty lastStateChangeReason. New CLI commands: - wizard choice list / show / answer (with --confirm-human gate) - wizard verification list / show / mark Wires last-stopping-point's pendingChoices / pendingMcpActions / pendingManualVerifications arrays to read real records (was [] in PR 1). Two callsites instrumented as the PR 2 wiring beachhead: - env-selection in src/commands/helpers.ts (Choice mirror + answer) - event-plan-approval in src/lib/wizard-tools.ts (Verification mirror) Adds 42 tests across choices/verifications/mcp-app-lifecycle/last-stopping-point/CLI. No TUI changes (deferred to PR 3); no MCP-server tool changes (deferred to PR 3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Resume command ignores session-id for action derivation
- Added an optional sessionId parameter to computeLastStoppingPoint so it looks up the specific session and scopes task filtering to that session, and updated the resume command handler to pass the user-specified session ID.

Or push these changes by commenting:

@cursor push 13e12425d3

Preview (13e12425d3)

diff --git a/src/commands/orchestration.ts b/src/commands/orchestration.ts
--- a/src/commands/orchestration.ts
+++ b/src/commands/orchestration.ts
@@ -559,7 +559,9 @@
           else getUI().log.error(`Session ${sessionIdRaw} not found`);
           process.exit(ExitCode.INVALID_ARGS);
         }
-        const lsp = computeLastStoppingPoint(opts.installDir);
+        const lsp = computeLastStoppingPoint(opts.installDir, {
+          sessionId: sessionId!,
+        });
         const command = lsp.nextAction.command;
         const description = lsp.nextAction.description;
 

diff --git a/src/lib/orchestration/last-stopping-point.ts b/src/lib/orchestration/last-stopping-point.ts
--- a/src/lib/orchestration/last-stopping-point.ts
+++ b/src/lib/orchestration/last-stopping-point.ts
@@ -16,7 +16,13 @@
 import { execFileSync } from 'node:child_process';
 
 import { TaskLifecycle, isActive } from './lifecycle';
-import type { LastStoppingPoint, NextAction, Ownership, Task } from './state';
+import type {
+  LastStoppingPoint,
+  NextAction,
+  Ownership,
+  SessionId,
+  Task,
+} from './state';
 import { getOrchestrationStore } from './store';
 import { CLI_INVOCATION } from '../../commands/context';
 
@@ -182,23 +188,27 @@
  */
 export function computeLastStoppingPoint(
   installDir: string,
-  options?: { now?: number; cliInvocation?: string[] },
+  options?: { now?: number; cliInvocation?: string[]; sessionId?: SessionId },
 ): LastStoppingPoint {
   const now = options?.now ?? Date.now();
   const cutoff = now - TWENTY_FOUR_HOURS_MS;
   const store = getOrchestrationStore(installDir);
   const file = store.read();
 
-  const session =
-    file.sessions
-      .filter((s) => s.status === 'active')
-      .sort((a, b) => b.createdAt - a.createdAt)[0] ?? null;
+  const session = options?.sessionId
+    ? file.sessions.find((s) => s.id === options.sessionId) ?? null
+    : file.sessions
+        .filter((s) => s.status === 'active')
+        .sort((a, b) => b.createdAt - a.createdAt)[0] ?? null;
 
   const branch = session?.branch ?? tryDetectBranch(installDir);
   const worktree = session?.worktree ?? tryDetectWorktree(installDir);
 
+  const scopeToSession = (t: Task): boolean =>
+    !options?.sessionId || t.sessionId === options.sessionId;
+
   const activeTasks = file.tasks
-    .filter((t) => isActive(t.state))
+    .filter((t) => isActive(t.state) && scopeToSession(t))
     .sort((a, b) => b.updatedAt - a.updatedAt);
   const stoppedTasks = file.tasks
     .filter(
@@ -206,11 +216,17 @@
         (t.state === TaskLifecycle.Failed ||
           t.state === TaskLifecycle.Cancelled ||
           t.state === TaskLifecycle.Superseded) &&
-        t.updatedAt >= cutoff,
+        t.updatedAt >= cutoff &&
+        scopeToSession(t),
     )
     .sort((a, b) => b.updatedAt - a.updatedAt);
   const recentlyCompletedTasks = file.tasks
-    .filter((t) => t.state === TaskLifecycle.Completed && t.updatedAt >= cutoff)
+    .filter(
+      (t) =>
+        t.state === TaskLifecycle.Completed &&
+        t.updatedAt >= cutoff &&
+        scopeToSession(t),
+    )
     .sort((a, b) => b.updatedAt - a.updatedAt);
 
   // Aggregate ownership across all live tasks plus recently-stopped tasks

_{You can send follow-ups to the cloud agent here.}

…ty + perf hot-paths + resilience Stacks on #690 (which stacks on #689). Merge after PRs 1 + 2. PR 3 lands the state-driven foundation that the broader v2 TUI redesign will sit on. Five concerns, all additive — every PR 1 + PR 2 surface keeps working unchanged. A. TUI v2 wiring — `/status` overlay renders the same data `wizard orchestration status --json` emits, sectioned for human reading. ManualVerificationRibbon mounts on OutroScreen so success- looking UI cannot appear while a verification is pending. ChoiceCheckpointBanner is a reusable primitive for surfacing typed Choice records with the full UX contract (why-asking, recommended, safe-default, reversibility, consequence-if-skipped). B. MCP-server tool parity — every read-only orchestration CLI command now has a matching MCP tool. Both surfaces call into the same builders in `src/lib/orchestration/envelopes.ts`, so output is byte-for-byte identical (modulo `generatedAt`). Server stays read-only by design — mutators stay on the CLI. C. Perf hot-paths — `withReadCache(fn)` amortises store reads across builders inside one command/tool invocation. `per-run-cache.ts` memoises repeated `gh pr view` / MCP-availability calls within a single run. D. Bugs found and fixed — - success-looking UI while blocked on a verification → ribbon - choices asked again after a durable answer → addChoice de-dup (covered in PR 2; regression test added) - skipped MCP apps not remembered → covered by anti-nag invariant (PR 2; surfaced via /status) Background agents continuing after cancellation: out of scope — call out as known limitation. E. Resilience — token-expired-during-long-task. agent-runner's AUTH_ERROR branch now mirrors the K/R question to a durable Choice (kind=keep_or_revert_files) plus a manual_pr_test Verification. `wizard status --json` thereafter shows `nextAction.kind === 'await_user_choice'`. F. Tests — 40+ new tests: - envelope schema parity (CLI ↔ MCP tool) - StatusOverlay rendering all sections - ChoiceCheckpoint UX contract (every required field surfaced) - OutroScreen verification ribbon regression - per-run-cache (memoize / memoizeAsync / invalidate) - auth-error resilience (Choice + Verification + LSP shape) - perf-status-cold (internal-cold-start bound) All 3919 unit tests pass; 100/100 BDD scenarios pass. G. Docs — extended `docs/orchestration.md` with PR 3 sections (TUI integration model, envelopes layer, MCP tool parity table, perf measurements, resilience flow). New `docs/agent-consumability.md` covers CLI / MCP / NDJSON consumption with worked examples (Claude Code, Cursor, CI bots, watchdogs). README + CLAUDE.md updated. Out of scope (future PRs): - Full TUI screen-tree redesign / information-architecture refactor. - Widening the Choice/Verification wiring beachhead beyond env-selection + event-plan-approval. - Retiring legacy `WizardSession`. - esbuild-bundled CLI for sub-200ms cold-start. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

kelsonpw · 2026-05-09T16:44:09Z

Bugbot triage pass complete: 0 stale, 2 live (both fixed in e1bad84 — Windows spawn + tilde expansion), 0 defensible.

`wizard resume <session-id>` validated the requested session existed but then called `computeLastStoppingPoint(installDir)` without scoping, which always derived its next action from the *most recently created active session* — not the one the user asked about. The envelope's `sessionId` field still echoed the requested ID, so the command/description shown could describe a different session entirely. `computeLastStoppingPoint` now accepts an optional `sessionId` that restricts both session metadata and task buckets to that session. The resume command threads the resolved session id through, and an added test pins the scoping behavior against a two-session fixture.

…kpoints + MCP-app lifecycle Stacks on PR 1 (#689). Adds three typed checkpoint surfaces on top of the v2 orchestration foundation: - Choice — typed user-choice records with stable promptId for de-dup, requiresHuman automation gate, and full status transitions (pending → answered/expired/cancelled/superseded). - Verification — manual out-of-band verification records with status transitions (pending → passed/failed/skipped, skipped/failed may recover to passed; passed/skipped/failed may supersede). - McpAppCapability — durable lifecycle for every MCP-app capability with an anti-nag invariant: install_skipped → needs_user_choice REQUIRES a non-empty lastStateChangeReason. New CLI commands: - wizard choice list / show / answer (with --confirm-human gate) - wizard verification list / show / mark Wires last-stopping-point's pendingChoices / pendingMcpActions / pendingManualVerifications arrays to read real records (was [] in PR 1). Two callsites instrumented as the PR 2 wiring beachhead: - env-selection in src/commands/helpers.ts (Choice mirror + answer) - event-plan-approval in src/lib/wizard-tools.ts (Verification mirror) Adds 42 tests across choices/verifications/mcp-app-lifecycle/last-stopping-point/CLI. No TUI changes (deferred to PR 3); no MCP-server tool changes (deferred to PR 3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Missing error handler on spawned child process
- Added a child.on('error', ...) handler that logs the error (respecting JSON output mode) and exits with ExitCode.GENERAL_ERROR, preventing an uncaught exception if the spawned binary fails to launch.
✅ Fixed: Redundant identical ensureDir calls in saveStore
- Removed the redundant ensureDir(getRunDir(next.installDir)) call since ensureDir(dirname(path)) already creates the same directory, and removed the now-unused getRunDir import.

Or push these changes by commenting:

@cursor push e57619e3cb

Preview (e57619e3cb)

diff --git a/src/commands/orchestration.ts b/src/commands/orchestration.ts
--- a/src/commands/orchestration.ts
+++ b/src/commands/orchestration.ts
@@ -563,7 +563,7 @@
         // and description belong to the session the user asked for, not the
         // most-recently-active session in the store.
         const lsp = computeLastStoppingPoint(opts.installDir, {
-          sessionId: session!.id,
+          sessionId: session.id,
         });
         const command = lsp.nextAction.command;
         const description = lsp.nextAction.description;
@@ -609,6 +609,15 @@
             process.exit(ExitCode.GENERAL_ERROR);
           }
           const child = spawn(cmd, rest, { stdio: 'inherit' });
+          child.on('error', (err) => {
+            if (opts.jsonOutput)
+              emitJsonError(`Failed to spawn resume command: ${err.message}`);
+            else
+              getUI().log.error(
+                `Failed to spawn resume command: ${err.message}`,
+              );
+            process.exit(ExitCode.GENERAL_ERROR);
+          });
           child.on('exit', (code) => {
             process.exit(code ?? 0);
           });

diff --git a/src/lib/orchestration/store.ts b/src/lib/orchestration/store.ts
--- a/src/lib/orchestration/store.ts
+++ b/src/lib/orchestration/store.ts
@@ -43,7 +43,6 @@
 import { OrchestrationStoreFileSchema } from './schemas';
 import { TaskLifecycle, assertTransition, isTerminal } from './lifecycle';
 import { getOrchestrationStoreFile } from './storage-paths';
-import { getRunDir } from '../../utils/storage-paths';
 import { dirname } from 'node:path';
 
 // ── Id helpers ────────────────────────────────────────────────────────
@@ -155,7 +154,6 @@
   OrchestrationStoreFileSchema.parse(next);
   const path = getOrchestrationStoreFile(next.installDir);
   ensureDir(dirname(path));
-  ensureDir(getRunDir(next.installDir));
   atomicWriteJSON(path, next, { mode: 0o600 });
 }

_{You can send follow-ups to the cloud agent here.}

…ty + perf hot-paths + resilience Stacks on #690 (which stacks on #689). Merge after PRs 1 + 2. PR 3 lands the state-driven foundation that the broader v2 TUI redesign will sit on. Five concerns, all additive — every PR 1 + PR 2 surface keeps working unchanged. A. TUI v2 wiring — `/status` overlay renders the same data `wizard orchestration status --json` emits, sectioned for human reading. ManualVerificationRibbon mounts on OutroScreen so success- looking UI cannot appear while a verification is pending. ChoiceCheckpointBanner is a reusable primitive for surfacing typed Choice records with the full UX contract (why-asking, recommended, safe-default, reversibility, consequence-if-skipped). B. MCP-server tool parity — every read-only orchestration CLI command now has a matching MCP tool. Both surfaces call into the same builders in `src/lib/orchestration/envelopes.ts`, so output is byte-for-byte identical (modulo `generatedAt`). Server stays read-only by design — mutators stay on the CLI. C. Perf hot-paths — `withReadCache(fn)` amortises store reads across builders inside one command/tool invocation. `per-run-cache.ts` memoises repeated `gh pr view` / MCP-availability calls within a single run. D. Bugs found and fixed — - success-looking UI while blocked on a verification → ribbon - choices asked again after a durable answer → addChoice de-dup (covered in PR 2; regression test added) - skipped MCP apps not remembered → covered by anti-nag invariant (PR 2; surfaced via /status) Background agents continuing after cancellation: out of scope — call out as known limitation. E. Resilience — token-expired-during-long-task. agent-runner's AUTH_ERROR branch now mirrors the K/R question to a durable Choice (kind=keep_or_revert_files) plus a manual_pr_test Verification. `wizard status --json` thereafter shows `nextAction.kind === 'await_user_choice'`. F. Tests — 40+ new tests: - envelope schema parity (CLI ↔ MCP tool) - StatusOverlay rendering all sections - ChoiceCheckpoint UX contract (every required field surfaced) - OutroScreen verification ribbon regression - per-run-cache (memoize / memoizeAsync / invalidate) - auth-error resilience (Choice + Verification + LSP shape) - perf-status-cold (internal-cold-start bound) All 3919 unit tests pass; 100/100 BDD scenarios pass. G. Docs — extended `docs/orchestration.md` with PR 3 sections (TUI integration model, envelopes layer, MCP tool parity table, perf measurements, resilience flow). New `docs/agent-consumability.md` covers CLI / MCP / NDJSON consumption with worked examples (Claude Code, Cursor, CI bots, watchdogs). README + CLAUDE.md updated. Out of scope (future PRs): - Full TUI screen-tree redesign / information-architecture refactor. - Widening the Choice/Verification wiring beachhead beyond env-selection + event-plan-approval. - Retiring legacy `WizardSession`. - esbuild-bundled CLI for sub-200ms cold-start. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- `resume --execute` now attaches `child.on('error', …)` before `exit`. Previously a synchronous spawn failure (ENOENT, EACCES, missing PATH entry on Windows) fired an unhandled `error` event, which Node's EventEmitter rethrows — crashing the CLI with a stack trace instead of producing a clean message + GENERAL_ERROR exit. - `saveStore` was calling `ensureDir(dirname(path))` and then `ensureDir(getRunDir(installDir))` — both resolve to the same run directory because `getOrchestrationStoreFile()` is defined as `join(getRunDir(installDir), 'orchestration.json')`. Drop the second call and the now-unused `getRunDir` import.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Description uses hardcoded CLI_INVOCATION instead of configurable invocation
- Replaced the hardcoded CLI_INVOCATION on line 170 with cliPrefix.join(' ') so the description string uses the same configurable invocation as the command array.

Or push these changes by commenting:

@cursor push b9a4e38b5d

Preview (b9a4e38b5d)

diff --git a/src/lib/orchestration/last-stopping-point.ts b/src/lib/orchestration/last-stopping-point.ts
--- a/src/lib/orchestration/last-stopping-point.ts
+++ b/src/lib/orchestration/last-stopping-point.ts
@@ -167,7 +167,9 @@
     const recent = args.stoppedTasks[0];
     return {
       kind: 'inspect_failure',
-      description: `Most recent stop: ${recent.label} (${recent.state}). Inspect with \`${CLI_INVOCATION} task ${recent.id}\`.`,
+      description: `Most recent stop: ${recent.label} (${
+        recent.state
+      }). Inspect with \`${cliPrefix.join(' ')} task ${recent.id}\`.`,
       command: [...cliPrefix, 'task', recent.id, ...installDirArgs],
     };
   }

_{You can send follow-ups to the cloud agent here.}

`deriveNextAction` builds an `inspect_failure` next-action when the most recent task has stopped. The structured `command` array uses the configurable `cliPrefix` (sourced from `args.invocation`, which flows from `options.cliInvocation` on `computeLastStoppingPoint`), but the inline shell hint embedded in `description` was templating the hardcoded module-level `CLI_INVOCATION` constant. A custom invocation (e.g. an alternate `wizard` symlink, or a test harness overriding the binary name) would surface a description that says \`amplitude-wizard task <id>\` while the JSON payload's `command` points at the configured executable. Use `cliPrefix.join(' ')` for both so the human and machine views always agree.

…dundancy + supervisor + live status refresh (stacks on #691) Stacks on #691 → #690 → #689. Merge after PRs 1+2+3. ## Summary - **Beachhead widening**: centralized `record*Choice` / `record*Verification` helpers in `src/lib/orchestration/wiring.ts` and wired them through every major user-choice and manual-verification surface in the wizard (MCP install, Slack, region select, OAuth browser login, project creation, dashboard setup, event-plan revision, logout). Existing TUI screens / agent prompts continue to drive the user-facing flow; the orchestration store mirror is ADDITIVE so outer agents inspecting `wizard status --json` see typed records. Mirror failures swallow + log so they NEVER break the user-facing path. - **WizardSession boundary**: docblock at the top of `wizard-session.ts` now spells out the contract — `WizardSession` = transient TUI display state; `OrchestrationStore` = durable orchestration state; never duplicate fields between them. Audit table in `docs/orchestration.md` (PR 4 section) walks every field. PR 4 deletes zero fields by design — the redundant *concept* (Subagent / Task / Ownership double-bookkeeping) was already avoided in PR 1; PR 4 cements the contract for PR 5's screen-tree redesign. - **Background-agent supervision**: new `Supervisor` class in `src/lib/orchestration/supervisor.ts`. Tracks subprocess PIDs that map to `Subagent` rows, writes `<runDir>/heartbeats/<pid>.txt` every 5s, SIGTERMs on SIGINT/SIGTERM (with 5s grace before SIGKILL), reaps stale heartbeats (>30s old + PID gone) by transitioning the rooted Task to `cancelled`. Startup recovery transitions orphaned-but-running Tasks to `failed: 'process gone'`. Eliminates the "stopped agents shown as running" drift. - **Live `/status` refresh**: new `watchOrchestrationStore` (debounced 200ms, watches the parent dir to survive `atomicWriteJSON`'s rename) + `useOrchestrationStore` React hook. `StatusOverlayScreen` plumbs the hook in so the overlay re-renders when a sibling shell mutates the store via `wizard choice answer`, `wizard verification mark`, etc. ## Tests +30 tests (3919 → 3949 vitest, 100/100 BDD): - 20 wiring tests (each `record*` helper, dedup invariant, answerByPromptId, anti-nag re-record, verification mark-passed contract) - 5 supervisor tests (track + heartbeat write, terminateAll + signal/marking, stale-heartbeat reap, recoverOrphanedSubagents, untrack) - 5 watcher tests (write fires onChange, debounce coalesces a burst, dispose idempotency, no-fire-after-dispose, late-mount before file exists) All test surfaces: - `pnpm exec vitest run --pool=forks --maxWorkers=1` → 3949/3949 - `pnpm test:bdd` → 100/100 - `pnpm build` → green - `pnpm lint` → green (1 pre-existing warning unchanged) - `pnpm exec tsc --noEmit -p tsconfig.json` → clean ## Backward compatibility - No public-contract changes. `wizard status --json`, `wizard choice list`, `wizard verification list`, and the MCP server's read-only tools all keep emitting the same envelope shapes; PR 4 just produces *more* records in them. - AI-SDK migration unaffected — no fields removed from `WizardSession`. ## Known limitations - TUI screen-tree redesign still PR 5. - Cold-start bundling still a follow-up. - Some less-trafficked prompt surfaces (the inner agent's `choose` tool, per-tool MCP auth confirmations) intentionally keep their existing transient-text path. The audit table in `docs/orchestration.md` documents what was wired and what was skipped (and why). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Resume command string breaks on paths with spaces
- Added a shellJoin helper that single-quotes any argv element containing whitespace or shell metacharacters, and replaced all five command.join(' ') call sites in last-stopping-point.ts and orchestration.ts with it.

Or push these changes by commenting:

@cursor push de7da72a3c

Preview (de7da72a3c)

diff --git a/src/commands/orchestration.ts b/src/commands/orchestration.ts
--- a/src/commands/orchestration.ts
+++ b/src/commands/orchestration.ts
@@ -536,7 +536,7 @@
         const { getOrchestrationStore } = await import(
           '../lib/orchestration/store.js'
         );
-        const { computeLastStoppingPoint } = await import(
+        const { computeLastStoppingPoint, shellJoin } = await import(
           '../lib/orchestration/last-stopping-point.js'
         );
         const { ResumeEnvelopeSchema } = await import(
@@ -583,7 +583,7 @@
         } else {
           const ui = getUI();
           ui.log.info(description);
-          ui.log.info(`Resume: ${chalk.bold(command.join(' '))}`);
+          ui.log.info(`Resume: ${chalk.bold(shellJoin(command))}`);
           if (!execute) {
             ui.log.info(
               chalk.dim('(pass --execute to invoke this command directly)'),
@@ -666,7 +666,7 @@
               const { getOrchestrationStore } = await import(
                 '../lib/orchestration/store.js'
               );
-              const { computeLastStoppingPoint } = await import(
+              const { computeLastStoppingPoint, shellJoin } = await import(
                 '../lib/orchestration/last-stopping-point.js'
               );
               const { StatusEnvelopeSchema } = await import(
@@ -696,7 +696,7 @@
                   );
                   ui.log.info(lsp.nextAction.description);
                   ui.log.info(
-                    `Resume: ${chalk.bold(lsp.nextAction.command.join(' '))}`,
+                    `Resume: ${chalk.bold(shellJoin(lsp.nextAction.command))}`,
                   );
                 } else {
                   ui.log.info(
@@ -729,7 +729,7 @@
                   ui.log.info(`Next action: ${lsp.nextAction.description}`);
                   ui.log.info(
                     `Resume:      ${chalk.bold(
-                      lsp.nextAction.command.join(' '),
+                      shellJoin(lsp.nextAction.command),
                     )}`,
                   );
                 }

diff --git a/src/lib/orchestration/last-stopping-point.ts b/src/lib/orchestration/last-stopping-point.ts
--- a/src/lib/orchestration/last-stopping-point.ts
+++ b/src/lib/orchestration/last-stopping-point.ts
@@ -66,6 +66,18 @@
   }
 }
 
+/**
+ * Join an argv array into a copy-pasteable shell string, quoting any
+ * arguments that contain whitespace or shell metacharacters.
+ */
+export function shellJoin(argv: readonly string[]): string {
+  return argv
+    .map((arg) =>
+      /[\s"'\\`$!#&|;()<>]/.test(arg) ? `'${arg.replace(/'/g, "'\\''")}'` : arg,
+    )
+    .join(' ');
+}
+
 function dedupeOwnership(ownership: Ownership[]): Ownership[] {
   const seen = new Set<string>();
   const out: Ownership[] = [];
@@ -171,7 +183,7 @@
     // `CLI_INVOCATION` here meant a custom invocation (e.g. test harness,
     // alternate `wizard` symlink) would print the wrong command name in
     // the human-readable hint while emitting the correct one in JSON.
-    const cliInline = cliPrefix.join(' ');
+    const cliInline = shellJoin(cliPrefix);
     return {
       kind: 'inspect_failure',
       description: `Most recent stop: ${recent.label} (${recent.state}). Inspect with \`${cliInline} task ${recent.id}\`.`,
@@ -296,6 +308,6 @@
     pendingMcpActions,
     pendingManualVerifications,
     nextAction,
-    resumeCommand: nextAction.command.join(' '),
+    resumeCommand: shellJoin(nextAction.command),
   };
 }

_{You can send follow-ups to the cloud agent here.}

…rdown (stacks on #693) Stacks on #693 → #691 → #690 → #689. Merge after PRs 1+2+3+4. PR 5 turns the TUI from "screens that mostly work" into a serious operator interface with a coherent IA, shared glyph vocabulary, and render-cost discipline. IA redesign: - Three-zone layout (header / body / chrome). - Header: JourneyStepper + identity + mode badge. Mode badge surfaces agent / ci / nested / mcp-server states; suppressed in plain interactive mode. - Operator Overview screen (`/status`) reframed: title + mode badge + 1-line summary, then sectioned by Session / Primary work / Background / Pending choices / Pending verifications / MCP capabilities / Owned artifacts / Next action. Live-refresh on orchestration store mutations via PR 4's file-watcher hook. Glyph palette (canonical vocabulary): ○ queued · › running · … waiting · ⏸ blocked · ✓ completed ✗ failed · ⊘ cancelled · ⮕ superseded Centralized in `src/ui/tui/utils/lifecycle-display.ts` so a future "swap one glyph" change is a one-line edit, not a hunt across the screen tree. Pinned by unit tests so silent drift trips a test. Slash command coherence: - New `/help` command lists every registered command grouped by "available anytime" vs "available before/after a setup run". When a run is active, the second group is renamed "paused while a setup run is active (Ctrl+C to cancel, then retry)" so the user knows exactly why a command can't fire. - Multi-line command feedback (e.g. /help, /diagnostics) renders with hanging indent so it reads as one block. Render-cost teardown: - New `useWizardSelector(store, selector, isEqual?)` slice hook. Components subscribed to a slice no longer rerender for unrelated store ticks. `shallowArrayEqual` and `shallowObjectEqual` exported for the common case. - Render-cost benchmark fixture pins the contract: 3 task transitions + 5 status bumps → tasks slice 3 renders, status slice 5 renders, whole-store subscriber 8+ renders. Slicing cuts each subscriber's render budget by ~60%. Tests added (40 over the base 3949): - lifecycle-display vocabulary (5) - mode-badge env resolution (9) - /help text generation (6) - HeaderBar mode badge rendering (5) - useWizardSelector primitives + render-cost ceiling (4 + 3) - StatusOverlayScreen glyph palette + summary + mode badge (7) - StatusOverlayScreen Operator Overview reframing (existing test updated to match new section names) (1) Build, lint, vitest (3989/3989), BDD (100/100) all green. Backward compatibility: - All existing slash commands continue to work the same way; /help is additive. - /status overlay's data shape is unchanged from PR 3; only the rendering reorganized. - --agent, --ci, --json, manifest, plan, apply, verify, MCP server, v: 1 envelope, exit codes — all unchanged. - Mode badge is suppressed in plain interactive mode, preserving the prior header look for the most common case. - ProgressList still uses a blank gutter for `pending` rows rather than the canonical ○ glyph (deliberate UX trade-off — see comment in ProgressList.tsx). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`resumeCommand` is the human-facing copy-pasteable form of `nextAction.command`. It was built with `nextAction.command.join(' ')`, which silently corrupts paths containing spaces (e.g. an `installDir` of `/Users/me/my project` would land in the shell as two separate words). The structured `command` array stayed correct, but the string the user is invited to paste into a terminal would fail. Add a small `shellJoin` / `shellQuote` helper that wraps tokens with shell metacharacters or whitespace in single quotes (with the standard `'\''` close/escape/reopen dance for embedded single quotes). Tokens that are already shell-safe stay unquoted so the common case stays readable.

…estration commands The structured `lsp.nextAction.command` array is correct, but joining it with ` ` for the human-facing `Resume:` line splits paths-with-spaces into multiple shell words. Switch every human-display callsite to `lsp.resumeCommand`, which already shellQuotes via `shellJoin`. Mirrors the fix that was applied to the LSP envelope, but covers the remaining `wizard resume` and `wizard orchestration status` print paths.