feat(agent): inner-agent lifecycle + file_change NDJSON events by kelsonpw · Pull Request #256 · amplitude/wizard

kelsonpw · 2026-04-25T20:08:00Z

Summary

Surface what the inner Claude SDK agent is doing so outer agents (Claude Code, Cursor, Codex) can mirror progress, attribute file changes to specific tools, and decide when to abort. Adds eight new event types to the agent-mode wire format and a hook factory that's ready to wire in.

This is Gap 2 of 4 in the wizard sub-agent contract. Stacked on #255.

New events

inner_agent_started     model + phase + planId at SessionStart
tool_call               every PreToolUse, with privacy-safe summary
file_change_planned     write-tool intent (path + create/modify/delete)
file_change_applied     write-tool success, paired with planned
event_plan_proposed     confirm_event_plan invocation
event_plan_confirmed    decision + source (auto/human/flag)
verification_started    pre-check phase boundary
verification_result     pass/fail with structured failures

Why additive only

Implementation is intentionally additive — no changes to agent-interface.ts to avoid conflicts with:

feat(agent): migrate tool allowlist from canUseTool to PreToolUse hook #243 (PreToolUse migration from canUseTool)
feat: observability spine (Bet 1) — run_id, trace headers, funnel, tool summary, session-trace upload #149 (observability spine touching the same hooks)
feat: structured status via report_status MCP tool (Bet 2 slice 2) #172 (report_status MCP tool that may overlap with tool_call)

The new createInnerLifecycleHooks(config) factory in src/lib/inner-lifecycle.ts returns hook callbacks shaped to merge into the existing buildHooksConfig call. Wiring is documented in the module header so a follow-up PR (or whoever lands #243) can compose the hooks in one ~5-line change:

// src/lib/agent-interface.ts, inside buildHooksConfig({...})
import { createInnerLifecycleHooks } from './inner-lifecycle.js';
const inner = createInnerLifecycleHooks({ phase: 'wizard' });
buildHooksConfig({
  ...inner.hooks(),
  Stop: createStopHook(...),
});

What's in

src/lib/agent-events.ts — 8 new typed event interfaces, plus three helpers:
- summarizeToolInput(name, input) — privacy-safe summary for PreToolUse payloads (file paths for Read/Edit/Write, command head for Bash, pattern for Grep, etc.)
- classifyWriteOperation(name) — Write→create, Edit→modify, others→null
- summarizeForEvent(s, max) — truncate-with-ellipsis for any string
src/ui/agent-ui.ts — 8 new emit* methods, all thin wrappers over the existing emit() infrastructure. They reuse the same envelope (v: 1, @timestamp, session_id, run_id, …) so outer agents need only one parser.
src/lib/inner-lifecycle.ts — createInnerLifecycleHooks(config) factory. Returns:
- hooks() → { SessionStart, PreToolUse, PostToolUse } ready to merge into buildHooksConfig
- emitEventPlanProposed(events) for the wizard-tools MCP server to call from confirm_event_plan
- emitEventPlanConfirmed(source, decision) for the same
- withVerification(phase, fn) wrapper that emits started/result around any async check
AgentUI emit methods are no-ops in TUI/CI mode — the helper checks getUI() instanceof AgentUI and short-circuits cleanly so the same hook config can be used in any mode.

Test plan

pnpm test — 1288 passed, 17 skipped (21 new tests in inner-lifecycle.test.ts)
pnpm tsc --noEmit clean
pnpm lint clean

Out of scope (follow-up)

Wiring createInnerLifecycleHooks into agent-interface.ts:buildHooksConfig — small, single-PR change once feat(agent): migrate tool allowlist from canUseTool to PreToolUse hook #243 lands so the PreToolUse hook composition is clean.
Wiring emitEventPlanProposed / emitEventPlanConfirmed into the confirm_event_plan MCP tool in wizard-tools.ts — also a small follow-up, gated on whichever Bet 2 PR touches wizard-tools.ts next.
report_status (feat: structured status via report_status MCP tool (Bet 2 slice 2) #172) → NDJSON forwarding. The tool_call event covers most of what report_status does for tools; the cross-over is a small refactor once both are in.

cc @amplitude/growth

🤖 Generated with Claude Code

Note

Medium Risk
Medium risk because it expands the agent-mode NDJSON wire schema and adds new emitters/hooks that affect stdout output consumed by external orchestrators. Behavior is otherwise additive and gated to AgentUI (agent mode) only.

Overview
Adds inner-agent lifecycle observability to agent-mode output by introducing 8 new NDJSON event payloads in agent-events.ts (start, tool calls, planned/applied file changes, event-plan proposed/confirmed, and verification started/result), plus helpers to privacy-safely summarize tool inputs and classify write operations.

Introduces createInnerLifecycleHooks (inner-lifecycle.ts) to forward Claude SDK SessionStart/PreToolUse/PostToolUse signals into those NDJSON events (including byte counts for applied writes), and extends AgentUI with corresponding emit* methods.

Adds Vitest coverage validating the summarizers and that the hook factory emits the expected NDJSON lines when AgentUI is active.

^{Reviewed by Cursor Bugbot for commit cbeeb24. Bugbot is set up for automated code reviews on this repo. Configure here.}

Adds a typed `needs_input` event so outer agents (Claude Code, Cursor, Codex) can deterministically surface decisions to a human instead of the wizard silently auto-selecting. Every event carries `code`, `choices`, `recommended`, and `resumeFlags` so the orchestrator can either re-invoke with a single CLI flag or pipe a JSON line to stdin. - New `src/lib/agent-events.ts` — source-of-truth schema for the agent-mode wire format. `AgentEventEnvelope`, `NeedsInputData`, `AgentEventType` all live here so future events land against one doc. - `AgentUI.emitNeedsInput()` — public method any caller (including upcoming plan/apply commands) can use to surface a decision. - `promptConfirm` and `promptChoice` now emit a `needs_input` event in addition to the legacy `prompt` event so existing orchestrators keep working while new ones key off the canonical shape. - `promptEnvironmentSelection` emits both event shapes; the existing stdin round-trip behavior is unchanged. - `ExitCode.INPUT_REQUIRED = 12` so future flag-matrix work can exit cleanly when input is needed and `--auto-approve` is not set. Tests: +3 in agent-ui.test.ts. Suite green (1240 pass, 17 skip). Part of the wizard sub-agent design — Gap 1 of 4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-gate Splits the implicit "agent-mode auto-approves everything" behavior into three explicit, composable capabilities so plan/apply/verify can layer on without ambiguity: --auto-approve → silently pick `recommended` on `needs_input` (no writes) --yes (-y) → autoApprove + allowWrites (today's --yes / --ci semantics) --force → autoApprove + allowWrites + allowDestructive Back-compat preserved: `--agent` alone still implies `autoApprove + allowWrites`. The upcoming `apply` command will pass `requireExplicitWrites: true` to `resolveMode`, which forces writes to be requested by name. - `ModeConfig` extends new `CapabilityFlags` interface - `resolveMode` builds capabilities additively from the flag set - `evaluateWriteGate(toolName, toolInput, caps)` is a pure function the PreToolUse hook can call to decide allow/deny — gates Edit/Write/ MultiEdit/NotebookEdit on `allowWrites`, and gates a curated set of destructive Bash patterns (rm -rf, git reset --hard, git push --force, DROP TABLE, etc.) on `allowDestructive` - New `--auto-approve` and `--force` global flags in bin.ts - `--yes` consolidated to a single global declaration with `-y` alias - `ExitCode.WRITE_REFUSED = 13` for clean re-invocation by outer agents Tests: +16 in mode-config.test.ts (35 total). Suite green (1257 pass). Part of the wizard sub-agent design — Gap 4 of 4. Stacked on #253. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Splits the wizard's run-everything-at-once flow into three explicit phases so outer agents (Claude Code, Cursor, Codex) can inspect a plan before any writes happen: npx wizard plan → emits a `plan` NDJSON event with planId + framework + sdk + resumeFlags. No writes. Persists the plan to $TMPDIR/amplitude-wizard-plans/<planId>.json with a 24h TTL. npx wizard apply --plan-id <id> --yes → loads + validates the plan, then runs the wizard. Refuses without --yes (exits WRITE_REFUSED=13 with a clear resume hint). Refuses stale or unknown plan IDs (exits INVALID_ARGS=2). npx wizard verify → cheap, no-network check that SDK is installed + API key is configured + framework is detectable. Emits a structured `verification_result` event. Implementation: - `src/lib/agent-plans.ts` — typed plan persistence layer. Zod-validated WizardPlan schema (v1), atomic JSON writes (0o600 perms), TTL-based expiry, `pruneStalePlans` for cleanup. - `src/lib/agent-ops.ts` — adds `runPlan` and `runVerify` business logic alongside the existing `runDetect` / `runStatus`. No UI, no process.exit; thin yargs handlers compose them. - `bin.ts` — three new yargs `.command()` entries that all pass `requireExplicitWrites: true` to `resolveMode` so the new commands opt out of the agent-implies-writes back-compat. Smoke tests (manual, in this directory): $ wizard plan --json → emits plan envelope, exit 0 $ wizard apply --plan-id X --json → exits 13, no writes $ wizard apply --plan-id X --yes --json → executes $ wizard verify --json → emits verification_result, exit reflects pass/fail Tests: +10 in agent-plans.test.ts (1267 total). Suite green. Part of the wizard sub-agent design — Gap 3 of 4. Stacked on #254. Note: the `apply` handler currently spawns the existing `--agent --yes` wizard run with the planId in the env. Wiring the plan into the agent prompt (so the inner agent reads `WizardPlan` and reports back) is a follow-up that lands cleanly when #180 (three-phase handoff schemas) ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Surface what the inner Claude SDK agent is doing so outer agents can mirror progress, attribute file changes to specific tools, and decide when to abort. Adds eight new event types to the agent-mode wire format: inner_agent_started — model + phase + planId at SessionStart tool_call — every PreToolUse, with privacy-safe summary file_change_planned — write-tool intent (path + create/modify/delete) file_change_applied — write-tool success, paired with planned event_plan_proposed — confirm_event_plan invocation event_plan_confirmed — decision + source (auto/human/flag) verification_started — pre-check phase boundary verification_result — pass/fail with structured failures Implementation is intentionally additive — no changes to agent-interface.ts to avoid conflicts with #243 (PreToolUse migration) and #149 (observability spine). The new `createInnerLifecycleHooks(config)` factory in src/lib/inner-lifecycle.ts returns hook callbacks shaped to merge into the existing `buildHooksConfig` call. Wiring is documented in the module header so a follow-up PR (or whoever lands #243) can compose the hooks in one ~5-line change. Helpers in agent-events.ts: - summarizeToolInput(name, input) — privacy-safe summary for PreToolUse payloads (file paths for Read/Edit/Write, command head for Bash, etc.) - classifyWriteOperation(name) — Write→create, Edit→modify, others→null - summarizeForEvent(s, max) — truncate-with-ellipsis for any string AgentUI emit methods are no-ops in TUI/CI mode — the helper checks `getUI() instanceof AgentUI` and short-circuits cleanly. Tests: +21 in inner-lifecycle.test.ts (1288 total). Suite green. Part of the wizard sub-agent design — Gap 2 of 4. Stacked on #255. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-25T20:08:13Z

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

/wizard-ci all

Test all apps in a directory:

/wizard-ci django
/wizard-ci fastapi
/wizard-ci flask
/wizard-ci javascript-node
/wizard-ci javascript-web
/wizard-ci next-js
/wizard-ci python
/wizard-ci react-router
/wizard-ci vue

Test an individual app:

/wizard-ci django/django3-saas
/wizard-ci fastapi/fastapi3-ai-saas
/wizard-ci flask/flask3-social-media

Show more apps

/wizard-ci javascript-node/express-todo
/wizard-ci javascript-node/fastify-blog
/wizard-ci javascript-node/hono-links
/wizard-ci javascript-node/koa-notes
/wizard-ci javascript-node/native-http-contacts
/wizard-ci javascript-web/saas-dashboard
/wizard-ci next-js/15-app-router-saas
/wizard-ci next-js/15-app-router-todo
/wizard-ci next-js/15-pages-router-saas
/wizard-ci next-js/15-pages-router-todo
/wizard-ci python/meeting-summarizer
/wizard-ci react-router/react-router-v7-project
/wizard-ci react-router/rrv7-starter
/wizard-ci react-router/saas-template
/wizard-ci react-router/shopper
/wizard-ci vue/movies

Results will be posted here when complete.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Non-Error throws produce [null] in failures array
- Changed (e as Error).message to String((e as Error).message ?? e) so non-Error thrown values produce a human-readable string instead of undefined/null.

Or push these changes by commenting:

@cursor push 19ca88cf89

Preview (19ca88cf89)

diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -114,14 +114,14 @@
       typeof input.tool_name === 'string'
         ? input.tool_name
         : typeof input.toolName === 'string'
-        ? input.toolName
-        : 'unknown';
+          ? input.toolName
+          : 'unknown';
     const toolInput =
       typeof input.tool_input !== 'undefined'
         ? input.tool_input
         : typeof input.toolInput !== 'undefined'
-        ? input.toolInput
-        : null;
+          ? input.toolInput
+          : null;
     const summary = summarizeToolInput(toolName, toolInput);
     ui.emitToolCall({ tool: toolName, summary });
 
@@ -137,8 +137,8 @@
         typeof obj.file_path === 'string'
           ? obj.file_path
           : typeof obj.path === 'string'
-          ? obj.path
-          : null;
+            ? obj.path
+            : null;
       if (path) {
         ui.emitFileChangePlanned({ path, operation });
       }
@@ -153,16 +153,16 @@
       typeof input.tool_name === 'string'
         ? input.tool_name
         : typeof input.toolName === 'string'
-        ? input.toolName
-        : 'unknown';
+          ? input.toolName
+          : 'unknown';
     const operation = classifyWriteOperation(toolName);
     if (!operation) return Promise.resolve({});
     const toolInput =
       typeof input.tool_input !== 'undefined'
         ? input.tool_input
         : typeof input.toolInput !== 'undefined'
-        ? input.toolInput
-        : null;
+          ? input.toolInput
+          : null;
     const obj =
       toolInput && typeof toolInput === 'object'
         ? (toolInput as Record<string, unknown>)
@@ -171,8 +171,8 @@
       typeof obj.file_path === 'string'
         ? obj.file_path
         : typeof obj.path === 'string'
-        ? obj.path
-        : null;
+          ? obj.path
+          : null;
     if (path) {
       const content = typeof obj.content === 'string' ? obj.content : null;
       ui.emitFileChangeApplied({
@@ -210,7 +210,7 @@
           ui.emitVerificationResult({
             phase,
             success: false,
-            failures: [(e as Error).message],
+            failures: [String((e as Error).message ?? e)],
           });
         }
         throw e;

_{You can send follow-ups to the cloud agent here.}

kelsonpw · 2026-04-25T21:03:23Z

@cursor push 19ca88c

Applied via @cursor push command

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Falsy check on content drops bytes for empty writes
- Changed content && to content !== null && so that empty string content correctly emits bytes: 0 instead of being silently omitted.

Or push these changes by commenting:

@cursor push 826ae92626

Preview (826ae92626)

diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -178,7 +178,7 @@
       ui.emitFileChangeApplied({
         path,
         operation,
-        ...(content && { bytes: Buffer.byteLength(content, 'utf8') }),
+        ...(content !== null && { bytes: Buffer.byteLength(content, 'utf8') }),
       });
     }
     return Promise.resolve({});

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.}

cursor · 2026-04-25T21:09:02Z

+      ui.emitFileChangeApplied({
+        path,
+        operation,
+        ...(content && { bytes: Buffer.byteLength(content, 'utf8') }),


Falsy check on content drops bytes for empty writes

Low Severity

The content && { bytes: Buffer.byteLength(content, 'utf8') } spread uses a truthiness check. An empty string '' is falsy in JavaScript, so when a Write tool creates an empty file, the && short-circuits to '' and spreading that is a no-op. This means bytes: 0 is silently omitted instead of being reported, creating an inconsistency where non-empty writes include bytes but empty writes don't. Using content !== null instead of content would correctly emit bytes: 0.

^{Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.}

kelsonpw and others added 4 commits April 25, 2026 12:50

kelsonpw requested a review from a team April 25, 2026 20:08

kelsonpw mentioned this pull request Apr 25, 2026

refactor: [BA-47] rename "workspace" to "project" across the wizard #235

Merged

7 tasks

cursor Bot reviewed Apr 25, 2026

View reviewed changes

Comment thread src/lib/inner-lifecycle.ts

kelsonpw mentioned this pull request Apr 25, 2026

feat(agent): typed UI-hint protocol on needs_input + projects list command #257

Closed

5 tasks

fix: handle non-Error throws in withVerification failures array

cbeeb24

Applied via @cursor push command

cursor Bot reviewed Apr 25, 2026

View reviewed changes

kelsonpw force-pushed the kelsonpw/agent-plan-apply-verify branch from 62932a3 to db720a0 Compare April 26, 2026 03:33

kelsonpw deleted the branch kelsonpw/agent-plan-apply-verify April 26, 2026 04:03

kelsonpw closed this Apr 26, 2026

kelsonpw mentioned this pull request Apr 26, 2026

feat(agent): inner-agent lifecycle + file_change NDJSON events #270

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): inner-agent lifecycle + file_change NDJSON events#256

feat(agent): inner-agent lifecycle + file_change NDJSON events#256
kelsonpw wants to merge 5 commits intokelsonpw/agent-plan-apply-verifyfrom
kelsonpw/agent-inner-lifecycle

kelsonpw commented Apr 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

kelsonpw commented Apr 25, 2026

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

cursor Bot Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kelsonpw commented Apr 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New events

Why additive only

What's in

Test plan

Out of scope (follow-up)

Uh oh!

github-actions Bot commented Apr 25, 2026

🧙 Wizard CI

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kelsonpw commented Apr 25, 2026

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 25, 2026

Choose a reason for hiding this comment

Falsy check on content drops bytes for empty writes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kelsonpw commented Apr 25, 2026 •

edited by cursor Bot

Loading

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading