Skip to content

feat(agent): inner-agent lifecycle + file_change NDJSON events#256

Closed
kelsonpw wants to merge 5 commits intokelsonpw/agent-plan-apply-verifyfrom
kelsonpw/agent-inner-lifecycle
Closed

feat(agent): inner-agent lifecycle + file_change NDJSON events#256
kelsonpw wants to merge 5 commits intokelsonpw/agent-plan-apply-verifyfrom
kelsonpw/agent-inner-lifecycle

Conversation

@kelsonpw
Copy link
Copy Markdown
Collaborator

@kelsonpw kelsonpw commented Apr 25, 2026

Summary

Surface what the inner Claude SDK agent is doing so outer agents (Claude Code, Cursor, Codex) can mirror progress, attribute file changes to specific tools, and decide when to abort. Adds eight new event types to the agent-mode wire format and a hook factory that's ready to wire in.

This is Gap 2 of 4 in the wizard sub-agent contract. Stacked on #255.

New events

inner_agent_started     model + phase + planId at SessionStart
tool_call               every PreToolUse, with privacy-safe summary
file_change_planned     write-tool intent (path + create/modify/delete)
file_change_applied     write-tool success, paired with planned
event_plan_proposed     confirm_event_plan invocation
event_plan_confirmed    decision + source (auto/human/flag)
verification_started    pre-check phase boundary
verification_result     pass/fail with structured failures

Why additive only

Implementation is intentionally additive — no changes to agent-interface.ts to avoid conflicts with:

The new createInnerLifecycleHooks(config) factory in src/lib/inner-lifecycle.ts returns hook callbacks shaped to merge into the existing buildHooksConfig call. Wiring is documented in the module header so a follow-up PR (or whoever lands #243) can compose the hooks in one ~5-line change:

// src/lib/agent-interface.ts, inside buildHooksConfig({...})
import { createInnerLifecycleHooks } from './inner-lifecycle.js';
const inner = createInnerLifecycleHooks({ phase: 'wizard' });
buildHooksConfig({
  ...inner.hooks(),
  Stop: createStopHook(...),
});

What's in

  • src/lib/agent-events.ts — 8 new typed event interfaces, plus three helpers:
    • summarizeToolInput(name, input) — privacy-safe summary for PreToolUse payloads (file paths for Read/Edit/Write, command head for Bash, pattern for Grep, etc.)
    • classifyWriteOperation(name) — Write→create, Edit→modify, others→null
    • summarizeForEvent(s, max) — truncate-with-ellipsis for any string
  • src/ui/agent-ui.ts — 8 new emit* methods, all thin wrappers over the existing emit() infrastructure. They reuse the same envelope (v: 1, @timestamp, session_id, run_id, …) so outer agents need only one parser.
  • src/lib/inner-lifecycle.tscreateInnerLifecycleHooks(config) factory. Returns:
    • hooks(){ SessionStart, PreToolUse, PostToolUse } ready to merge into buildHooksConfig
    • emitEventPlanProposed(events) for the wizard-tools MCP server to call from confirm_event_plan
    • emitEventPlanConfirmed(source, decision) for the same
    • withVerification(phase, fn) wrapper that emits started/result around any async check
  • AgentUI emit methods are no-ops in TUI/CI mode — the helper checks getUI() instanceof AgentUI and short-circuits cleanly so the same hook config can be used in any mode.

Test plan

  • pnpm test1288 passed, 17 skipped (21 new tests in inner-lifecycle.test.ts)
  • pnpm tsc --noEmit clean
  • pnpm lint clean

Out of scope (follow-up)

cc @amplitude/growth

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk because it expands the agent-mode NDJSON wire schema and adds new emitters/hooks that affect stdout output consumed by external orchestrators. Behavior is otherwise additive and gated to AgentUI (agent mode) only.

Overview
Adds inner-agent lifecycle observability to agent-mode output by introducing 8 new NDJSON event payloads in agent-events.ts (start, tool calls, planned/applied file changes, event-plan proposed/confirmed, and verification started/result), plus helpers to privacy-safely summarize tool inputs and classify write operations.

Introduces createInnerLifecycleHooks (inner-lifecycle.ts) to forward Claude SDK SessionStart/PreToolUse/PostToolUse signals into those NDJSON events (including byte counts for applied writes), and extends AgentUI with corresponding emit* methods.

Adds Vitest coverage validating the summarizers and that the hook factory emits the expected NDJSON lines when AgentUI is active.

Reviewed by Cursor Bugbot for commit cbeeb24. Bugbot is set up for automated code reviews on this repo. Configure here.

kelsonpw and others added 4 commits April 25, 2026 12:50
Adds a typed `needs_input` event so outer agents (Claude Code, Cursor,
Codex) can deterministically surface decisions to a human instead of the
wizard silently auto-selecting. Every event carries `code`, `choices`,
`recommended`, and `resumeFlags` so the orchestrator can either re-invoke
with a single CLI flag or pipe a JSON line to stdin.

- New `src/lib/agent-events.ts` — source-of-truth schema for the
  agent-mode wire format. `AgentEventEnvelope`, `NeedsInputData`,
  `AgentEventType` all live here so future events land against one doc.
- `AgentUI.emitNeedsInput()` — public method any caller (including
  upcoming plan/apply commands) can use to surface a decision.
- `promptConfirm` and `promptChoice` now emit a `needs_input` event in
  addition to the legacy `prompt` event so existing orchestrators keep
  working while new ones key off the canonical shape.
- `promptEnvironmentSelection` emits both event shapes; the existing
  stdin round-trip behavior is unchanged.
- `ExitCode.INPUT_REQUIRED = 12` so future flag-matrix work can exit
  cleanly when input is needed and `--auto-approve` is not set.

Tests: +3 in agent-ui.test.ts. Suite green (1240 pass, 17 skip).

Part of the wizard sub-agent design — Gap 1 of 4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-gate

Splits the implicit "agent-mode auto-approves everything" behavior into
three explicit, composable capabilities so plan/apply/verify can layer
on without ambiguity:

  --auto-approve  → silently pick `recommended` on `needs_input` (no writes)
  --yes (-y)      → autoApprove + allowWrites (today's --yes / --ci semantics)
  --force         → autoApprove + allowWrites + allowDestructive

Back-compat preserved: `--agent` alone still implies `autoApprove + allowWrites`.
The upcoming `apply` command will pass `requireExplicitWrites: true` to
`resolveMode`, which forces writes to be requested by name.

- `ModeConfig` extends new `CapabilityFlags` interface
- `resolveMode` builds capabilities additively from the flag set
- `evaluateWriteGate(toolName, toolInput, caps)` is a pure function the
  PreToolUse hook can call to decide allow/deny — gates Edit/Write/
  MultiEdit/NotebookEdit on `allowWrites`, and gates a curated set of
  destructive Bash patterns (rm -rf, git reset --hard, git push --force,
  DROP TABLE, etc.) on `allowDestructive`
- New `--auto-approve` and `--force` global flags in bin.ts
- `--yes` consolidated to a single global declaration with `-y` alias
- `ExitCode.WRITE_REFUSED = 13` for clean re-invocation by outer agents

Tests: +16 in mode-config.test.ts (35 total). Suite green (1257 pass).

Part of the wizard sub-agent design — Gap 4 of 4. Stacked on #253.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the wizard's run-everything-at-once flow into three explicit
phases so outer agents (Claude Code, Cursor, Codex) can inspect a plan
before any writes happen:

  npx wizard plan
    → emits a `plan` NDJSON event with planId + framework + sdk +
      resumeFlags. No writes. Persists the plan to
      $TMPDIR/amplitude-wizard-plans/<planId>.json with a 24h TTL.

  npx wizard apply --plan-id <id> --yes
    → loads + validates the plan, then runs the wizard. Refuses without
      --yes (exits WRITE_REFUSED=13 with a clear resume hint). Refuses
      stale or unknown plan IDs (exits INVALID_ARGS=2).

  npx wizard verify
    → cheap, no-network check that SDK is installed + API key is
      configured + framework is detectable. Emits a structured
      `verification_result` event.

Implementation:

- `src/lib/agent-plans.ts` — typed plan persistence layer.
  Zod-validated WizardPlan schema (v1), atomic JSON writes (0o600
  perms), TTL-based expiry, `pruneStalePlans` for cleanup.
- `src/lib/agent-ops.ts` — adds `runPlan` and `runVerify` business
  logic alongside the existing `runDetect` / `runStatus`. No UI, no
  process.exit; thin yargs handlers compose them.
- `bin.ts` — three new yargs `.command()` entries that all pass
  `requireExplicitWrites: true` to `resolveMode` so the new commands
  opt out of the agent-implies-writes back-compat.

Smoke tests (manual, in this directory):
  $ wizard plan --json     → emits plan envelope, exit 0
  $ wizard apply --plan-id X --json   → exits 13, no writes
  $ wizard apply --plan-id X --yes --json   → executes
  $ wizard verify --json   → emits verification_result, exit reflects pass/fail

Tests: +10 in agent-plans.test.ts (1267 total). Suite green.

Part of the wizard sub-agent design — Gap 3 of 4. Stacked on #254.

Note: the `apply` handler currently spawns the existing `--agent --yes`
wizard run with the planId in the env. Wiring the plan into the agent
prompt (so the inner agent reads `WizardPlan` and reports back) is a
follow-up that lands cleanly when #180 (three-phase handoff schemas)
ships.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surface what the inner Claude SDK agent is doing so outer agents can
mirror progress, attribute file changes to specific tools, and decide
when to abort. Adds eight new event types to the agent-mode wire format:

  inner_agent_started     — model + phase + planId at SessionStart
  tool_call               — every PreToolUse, with privacy-safe summary
  file_change_planned     — write-tool intent (path + create/modify/delete)
  file_change_applied     — write-tool success, paired with planned
  event_plan_proposed     — confirm_event_plan invocation
  event_plan_confirmed    — decision + source (auto/human/flag)
  verification_started    — pre-check phase boundary
  verification_result     — pass/fail with structured failures

Implementation is intentionally additive — no changes to agent-interface.ts
to avoid conflicts with #243 (PreToolUse migration) and #149 (observability
spine). The new `createInnerLifecycleHooks(config)` factory in
src/lib/inner-lifecycle.ts returns hook callbacks shaped to merge into
the existing `buildHooksConfig` call. Wiring is documented in the module
header so a follow-up PR (or whoever lands #243) can compose the hooks
in one ~5-line change.

Helpers in agent-events.ts:
  - summarizeToolInput(name, input) — privacy-safe summary for PreToolUse
    payloads (file paths for Read/Edit/Write, command head for Bash, etc.)
  - classifyWriteOperation(name) — Write→create, Edit→modify, others→null
  - summarizeForEvent(s, max) — truncate-with-ellipsis for any string

AgentUI emit methods are no-ops in TUI/CI mode — the helper checks
`getUI() instanceof AgentUI` and short-circuits cleanly.

Tests: +21 in inner-lifecycle.test.ts (1288 total). Suite green.

Part of the wizard sub-agent design — Gap 2 of 4. Stacked on #255.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kelsonpw kelsonpw requested a review from a team April 25, 2026 20:08
@github-actions
Copy link
Copy Markdown
Contributor

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci django
  • /wizard-ci fastapi
  • /wizard-ci flask
  • /wizard-ci javascript-node
  • /wizard-ci javascript-web
  • /wizard-ci next-js
  • /wizard-ci python
  • /wizard-ci react-router
  • /wizard-ci vue

Test an individual app:

  • /wizard-ci django/django3-saas
  • /wizard-ci fastapi/fastapi3-ai-saas
  • /wizard-ci flask/flask3-social-media
Show more apps
  • /wizard-ci javascript-node/express-todo
  • /wizard-ci javascript-node/fastify-blog
  • /wizard-ci javascript-node/hono-links
  • /wizard-ci javascript-node/koa-notes
  • /wizard-ci javascript-node/native-http-contacts
  • /wizard-ci javascript-web/saas-dashboard
  • /wizard-ci next-js/15-app-router-saas
  • /wizard-ci next-js/15-app-router-todo
  • /wizard-ci next-js/15-pages-router-saas
  • /wizard-ci next-js/15-pages-router-todo
  • /wizard-ci python/meeting-summarizer
  • /wizard-ci react-router/react-router-v7-project
  • /wizard-ci react-router/rrv7-starter
  • /wizard-ci react-router/saas-template
  • /wizard-ci react-router/shopper
  • /wizard-ci vue/movies

Results will be posted here when complete.

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Non-Error throws produce [null] in failures array
    • Changed (e as Error).message to String((e as Error).message ?? e) so non-Error thrown values produce a human-readable string instead of undefined/null.

Create PR

Or push these changes by commenting:

@cursor push 19ca88cf89
Preview (19ca88cf89)
diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -114,14 +114,14 @@
       typeof input.tool_name === 'string'
         ? input.tool_name
         : typeof input.toolName === 'string'
-        ? input.toolName
-        : 'unknown';
+          ? input.toolName
+          : 'unknown';
     const toolInput =
       typeof input.tool_input !== 'undefined'
         ? input.tool_input
         : typeof input.toolInput !== 'undefined'
-        ? input.toolInput
-        : null;
+          ? input.toolInput
+          : null;
     const summary = summarizeToolInput(toolName, toolInput);
     ui.emitToolCall({ tool: toolName, summary });
 
@@ -137,8 +137,8 @@
         typeof obj.file_path === 'string'
           ? obj.file_path
           : typeof obj.path === 'string'
-          ? obj.path
-          : null;
+            ? obj.path
+            : null;
       if (path) {
         ui.emitFileChangePlanned({ path, operation });
       }
@@ -153,16 +153,16 @@
       typeof input.tool_name === 'string'
         ? input.tool_name
         : typeof input.toolName === 'string'
-        ? input.toolName
-        : 'unknown';
+          ? input.toolName
+          : 'unknown';
     const operation = classifyWriteOperation(toolName);
     if (!operation) return Promise.resolve({});
     const toolInput =
       typeof input.tool_input !== 'undefined'
         ? input.tool_input
         : typeof input.toolInput !== 'undefined'
-        ? input.toolInput
-        : null;
+          ? input.toolInput
+          : null;
     const obj =
       toolInput && typeof toolInput === 'object'
         ? (toolInput as Record<string, unknown>)
@@ -171,8 +171,8 @@
       typeof obj.file_path === 'string'
         ? obj.file_path
         : typeof obj.path === 'string'
-        ? obj.path
-        : null;
+          ? obj.path
+          : null;
     if (path) {
       const content = typeof obj.content === 'string' ? obj.content : null;
       ui.emitFileChangeApplied({
@@ -210,7 +210,7 @@
           ui.emitVerificationResult({
             phase,
             success: false,
-            failures: [(e as Error).message],
+            failures: [String((e as Error).message ?? e)],
           });
         }
         throw e;

You can send follow-ups to the cloud agent here.

Comment thread src/lib/inner-lifecycle.ts
@kelsonpw
Copy link
Copy Markdown
Collaborator Author

@cursor push 19ca88c

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Falsy check on content drops bytes for empty writes
    • Changed content && to content !== null && so that empty string content correctly emits bytes: 0 instead of being silently omitted.

Create PR

Or push these changes by commenting:

@cursor push 826ae92626
Preview (826ae92626)
diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -178,7 +178,7 @@
       ui.emitFileChangeApplied({
         path,
         operation,
-        ...(content && { bytes: Buffer.byteLength(content, 'utf8') }),
+        ...(content !== null && { bytes: Buffer.byteLength(content, 'utf8') }),
       });
     }
     return Promise.resolve({});

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.

ui.emitFileChangeApplied({
path,
operation,
...(content && { bytes: Buffer.byteLength(content, 'utf8') }),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Falsy check on content drops bytes for empty writes

Low Severity

The content && { bytes: Buffer.byteLength(content, 'utf8') } spread uses a truthiness check. An empty string '' is falsy in JavaScript, so when a Write tool creates an empty file, the && short-circuits to '' and spreading that is a no-op. This means bytes: 0 is silently omitted instead of being reported, creating an inconsistency where non-empty writes include bytes but empty writes don't. Using content !== null instead of content would correctly emit bytes: 0.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.

@kelsonpw kelsonpw force-pushed the kelsonpw/agent-plan-apply-verify branch from 62932a3 to db720a0 Compare April 26, 2026 03:33
@kelsonpw kelsonpw deleted the branch kelsonpw/agent-plan-apply-verify April 26, 2026 04:03
@kelsonpw kelsonpw closed this Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants