feat(agent): inner-agent lifecycle + file_change NDJSON events#256
feat(agent): inner-agent lifecycle + file_change NDJSON events#256kelsonpw wants to merge 5 commits intokelsonpw/agent-plan-apply-verifyfrom
Conversation
Adds a typed `needs_input` event so outer agents (Claude Code, Cursor, Codex) can deterministically surface decisions to a human instead of the wizard silently auto-selecting. Every event carries `code`, `choices`, `recommended`, and `resumeFlags` so the orchestrator can either re-invoke with a single CLI flag or pipe a JSON line to stdin. - New `src/lib/agent-events.ts` — source-of-truth schema for the agent-mode wire format. `AgentEventEnvelope`, `NeedsInputData`, `AgentEventType` all live here so future events land against one doc. - `AgentUI.emitNeedsInput()` — public method any caller (including upcoming plan/apply commands) can use to surface a decision. - `promptConfirm` and `promptChoice` now emit a `needs_input` event in addition to the legacy `prompt` event so existing orchestrators keep working while new ones key off the canonical shape. - `promptEnvironmentSelection` emits both event shapes; the existing stdin round-trip behavior is unchanged. - `ExitCode.INPUT_REQUIRED = 12` so future flag-matrix work can exit cleanly when input is needed and `--auto-approve` is not set. Tests: +3 in agent-ui.test.ts. Suite green (1240 pass, 17 skip). Part of the wizard sub-agent design — Gap 1 of 4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-gate Splits the implicit "agent-mode auto-approves everything" behavior into three explicit, composable capabilities so plan/apply/verify can layer on without ambiguity: --auto-approve → silently pick `recommended` on `needs_input` (no writes) --yes (-y) → autoApprove + allowWrites (today's --yes / --ci semantics) --force → autoApprove + allowWrites + allowDestructive Back-compat preserved: `--agent` alone still implies `autoApprove + allowWrites`. The upcoming `apply` command will pass `requireExplicitWrites: true` to `resolveMode`, which forces writes to be requested by name. - `ModeConfig` extends new `CapabilityFlags` interface - `resolveMode` builds capabilities additively from the flag set - `evaluateWriteGate(toolName, toolInput, caps)` is a pure function the PreToolUse hook can call to decide allow/deny — gates Edit/Write/ MultiEdit/NotebookEdit on `allowWrites`, and gates a curated set of destructive Bash patterns (rm -rf, git reset --hard, git push --force, DROP TABLE, etc.) on `allowDestructive` - New `--auto-approve` and `--force` global flags in bin.ts - `--yes` consolidated to a single global declaration with `-y` alias - `ExitCode.WRITE_REFUSED = 13` for clean re-invocation by outer agents Tests: +16 in mode-config.test.ts (35 total). Suite green (1257 pass). Part of the wizard sub-agent design — Gap 4 of 4. Stacked on #253. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the wizard's run-everything-at-once flow into three explicit
phases so outer agents (Claude Code, Cursor, Codex) can inspect a plan
before any writes happen:
npx wizard plan
→ emits a `plan` NDJSON event with planId + framework + sdk +
resumeFlags. No writes. Persists the plan to
$TMPDIR/amplitude-wizard-plans/<planId>.json with a 24h TTL.
npx wizard apply --plan-id <id> --yes
→ loads + validates the plan, then runs the wizard. Refuses without
--yes (exits WRITE_REFUSED=13 with a clear resume hint). Refuses
stale or unknown plan IDs (exits INVALID_ARGS=2).
npx wizard verify
→ cheap, no-network check that SDK is installed + API key is
configured + framework is detectable. Emits a structured
`verification_result` event.
Implementation:
- `src/lib/agent-plans.ts` — typed plan persistence layer.
Zod-validated WizardPlan schema (v1), atomic JSON writes (0o600
perms), TTL-based expiry, `pruneStalePlans` for cleanup.
- `src/lib/agent-ops.ts` — adds `runPlan` and `runVerify` business
logic alongside the existing `runDetect` / `runStatus`. No UI, no
process.exit; thin yargs handlers compose them.
- `bin.ts` — three new yargs `.command()` entries that all pass
`requireExplicitWrites: true` to `resolveMode` so the new commands
opt out of the agent-implies-writes back-compat.
Smoke tests (manual, in this directory):
$ wizard plan --json → emits plan envelope, exit 0
$ wizard apply --plan-id X --json → exits 13, no writes
$ wizard apply --plan-id X --yes --json → executes
$ wizard verify --json → emits verification_result, exit reflects pass/fail
Tests: +10 in agent-plans.test.ts (1267 total). Suite green.
Part of the wizard sub-agent design — Gap 3 of 4. Stacked on #254.
Note: the `apply` handler currently spawns the existing `--agent --yes`
wizard run with the planId in the env. Wiring the plan into the agent
prompt (so the inner agent reads `WizardPlan` and reports back) is a
follow-up that lands cleanly when #180 (three-phase handoff schemas)
ships.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surface what the inner Claude SDK agent is doing so outer agents can mirror progress, attribute file changes to specific tools, and decide when to abort. Adds eight new event types to the agent-mode wire format: inner_agent_started — model + phase + planId at SessionStart tool_call — every PreToolUse, with privacy-safe summary file_change_planned — write-tool intent (path + create/modify/delete) file_change_applied — write-tool success, paired with planned event_plan_proposed — confirm_event_plan invocation event_plan_confirmed — decision + source (auto/human/flag) verification_started — pre-check phase boundary verification_result — pass/fail with structured failures Implementation is intentionally additive — no changes to agent-interface.ts to avoid conflicts with #243 (PreToolUse migration) and #149 (observability spine). The new `createInnerLifecycleHooks(config)` factory in src/lib/inner-lifecycle.ts returns hook callbacks shaped to merge into the existing `buildHooksConfig` call. Wiring is documented in the module header so a follow-up PR (or whoever lands #243) can compose the hooks in one ~5-line change. Helpers in agent-events.ts: - summarizeToolInput(name, input) — privacy-safe summary for PreToolUse payloads (file paths for Read/Edit/Write, command head for Bash, etc.) - classifyWriteOperation(name) — Write→create, Edit→modify, others→null - summarizeForEvent(s, max) — truncate-with-ellipsis for any string AgentUI emit methods are no-ops in TUI/CI mode — the helper checks `getUI() instanceof AgentUI` and short-circuits cleanly. Tests: +21 in inner-lifecycle.test.ts (1288 total). Suite green. Part of the wizard sub-agent design — Gap 2 of 4. Stacked on #255. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🧙 Wizard CIRun the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands: Test all apps:
Test all apps in a directory:
Test an individual app:
Show more apps
Results will be posted here when complete. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Non-Error throws produce
[null]in failures array- Changed
(e as Error).messagetoString((e as Error).message ?? e)so non-Error thrown values produce a human-readable string instead of undefined/null.
- Changed
Or push these changes by commenting:
@cursor push 19ca88cf89
Preview (19ca88cf89)
diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -114,14 +114,14 @@
typeof input.tool_name === 'string'
? input.tool_name
: typeof input.toolName === 'string'
- ? input.toolName
- : 'unknown';
+ ? input.toolName
+ : 'unknown';
const toolInput =
typeof input.tool_input !== 'undefined'
? input.tool_input
: typeof input.toolInput !== 'undefined'
- ? input.toolInput
- : null;
+ ? input.toolInput
+ : null;
const summary = summarizeToolInput(toolName, toolInput);
ui.emitToolCall({ tool: toolName, summary });
@@ -137,8 +137,8 @@
typeof obj.file_path === 'string'
? obj.file_path
: typeof obj.path === 'string'
- ? obj.path
- : null;
+ ? obj.path
+ : null;
if (path) {
ui.emitFileChangePlanned({ path, operation });
}
@@ -153,16 +153,16 @@
typeof input.tool_name === 'string'
? input.tool_name
: typeof input.toolName === 'string'
- ? input.toolName
- : 'unknown';
+ ? input.toolName
+ : 'unknown';
const operation = classifyWriteOperation(toolName);
if (!operation) return Promise.resolve({});
const toolInput =
typeof input.tool_input !== 'undefined'
? input.tool_input
: typeof input.toolInput !== 'undefined'
- ? input.toolInput
- : null;
+ ? input.toolInput
+ : null;
const obj =
toolInput && typeof toolInput === 'object'
? (toolInput as Record<string, unknown>)
@@ -171,8 +171,8 @@
typeof obj.file_path === 'string'
? obj.file_path
: typeof obj.path === 'string'
- ? obj.path
- : null;
+ ? obj.path
+ : null;
if (path) {
const content = typeof obj.content === 'string' ? obj.content : null;
ui.emitFileChangeApplied({
@@ -210,7 +210,7 @@
ui.emitVerificationResult({
phase,
success: false,
- failures: [(e as Error).message],
+ failures: [String((e as Error).message ?? e)],
});
}
throw e;You can send follow-ups to the cloud agent here.
Applied via @cursor push command
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Falsy check on content drops bytes for empty writes
- Changed
content &&tocontent !== null &&so that empty string content correctly emitsbytes: 0instead of being silently omitted.
- Changed
Or push these changes by commenting:
@cursor push 826ae92626
Preview (826ae92626)
diff --git a/src/lib/inner-lifecycle.ts b/src/lib/inner-lifecycle.ts
--- a/src/lib/inner-lifecycle.ts
+++ b/src/lib/inner-lifecycle.ts
@@ -178,7 +178,7 @@
ui.emitFileChangeApplied({
path,
operation,
- ...(content && { bytes: Buffer.byteLength(content, 'utf8') }),
+ ...(content !== null && { bytes: Buffer.byteLength(content, 'utf8') }),
});
}
return Promise.resolve({});You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.
| ui.emitFileChangeApplied({ | ||
| path, | ||
| operation, | ||
| ...(content && { bytes: Buffer.byteLength(content, 'utf8') }), |
There was a problem hiding this comment.
Falsy check on content drops bytes for empty writes
Low Severity
The content && { bytes: Buffer.byteLength(content, 'utf8') } spread uses a truthiness check. An empty string '' is falsy in JavaScript, so when a Write tool creates an empty file, the && short-circuits to '' and spreading that is a no-op. This means bytes: 0 is silently omitted instead of being reported, creating an inconsistency where non-empty writes include bytes but empty writes don't. Using content !== null instead of content would correctly emit bytes: 0.
Reviewed by Cursor Bugbot for commit cbeeb24. Configure here.
62932a3 to
db720a0
Compare



Summary
Surface what the inner Claude SDK agent is doing so outer agents (Claude Code, Cursor, Codex) can mirror progress, attribute file changes to specific tools, and decide when to abort. Adds eight new event types to the agent-mode wire format and a hook factory that's ready to wire in.
This is Gap 2 of 4 in the wizard sub-agent contract. Stacked on #255.
New events
Why additive only
Implementation is intentionally additive — no changes to
agent-interface.tsto avoid conflicts with:report_statusMCP tool that may overlap withtool_call)The new
createInnerLifecycleHooks(config)factory insrc/lib/inner-lifecycle.tsreturns hook callbacks shaped to merge into the existingbuildHooksConfigcall. Wiring is documented in the module header so a follow-up PR (or whoever lands #243) can compose the hooks in one ~5-line change:What's in
src/lib/agent-events.ts— 8 new typed event interfaces, plus three helpers:summarizeToolInput(name, input)— privacy-safe summary for PreToolUse payloads (file paths for Read/Edit/Write, command head for Bash, pattern for Grep, etc.)classifyWriteOperation(name)— Write→create, Edit→modify, others→nullsummarizeForEvent(s, max)— truncate-with-ellipsis for any stringsrc/ui/agent-ui.ts— 8 newemit*methods, all thin wrappers over the existingemit()infrastructure. They reuse the same envelope (v: 1,@timestamp,session_id,run_id, …) so outer agents need only one parser.src/lib/inner-lifecycle.ts—createInnerLifecycleHooks(config)factory. Returns:hooks()→{ SessionStart, PreToolUse, PostToolUse }ready to merge intobuildHooksConfigemitEventPlanProposed(events)for the wizard-tools MCP server to call fromconfirm_event_planemitEventPlanConfirmed(source, decision)for the samewithVerification(phase, fn)wrapper that emits started/result around any async checkgetUI() instanceof AgentUIand short-circuits cleanly so the same hook config can be used in any mode.Test plan
pnpm test— 1288 passed, 17 skipped (21 new tests in inner-lifecycle.test.ts)pnpm tsc --noEmitcleanpnpm lintcleanOut of scope (follow-up)
createInnerLifecycleHooksintoagent-interface.ts:buildHooksConfig— small, single-PR change once feat(agent): migrate tool allowlist from canUseTool to PreToolUse hook #243 lands so the PreToolUse hook composition is clean.emitEventPlanProposed/emitEventPlanConfirmedinto theconfirm_event_planMCP tool inwizard-tools.ts— also a small follow-up, gated on whichever Bet 2 PR toucheswizard-tools.tsnext.report_status(feat: structured status via report_status MCP tool (Bet 2 slice 2) #172) → NDJSON forwarding. Thetool_callevent covers most of whatreport_statusdoes for tools; the cross-over is a small refactor once both are in.cc @amplitude/growth
🤖 Generated with Claude Code
Note
Medium Risk
Medium risk because it expands the agent-mode NDJSON wire schema and adds new emitters/hooks that affect stdout output consumed by external orchestrators. Behavior is otherwise additive and gated to
AgentUI(agent mode) only.Overview
Adds inner-agent lifecycle observability to agent-mode output by introducing 8 new NDJSON event payloads in
agent-events.ts(start, tool calls, planned/applied file changes, event-plan proposed/confirmed, and verification started/result), plus helpers to privacy-safely summarize tool inputs and classify write operations.Introduces
createInnerLifecycleHooks(inner-lifecycle.ts) to forward Claude SDKSessionStart/PreToolUse/PostToolUsesignals into those NDJSON events (including byte counts for applied writes), and extendsAgentUIwith correspondingemit*methods.Adds Vitest coverage validating the summarizers and that the hook factory emits the expected NDJSON lines when
AgentUIis active.Reviewed by Cursor Bugbot for commit cbeeb24. Bugbot is set up for automated code reviews on this repo. Configure here.