Skip to content

feat: UserPromptSubmit hydrates recovery note after compaction (Bet 2 slice 4)#174

Closed
kelsonpw wants to merge 4 commits intokelsonpw/agent-loop-precompact-recoveredfrom
kelsonpw/agent-loop-hydrate-recovered
Closed

feat: UserPromptSubmit hydrates recovery note after compaction (Bet 2 slice 4)#174
kelsonpw wants to merge 4 commits intokelsonpw/agent-loop-precompact-recoveredfrom
kelsonpw/agent-loop-hydrate-recovered

Conversation

@kelsonpw
Copy link
Copy Markdown
Collaborator

@kelsonpw kelsonpw commented Apr 21, 2026

Tracker: #143
Bet 2: Agent loop overhaul — prompt caching, three-phase Planner → Integrator → Instrumenter pipeline, structured status, real hooks, eval harness
Kill criterion: cache hit rate <40% after 2 weeks → revert

What changes for users

When Claude compacts its context mid-run, the wizard feeds it back a short recovery note — what files have been edited, last status, compaction count — so the agent re-orients and keeps going. Without this, a mid-run compaction can leave the agent re-editing files it already modified or skipping steps it already completed. The user sees the same wizard; the agent just remembers what it already did.

Scope of this slice

  • loadSnapshot(path), consumeSnapshot(path), buildRecoveryNote(snap) helpers in src/lib/agent-state.ts. loadSnapshot is zod-validated so a malformed file returns null cleanly; consumeSnapshot deletes the file so hydration fires at most once per compaction cycle.
  • New createUserPromptSubmitHook(state) factory in src/lib/agent-interface.ts that injects the recovery note via hookSpecificOutput.additionalContext.
  • Wired into buildHooksConfig so every UserPromptSubmit gets a chance to hydrate.
  • Stops resetting AgentState between retries so a mid-retry compaction still sees the files modified in the prior attempt.
  • +11 hydrate tests (agent-state-hydrate.test.ts) covering the full load → consume → build → hook pipeline.

How this advances Bet 2

Closes the round-trip started in slice 3 (precompact, #126). PreCompact persists state to disk; UserPromptSubmit consumes it on the next turn. The recovery note is intentionally short (compaction count, modified files, last status) so it doesn't blow the context budget we just reclaimed.


Scope note — recovered chain

The PreCompact half of this round-trip still depends on Bet 1 ToolCallCounters landing first (PR #149). Until that merges, snapshots won't be persisted in production — but UserPromptSubmit hydration is wired and tested, ready to light up the moment PreCompact writes are turned on.

Tests

+11 new (agent-state-hydrate.test.ts): loadSnapshot null-on-missing, round-trip, null-on-invalid-JSON, null-on-shape-mismatch, consumeSnapshot deletes file, buildRecoveryNote with/without files + status, createUserPromptSubmitHook no-op when absent, injects additionalContext when present, only hydrates once per compaction cycle. 1111 total passing (was 1100 on slice 3 recovered).

Test plan

  • pnpm test green
  • pnpm build smoke passes
  • pnpm try locally → force a compaction mid-run, verify agent behavior stays consistent

Deferred to later Bet 2 slices

  • PreCompact + PostToolUse hook wiring (needs Bet 1 ToolCallCounters)
  • Three-phase pipeline (Planner → Integrator → Instrumenter)

Recreated from #127 onto the recovered Bet 2 chain after the 2026-04-20 history reset.

cc @amplitude/growth


Note

Medium Risk
Adds new hook-time prompt augmentation based on persisted snapshots and changes retry behavior by no longer resetting AgentState, which could affect agent run consistency across stalls/retries if edge cases exist.

Overview
After a context compaction, the wizard now rehydrates the next user prompt with a short recovery note (compaction count, files already modified, last status) by loading a persisted snapshot, validating it, and consuming (deleting) it so it only applies once.

This introduces zod-validated snapshot helpers in agent-state (loadSnapshot, consumeSnapshot, buildRecoveryNote), wires a new createUserPromptSubmitHook into the agent SDK hooks, and adds a dedicated test suite covering snapshot read/consume, recovery note formatting, and one-shot hydration behavior.

Reviewed by Cursor Bugbot for commit 2af00ba. Bugbot is set up for automated code reviews on this repo. Configure here.

kelsonpw and others added 4 commits April 20, 2026 23:19
Enable prompt caching via excludeDynamicSections on the systemPrompt
block passed to query(). Strips per-run / per-machine sections (date,
cwd, git status) from the Claude Code preset so the static prefix is
byte-identical across turns — the Claude Agent SDK then attaches
cache_control internally.

Verified upstream that per-run values (projectApiKey, projectId,
framework version) already live in the user-message prompt built by
buildIntegrationPrompt, not in the system prefix, so the prefix is
cacheable as-is with no refactor.

Measurement already lives in the Bet 1 observability spine: the
`wizard cli: agent completed` event now includes `cache read input
tokens`, `cache creation 5m/1h tokens`, and `cache hit rate`.
Success threshold per the Bet 2 brief: ≥50% cache hit rate on run 2+.
Kill criterion: <40% after two weeks → revert.

Defers to follow-up slices: three-phase Planner → Integrator →
Instrumenter pipeline, structured status via report_status MCP tool
(#125), real PreCompact / PostToolUse / UserPromptSubmit hooks,
eval harness.

Recreated from closed #123 against flattened open-source main. The
original was auto-closed during the 2026-04-20 history reset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds report_status(kind, code, detail) in-process MCP tool replacing the
[STATUS] / [ERROR-MCP-MISSING] / [ERROR-RESOURCE-MISSING] text-marker regex
scanner. Gives the --agent NDJSON surface and outro screen a typed source
of truth instead of scraping plaintext.

- New report_status MCP tool in src/lib/wizard-tools.ts with Zod validation
  and 5-calls/second-per-(kind,code) rate limit
- StatusReporter interface + _activeStatusReporter slot wired per-run
- Deleted legacy text-marker scanner from src/lib/agent-interface.ts
- New commandment instructing agent to use the tool
- +6 tests in report-status.test.ts; removed 2 deprecated regression tests

Bet 2 slice 2. Recreated from closed #125 onto kelsonpw/agent-loop-caching-recovered
after the 2026-04-20 history reset.
Adds per-attempt AgentState bag that tracks modified files, last structured
status, and compaction count. Serializes to a deterministic tmpdir path so a
future post-compaction UserPromptSubmit hook can hydrate the agent back with
the context that compaction dropped.

- New src/lib/agent-state.ts — AgentState class with setAttemptId,
  recordModifiedFile, recordStatus, recordCompaction, snapshot, persist,
  snapshotPath, reset. Schema versioned as amplitude-wizard-agent-state/1.
- Instantiated per runAgent call; reset between retry attempts.
- Wired into StatusReporter.onStatus so the persisted snapshot always carries
  the most recent status message.
- +11 tests (agent-state.test.ts) covering dedup/sort, status tracking,
  compaction count, JSON schema, tmpdir path, reset semantics.

Bet 2 slice 3. Recreated from #126 onto kelsonpw/agent-loop-status-recovered
after the 2026-04-20 history reset. Hook-factory wiring for PreCompact and
PostToolUse depends on Bet 1 ToolCallCounters landing first (PR #149); this
slice ships the persistence primitive so later slices can wire it up with
zero merge friction.
When Claude compacts its context mid-run, the wizard feeds it back a short
recovery note — what files have been edited, last status, compaction count —
so the agent re-orients and continues the run instead of re-writing files
or skipping steps.

- src/lib/agent-state.ts gains loadSnapshot, consumeSnapshot, buildRecoveryNote
  helpers. loadSnapshot uses zod validation so a malformed file returns null
  cleanly (no uncaught JSON errors). consumeSnapshot deletes the file so
  hydration fires at most once per compaction.
- New createUserPromptSubmitHook(state) factory in agent-interface.ts that
  injects the recovery note via hookSpecificOutput.additionalContext.
- Wired into buildHooksConfig so every UserPromptSubmit gets a chance to
  hydrate.
- Stops resetting AgentState between retries so a mid-retry compaction still
  sees the files modified in the prior attempt.
- +11 hydrate tests covering the full load/consume/build/hook pipeline.

Bet 2 slice 4. Recreated from #127 onto kelsonpw/agent-loop-precompact-recovered
after the 2026-04-20 history reset.
@kelsonpw kelsonpw requested a review from a team as a code owner April 21, 2026 06:43
@kelsonpw kelsonpw added the futureproof bet Part of the wizard vision bets rolled up in issue #143 label Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci django
  • /wizard-ci fastapi
  • /wizard-ci flask
  • /wizard-ci javascript-node
  • /wizard-ci javascript-web
  • /wizard-ci next-js
  • /wizard-ci python
  • /wizard-ci react-router
  • /wizard-ci vue

Test an individual app:

  • /wizard-ci django/django3-saas
  • /wizard-ci fastapi/fastapi3-ai-saas
  • /wizard-ci flask/flask3-social-media
Show more apps
  • /wizard-ci javascript-node/express-todo
  • /wizard-ci javascript-node/fastify-blog
  • /wizard-ci javascript-node/hono-links
  • /wizard-ci javascript-node/koa-notes
  • /wizard-ci javascript-node/native-http-contacts
  • /wizard-ci javascript-web/saas-dashboard
  • /wizard-ci next-js/15-app-router-saas
  • /wizard-ci next-js/15-app-router-todo
  • /wizard-ci next-js/15-pages-router-saas
  • /wizard-ci next-js/15-pages-router-todo
  • /wizard-ci python/meeting-summarizer
  • /wizard-ci react-router/react-router-v7-project
  • /wizard-ci react-router/rrv7-starter
  • /wizard-ci react-router/saas-template
  • /wizard-ci react-router/shopper
  • /wizard-ci vue/movies

Results will be posted here when complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

futureproof bet Part of the wizard vision bets rolled up in issue #143

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants