Skip to content

feat: PreCompact persists agent state (Bet 2 slice 3)#126

Closed
kelsonpw wants to merge 2 commits intokelsonpw/agent-loop-statusfrom
kelsonpw/agent-loop-precompact
Closed

feat: PreCompact persists agent state (Bet 2 slice 3)#126
kelsonpw wants to merge 2 commits intokelsonpw/agent-loop-statusfrom
kelsonpw/agent-loop-precompact

Conversation

@kelsonpw
Copy link
Copy Markdown
Collaborator

@kelsonpw kelsonpw commented Apr 18, 2026

Tracker: #143
Bet 2: Agent loop overhaul — prompt caching, three-phase Planner → Integrator → Instrumenter pipeline, structured status, real hooks, eval harness
Kill criterion: cache hit rate <40% after 2 weeks → revert

What changes for users

Nothing user-visible yet. When the wizard is mid-run and Claude has to compact its conversation to fit more context, the wizard now saves a small snapshot — which files it has edited, its most recent status, how many compactions have happened — so the agent can pick back up without forgetting. The follow-up slice reads that snapshot back; together they stop the wizard from re-writing files or skipping steps after a compaction.

Scope of this slice

  • New src/lib/agent-state.ts: per-attempt AgentState bag tracking modified files (deduped + sorted), last status code/detail, and compaction count.
  • persist() writes a schema-versioned JSON snapshot to tmpdir()/amplitude-wizard-state-<attemptId>.json with 0o600 perms. Schema tag amplitude-wizard-agent-state/1 so future readers can detect format drift.
  • Extended hooks in src/lib/agent-interface.ts: createPreCompactHook(counters, state?) records the compaction on AgentState and calls state.persist(); createPostToolUseHook(counters, state?) pulls file_path from Write / Edit tool inputs and records it on AgentState. Both state args optional so callers without state still compile.
  • StatusReporter in runAgent now also calls agentState.recordStatus so the persisted snapshot always carries the most recent status message.
  • +11 tests in agent-state.test.ts (dedup + sort, last-status-wins, compactionCount, JSON schema + contents, tmpdir path carries attempt id, hooks no-op when state omitted). 1068 total passing.

How this advances Bet 2

Bet 2's "real hooks" deliverable turns the previously-declared-but-unwired hooks into actual behavior. This slice lands the persistence half of the PreCompact → UserPromptSubmit round-trip; the on-disk format is stable and versioned so slice 4 can consume it without coordination. Once hydration lands, the three-phase pipeline inherits a grounded recovery point across compactions.


Deferred to next slice

Post-compaction restoration — a UserPromptSubmit hook that reads the snapshot and prepends a recovery block to the first user message after a compaction fires. This PR lands the persistence half of the round-trip; hydration lands next. The on-disk format is stable and versioned so the restoration PR can consume it without coordination.

Also out of scope: typecheck/lint gate on PostToolUse (briefs says Stop-hook blocks on errors). Lower priority than hydration.

Tests

+11 new in agent-state.test.ts:

  • dedup + sort of modifiedFiles
  • last-status wins in recordStatus
  • compactionCount increments
  • JSON snapshot schema + contents
  • tmpdir path contains attempt id
  • createPreCompactHook writes to disk
  • createPreCompactHook is a no-op on state when state arg is omitted
  • createPostToolUseHook records Write + Edit but ignores Read
  • createPostToolUseHook counter still increments when state is omitted

1068 total passing (was 1057).

Test plan

  • pnpm test green
  • pnpm build smoke passes
  • pnpm try with a large-context framework (e.g. Next.js monorepo) until a compaction fires → verify /tmp/amplitude-wizard-state-*.json exists with modifiedFiles + lastStatus populated
  • wizard cli: tool summary still reports compactions: N

cc @amplitude/growth

Generated with Claude Code

Bet 2, Slice 3 (real hooks — PreCompact substance).

Why: compaction drops earlier turns from the LLM's context window. The agent
can forget which files it has edited and where it was in the workflow,
leading to re-writes of the same file or skipped steps. Persisting a small
snapshot to disk just before compaction gives a future UserPromptSubmit
hook (or a human debugging a stuck run) a grounded recovery point.

What changed:

- New `src/lib/agent-state.ts` — per-attempt AgentState bag tracking
  modified files (deduped + sorted), last status code/detail, and
  compaction count. Serializes to
  /tmp/amplitude-wizard-state-<attemptId>.json via atomic sync write
  with 0o600 perms. Schema versioned as amplitude-wizard-agent-state/1.
- `createPreCompactHook(counters, state?)` — extended to record the
  compaction on AgentState and persist the snapshot. State arg is
  optional so existing callers without state still compile.
- `createPostToolUseHook(counters, state?)` — extended to record
  file_path from Write and Edit tool inputs into AgentState.
- StatusReporter in `runAgent` now also calls `agentState.recordStatus`
  so the persisted snapshot always carries the most recent status
  message.

Tests: +11 in `agent-state.test.ts` (dedup, sort, status tracking,
compaction counter, JSON schema validity, tmpdir path, hook wiring for
Write/Edit/Read and counter-only fallback). 1068 total passing (was 1057).

Post-compaction hydration (UserPromptSubmit reads the snapshot and
prepends a recovery block to the user message) is a separate follow-up
slice — not in scope here. This PR lands the persistence half of the
round-trip; restoration lands next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kelsonpw kelsonpw requested a review from a team April 18, 2026 15:34
@github-actions
Copy link
Copy Markdown
Contributor

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci django
  • /wizard-ci fastapi
  • /wizard-ci flask
  • /wizard-ci javascript-node
  • /wizard-ci javascript-web
  • /wizard-ci next-js
  • /wizard-ci python
  • /wizard-ci react-router
  • /wizard-ci vue

Test an individual app:

  • /wizard-ci django/django3-saas
  • /wizard-ci fastapi/fastapi3-ai-saas
  • /wizard-ci flask/flask3-social-media
Show more apps
  • /wizard-ci javascript-node/express-todo
  • /wizard-ci javascript-node/fastify-blog
  • /wizard-ci javascript-node/hono-links
  • /wizard-ci javascript-node/koa-notes
  • /wizard-ci javascript-node/native-http-contacts
  • /wizard-ci javascript-web/saas-dashboard
  • /wizard-ci next-js/15-app-router-saas
  • /wizard-ci next-js/15-app-router-todo
  • /wizard-ci next-js/15-pages-router-saas
  • /wizard-ci next-js/15-pages-router-todo
  • /wizard-ci python/meeting-summarizer
  • /wizard-ci react-router/react-router-v7-project
  • /wizard-ci react-router/rrv7-starter
  • /wizard-ci react-router/saas-template
  • /wizard-ci react-router/shopper
  • /wizard-ci vue/movies

Results will be posted here when complete.

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: AgentState not reset between retry attempts
    • Added a reset() method to AgentState that clears modifiedFiles, lastStatus, and compactionCount, and called it alongside the other per-attempt resets in the retry loop.

Create PR

Or push these changes by commenting:

@cursor push aeb8f831ea
Preview (aeb8f831ea)
diff --git a/src/lib/agent-interface.ts b/src/lib/agent-interface.ts
--- a/src/lib/agent-interface.ts
+++ b/src/lib/agent-interface.ts
@@ -1386,6 +1386,7 @@
         collectedText.length = 0;
         recentStatuses.length = 0;
         authErrorDetected = false;
+        agentState.reset();
       }
 
       // Fresh prompt stream per attempt — stdin stays open until result received

diff --git a/src/lib/agent-state.ts b/src/lib/agent-state.ts
--- a/src/lib/agent-state.ts
+++ b/src/lib/agent-state.ts
@@ -45,6 +45,13 @@
     this.compactionCount += 1;
   }
 
+  /** Reset all mutable state for a fresh retry attempt. */
+  reset(): void {
+    this.modifiedFiles.clear();
+    this.lastStatus = null;
+    this.compactionCount = 0;
+  }
+
   snapshot(): SerializedAgentState {
     return {
       schema: 'amplitude-wizard-agent-state/1',

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 9a4870e. Configure here.

Comment thread src/lib/agent-interface.ts
@kelsonpw
Copy link
Copy Markdown
Collaborator Author

@cursor push aeb8f83

@kaiapeacock-eng
Copy link
Copy Markdown
Collaborator

Trivial fix

agentState is constructed once before the retry loop and never reset on retry, despite the "Per-attempt" comment. compactionCount inflates across failed attempts and a stale lastStatus gets stamped under the new attemptId.

 for (let attempt = 0; attempt < maxAttempts; attempt++) {
+  agentState.reset();
   // ...existing reset block

(or drop the pre-loop construction and re-instantiate inside the loop.)

@kelsonpw
Copy link
Copy Markdown
Collaborator Author

Superseded by #173 after the Bet 2 chain was re-rooted onto #171 (recovered slice 1). The original #126 sat on the orphan kelsonpw/agent-loop-* branch chain whose root predated the 2026-04-20 history reset — merging it would have silently carried pre-reset code into main. The new #173 is the same slice (3) rebased onto the flattened main, authorship preserved via git commit --author.

@kelsonpw kelsonpw closed this Apr 21, 2026
kelsonpw pushed a commit that referenced this pull request Apr 26, 2026
Adds per-attempt AgentState bag that tracks modified files, last structured
status, and compaction count. Serializes to a deterministic tmpdir path so a
future post-compaction UserPromptSubmit hook can hydrate the agent back with
the context that compaction dropped.

- New src/lib/agent-state.ts — AgentState class with setAttemptId,
  recordModifiedFile, recordStatus, recordCompaction, snapshot, persist,
  snapshotPath, reset. Schema versioned as amplitude-wizard-agent-state/1.
- Instantiated per runAgent call; reset between retry attempts.
- Wired into StatusReporter.onStatus so the persisted snapshot always carries
  the most recent status message.
- +11 tests (agent-state.test.ts) covering dedup/sort, status tracking,
  compaction count, JSON schema, tmpdir path, reset semantics.

Bet 2 slice 3. Recreated from #126 onto kelsonpw/agent-loop-status-recovered
after the 2026-04-20 history reset. Hook-factory wiring for PreCompact and
PostToolUse depends on Bet 1 ToolCallCounters landing first (PR #149); this
slice ships the persistence primitive so later slices can wire it up with
zero merge friction.
kelsonpw added a commit that referenced this pull request Apr 26, 2026
Adds per-attempt AgentState bag that tracks modified files, last structured
status, and compaction count. Serializes to a deterministic tmpdir path so a
future post-compaction UserPromptSubmit hook can hydrate the agent back with
the context that compaction dropped.

- New src/lib/agent-state.ts — AgentState class with setAttemptId,
  recordModifiedFile, recordStatus, recordCompaction, snapshot, persist,
  snapshotPath, reset. Schema versioned as amplitude-wizard-agent-state/1.
- Instantiated per runAgent call; reset between retry attempts.
- Wired into StatusReporter.onStatus so the persisted snapshot always carries
  the most recent status message.
- +11 tests (agent-state.test.ts) covering dedup/sort, status tracking,
  compaction count, JSON schema, tmpdir path, reset semantics.

Bet 2 slice 3. Recreated from #126 onto kelsonpw/agent-loop-status-recovered
after the 2026-04-20 history reset. Hook-factory wiring for PreCompact and
PostToolUse depends on Bet 1 ToolCallCounters landing first (PR #149); this
slice ships the persistence primitive so later slices can wire it up with
zero merge friction.

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

futureproof bet Part of the wizard vision bets rolled up in issue #143

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants