fix(hooks): break Stop-hook infinite recursion via agent-sdk fallback#187
fix(hooks): break Stop-hook infinite recursion via agent-sdk fallback#187
Conversation
Reported: a user with no provider API key and AGENTMEMORY_AUTO_COMPRESS=false (which they believed protected them) hit unbounded recursion — Stop hook POSTs /agentmemory/summarize, handler calls provider.summarize(), agent-sdk provider spawns @anthropic-ai/claude-agent-sdk query(), which creates a full CC-style child session that reads ~/.claude/settings.json, registers the same plugin hooks, and fires its own Stop -> another child -> loop. ~579 ghost 'entrypoint: sdk-ts' sessions accumulated in a few minutes, draining Claude Pro tokens. #149 only added a stderr warning. AGENTMEMORY_AUTO_COMPRESS gated /compress but never /summarize, so users who followed the warning's implied guidance still got hit. Fix the loop at every layer: 1. config.ts detectProvider - Treat empty-string provider keys (ANTHROPIC_API_KEY=) as unset; they previously passed the truthiness check identically to a real key. - Stop defaulting to agent-sdk. When no key is set, return a 'noop' provider config and warn. Agent-sdk fallback now requires an explicit AGENTMEMORY_ALLOW_AGENT_SDK=true opt-in with a loud second warning. 2. providers/noop.ts (new) + providers/index.ts - NoopProvider implements MemoryProvider and returns empty strings for compress and summarize so callers can detect .name === 'noop' and short-circuit without spawning anything. - Add ProviderType 'noop' and wire it through createBaseProvider. 3. providers/agent-sdk.ts - Before spawning query(), check process.env.AGENTMEMORY_SDK_CHILD === '1' and return '' instead of recursing. Set the env var to '1' before the spawn so any child process (including the Agent SDK session's hooks) inherits it. 4. hooks/sdk-guard.ts (new) + all 12 hook scripts - Shared isSdkChildContext(payload) checks both AGENTMEMORY_SDK_CHILD=1 and payload.entrypoint === 'sdk-ts' (CC writes this into the stdin jsonl for SDK-spawned sessions). Every hook script now bails early when that returns true, so even if one guard layer fails the others break the loop. 5. functions/summarize.ts - Short-circuit with {success:false, error:'no_provider'} when provider.name === 'noop' — never reach .summarize(). - Treat an empty provider response as empty_provider_response instead of trying to parse it. Tests: 74 files / 819 tests pass (+7 new in stop-hook-recursion-guard.test.ts). Defense in depth means any ONE of the five layers breaks the loop.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR implements a comprehensive recursion guard to prevent infinite loops in the Agent SDK integration. It adds an Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/providers/agent-sdk.ts (1)
14-45:⚠️ Potential issue | 🟠 MajorRestore
AGENTMEMORY_SDK_CHILDafter the SDK query completes.Line 26 sets a process-wide environment variable that is never restored. This causes subsequent calls to
compress()orsummarize()in the same process to hit the bailout check on line 15 and return empty strings. Concurrent calls may also be incorrectly treated as SDK children.🐛 Proposed fix
private async query(systemPrompt: string, userPrompt: string): Promise<string> { if (process.env.AGENTMEMORY_SDK_CHILD === "1") { // We are already running inside a Claude Agent SDK-spawned session. // Spawning another one would let its plugin-hook-driven Stop loop // re-enter /agentmemory/summarize and cause unbounded recursion // (`#149` follow-up). Degrade to empty string so callers short-circuit. return "" } // Mark any child process / SDK session spawned from here as a SDK // child. agentmemory hook scripts check this marker and skip their // REST calls to break the recursion loop. + const previousSdkChild = process.env.AGENTMEMORY_SDK_CHILD process.env.AGENTMEMORY_SDK_CHILD = "1" - const { query } = await import('@anthropic-ai/claude-agent-sdk') + try { + const { query } = await import('@anthropic-ai/claude-agent-sdk') - const messages = query({ - prompt: userPrompt, - options: { - systemPrompt, - maxTurns: 1, - allowedTools: [], - }, - }) + const messages = query({ + prompt: userPrompt, + options: { + systemPrompt, + maxTurns: 1, + allowedTools: [], + }, + }) - let result = '' - for await (const msg of messages) { - if (msg.type === 'result') { - result = (msg as any).result ?? '' + let result = '' + for await (const msg of messages) { + if (msg.type === 'result') { + result = (msg as any).result ?? '' + } } + return result + } finally { + if (previousSdkChild === undefined) { + delete process.env.AGENTMEMORY_SDK_CHILD + } else { + process.env.AGENTMEMORY_SDK_CHILD = previousSdkChild + } } - return result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/agent-sdk.ts` around lines 14 - 45, The query method sets process.env.AGENTMEMORY_SDK_CHILD = "1" and never restores it, causing later calls to short-circuit; fix by capturing the previous value (e.g., const prev = process.env.AGENTMEMORY_SDK_CHILD), set the env var to "1" before importing/iterating the SDK messages, and ensure you restore the original value in a finally block (restore to prev or delete if prev is undefined) so concurrent or subsequent calls aren’t misclassified as SDK children; keep this change localized to the query function and ensure the restore runs even if the import or message loop throws.
🧹 Nitpick comments (1)
plugin/scripts/notification.mjs (1)
2-21: Bundled artifact — matches the source-side guard.Consider whether
plugin/scripts/*.mjsshould be generated in CI rather than committed; the hashed filenamesdk-guard-DI1NUOS9.mjswill churn on every rebuild and make diffs noisy. Non-blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/scripts/notification.mjs` around lines 2 - 21, The committed bundled artifact plugin/scripts/notification.mjs imports a hashed helper file sdk-guard-DI1NUOS9.mjs which will churn diffs on every rebuild; update the build/release process so these generated files are not committed: either (a) stop committing plugin/scripts/*.mjs and generate them during CI (adjust build scripts and CI job to emit notification.mjs and the sdk-guard file at deploy time), or (b) make the guard import use a stable filename and produce that stable artifact from the build; also add plugin/scripts/*.mjs (or the specific hashed outputs) to .gitignore to prevent future noise and document the CI generation step in the repo README.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config.ts`:
- Around line 89-107: The FALLBACK_PROVIDERS parsing in loadFallbackConfig()
must respect the AGENTMEMORY_ALLOW_AGENT_SDK gate just like detectProvider():
when building the fallback provider list from the FALLBACK_PROVIDERS env var,
filter out the "agent-sdk" entry unless env["AGENTMEMORY_ALLOW_AGENT_SDK"] ===
"true"; update loadFallbackConfig() to read AGENTMEMORY_ALLOW_AGENT_SDK and skip
or remove "agent-sdk" from the parsed providers so users cannot bypass the
safety gate applied in detectProvider().
In `@src/functions/summarize.ts`:
- Around line 87-92: The empty_provider_response branch in summarize.ts
currently returns without recording failure metrics or logs; update the branch
after the provider.summarize call (the spot returning { success: false, error:
"empty_provider_response" }) to record the failure and log context: increment
the same failure/metrics counter used for parse/validation errors (e.g.,
recordFailure or metrics.increment with "empty_provider_response"), and emit a
diagnostic log (using the existing logger used elsewhere) that includes
SUMMARY_SYSTEM, the prompt from buildSummaryPrompt(compressed), and the provider
identity so empty provider responses are visible in telemetry and logs. Ensure
the new metric/logging follows the same pattern and function names used for
other failure paths.
---
Outside diff comments:
In `@src/providers/agent-sdk.ts`:
- Around line 14-45: The query method sets process.env.AGENTMEMORY_SDK_CHILD =
"1" and never restores it, causing later calls to short-circuit; fix by
capturing the previous value (e.g., const prev =
process.env.AGENTMEMORY_SDK_CHILD), set the env var to "1" before
importing/iterating the SDK messages, and ensure you restore the original value
in a finally block (restore to prev or delete if prev is undefined) so
concurrent or subsequent calls aren’t misclassified as SDK children; keep this
change localized to the query function and ensure the restore runs even if the
import or message loop throws.
---
Nitpick comments:
In `@plugin/scripts/notification.mjs`:
- Around line 2-21: The committed bundled artifact
plugin/scripts/notification.mjs imports a hashed helper file
sdk-guard-DI1NUOS9.mjs which will churn diffs on every rebuild; update the
build/release process so these generated files are not committed: either (a)
stop committing plugin/scripts/*.mjs and generate them during CI (adjust build
scripts and CI job to emit notification.mjs and the sdk-guard file at deploy
time), or (b) make the guard import use a stable filename and produce that
stable artifact from the build; also add plugin/scripts/*.mjs (or the specific
hashed outputs) to .gitignore to prevent future noise and document the CI
generation step in the repo README.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 31cd2282-70bf-458d-8154-22bc88c04f5e
📒 Files selected for processing (33)
plugin/scripts/notification.mjsplugin/scripts/post-tool-failure.mjsplugin/scripts/post-tool-use.mjsplugin/scripts/pre-compact.mjsplugin/scripts/pre-tool-use.mjsplugin/scripts/prompt-submit.mjsplugin/scripts/sdk-guard-DI1NUOS9.mjsplugin/scripts/session-end.mjsplugin/scripts/session-start.mjsplugin/scripts/stop.mjsplugin/scripts/subagent-start.mjsplugin/scripts/subagent-stop.mjsplugin/scripts/task-completed.mjssrc/config.tssrc/functions/summarize.tssrc/hooks/notification.tssrc/hooks/post-tool-failure.tssrc/hooks/post-tool-use.tssrc/hooks/pre-compact.tssrc/hooks/pre-tool-use.tssrc/hooks/prompt-submit.tssrc/hooks/sdk-guard.tssrc/hooks/session-end.tssrc/hooks/session-start.tssrc/hooks/stop.tssrc/hooks/subagent-start.tssrc/hooks/subagent-stop.tssrc/hooks/task-completed.tssrc/providers/agent-sdk.tssrc/providers/index.tssrc/providers/noop.tssrc/types.tstest/stop-hook-recursion-guard.test.ts
Findings verified against current code on this branch; all four valid. 1. config.ts loadFallbackConfig (L281) — user could set FALLBACK_PROVIDERS=agent-sdk and bypass the AGENTMEMORY_ALLOW_AGENT_SDK gate added to detectProvider. Filter it out at the fallback layer too, with the same warning pointing at the opt-in flag. 2. summarize.ts (L87-92) — the empty_provider_response branch returned without recording failure metrics or a diagnostic log, unlike the parse/validation paths. Record the same metricsStore failure event and log provider name, prompt size, system size, and observation count so empty responses are visible in telemetry. 3. providers/agent-sdk.ts (L14-45) — setting process.env.AGENTMEMORY_SDK_CHILD = '1' without restoring it caused every subsequent .query() in the same parent process to hit the short-circuit guard and return '' (classified as a SDK child it is not). Capture prev, set in try, restore in finally (delete if prev was undefined). Child processes spawned during the for-await loop still inherit the marker because env is inherited at spawn time; we only restore after the loop completes. 4. plugin/scripts/sdk-guard-DI1NUOS9.mjs — tsdown extracted the shared guard helper into a hashed chunk. Hash rotates on every rebuild and churns the diff. Stopped using the shared module from hooks entirely and inlined the 6-line guard function into each hook .ts file instead. sdk-guard.ts stays in the tree because the unit tests cover it directly. Deleted the tracked hashed .mjs and confirmed no new chunk is emitted. Also applied the CI two-step install (npm install --package-lock-only then npm ci) on this branch, matching #184. Without it, npm ci fails because lockfiles are gitignored. Tests: 74 files / 819 tests pass.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/config.ts (1)
115-119:⚠️ Potential issue | 🟡 MinorInconsistent
maxTokenshandling in agent-sdk fallback.All other branches in
detectProvideruse themaxTokensvariable computed fromenv["MAX_TOKENS"](line 49), but this branch hardcodes4096, silently ignoring a user-suppliedMAX_TOKENS. Align with the rest of the function for consistency.Proposed fix
return { provider: "agent-sdk", model: "claude-sonnet-4-20250514", - maxTokens: 4096, + maxTokens, };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.ts` around lines 115 - 119, The agent-sdk fallback branch in detectProvider currently hardcodes maxTokens to 4096 and ignores the computed maxTokens variable; update the branch that returns { provider: "agent-sdk", model: "claude-sonnet-4-20250514", maxTokens: 4096 } to use the existing maxTokens variable (the one computed from env["MAX_TOKENS"]) so it aligns with the other branches—locate the detectProvider function and replace the hardcoded 4096 in that return object with the maxTokens identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 20-24: The CI workflow is regenerating package-lock.json which
defeats the committed lockfile; edit the .github/workflows/ci.yml job to remove
the step that runs "npm install --package-lock-only --legacy-peer-deps
--no-audit --no-fund" and instead run "npm ci --legacy-peer-deps --no-audit
--no-fund" (or just "npm ci" if legacy flags are unnecessary) so CI uses the
committed package-lock.json; also update the comment above that step to remove
the claim that lockfiles are gitignored.
In @.github/workflows/publish.yml:
- Around line 28-32: Remove the dynamic lockfile generation in the publish job:
delete the run step that executes "npm install --package-lock-only
--legacy-peer-deps --no-audit --no-fund" and rely on the committed
package-lock.json by keeping a single "npm ci --legacy-peer-deps --no-audit
--no-fund" step; also update the preceding comment that claims lockfiles are
gitignored to reflect that package-lock.json is tracked so the workflow uses the
committed lockfile rather than regenerating one at release time.
---
Outside diff comments:
In `@src/config.ts`:
- Around line 115-119: The agent-sdk fallback branch in detectProvider currently
hardcodes maxTokens to 4096 and ignores the computed maxTokens variable; update
the branch that returns { provider: "agent-sdk", model:
"claude-sonnet-4-20250514", maxTokens: 4096 } to use the existing maxTokens
variable (the one computed from env["MAX_TOKENS"]) so it aligns with the other
branches—locate the detectProvider function and replace the hardcoded 4096 in
that return object with the maxTokens identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f1693628-9ec4-4372-898c-fd26008c3480
📒 Files selected for processing (29)
.github/workflows/ci.yml.github/workflows/publish.ymlplugin/scripts/notification.mjsplugin/scripts/post-tool-failure.mjsplugin/scripts/post-tool-use.mjsplugin/scripts/pre-compact.mjsplugin/scripts/pre-tool-use.mjsplugin/scripts/prompt-submit.mjsplugin/scripts/session-end.mjsplugin/scripts/session-start.mjsplugin/scripts/stop.mjsplugin/scripts/subagent-start.mjsplugin/scripts/subagent-stop.mjsplugin/scripts/task-completed.mjssrc/config.tssrc/functions/summarize.tssrc/hooks/notification.tssrc/hooks/post-tool-failure.tssrc/hooks/post-tool-use.tssrc/hooks/pre-compact.tssrc/hooks/pre-tool-use.tssrc/hooks/prompt-submit.tssrc/hooks/session-end.tssrc/hooks/session-start.tssrc/hooks/stop.tssrc/hooks/subagent-start.tssrc/hooks/subagent-stop.tssrc/hooks/task-completed.tssrc/providers/agent-sdk.ts
✅ Files skipped from review due to trivial changes (5)
- plugin/scripts/session-end.mjs
- src/hooks/stop.ts
- src/hooks/session-end.ts
- src/hooks/post-tool-failure.ts
- src/providers/agent-sdk.ts
🚧 Files skipped from review as they are similar to previous changes (17)
- plugin/scripts/pre-compact.mjs
- plugin/scripts/stop.mjs
- plugin/scripts/notification.mjs
- src/hooks/session-start.ts
- src/hooks/task-completed.ts
- src/hooks/prompt-submit.ts
- src/hooks/post-tool-use.ts
- plugin/scripts/subagent-stop.mjs
- src/hooks/notification.ts
- src/hooks/subagent-start.ts
- plugin/scripts/subagent-start.mjs
- src/hooks/pre-tool-use.ts
- plugin/scripts/post-tool-use.mjs
- src/hooks/subagent-stop.ts
- plugin/scripts/prompt-submit.mjs
- plugin/scripts/session-start.mjs
- src/functions/summarize.ts
| # Two-step install: generate a lockfile in-runner with | ||
| # --package-lock-only, then install from it with `npm ci`. | ||
| # Lockfiles are gitignored at the repo level. | ||
| - run: npm install --package-lock-only --legacy-peer-deps --no-audit --no-fund | ||
| - run: npm ci --legacy-peer-deps --no-audit --no-fund |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Tracked npm lockfiles:"
git ls-files | rg '(^|/)(package-lock\.json|npm-shrinkwrap\.json)$' || true
echo
echo "Ignored root lockfile patterns:"
git check-ignore -v package-lock.json npm-shrinkwrap.json 2>/dev/null || true
echo
echo "Publish workflow install commands:"
rg -n 'npm (install --package-lock-only|ci|install )' .github/workflows/publish.yml
echo
echo "Dependency ranges that can resolve differently at release time:"
python - <<'PY'
import json, os, re
skip = {".git", "node_modules", "dist", "build", "coverage"}
range_re = re.compile(r'(^[\^~*])|(\|\|)|([<>]=?)|([xX])')
for root, dirs, files in os.walk("."):
dirs[:] = [d for d in dirs if d not in skip]
if "package.json" not in files:
continue
path = os.path.join(root, "package.json")
try:
data = json.load(open(path, encoding="utf-8"))
except Exception as exc:
print(f"{path}: failed to parse: {exc}")
continue
for section in ("dependencies", "devDependencies", "optionalDependencies"):
for name, version in (data.get(section) or {}).items():
if isinstance(version, str) and range_re.search(version):
print(f"{path}: {section}.{name}={version}")
PYRepository: rohitg00/agentmemory
Length of output: 1448
🏁 Script executed:
cat -n .github/workflows/publish.yml | head -40Repository: rohitg00/agentmemory
Length of output: 1511
🏁 Script executed:
cat -n .gitignore | grep -i package-lock || echo "No package-lock pattern found in .gitignore"Repository: rohitg00/agentmemory
Length of output: 107
Remove the dynamic lockfile generation in the publish job; use the committed lockfile instead.
The code comment incorrectly states that lockfiles are gitignored—package-lock.json is actually tracked in git. The publish workflow should install from this committed lockfile rather than generating a fresh one at release time. Line 31 (npm install --package-lock-only) regenerates the lockfile, which can resolve differently than the CI build due to version ranges like ^0.39.0 in dependencies. This means published artifacts may differ from tested code without any repository change.
Safer publish install using the committed lockfile
- # Two-step install: generate a lockfile in-runner with
- # --package-lock-only, then install from it with `npm ci`.
- # Lockfiles are gitignored at the repo level.
- - run: npm install --package-lock-only --legacy-peer-deps --no-audit --no-fund
+ # Install from the reviewed lockfile committed to git.
- run: npm ci --legacy-peer-deps --no-audit --no-fund📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Two-step install: generate a lockfile in-runner with | |
| # --package-lock-only, then install from it with `npm ci`. | |
| # Lockfiles are gitignored at the repo level. | |
| - run: npm install --package-lock-only --legacy-peer-deps --no-audit --no-fund | |
| - run: npm ci --legacy-peer-deps --no-audit --no-fund | |
| # Install from the reviewed lockfile committed to git. | |
| - run: npm ci --legacy-peer-deps --no-audit --no-fund |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/publish.yml around lines 28 - 32, Remove the dynamic
lockfile generation in the publish job: delete the run step that executes "npm
install --package-lock-only --legacy-peer-deps --no-audit --no-fund" and rely on
the committed package-lock.json by keeping a single "npm ci --legacy-peer-deps
--no-audit --no-fund" step; also update the preceding comment that claims
lockfiles are gitignored to reflect that package-lock.json is tracked so the
workflow uses the committed lockfile rather than regenerating one at release
time.
The agent-sdk branch hardcoded maxTokens: 4096, ignoring the maxTokens variable computed from env['MAX_TOKENS']. Other branches already use the computed value. Make agent-sdk match.
Pulls in #184 runtime fix, #188 viewer pipeline, lockfile gitignore, CI two-step install, StateScope types, firstPrompt typeof guard, content-addressed lesson/crystal IDs, image-quota fail-closed, and onnxruntime optionalDependencies. Resolves workflow conflicts flagged by CodeRabbit — ci.yml / publish.yml on main are already the correct two-step install pattern. # Conflicts: # .github/workflows/publish.yml
- README provider table: surface the no-op default and call out the opt-in Claude-subscription fallback with AGENTMEMORY_ALLOW_AGENT_SDK (from #187) instead of listing 'Claude subscription' as the default. - README env block: document OPENAI_BASE_URL / OPENAI_EMBEDDING_MODEL (#186) and OPENAI_EMBEDDING_DIMENSIONS (#189), plus MINIMAX_API_KEY. - Hero stat-tests SVG: 654 -> 827 (both dark and light variants) to match current suite size after recursion guard + idempotent lesson/crystal tests + openai dimension tests landed. - website/lib/generated-meta.json regenerated by website/scripts/gen-meta.mjs (v0.9.1, 51 tools, 12 hooks, 119 endpoints, 848 tests).
Rolls up #186 (OPENAI_BASE_URL / OPENAI_EMBEDDING_MODEL), #187 (Stop-hook recursion 5-layer defense + NoopProvider + AGENTMEMORY_ALLOW_AGENT_SDK opt-in), #188 (viewer empty-tabs + import-jsonl synthetic compression + auto-derived lessons/crystals + richer session detail + audit/replay/frontier shape fixes), #189 (OPENAI_EMBEDDING_DIMENSIONS + model-dimensions table), and #190 (README/website docs refresh). Bumps: package.json, plugin/.claude-plugin/plugin.json, src/version.ts, src/types.ts ExportData.version union, src/functions/export-import.ts supportedVersions, test/export-import.test.ts assertion, and packages/mcp/package.json shim (was stuck at 0.9.0).
Summary
Closes the infinite-recursion loop reported against v0.9.1 where an unset / empty ANTHROPIC_API_KEY plus AGENTMEMORY_AUTO_COMPRESS=false still produced hundreds of SDK-spawned ghost sessions that drained Claude Pro tokens.
This is the follow-up to #149. #149 only added a stderr warning; AGENTMEMORY_AUTO_COMPRESS gated /compress but never /summarize, so users who followed the warning's implied guidance still got hit.
Root cause
Stop hook → POST /agentmemory/summarize → summarize.ts calls provider.summarize() unconditionally → agent-sdk provider spawns @anthropic-ai/claude-agent-sdk query() → child CC session reads ~/.claude/settings.json, registers the same plugin hooks, fires its own Stop → another child → runaway loop. Empty-string ANTHROPIC_API_KEY= behaved identically to unset because of a plain truthiness check.
Fix (defense in depth — any ONE layer breaks the loop)
config.ts detectProvider
providers/noop.ts (new)
providers/agent-sdk.ts
hooks/sdk-guard.ts (new) + all 12 hook scripts
functions/summarize.ts
Testing
```
Test Files 74 passed (74)
Tests 819 passed (819)
```
+7 new tests in test/stop-hook-recursion-guard.test.ts covering env marker, payload.entrypoint detection, null/non-object payloads, and NoopProvider behaviour.
Migration notes
Local workaround users found (documented here for completeness)
Users could set a fake ANTHROPIC_API_KEY=sk-ant-fake, which would select the Anthropic provider, fail at the real API with 401, and stop.ts's fetch().catch(() => {}) would swallow it. That workaround keeps working; the noop path is cleaner.
Summary by CodeRabbit
New Features
Bug Fixes