fix: stop burning Claude tokens on every PostToolUse hook (#138)#140
fix: stop burning Claude tokens on every PostToolUse hook (#138)#140
Conversation
mem::observe was firing mem::compress unconditionally on every PostToolUse hook, which called Claude via the user's ANTHROPIC_API_KEY to compress each raw observation into a structured summary. On an active coding session (50-200 tool calls/hour) this silently burned hundreds of thousands of Claude API tokens within minutes. Reported by @olcor1 on a fresh Docker install — his token allocation was gone in 20 minutes. Make per-observation LLM compression opt-in. - New env var AGENTMEMORY_AUTO_COMPRESS, default `false`. When false, observe.ts builds a zero-LLM synthetic CompressedObservation from raw tool I/O (title from toolName, narrative from truncated input/output, files extracted from file_path/pattern/etc., type inferred from camelCase-normalized tool name) and stores + indexes it. Recall and BM25 search still work. - New src/functions/compress-synthetic.ts with buildSyntheticCompression(). - src/functions/observe.ts gates the compress trigger on the flag and writes the synthetic observation + stream events when disabled. - Startup banner now logs "Auto-compress: OFF (default, #138)" or a loud warning when opt-in is on, so the mode is never silent. - Regression test: test/auto-compress.test.ts covers the default path (no mem::compress fired), explicit opt-in, tool-name-to-type mapping, file-path extraction, narrative truncation, and the post_tool_failure→error mapping. Bumps to 0.8.8 (main package + shim). 707 tests pass. **Migration**: users who were relying on LLM-generated summaries set AGENTMEMORY_AUTO_COMPRESS=true in ~/.agentmemory/.env and restart. Existing compressed observations on disk are untouched.
📝 WalkthroughWalkthroughPer-observation LLM compression is now opt-in via Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Observer
participant Config
participant Synthetic as Synthetic<br/>Compressor
participant LLM as LLM<br/>Compressor
participant KV as KV Store
participant Search as Search Index
participant Stream as Stream Events
Client->>Observer: mem::observe (PostToolUse raw)
Observer->>Config: isAutoCompressEnabled()?
alt AGENTMEMORY_AUTO_COMPRESS = "true"
Config-->>Observer: true
Observer->>LLM: trigger mem::compress(observation)
LLM->>LLM: perform LLM compression
LLM-->>Observer: CompressedObservation
Observer->>KV: persist compressed observation
Observer->>Search: add(compressed)
Observer->>Stream: emit stream::set (compressed)
else default (synthetic)
Config-->>Observer: false
Observer->>Synthetic: buildSyntheticCompression(raw)
Synthetic-->>Observer: synthetic CompressedObservation
Observer->>KV: overwrite observation with synthetic
Observer->>Search: add(synthetic)
Observer->>Stream: emit stream::set (compressed)
end
Observer-->>Client: return observationId
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
CHANGELOG.md (1)
13-13: Tighten phrasing in Line 13.Consider replacing “the exact opposite” with “the opposite” for cleaner wording.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@CHANGELOG.md` at line 13, The phrasing in the changelog sentence describing mem::observe→mem::compress behavior is a bit wordy; update the sentence that reads “which is the exact opposite of what a memory tool should do” to use “which is the opposite of what a memory tool should do” instead. Edit the line describing the old mem::observe path (mentions mem::compress and PostToolUse hook and Claude/ANTHROPIC_API_KEY) to swap the phrase while keeping the surrounding explanation about synthetic compression and BM25 intact.src/functions/export-import.ts (1)
178-178: Centralize supported export versions to reduce release drift risk.The inline literal works, but this list is easy to miss during future bumps. Consider defining a shared
SUPPORTED_EXPORT_VERSIONSconstant (and derivingVERSION/type unions from it) to keep one source of truth.Based on learnings: "When bumping version, update: package.json version field, src/version.ts VERSION constant and type union, src/types.ts ExportData version union, src/functions/export-import.ts supportedVersions set, test/export-import.test.ts version assertion, and plugin.json version field".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/functions/export-import.ts` at line 178, Replace the inline Set literal used in supportedVersions with a single shared constant array/Set named SUPPORTED_EXPORT_VERSIONS and import/use it in the export-import code (replace supportedVersions with SUPPORTED_EXPORT_VERSIONS), then derive the exported VERSION constant and any version type unions (e.g., the ExportData version union) from that single source so there’s one source of truth; update related references/tests (e.g., the version assertion in export-import.test.ts) to import/compare against SUPPORTED_EXPORT_VERSIONS (or a helper like LATEST_EXPORT_VERSION) instead of hard‑coding literals to avoid release drift.src/functions/compress-synthetic.ts (1)
18-20: Map task signals totaskto preserve type fidelity.Line 40 currently maps
"task"to"subagent", which collapses two distinctObservationTypevalues and can weaken type-based filtering/analytics.Proposed adjustment
- if (hookType === "subagent_stop" || hookType === "task_completed") - return "subagent"; + if (hookType === "subagent_stop") return "subagent"; + if (hookType === "task_completed") return "task"; @@ - if (["task", "agent"].some(hasWord)) return "subagent"; + if (["task"].some(hasWord)) return "task"; + if (["agent", "subagent"].some(hasWord)) return "subagent";Also applies to: 40-40
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/functions/compress-synthetic.ts` around lines 18 - 20, The current mapping collapses distinct ObservationType values by returning "subagent" for task-related signals; update the mapping logic so that hookType === "task" returns "task" (preserving ObservationType fidelity) while keeping hookType === "subagent_stop" and hookType === "task_completed" mapped to "subagent", and "notification" mapped to "notification"; locate the conditional that checks hookType in compress-synthetic.ts and add an explicit branch for "task" (referencing hookType and ObservationType) so analytics/type-based filtering remain accurate.test/auto-compress.test.ts (1)
172-179: Remove redundant dynamic import inside the loop.The
buildSyntheticCompressionfunction is already imported at line 152-154. The import at line 174 inside the loop is unnecessary and adds overhead.♻️ Suggested simplification
for (const [name, expectedType] of cases) { - const synthetic = ( - await import("../src/functions/compress-synthetic.js") - ).buildSyntheticCompression({ ...base, toolName: name }); + const synthetic = buildSyntheticCompression({ ...base, toolName: name }); expect(synthetic.type, `${name} -> ${expectedType}`).toBe(expectedType); } - // silence unused warning — buildSyntheticCompression is used above - expect(typeof buildSyntheticCompression).toBe("function");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/auto-compress.test.ts` around lines 172 - 179, The loop currently performs a dynamic import of "../src/functions/compress-synthetic.js" each iteration; remove that import and call the already-imported buildSyntheticCompression (the symbol imported earlier at lines ~152-154) directly inside the loop when creating synthetic via buildSyntheticCompression({ ...base, toolName: name }); also remove or adjust the trailing expect that exists solely to silence an "unused" warning (since buildSyntheticCompression will now be used), i.e., delete the final expect(typeof buildSyntheticCompression).toBe("function") line so there is no redundant import or unused-symbol workaround.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/functions/compress-synthetic.ts`:
- Line 79: The computed fallback variable toolName (const toolName =
raw.toolName ?? raw.hookType) is not being used for type inference; replace
usages of raw.toolName in the type-inference call with the normalized toolName
so the fallback is applied (preventing inference falling back to "other").
Locate the inference call near where raw.toolName is passed (around the
subsequent call at ~line 92) and pass toolName instead; ensure any other places
in this function that use raw.toolName for inference also use the normalized
toolName.
In `@test/auto-compress.test.ts`:
- Around line 65-72: Tests for mem::observe rely on dynamic imports that cache
module state; add vi.resetModules() in the test setup to clear the module cache
between runs so environment-driven config reads (e.g.,
AGENTMEMORY_AUTO_COMPRESS) are re-evaluated. Update the beforeEach (and/or
afterEach) block surrounding the describe("mem::observe auto-compress gate
(`#138`)") to call vi.resetModules() before deleting/setting process.env, ensuring
observe.js and any config modules are re-imported fresh for each test.
---
Nitpick comments:
In `@CHANGELOG.md`:
- Line 13: The phrasing in the changelog sentence describing
mem::observe→mem::compress behavior is a bit wordy; update the sentence that
reads “which is the exact opposite of what a memory tool should do” to use
“which is the opposite of what a memory tool should do” instead. Edit the line
describing the old mem::observe path (mentions mem::compress and PostToolUse
hook and Claude/ANTHROPIC_API_KEY) to swap the phrase while keeping the
surrounding explanation about synthetic compression and BM25 intact.
In `@src/functions/compress-synthetic.ts`:
- Around line 18-20: The current mapping collapses distinct ObservationType
values by returning "subagent" for task-related signals; update the mapping
logic so that hookType === "task" returns "task" (preserving ObservationType
fidelity) while keeping hookType === "subagent_stop" and hookType ===
"task_completed" mapped to "subagent", and "notification" mapped to
"notification"; locate the conditional that checks hookType in
compress-synthetic.ts and add an explicit branch for "task" (referencing
hookType and ObservationType) so analytics/type-based filtering remain accurate.
In `@src/functions/export-import.ts`:
- Line 178: Replace the inline Set literal used in supportedVersions with a
single shared constant array/Set named SUPPORTED_EXPORT_VERSIONS and import/use
it in the export-import code (replace supportedVersions with
SUPPORTED_EXPORT_VERSIONS), then derive the exported VERSION constant and any
version type unions (e.g., the ExportData version union) from that single source
so there’s one source of truth; update related references/tests (e.g., the
version assertion in export-import.test.ts) to import/compare against
SUPPORTED_EXPORT_VERSIONS (or a helper like LATEST_EXPORT_VERSION) instead of
hard‑coding literals to avoid release drift.
In `@test/auto-compress.test.ts`:
- Around line 172-179: The loop currently performs a dynamic import of
"../src/functions/compress-synthetic.js" each iteration; remove that import and
call the already-imported buildSyntheticCompression (the symbol imported earlier
at lines ~152-154) directly inside the loop when creating synthetic via
buildSyntheticCompression({ ...base, toolName: name }); also remove or adjust
the trailing expect that exists solely to silence an "unused" warning (since
buildSyntheticCompression will now be used), i.e., delete the final
expect(typeof buildSyntheticCompression).toBe("function") line so there is no
redundant import or unused-symbol workaround.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 575b85ba-658c-4648-b6ca-3dd61eb76ecf
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (14)
CHANGELOG.mdREADME.mdpackage.jsonpackages/mcp/package.jsonplugin/.claude-plugin/plugin.jsonsrc/config.tssrc/functions/compress-synthetic.tssrc/functions/export-import.tssrc/functions/observe.tssrc/index.tssrc/types.tssrc/version.tstest/auto-compress.test.tstest/export-import.test.ts
- compress-synthetic.ts:92 — use the fallback toolName for inferType so both title and type derive from the same resolved value - test/auto-compress.test.ts — vi.resetModules() in beforeEach to prevent cached observe.js from seeing stale env var state between the default, opt-in, and explicit-off test cases
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/functions/compress-synthetic.ts (2)
18-19: Maptask_completedtotaskinstead ofsubagent.Line 18 currently collapses
"task_completed"into"subagent", which can reduce precision for task-specific retrieval/analytics.Suggested change
- if (hookType === "subagent_stop" || hookType === "task_completed") - return "subagent"; + if (hookType === "subagent_stop") return "subagent"; + if (hookType === "task_completed") return "task";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/functions/compress-synthetic.ts` around lines 18 - 19, The current conditional in compress-synthetic.ts maps hookType === "task_completed" to "subagent" which reduces task-level precision; update the logic in the function handling hookType (the branch that currently reads if (hookType === "subagent_stop" || hookType === "task_completed") return "subagent";) so that "task_completed" returns "task" while "subagent_stop" still returns "subagent" (i.e., split the OR into two checks or add an explicit branch for "task_completed" returning "task").
7-10: Prefer self-documenting code over WHAT-style banner comments.This header explains implementation behavior that is already clear from function names and call sites; consider trimming or moving rationale to changelog/docs.
As per coding guidelines, "Use clear, self-documenting variable and function names instead of code comments explaining WHAT".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/functions/compress-synthetic.ts` around lines 7 - 10, Remove the long WHAT-style banner comment describing the "Zero-LLM compression path" in src/functions/compress-synthetic.ts and replace it with a concise one-line summary above the compressSynthetic function (or the function that converts RawObservation to CompressedObservation) such as "Convert RawObservation to CompressedObservation using heuristic (no LLM)"; keep variable and function names (RawObservation, CompressedObservation, compressSynthetic) self-descriptive, and move the rationale about default behavior and AGENTMEMORY_AUTO_COMPRESS to the changelog/docs instead of in-file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/functions/compress-synthetic.ts`:
- Around line 18-19: The current conditional in compress-synthetic.ts maps
hookType === "task_completed" to "subagent" which reduces task-level precision;
update the logic in the function handling hookType (the branch that currently
reads if (hookType === "subagent_stop" || hookType === "task_completed") return
"subagent";) so that "task_completed" returns "task" while "subagent_stop" still
returns "subagent" (i.e., split the OR into two checks or add an explicit branch
for "task_completed" returning "task").
- Around line 7-10: Remove the long WHAT-style banner comment describing the
"Zero-LLM compression path" in src/functions/compress-synthetic.ts and replace
it with a concise one-line summary above the compressSynthetic function (or the
function that converts RawObservation to CompressedObservation) such as "Convert
RawObservation to CompressedObservation using heuristic (no LLM)"; keep variable
and function names (RawObservation, CompressedObservation, compressSynthetic)
self-descriptive, and move the rationale about default behavior and
AGENTMEMORY_AUTO_COMPRESS to the changelog/docs instead of in-file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 74c92794-915e-49ca-9ee9-86f17e5d1e08
📒 Files selected for processing (2)
src/functions/compress-synthetic.tstest/auto-compress.test.ts
✅ Files skipped from review due to trivial changes (1)
- test/auto-compress.test.ts
Closes #138.
The bug
@olcor1 reported that installing agentmemory on a fresh Docker stack busted his Claude API token allocation within 20 minutes — the exact opposite of what the tool is supposed to do. Reproduced.
Root cause: `src/functions/observe.ts:129` fired `mem::compress` unconditionally on every PostToolUse hook:
```ts
sdk.triggerVoid("mem::compress", { observationId: obsId, sessionId: payload.sessionId, raw });
```
`mem::compress` in turn calls `compressWithRetry(provider, COMPRESSION_SYSTEM, prompt, validator, 1)` which goes to Claude via the user's `ANTHROPIC_API_KEY`. No rate limit, no batching, no opt-in. An active coding session with 50-200 tool calls per hour burned hundreds of thousands of tokens silently in the background.
The fix
Per-observation LLM compression is now opt-in.
Migration
If you were relying on LLM-generated summaries, add to `~/.agentmemory/.env`:
```env
AGENTMEMORY_AUTO_COMPRESS=true
```
and restart. Existing compressed observations on disk are untouched.
Files changed
Test plan
Release
Bumps to 0.8.8 (main package + `@agentmemory/mcp` shim). CHANGELOG includes a prominent Behavior change banner at the top of the [0.8.8] entry so upgraders see the migration note.
Summary by CodeRabbit
New Features
Chores
Documentation
Tests