Skip to content

feat: v0.2.0 -- 12 hooks, MCP tools, skills, pattern detection#2

Merged
rohitg00 merged 5 commits intomainfrom
feat/full-memory-upgrade
Feb 26, 2026
Merged

feat: v0.2.0 -- 12 hooks, MCP tools, skills, pattern detection#2
rohitg00 merged 5 commits intomainfrom
feat/full-memory-upgrade

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Feb 25, 2026

Summary

Major upgrade from 5 hooks to full Claude Code integration. agentmemory now covers 12 of 18 hook types, exposes 5 MCP tools for active memory queries, adds 4 slash command skills, and includes intelligent features like pattern detection and memory consolidation.

New Hooks (+7, total 12)

  • PreToolUse -- injects file history before Edit/Write/Read/Glob/Grep operations
  • PostToolUseFailure -- captures error patterns for cross-session learning
  • PreCompact -- preserves memory context through context window compaction
  • SubagentStart/SubagentStop -- tracks multi-agent workflows and transcripts
  • Notification -- captures permission prompts to learn tool preferences
  • TaskCompleted -- tracks team task completions

MCP Server (5 tools)

  • memory_recall -- search past observations mid-conversation
  • memory_save -- explicitly save insights to long-term memory
  • memory_file_history -- get past observations about specific files
  • memory_patterns -- detect recurring patterns across sessions
  • memory_sessions -- list recent sessions

Skills (4 slash commands)

  • /recall [query] -- search memory
  • /remember [insight] -- save to long-term memory
  • /session-history -- show past session timeline
  • /forget [target] -- delete specific data (with confirmation)

New Functions (4)

  • mem::file-context -- file-centric memory for PreToolUse context injection
  • mem::consolidate -- LLM-powered observation merging into long-term memories
  • mem::patterns + mem::generate-rules -- co-change detection, error pattern tracking
  • mem::remember + mem::forget -- explicit save/delete for long-term memory

New API Endpoints (+7, total 17)

/agentmemory/file-context, /remember, /forget, /consolidate, /patterns, /generate-rules, /mcp/*

Stats

  • 33 files changed, +1,716 lines
  • 12 hook scripts (7 new)
  • 5 MCP tools
  • 4 skills
  • 17 API endpoints (7 new)
  • 11 iii Functions (4 new)
  • Build: 56KB main + 12 hook scripts
  • Tests: 45/45 passing

Test plan

  • npx tsc --noEmit -- zero type errors
  • npm test -- 45/45 tests pass
  • npm run build -- all 3 build targets succeed
  • Integration test with all 12 hooks registered in Claude Code
  • MCP tool calls via /agentmemory/mcp/call
  • PreCompact context preservation through compaction cycle
  • Memory consolidation with real observations

Summary by CodeRabbit

  • New Features

    • Save/recall/forget memories, session-history and file-context lookups, memory consolidation, pattern detection with generated maintenance rules, eviction tools, and MCP tool endpoints.
    • New lifecycle hooks (pre/post-tool, compaction, subagent start/stop, notification, task-completed) with contextual helpers and pretool file-context.
    • Health & metrics reporting, runtime health monitoring, deduplication, and quality/metrics scoring for operations.
  • Documentation

    • Added user-facing skill guides for remember, recall, forget, and session-history.
  • Tests

    • New tests covering circuit-breaker and evaluation/validation/quality logic.
  • Chores

    • Version bumped to 0.3.0.

… and intelligence

New hooks (7):
- PreToolUse: inject file history before edits (Edit/Write/Read/Glob/Grep)
- PostToolUseFailure: capture error patterns for learning
- PreCompact: preserve memory context through compaction
- SubagentStart/SubagentStop: track multi-agent workflows
- Notification: capture permission prompts (tool preferences)
- TaskCompleted: track team task completions

New functions (4):
- mem::file-context: file-centric memory index for PreToolUse
- mem::consolidate: merge observations into long-term memories via LLM
- mem::patterns + mem::generate-rules: detect co-change patterns and recurring errors
- mem::remember + mem::forget: explicit save/delete for long-term memory

MCP server (2 endpoints):
- GET /agentmemory/mcp/tools: list 5 MCP tools (recall, save, file_history, patterns, sessions)
- POST /agentmemory/mcp/call: dispatch tool calls to iii functions

Skills (4):
- /recall [query]: search past observations
- /remember [insight]: save to long-term memory
- /session-history: show past session timeline
- /forget [target]: delete specific memory data

New API endpoints (7):
- POST /agentmemory/file-context
- POST /agentmemory/remember
- POST /agentmemory/forget
- POST /agentmemory/consolidate
- POST /agentmemory/patterns
- POST /agentmemory/generate-rules
- GET/POST /agentmemory/mcp/*

Updated: types (7 new HookTypes, 3 new ObservationTypes), version 0.2.0
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

Adds plugin/hook metadata and seven new hook scripts; implements multiple mem::* functions (remember/forget/consolidate/patterns/file-context/evict), metrics, dedup/health tooling, resilient provider/circuit breaker, MCP/API endpoints, eval schemas/validators/quality scoring, BM25 search index, tests, and docs.

Changes

Cohort / File(s) Summary
Version & Plugin Metadata
package.json, plugin/.claude-plugin/plugin.json, tsdown.config.ts
Package and plugin versions updated; plugin description expanded and skills path added; new hook entry files added to build config.
Hook Definitions & Scripts
plugin/hooks/hooks.json, plugin/scripts/*.mjs, src/hooks/*
Added seven hook definitions and corresponding script and TS hook implementations that read stdin JSON and POST observations/context to AGENTMEMORY endpoints (fire-and-forget, optional auth, timeouts).
Memory Functions & Runtime
src/functions/* (compress.ts, observe.ts, consolidate.ts, remember.ts, patterns.ts, file-index.ts, evict.ts, summarize.ts, dedup.ts)
New mem::* functions (remember/forget, consolidate, patterns, file-context, evict); observe/compress/summarize updated to support dedup, metrics, retries, validation and quality scoring; added DedupMap and ObservationQueue integrations.
Eval / Quality / Validation
src/eval/*.ts (schemas.ts, validator.ts, self-correct.ts, quality.ts, metrics-store.ts)
Added Zod schemas for inputs/outputs, validateInput/validateOutput, compressWithRetry, scoring utilities, and a MetricsStore for per-function metrics.
Telemetry, Health & Recovery
src/telemetry/setup.ts, src/health/*
Added OTEL scaffolding, health monitor with snapshot storage, health thresholds evaluator, and an ObservationQueue to buffer hook payloads.
Providers & Resilience
src/providers/* (circuit-breaker.ts, resilient.ts, index.ts)
New CircuitBreaker and ResilientProvider; provider factory now returns a ResilientProvider wrapper exposing circuit state.
MCP, API & Startup Wiring
src/mcp/server.ts, src/triggers/api.ts, src/index.ts
New MCP tool endpoints and server; many new API routes (remember/forget/consolidate/patterns/file-context/evict/generate-rules) and startup wiring to register new functions, metrics, dedup, and health monitor.
Types, Prompts & State Schema
src/types.ts, src/prompts/compression.ts, src/state/schema.ts
Extended ObservationType and HookType with new values; compression prompt updated to include new types; added KV keys for metrics and health.
Search Index & File Context
src/state/search-index.ts, src/functions/file-index.ts
Replaced simple index with BM25-like scoring, exposed index size; added file-context aggregation function producing contextual text for files.
Skills Documentation
plugin/skills/*/SKILL.md
Added four skill docs (remember, recall, session-history, forget) with usage examples and curl commands.
Tests
test/circuit-breaker.test.ts, test/eval.test.ts
Added tests for CircuitBreaker state transitions and comprehensive tests for eval schemas, validators, and quality scoring functions.

Sequence Diagram

sequenceDiagram
    participant Hook as Hook Script (stdin)
    participant API as AgentMemory API
    participant SDK as SDK / Function Runner
    participant Provider as Model Provider
    participant KV as KV Storage
    participant MCP as MCP Client

    Hook->>API: POST /agentmemory/observe or /context (hook payload)
    API->>KV: persist observation / read context
    API->>SDK: trigger mem::... (observe, file-context, consolidate, etc.)
    SDK->>KV: read/write observations & memories
    SDK->>Provider: compress / summarize requests
    Provider-->>SDK: compressed XML / summary
    SDK->>KV: store consolidated memories / summaries
    MCP->>API: GET/POST /agentmemory/mcp/... (list / call)
    API->>SDK: invoke mem::* for MCP tool call
    SDK-->>API: formatted tool response
    API-->>MCP: return tool output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I nibble logs and follow the trail,
Hooks sprout like carrots, long and pale,
I stitch observations into lore,
Consolidate, recall — memories galore!
Hooray for bumps and tools to explore.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main features added in this version bump: 12 hooks, MCP tools, skills, and pattern detection capabilities.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/full-memory-upgrade

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/index.ts (1)

62-83: ⚠️ Potential issue | 🟡 Minor

Missing /agentmemory/generate-rules from readiness log.

Per the API triggers (see src/triggers/api.ts snippet — api::generate-rules: POST /agentmemory/generate-rules), this endpoint exists but isn't listed in the startup banner. Minor omission.

Add the missing log line
   console.log(`  POST /agentmemory/patterns          - Detect patterns`);
+  console.log(`  POST /agentmemory/generate-rules    - Generate rules from memory`);
   console.log(`  GET  /agentmemory/sessions          - List sessions`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 62 - 83, Startup readiness logs in src/index.ts
omit the /agentmemory/generate-rules endpoint; add a console.log line matching
the style of the other entries to include "POST /agentmemory/generate-rules -
Generate rules" (or similar descriptive text) near the other endpoint prints so
the startup banner reflects the actual API triggers (refer to the existing
console.log block that lists endpoints and the api::generate-rules trigger
string).
🧹 Nitpick comments (5)
src/functions/remember.ts (1)

10-15: Validate content before saving memory.

Rejecting empty or whitespace-only content will prevent low-value entries from being persisted.

Proposed refactor
     async (data: {
       content: string;
       type?: string;
       concepts?: string[];
       files?: string[];
     }) => {
+      if (!data.content || !data.content.trim()) {
+        return { success: false, error: "content is required" };
+      }
       const ctx = getContext();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/remember.ts` around lines 10 - 15, The handler's input object
(the async (data: { content: string; ... }) => { ... }) does not validate
data.content; add a guard at the start of that anonymous async function to trim
and check data.content for non-empty, non-whitespace content and reject early
(throw an Error or return a 4xx validation response) when the check fails;
ensure you reference data.content in the check and only proceed to the existing
save/persist logic when data.content.trim().length > 0 to avoid storing
empty/whitespace-only memories.
src/hooks/post-tool-failure.ts (1)

39-43: Consider truncating tool_input and error to bound payload size.

last_assistant_message is truncated to 4000 chars in subagent-stop.ts and task_description to 2000 chars in task-completed.mjs, but tool_input and error are sent unbounded here. A large file write or verbose stack trace could produce an oversized request body.

Proposed truncation
       data: {
         tool_name: data.tool_name,
-        tool_input: data.tool_input,
-        error: data.error,
+        tool_input: typeof data.tool_input === "string"
+          ? data.tool_input.slice(0, 4000)
+          : data.tool_input,
+        error: typeof data.error === "string"
+          ? data.error.slice(0, 4000)
+          : data.error,
       },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/post-tool-failure.ts` around lines 39 - 43, The payload in
src/hooks/post-tool-failure.ts sends unbounded data.tool_input and data.error
which can make the request body oversized; before constructing the data object
in the post-tool-failure handler, truncate tool_input (e.g., to 4000 chars) and
error (e.g., to 2000 chars) by using a small helper or a .slice and append an
ellipsis when shortened, then assign the truncated values to data.tool_input and
data.error so the rest of the code (the same data object used when calling the
downstream API) sends bounded strings.
src/hooks/subagent-stop.ts (1)

46-46: Inconsistent timeout: 3000 ms here vs 2000 ms in subagent-start.ts and most other hooks.

subagent-start.ts (line 42) and notification.ts (line 45) use AbortSignal.timeout(2000), while this and post-tool-failure.ts use 3000 ms. If intentional (e.g., larger payloads), a brief comment would help; otherwise, consider aligning to a single constant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/subagent-stop.ts` at line 46, The timeout used for the signal
assignment (AbortSignal.timeout(3000)) in subagent-stop.ts is inconsistent with
other hooks; change AbortSignal.timeout(3000) to AbortSignal.timeout(2000) to
match subagent-start.ts and notification.ts, or if the longer timeout is
intentional, add a brief inline comment next to the signal assignment explaining
why 3000ms is required (and consider mirroring that comment in
post-tool-failure.ts if it uses the same rationale).
src/functions/file-index.ts (1)

36-43: Avoid repeated KV observation reads per file.

Line 41 through Line 43 re-fetches the same session observations for each requested file. Cache observations per session once, then reuse in the file loop to cut read amplification.

Refactor sketch
+      const observationsBySession = new Map<string, CompressedObservation[]>();
+      for (const session of otherSessions) {
+        observationsBySession.set(
+          session.id,
+          await kv.list<CompressedObservation>(KV.observations(session.id)),
+        );
+      }
+
       for (const file of data.files) {
@@
         for (const session of otherSessions) {
-          const observations = await kv.list<CompressedObservation>(
-            KV.observations(session.id),
-          );
+          const observations = observationsBySession.get(session.id) || [];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/file-index.ts` around lines 36 - 43, Currently the code calls
kv.list<CompressedObservation>(KV.observations(session.id)) inside the inner
loop over files, causing repeated reads for the same session; instead, before
iterating data.files build a cache (e.g., a Map keyed by session.id) by
iterating otherSessions once and fetching kv.list for each session, storing the
resulting observations; then in the for (const file of data.files) loop reuse
that cached observations map (referencing KV.observations, otherSessions,
session.id, and the existing history/FileHistory logic) so you only read each
session's observations once.
src/triggers/api.ts (1)

195-300: Consider extracting an authenticated passthrough helper for these endpoints.

The new blocks repeat the same auth + sdk.trigger pattern; a small helper would reduce duplication and future drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 195 - 300, Extract the repeated auth +
trigger pattern into a reusable helper (e.g., createAuthPassthrough or
registerAuthPassthrough) that takes the triggerId (like "mem::file-context",
"mem::remember", "mem::forget", "mem::consolidate", "mem::patterns",
"mem::generate-rules"), the http response status (200/201), and optionally the
request body type; implement the helper to perform checkAuth(req, secret),
return the auth error if present, call sdk.trigger(triggerId, req.body) and
return the response object, then replace each inline handler (the async
functions registered for "api::file-context", "api::remember", "api::forget",
"api::consolidate", "api::patterns", "api::generate-rules") with a call to this
helper and keep the existing sdk.registerTrigger calls unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugin/scripts/post-tool-failure.mjs`:
- Around line 31-35: The persisted data object in post-tool-failure.mjs
currently stores tool_input and error verbatim; implement a sanitization step
(e.g., a function like redactAndTruncate or sanitizeToolPayload) and call it
before building the data payload so that tool_input and error have secrets
redacted (mask likely secrets/PII) and are size-bounded (truncate stringified
values to a safe length, e.g., 1000 chars) and converted to safe strings;
replace direct uses of data.tool_input and data.error with the sanitized outputs
when constructing the data object for persistence.

In `@plugin/scripts/pre-tool-use.mjs`:
- Around line 30-35: The loop that pushes keys
["file_path","path","file","pattern"] into the files array is treating Grep's
search expression ("pattern") as a file path; update the loop in
pre-tool-use.mjs so it skips adding the value when key === "pattern" for the
Grep tool (e.g., check tool name/id or tool.type equals "Grep" before pushing),
or more generally only push when the key is "pattern" and the value matches a
filesystem path pattern; modify the logic around the for (const key of [...])
block that populates files and ensure /agentmemory/file-context only receives
actual file paths.

In `@plugin/skills/recall/SKILL.md`:
- Line 10: Replace the raw interpolation of $ARGUMENTS into the curl JSON
payload with an encoded JSON body: instead of embedding "$ARGUMENTS" directly in
the -d string, build the request body using a JSON encoder (e.g., jq or a small
python/perl/json-encode call) so that quotes/newlines are escaped safely, then
pass that encoded string to curl -d; update the curl invocation that references
$ARGUMENTS, AGENTMEMORY_URL and AGENTMEMORY_SECRET to use the encoded payload
variable.

In `@src/functions/consolidate.ts`:
- Around line 69-72: The code currently sets minObs with falsy-coercion which
treats 0 as default and allows negatives; update the async handler "(data: {
project?: string; minObservations?: number }) =>" to validate
data.minObservations explicitly: if data.minObservations is undefined use
default 10, else ensure it's a finite integer >= 0 (e.g. Number.isFinite and
Number.isInteger or at least >=0) and if invalid throw/return a clear error
(RangeError or appropriate ctx error) before using minObs; replace the current
"const minObs = data.minObservations || 10" with this validated assignment to
ensure 0 is accepted and negatives/non-numeric values are rejected.
- Around line 124-128: The call to provider.compress (inside the try block using
CONSOLIDATION_SYSTEM with `concept` and `prompt`) lacks a timeout and can hang;
wrap this call in a timeout guard (e.g., Promise.race with a timeout promise or
use an abortable API) so the compress call rejects after a configurable timeout
(e.g., COMPRESS_TIMEOUT_MS) and then handle that rejection in the existing catch
to log/propagate a clear timeout error; ensure you reference and use
provider.compress exactly as currently invoked and keep the timeout value
configurable and documented.

In `@src/functions/file-index.ts`:
- Around line 27-34: Filter sessions to the current project before
sorting/slicing: when building otherSessions from kv.list<Session>(KV.sessions),
add a filter that keeps only sessions whose project identifier matches the
current project's id (e.g., session.projectId === data.projectId or similar) in
addition to excluding data.sessionId, then sort and slice as before; update the
otherSessions expression (the filter chain starting from
sessions.filter(...).sort(...).slice(...)) to include this project-scoping check
so only same-project session history is returned.

In `@src/functions/observe.ts`:
- Around line 35-45: The code reads fields from payload.data (d) allowing
unsanitized content to persist; change the extraction to use the
already-sanitized object (sanitizedRaw) instead of payload.data so raw.toolName,
raw.toolInput, raw.toolOutput and raw.userPrompt are populated from
sanitizedRaw["tool_name"], sanitizedRaw["tool_input"],
sanitizedRaw["tool_output"] (or ["error"]) and sanitizedRaw["prompt"]
respectively; update the conditional blocks that check payload.hookType
("post_tool_use", "post_tool_failure", "prompt_submit") to reference
sanitizedRaw for all extracted values to ensure sensitive data is not bypassed.

In `@src/functions/patterns.ts`:
- Around line 49-50: The grouping key for error patterns is being truncated at
80 chars (the key variable created where obs.type === "error" and obs.title is
used), which causes distinct errors with a common prefix to be merged; update
the logic that produces key in patterns.ts to use a full normalized key (e.g.,
obs.title.trim().toLowerCase() without .slice(0,80)) and optionally extend it to
include other stable fields such as obs.message or obs.stack if available, so
grouping uses the complete normalized identifier rather than a truncated prefix.

In `@src/functions/remember.ts`:
- Around line 68-76: The forget flow currently treats an empty observationIds
array as a silent no-op; update the handler in remember.ts to explicitly detect
Array.isArray(data.observationIds) && data.observationIds.length === 0 and
return a clear error/early response (or throw) instead of proceeding, so callers
get notified; modify the block around the checks that reference
data.observationIds, data.sessionId, kv.delete and KV.observations (and the
deleted counter) to perform this explicit empty-array validation before
attempting deletions or listing.

In `@src/mcp/server.ts`:
- Around line 147-211: The handler assumes req.body, name and args exist and
directly calls sdk.trigger in each case; add input validation and an error
boundary: first check req.body is an object and that body.name is a non-empty
string and body.arguments is an object (return a 400 McpResponse if not), then
wrap the entire switch dispatch (the switch on name in the exported handler) in
a try/catch so any sdk.trigger or kv.list failures return a controlled 500
McpResponse with an error message; additionally, validate per-tool required args
before calling sdk.trigger (e.g., for "memory_recall" ensure arguments.query is
a string, for "memory_file_history" ensure arguments.files is a string) and
return 400 if missing, and normalize parsing (safe string splits) inside the
respective "memory_recall", "memory_save", "memory_file_history",
"memory_patterns", and "memory_sessions" cases.

In `@src/triggers/api.ts`:
- Around line 195-300: These handlers (api::remember, api::forget,
api::consolidate, api::patterns, api::generate-rules and api::file-context)
currently forward req.body directly; add explicit request-body validation at the
top of each registered function (after auth via checkAuth) to return a 400
Response for malformed or missing required fields. For api::file-context ensure
sessionId is present and files is a non-empty array of strings; for
api::remember ensure content is a non-empty string (and if present, files is an
array, concepts is an array, type is a string); for api::forget ensure at least
one of sessionId, observationIds (non-empty array) or memoryId is provided; for
api::consolidate, api::patterns and api::generate-rules validate types (project
if provided is a string, minObservations if provided is a positive number) and
return { status_code: 400, body: { error: '...' } } on validation failure before
calling sdk.trigger. Use the function ids (e.g., "api::remember", "api::forget")
to locate the handlers and keep checkAuth usage unchanged.

---

Outside diff comments:
In `@src/index.ts`:
- Around line 62-83: Startup readiness logs in src/index.ts omit the
/agentmemory/generate-rules endpoint; add a console.log line matching the style
of the other entries to include "POST /agentmemory/generate-rules - Generate
rules" (or similar descriptive text) near the other endpoint prints so the
startup banner reflects the actual API triggers (refer to the existing
console.log block that lists endpoints and the api::generate-rules trigger
string).

---

Nitpick comments:
In `@src/functions/file-index.ts`:
- Around line 36-43: Currently the code calls
kv.list<CompressedObservation>(KV.observations(session.id)) inside the inner
loop over files, causing repeated reads for the same session; instead, before
iterating data.files build a cache (e.g., a Map keyed by session.id) by
iterating otherSessions once and fetching kv.list for each session, storing the
resulting observations; then in the for (const file of data.files) loop reuse
that cached observations map (referencing KV.observations, otherSessions,
session.id, and the existing history/FileHistory logic) so you only read each
session's observations once.

In `@src/functions/remember.ts`:
- Around line 10-15: The handler's input object (the async (data: { content:
string; ... }) => { ... }) does not validate data.content; add a guard at the
start of that anonymous async function to trim and check data.content for
non-empty, non-whitespace content and reject early (throw an Error or return a
4xx validation response) when the check fails; ensure you reference data.content
in the check and only proceed to the existing save/persist logic when
data.content.trim().length > 0 to avoid storing empty/whitespace-only memories.

In `@src/hooks/post-tool-failure.ts`:
- Around line 39-43: The payload in src/hooks/post-tool-failure.ts sends
unbounded data.tool_input and data.error which can make the request body
oversized; before constructing the data object in the post-tool-failure handler,
truncate tool_input (e.g., to 4000 chars) and error (e.g., to 2000 chars) by
using a small helper or a .slice and append an ellipsis when shortened, then
assign the truncated values to data.tool_input and data.error so the rest of the
code (the same data object used when calling the downstream API) sends bounded
strings.

In `@src/hooks/subagent-stop.ts`:
- Line 46: The timeout used for the signal assignment
(AbortSignal.timeout(3000)) in subagent-stop.ts is inconsistent with other
hooks; change AbortSignal.timeout(3000) to AbortSignal.timeout(2000) to match
subagent-start.ts and notification.ts, or if the longer timeout is intentional,
add a brief inline comment next to the signal assignment explaining why 3000ms
is required (and consider mirroring that comment in post-tool-failure.ts if it
uses the same rationale).

In `@src/triggers/api.ts`:
- Around line 195-300: Extract the repeated auth + trigger pattern into a
reusable helper (e.g., createAuthPassthrough or registerAuthPassthrough) that
takes the triggerId (like "mem::file-context", "mem::remember", "mem::forget",
"mem::consolidate", "mem::patterns", "mem::generate-rules"), the http response
status (200/201), and optionally the request body type; implement the helper to
perform checkAuth(req, secret), return the auth error if present, call
sdk.trigger(triggerId, req.body) and return the response object, then replace
each inline handler (the async functions registered for "api::file-context",
"api::remember", "api::forget", "api::consolidate", "api::patterns",
"api::generate-rules") with a call to this helper and keep the existing
sdk.registerTrigger calls unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cde447b and b7d00ea.

📒 Files selected for processing (33)
  • package.json
  • plugin/.claude-plugin/plugin.json
  • plugin/hooks/hooks.json
  • plugin/scripts/notification.mjs
  • plugin/scripts/post-tool-failure.mjs
  • plugin/scripts/pre-compact.mjs
  • plugin/scripts/pre-tool-use.mjs
  • plugin/scripts/subagent-start.mjs
  • plugin/scripts/subagent-stop.mjs
  • plugin/scripts/task-completed.mjs
  • plugin/skills/forget/SKILL.md
  • plugin/skills/recall/SKILL.md
  • plugin/skills/remember/SKILL.md
  • plugin/skills/session-history/SKILL.md
  • src/functions/compress.ts
  • src/functions/consolidate.ts
  • src/functions/file-index.ts
  • src/functions/observe.ts
  • src/functions/patterns.ts
  • src/functions/remember.ts
  • src/hooks/notification.ts
  • src/hooks/post-tool-failure.ts
  • src/hooks/pre-compact.ts
  • src/hooks/pre-tool-use.ts
  • src/hooks/subagent-start.ts
  • src/hooks/subagent-stop.ts
  • src/hooks/task-completed.ts
  • src/index.ts
  • src/mcp/server.ts
  • src/prompts/compression.ts
  • src/triggers/api.ts
  • src/types.ts
  • tsdown.config.ts

Comment on lines +31 to +35
data: {
tool_name: data.tool_name,
tool_input: data.tool_input,
error: data.error
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Redact and bound failure payload fields before persistence.

Lines 31–35 persist tool_input and error verbatim. Those fields can include sensitive values and oversized blobs, which is a privacy and retention risk.

Proposed fix
+function truncateString(v, max = 2000) {
+	return typeof v === "string" ? v.slice(0, max) : v;
+}
+
+function safeToolInput(input) {
+	if (!input || typeof input !== "object") return input;
+	const clone = { ...input };
+	for (const key of ["content", "text", "prompt", "token", "secret", "password", "apiKey"]) {
+		if (key in clone) clone[key] = "[redacted]";
+	}
+	return clone;
+}
+
 ...
 				data: {
 					tool_name: data.tool_name,
-					tool_input: data.tool_input,
-					error: data.error
+					tool_input: safeToolInput(data.tool_input),
+					error: truncateString(data.error, 2000)
 				}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
data: {
tool_name: data.tool_name,
tool_input: data.tool_input,
error: data.error
}
function truncateString(v, max = 2000) {
return typeof v === "string" ? v.slice(0, max) : v;
}
function safeToolInput(input) {
if (!input || typeof input !== "object") return input;
const clone = { ...input };
for (const key of ["content", "text", "prompt", "token", "secret", "password", "apiKey"]) {
if (key in clone) clone[key] = "[redacted]";
}
return clone;
}
data: {
tool_name: data.tool_name,
tool_input: safeToolInput(data.tool_input),
error: truncateString(data.error, 2000)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/scripts/post-tool-failure.mjs` around lines 31 - 35, The persisted
data object in post-tool-failure.mjs currently stores tool_input and error
verbatim; implement a sanitization step (e.g., a function like redactAndTruncate
or sanitizeToolPayload) and call it before building the data payload so that
tool_input and error have secrets redacted (mask likely secrets/PII) and are
size-bounded (truncate stringified values to a safe length, e.g., 1000 chars)
and converted to safe strings; replace direct uses of data.tool_input and
data.error with the sanitized outputs when constructing the data object for
persistence.

Comment thread plugin/scripts/pre-tool-use.mjs Outdated
Comment thread plugin/skills/recall/SKILL.md Outdated
Comment on lines +69 to +72
async (data: { project?: string; minObservations?: number }) => {
const ctx = getContext();
const minObs = data.minObservations || 10;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate minObservations before use.

Line 71 allows invalid numbers (for example negative) and treats 0 as default via falsy coercion.

Proposed validation
-      const minObs = data.minObservations || 10;
+      const minObs =
+        Number.isInteger(data.minObservations) &&
+        (data.minObservations as number) > 0
+          ? (data.minObservations as number)
+          : 10;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async (data: { project?: string; minObservations?: number }) => {
const ctx = getContext();
const minObs = data.minObservations || 10;
async (data: { project?: string; minObservations?: number }) => {
const ctx = getContext();
const minObs =
Number.isInteger(data.minObservations) &&
(data.minObservations as number) > 0
? (data.minObservations as number)
: 10;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/consolidate.ts` around lines 69 - 72, The code currently sets
minObs with falsy-coercion which treats 0 as default and allows negatives;
update the async handler "(data: { project?: string; minObservations?: number })
=>" to validate data.minObservations explicitly: if data.minObservations is
undefined use default 10, else ensure it's a finite integer >= 0 (e.g.
Number.isFinite and Number.isInteger or at least >=0) and if invalid
throw/return a clear error (RangeError or appropriate ctx error) before using
minObs; replace the current "const minObs = data.minObservations || 10" with
this validated assignment to ensure 0 is accepted and negatives/non-numeric
values are rejected.

Comment thread src/functions/consolidate.ts Outdated
Comment thread src/functions/observe.ts Outdated
Comment thread src/functions/patterns.ts Outdated
Comment thread src/functions/remember.ts Outdated
Comment on lines +68 to +76
if (data.observationIds && data.sessionId) {
for (const obsId of data.observationIds) {
await kv.delete(KV.observations(data.sessionId), obsId);
deleted++;
}
}

if (data.sessionId && !data.observationIds) {
const observations = await kv.list<{ id: string }>(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle empty observationIds explicitly in forget flow.

When sessionId is set and observationIds is an empty array, neither deletion mode is effectively executed, but the function still reports success. This silent no-op is easy to trigger from API clients.

Proposed fix
-      if (data.observationIds && data.sessionId) {
+      if (Array.isArray(data.observationIds) && data.sessionId && data.observationIds.length > 0) {
         for (const obsId of data.observationIds) {
           await kv.delete(KV.observations(data.sessionId), obsId);
           deleted++;
         }
       }

-      if (data.sessionId && !data.observationIds) {
+      if (data.sessionId && data.observationIds === undefined) {
         const observations = await kv.list<{ id: string }>(
           KV.observations(data.sessionId),
         );
         for (const obs of observations) {
           await kv.delete(KV.observations(data.sessionId), obs.id);
           deleted++;
         }
         await kv.delete(KV.sessions, data.sessionId);
         await kv.delete(KV.summaries, data.sessionId);
         deleted += 2;
       }
+
+      if (data.sessionId && Array.isArray(data.observationIds) && data.observationIds.length === 0) {
+        return { success: false, error: "observationIds cannot be empty" };
+      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (data.observationIds && data.sessionId) {
for (const obsId of data.observationIds) {
await kv.delete(KV.observations(data.sessionId), obsId);
deleted++;
}
}
if (data.sessionId && !data.observationIds) {
const observations = await kv.list<{ id: string }>(
if (Array.isArray(data.observationIds) && data.sessionId && data.observationIds.length > 0) {
for (const obsId of data.observationIds) {
await kv.delete(KV.observations(data.sessionId), obsId);
deleted++;
}
}
if (data.sessionId && data.observationIds === undefined) {
const observations = await kv.list<{ id: string }>(
KV.observations(data.sessionId),
);
for (const obs of observations) {
await kv.delete(KV.observations(data.sessionId), obs.id);
deleted++;
}
await kv.delete(KV.sessions, data.sessionId);
await kv.delete(KV.summaries, data.sessionId);
deleted += 2;
}
if (data.sessionId && Array.isArray(data.observationIds) && data.observationIds.length === 0) {
return { success: false, error: "observationIds cannot be empty" };
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/remember.ts` around lines 68 - 76, The forget flow currently
treats an empty observationIds array as a silent no-op; update the handler in
remember.ts to explicitly detect Array.isArray(data.observationIds) &&
data.observationIds.length === 0 and return a clear error/early response (or
throw) instead of proceeding, so callers get notified; modify the block around
the checks that reference data.observationIds, data.sessionId, kv.delete and
KV.observations (and the deleted counter) to perform this explicit empty-array
validation before attempting deletions or listing.

Comment thread src/mcp/server.ts
Comment thread src/triggers/api.ts
Comment on lines +195 to +300
sdk.registerFunction(
{ id: "api::file-context" },
async (
req: ApiRequest<{ sessionId: string; files: string[] }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::file-context", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::file-context",
config: { api_path: "/agentmemory/file-context", http_method: "POST" },
});

sdk.registerFunction(
{ id: "api::remember" },
async (
req: ApiRequest<{
content: string;
type?: string;
concepts?: string[];
files?: string[];
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::remember", req.body);
return { status_code: 201, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::remember",
config: { api_path: "/agentmemory/remember", http_method: "POST" },
});

sdk.registerFunction(
{ id: "api::forget" },
async (
req: ApiRequest<{
sessionId?: string;
observationIds?: string[];
memoryId?: string;
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::forget", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::forget",
config: { api_path: "/agentmemory/forget", http_method: "POST" },
});

sdk.registerFunction(
{ id: "api::consolidate" },
async (
req: ApiRequest<{ project?: string; minObservations?: number }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::consolidate", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::consolidate",
config: { api_path: "/agentmemory/consolidate", http_method: "POST" },
});

sdk.registerFunction(
{ id: "api::patterns" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::patterns", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::patterns",
config: { api_path: "/agentmemory/patterns", http_method: "POST" },
});

sdk.registerFunction(
{ id: "api::generate-rules" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::generate-rules", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::generate-rules",
config: { api_path: "/agentmemory/generate-rules", http_method: "POST" },
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return explicit 400s for malformed request bodies in new endpoints.

These handlers forward req.body directly; malformed payloads will fail downstream with less actionable errors. Validate required fields at the API boundary.

Validation pattern example
   sdk.registerFunction(
     { id: "api::file-context" },
     async (
       req: ApiRequest<{ sessionId: string; files: string[] }>,
     ): Promise<Response> => {
       const authErr = checkAuth(req, secret);
       if (authErr) return authErr;
+      if (
+        !req.body ||
+        typeof req.body.sessionId !== "string" ||
+        !Array.isArray(req.body.files) ||
+        req.body.files.length === 0
+      ) {
+        return {
+          status_code: 400,
+          body: { error: "sessionId and non-empty files are required" },
+        };
+      }
       const result = await sdk.trigger("mem::file-context", req.body);
       return { status_code: 200, body: result };
     },
   );

Apply analogous guards for api::remember, api::forget, api::consolidate, api::patterns, and api::generate-rules.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sdk.registerFunction(
{ id: "api::file-context" },
async (
req: ApiRequest<{ sessionId: string; files: string[] }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::file-context", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::file-context",
config: { api_path: "/agentmemory/file-context", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::remember" },
async (
req: ApiRequest<{
content: string;
type?: string;
concepts?: string[];
files?: string[];
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::remember", req.body);
return { status_code: 201, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::remember",
config: { api_path: "/agentmemory/remember", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::forget" },
async (
req: ApiRequest<{
sessionId?: string;
observationIds?: string[];
memoryId?: string;
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::forget", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::forget",
config: { api_path: "/agentmemory/forget", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::consolidate" },
async (
req: ApiRequest<{ project?: string; minObservations?: number }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::consolidate", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::consolidate",
config: { api_path: "/agentmemory/consolidate", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::patterns" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::patterns", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::patterns",
config: { api_path: "/agentmemory/patterns", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::generate-rules" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
const result = await sdk.trigger("mem::generate-rules", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::generate-rules",
config: { api_path: "/agentmemory/generate-rules", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::file-context" },
async (
req: ApiRequest<{ sessionId: string; files: string[] }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (
!req.body ||
typeof req.body.sessionId !== "string" ||
!Array.isArray(req.body.files) ||
req.body.files.length === 0
) {
return {
status_code: 400,
body: { error: "sessionId and non-empty files are required" },
};
}
const result = await sdk.trigger("mem::file-context", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::file-context",
config: { api_path: "/agentmemory/file-context", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::remember" },
async (
req: ApiRequest<{
content: string;
type?: string;
concepts?: string[];
files?: string[];
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (
!req.body ||
typeof req.body.content !== "string"
) {
return {
status_code: 400,
body: { error: "content is required" },
};
}
const result = await sdk.trigger("mem::remember", req.body);
return { status_code: 201, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::remember",
config: { api_path: "/agentmemory/remember", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::forget" },
async (
req: ApiRequest<{
sessionId?: string;
observationIds?: string[];
memoryId?: string;
}>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (!req.body) {
return {
status_code: 400,
body: { error: "request body is required" },
};
}
const result = await sdk.trigger("mem::forget", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::forget",
config: { api_path: "/agentmemory/forget", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::consolidate" },
async (
req: ApiRequest<{ project?: string; minObservations?: number }>,
): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (!req.body) {
return {
status_code: 400,
body: { error: "request body is required" },
};
}
const result = await sdk.trigger("mem::consolidate", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::consolidate",
config: { api_path: "/agentmemory/consolidate", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::patterns" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (!req.body) {
return {
status_code: 400,
body: { error: "request body is required" },
};
}
const result = await sdk.trigger("mem::patterns", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::patterns",
config: { api_path: "/agentmemory/patterns", http_method: "POST" },
});
sdk.registerFunction(
{ id: "api::generate-rules" },
async (req: ApiRequest<{ project?: string }>): Promise<Response> => {
const authErr = checkAuth(req, secret);
if (authErr) return authErr;
if (!req.body) {
return {
status_code: 400,
body: { error: "request body is required" },
};
}
const result = await sdk.trigger("mem::generate-rules", req.body);
return { status_code: 200, body: result };
},
);
sdk.registerTrigger({
type: "http",
function_id: "api::generate-rules",
config: { api_path: "/agentmemory/generate-rules", http_method: "POST" },
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 195 - 300, These handlers (api::remember,
api::forget, api::consolidate, api::patterns, api::generate-rules and
api::file-context) currently forward req.body directly; add explicit
request-body validation at the top of each registered function (after auth via
checkAuth) to return a 400 Response for malformed or missing required fields.
For api::file-context ensure sessionId is present and files is a non-empty array
of strings; for api::remember ensure content is a non-empty string (and if
present, files is an array, concepts is an array, type is a string); for
api::forget ensure at least one of sessionId, observationIds (non-empty array)
or memoryId is provided; for api::consolidate, api::patterns and
api::generate-rules validate types (project if provided is a string,
minObservations if provided is a positive number) and return { status_code: 400,
body: { error: '...' } } on validation failure before calling sdk.trigger. Use
the function ids (e.g., "api::remember", "api::forget") to locate the handlers
and keep checkAuth usage unchanged.

…orrectness

- observe.ts: extract fields from sanitizedRaw instead of payload.data
- mcp/server.ts: add input validation per tool and try-catch error boundary
- api.ts: validate request bodies for /remember, /forget, /migrate
- post-tool-failure.ts: truncate tool_input and error to 4000 chars
- pre-tool-use.ts: skip "pattern" key for Grep (regex, not file path)
- recall/SKILL.md: escape $ARGUMENTS before injecting into curl JSON
- consolidate.ts: use ?? instead of || for minObservations, add 30s timeout
- file-index.ts: filter sessions by project, cache observations per session
- patterns.ts: remove 80-char truncation on error keys
- remember.ts: validate content non-empty, skip empty observationIds
- index.ts: add missing /generate-rules log line
- subagent-stop.ts: normalize timeout to 2000ms
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (6)
plugin/skills/recall/SKILL.md (1)

10-10: ⚠️ Potential issue | 🟠 Major

Use a real JSON encoder for the request body (Line 10).

Current sed escaping handles \ and ", but not other JSON-significant control characters (e.g., newline/tab), so some valid queries can still break the payload.

Proposed fix
-!`QUERY=$(echo "$ARGUMENTS" | sed 's/\\/\\\\/g; s/"/\\"/g') && curl -s -H "Content-Type: application/json" -H "Authorization: Bearer ${AGENTMEMORY_SECRET:-}" -X POST http://${AGENTMEMORY_URL:-localhost:3111}/agentmemory/search -d "{\"query\": \"${QUERY}\", \"limit\": 10}" 2>/dev/null || echo '{"results":[]}'`
+!`BODY="$(jq -cn --arg query "$ARGUMENTS" '{query: $query, limit: 10}')" && \
+curl -s -H "Content-Type: application/json" \
+  -H "Authorization: Bearer ${AGENTMEMORY_SECRET:-}" \
+  -X POST "http://${AGENTMEMORY_URL:-localhost:3111}/agentmemory/search" \
+  -d "$BODY" 2>/dev/null || echo '{"results":[]}'`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/skills/recall/SKILL.md` at line 10, The request body is being built
with ad-hoc sed escaping (escaping backslashes and quotes) which misses other
JSON control characters; replace that with a proper JSON encoder when
constructing the payload in the request-body creation (the code referenced by
"sed escaping" at SKILL.md line 10) — e.g., use a JSON serializer (jq for shell,
JSON.stringify in JS/TS, or json.dumps in Python) to produce the body and pass
that encoded string directly to the HTTP client; ensure you remove the sed-based
escaping and instead call the serializer on the raw query string so newlines,
tabs, unicode, etc. are correctly escaped.
src/triggers/api.ts (2)

252-256: ⚠️ Potential issue | 🟠 Major

api::forget should accept observationIds as a standalone selector.

Line 252 currently rejects requests that only provide observationIds, even though the endpoint type includes it and forget flows commonly target observation subsets.

Proposed fix
-      if (!req.body?.sessionId && !req.body?.memoryId) {
+      const hasObservationIds =
+        Array.isArray(req.body?.observationIds) &&
+        req.body!.observationIds.length > 0;
+      if (!req.body?.sessionId && !req.body?.memoryId && !hasObservationIds) {
         return {
           status_code: 400,
-          body: { error: "sessionId or memoryId is required" },
+          body: { error: "sessionId, memoryId, or non-empty observationIds is required" },
         };
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 252 - 256, The request validation in the
api::forget handler currently rejects requests unless req.body.sessionId or
req.body.memoryId are present; change the conditional in the request check (the
if that returns status_code 400) to also accept req.body.observationIds (e.g.,
treat the request as valid if sessionId OR memoryId OR a non-empty
observationIds array is provided). Update the error response message accordingly
so it requires one of sessionId, memoryId, or observationIds, and ensure you
reference req.body.observationIds (and if desired check length > 0) in that
validation branch inside the api::forget logic.

195-313: ⚠️ Potential issue | 🟠 Major

Add explicit request-body validation for new POST endpoints.

Several handlers pass req.body directly to internal triggers without type checks, so malformed payloads can fail downstream with 500s instead of actionable 400s.

Validation pattern (example for key endpoints)
   sdk.registerFunction(
     { id: "api::file-context" },
@@
       const authErr = checkAuth(req, secret);
       if (authErr) return authErr;
+      if (
+        !req.body ||
+        typeof req.body.sessionId !== "string" ||
+        !Array.isArray(req.body.files) ||
+        req.body.files.length === 0 ||
+        req.body.files.some((f) => typeof f !== "string" || !f.trim())
+      ) {
+        return {
+          status_code: 400,
+          body: { error: "sessionId and non-empty files[] are required" },
+        };
+      }
       const result = await sdk.trigger("mem::file-context", req.body);
@@
   sdk.registerFunction(
     { id: "api::consolidate" },
@@
       const authErr = checkAuth(req, secret);
       if (authErr) return authErr;
+      if (
+        req.body &&
+        (typeof req.body !== "object" ||
+          (req.body.project !== undefined && typeof req.body.project !== "string") ||
+          (req.body.minObservations !== undefined &&
+            (!Number.isInteger(req.body.minObservations) ||
+              req.body.minObservations < 0)))
+      ) {
+        return { status_code: 400, body: { error: "invalid consolidate payload" } };
+      }
       const result = await sdk.trigger("mem::consolidate", req.body);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 195 - 313, Handlers are passing req.body
straight to sdk.trigger and need explicit validation: in the functions
registered as "api::file-context", "api::consolidate", "api::patterns", and
"api::generate-rules" (and re-check "api::forget" already has sessionId/memoryId
validation), add type checks after checkAuth and before calling sdk.trigger to
return 400 with a clear error when payloads are malformed — for
"api::file-context" ensure req.body.sessionId is a non-empty string and
req.body.files is an array of non-empty strings; for "api::consolidate" ensure
project (if present) is a string and minObservations (if present) is a
non-negative integer; for "api::patterns" and "api::generate-rules" ensure
project (if present) is a string; keep error shapes consistent (e.g., { error:
"..." }) and only call sdk.trigger("mem::file-context" / "mem::consolidate" /
"mem::patterns" / "mem::generate-rules") when validation passes.
src/mcp/server.ts (1)

153-157: ⚠️ Potential issue | 🟠 Major

Validate arguments payload type before dispatch.

Line 157 destructures arguments without ensuring it is an object. Malformed payloads (for example arguments: null) currently fall into the catch and return 500 instead of a client 400.

Proposed hardening patch
-      if (!req.body || typeof req.body.name !== "string") {
+      if (!req.body || typeof req.body !== "object" || req.body === null) {
+        return { status_code: 400, body: { error: "invalid request body" } };
+      }
+      if (typeof req.body.name !== "string" || !req.body.name.trim()) {
         return { status_code: 400, body: { error: "name is required" } };
       }
-
-      const { name, arguments: args = {} } = req.body;
+      if (
+        req.body.arguments !== undefined &&
+        (typeof req.body.arguments !== "object" ||
+          req.body.arguments === null ||
+          Array.isArray(req.body.arguments))
+      ) {
+        return { status_code: 400, body: { error: "arguments must be an object" } };
+      }
+      const name = req.body.name;
+      const args = (req.body.arguments ?? {}) as Record<string, unknown>;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp/server.ts` around lines 153 - 157, The handler currently destructures
"arguments" into args without validating its type, so malformed payloads like
arguments: null cause a 500; after extracting const { name, arguments: args = {}
} (in the request handler in server.ts) add a check that args is a plain object
(e.g., typeof args === "object" && args !== null && !Array.isArray(args)) and if
it fails return a 400 response with an explanatory error (e.g., { status_code:
400, body: { error: "arguments must be an object" } }) before proceeding to
dispatch.
src/functions/consolidate.ts (1)

69-72: ⚠️ Potential issue | 🟡 Minor

Validate minObservations range before use.

Line 71 accepts invalid values (negative/non-integer) and can produce unintended consolidation behavior.

Proposed validation
-      const minObs = data.minObservations ?? 10;
+      const minObs =
+        Number.isInteger(data.minObservations) &&
+        (data.minObservations as number) >= 0
+          ? (data.minObservations as number)
+          : 10;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/consolidate.ts` around lines 69 - 72, The code reads
data.minObservations into minObs without validating, allowing negative or
non-integer values; update the async handler to validate data.minObservations
before use: ensure it's a finite integer >= 1 (or use a safe lower bound like 1)
and otherwise fall back to the default (10) or return a clear error. Reference
the variables data.minObservations and minObs in the async function so the
validation runs immediately after obtaining ctx = getContext() and before any
consolidation logic uses minObs.
src/functions/file-index.ts (1)

27-31: ⚠️ Potential issue | 🟠 Major

Default file-history scope should be current session’s project.

When data.project is omitted, Line 28 includes sessions from all projects, which can leak unrelated file history.

Proposed scoping fix
-      const sessions = await kv.list<Session>(KV.sessions);
-      let otherSessions = sessions.filter((s) => s.id !== data.sessionId);
-      if (data.project) {
-        otherSessions = otherSessions.filter((s) => s.project === data.project);
-      }
+      const sessions = await kv.list<Session>(KV.sessions);
+      const currentSession = await kv.get<Session>(KV.sessions, data.sessionId);
+      const projectScope = data.project ?? currentSession?.project;
+      let otherSessions = sessions.filter(
+        (s) => s.id !== data.sessionId && (!projectScope || s.project === projectScope),
+      );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/file-index.ts` around lines 27 - 31, The current filter builds
otherSessions from kv.list(KV.sessions) but only restricts by project when
data.project is provided, which can include sessions from other projects;
retrieve the current session (find session where s.id === data.sessionId) to
determine its project and use that as the default scope (e.g., const
projectScope = data.project ?? currentSession.project) and then filter
otherSessions by s.project === projectScope; update references to sessions,
otherSessions, data.sessionId, and data.project accordingly so file-history is
always scoped to the current session’s project when data.project is omitted.
🧹 Nitpick comments (1)
src/functions/consolidate.ts (1)

125-133: Use a timeout helper that clears timers.

The inline Promise.race timeout leaves timers running after successful provider.compress calls. A helper that clears timeout handles this cleanly.

Refactor example
+function withTimeout<T>(promise: Promise<T>, ms: number): Promise<T> {
+  return new Promise((resolve, reject) => {
+    const timer = setTimeout(() => reject(new Error("compress timeout")), ms);
+    promise.then(
+      (value) => {
+        clearTimeout(timer);
+        resolve(value);
+      },
+      (error) => {
+        clearTimeout(timer);
+        reject(error);
+      },
+    );
+  });
+}
@@
-          const response = await Promise.race([
-            provider.compress(
-              CONSOLIDATION_SYSTEM,
-              `Concept: "${concept}"\n\nObservations:\n${prompt}`,
-            ),
-            new Promise<never>((_, reject) =>
-              setTimeout(() => reject(new Error("compress timeout")), 30_000),
-            ),
-          ]);
+          const response = await withTimeout(
+            provider.compress(
+              CONSOLIDATION_SYSTEM,
+              `Concept: "${concept}"\n\nObservations:\n${prompt}`,
+            ),
+            30_000,
+          );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/consolidate.ts` around lines 125 - 133, The inline Promise.race
timeout in the consolidate flow leaves timers running; replace it with a timeout
helper that clears the timer when provider.compress (called with
CONSOLIDATION_SYSTEM and the concept/Observations prompt) resolves or rejects.
Implement/usage guidance: create a helper (e.g., withTimeout or
promiseWithTimeout) that starts a setTimeout returning a reject after 30_000ms
but also clears that timeout on resolution/rejection of provider.compress, then
call that helper instead of Promise.race around provider.compress; keep the same
error message ("compress timeout") and preserve the original call signature to
provider.compress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@plugin/skills/recall/SKILL.md`:
- Line 10: The request body is being built with ad-hoc sed escaping (escaping
backslashes and quotes) which misses other JSON control characters; replace that
with a proper JSON encoder when constructing the payload in the request-body
creation (the code referenced by "sed escaping" at SKILL.md line 10) — e.g., use
a JSON serializer (jq for shell, JSON.stringify in JS/TS, or json.dumps in
Python) to produce the body and pass that encoded string directly to the HTTP
client; ensure you remove the sed-based escaping and instead call the serializer
on the raw query string so newlines, tabs, unicode, etc. are correctly escaped.

In `@src/functions/consolidate.ts`:
- Around line 69-72: The code reads data.minObservations into minObs without
validating, allowing negative or non-integer values; update the async handler to
validate data.minObservations before use: ensure it's a finite integer >= 1 (or
use a safe lower bound like 1) and otherwise fall back to the default (10) or
return a clear error. Reference the variables data.minObservations and minObs in
the async function so the validation runs immediately after obtaining ctx =
getContext() and before any consolidation logic uses minObs.

In `@src/functions/file-index.ts`:
- Around line 27-31: The current filter builds otherSessions from
kv.list(KV.sessions) but only restricts by project when data.project is
provided, which can include sessions from other projects; retrieve the current
session (find session where s.id === data.sessionId) to determine its project
and use that as the default scope (e.g., const projectScope = data.project ??
currentSession.project) and then filter otherSessions by s.project ===
projectScope; update references to sessions, otherSessions, data.sessionId, and
data.project accordingly so file-history is always scoped to the current
session’s project when data.project is omitted.

In `@src/mcp/server.ts`:
- Around line 153-157: The handler currently destructures "arguments" into args
without validating its type, so malformed payloads like arguments: null cause a
500; after extracting const { name, arguments: args = {} } (in the request
handler in server.ts) add a check that args is a plain object (e.g., typeof args
=== "object" && args !== null && !Array.isArray(args)) and if it fails return a
400 response with an explanatory error (e.g., { status_code: 400, body: { error:
"arguments must be an object" } }) before proceeding to dispatch.

In `@src/triggers/api.ts`:
- Around line 252-256: The request validation in the api::forget handler
currently rejects requests unless req.body.sessionId or req.body.memoryId are
present; change the conditional in the request check (the if that returns
status_code 400) to also accept req.body.observationIds (e.g., treat the request
as valid if sessionId OR memoryId OR a non-empty observationIds array is
provided). Update the error response message accordingly so it requires one of
sessionId, memoryId, or observationIds, and ensure you reference
req.body.observationIds (and if desired check length > 0) in that validation
branch inside the api::forget logic.
- Around line 195-313: Handlers are passing req.body straight to sdk.trigger and
need explicit validation: in the functions registered as "api::file-context",
"api::consolidate", "api::patterns", and "api::generate-rules" (and re-check
"api::forget" already has sessionId/memoryId validation), add type checks after
checkAuth and before calling sdk.trigger to return 400 with a clear error when
payloads are malformed — for "api::file-context" ensure req.body.sessionId is a
non-empty string and req.body.files is an array of non-empty strings; for
"api::consolidate" ensure project (if present) is a string and minObservations
(if present) is a non-negative integer; for "api::patterns" and
"api::generate-rules" ensure project (if present) is a string; keep error shapes
consistent (e.g., { error: "..." }) and only call
sdk.trigger("mem::file-context" / "mem::consolidate" / "mem::patterns" /
"mem::generate-rules") when validation passes.

---

Nitpick comments:
In `@src/functions/consolidate.ts`:
- Around line 125-133: The inline Promise.race timeout in the consolidate flow
leaves timers running; replace it with a timeout helper that clears the timer
when provider.compress (called with CONSOLIDATION_SYSTEM and the
concept/Observations prompt) resolves or rejects. Implement/usage guidance:
create a helper (e.g., withTimeout or promiseWithTimeout) that starts a
setTimeout returning a reject after 30_000ms but also clears that timeout on
resolution/rejection of provider.compress, then call that helper instead of
Promise.race around provider.compress; keep the same error message ("compress
timeout") and preserve the original call signature to provider.compress.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7d00ea and 824b0ff.

📒 Files selected for processing (15)
  • plugin/scripts/post-tool-failure.mjs
  • plugin/scripts/pre-tool-use.mjs
  • plugin/scripts/subagent-stop.mjs
  • plugin/skills/recall/SKILL.md
  • src/functions/consolidate.ts
  • src/functions/file-index.ts
  • src/functions/observe.ts
  • src/functions/patterns.ts
  • src/functions/remember.ts
  • src/hooks/post-tool-failure.ts
  • src/hooks/pre-tool-use.ts
  • src/hooks/subagent-stop.ts
  • src/index.ts
  • src/mcp/server.ts
  • src/triggers/api.ts
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/hooks/pre-tool-use.ts
  • src/functions/observe.ts
  • src/functions/patterns.ts
  • plugin/scripts/post-tool-failure.mjs
  • src/functions/remember.ts
  • src/hooks/post-tool-failure.ts
  • src/hooks/subagent-stop.ts

Add production-grade self-evaluation, health monitoring, and resilience:

- Zod input/output validation for all 11 functions with quality scoring (0-100)
- Self-correcting LLM retry on validation failure
- Per-function metrics tracking (success rate, latency, quality) in KV
- Circuit breaker for LLM providers (closed→open→half-open state machine)
- ResilientProvider wrapper with automatic circuit breaker integration
- Periodic health monitoring (CPU, memory, event loop lag, connection state)
- Health threshold evaluation (healthy/degraded/critical with alerts)
- Local observation queue for engine disconnection recovery
- SHA-256 content-hash deduplication with 5min TTL window
- Memory lifecycle eviction (stale sessions, low-importance obs, per-project caps)
- BM25 search scoring replacing basic TF (k1=1.2, b=0.75, IDF weighting)
- OTEL telemetry setup with counters and histograms
- Enhanced /health endpoint returning real metrics, function stats, circuit state
- New /evict endpoint for memory lifecycle management
- 82 tests passing (37 new: eval schemas, quality scoring, circuit breaker)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/functions/compress.ts (1)

104-167: ⚠️ Potential issue | 🟠 Major

Invalid compression outputs can still be stored after retry exhaustion.

At Line 104, compressWithRetry may return an invalid response after retries. After Line 112, only XML parsing is checked; schema validation is not enforced before KV write/stream.

Suggested fix
         const parsed = parseCompressionXml(response);
         if (!parsed) {
           const latencyMs = Date.now() - startMs;
           if (metricsStore) {
             await metricsStore.record("mem::compress", latencyMs, false);
           }
           ctx.logger.warn("Failed to parse compression XML", {
             obsId: data.observationId,
             retried,
           });
           return { success: false, error: "parse_failed" };
         }
 
-        const qualityScore = scoreCompression(parsed);
+        const validated = validateOutput(
+          CompressOutputSchema,
+          parsed,
+          "mem::compress",
+        );
+        if (!validated.valid) {
+          const latencyMs = Date.now() - startMs;
+          if (metricsStore) {
+            await metricsStore.record("mem::compress", latencyMs, false);
+          }
+          ctx.logger.warn("Compression output failed schema validation", {
+            obsId: data.observationId,
+            retried,
+            errors: validated.result.errors,
+          });
+          return { success: false, error: "validation_failed", errors: validated.result.errors };
+        }
+
+        const qualityScore = scoreCompression(validated.data);
 
         const compressed: CompressedObservation = {
           id: data.observationId,
           sessionId: data.sessionId,
           timestamp: data.raw.timestamp,
-          ...parsed,
+          ...validated.data,
         };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/compress.ts` around lines 104 - 167, compressWithRetry may
return a response that parseCompressionXml accepts but that doesn't meet your
compression schema, and the code currently writes to KV, search index and
triggers the stream before validating; update the flow after parseCompressionXml
to validate the parsed object (e.g., call an existing validator or implement
validateCompressionSchema(parsed)) and if validation fails log via
ctx.logger.warn with obsId and retried, record the failure via
metricsStore.record("mem::compress", latencyMs, false), and return { success:
false, error: "validation_failed" } without calling kv.set,
getSearchIndex().add, sdk.trigger, or scoreCompression; only compute
qualityScore, persist to KV, add to index and trigger the stream after
validation succeeds.
♻️ Duplicate comments (2)
src/functions/observe.ts (1)

65-65: ⚠️ Potential issue | 🟡 Minor

Using || may incorrectly fallback on empty string.

d["tool_output"] || d["error"] will fallback to error if tool_output is an empty string "". If empty string is a valid output, use ?? instead.

🔧 Proposed fix
-          raw.toolOutput = d["tool_output"] || d["error"];
+          raw.toolOutput = d["tool_output"] ?? d["error"];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/observe.ts` at line 65, Replace the fallback using logical OR
in the assignment to raw.toolOutput so an empty string from d["tool_output"] is
preserved; change the expression raw.toolOutput = d["tool_output"] || d["error"]
to use nullish coalescing (raw.toolOutput = d["tool_output"] ?? d["error"]) so
only null/undefined fall back to d["error"] while "" remains a valid toolOutput.
src/triggers/api.ts (1)

219-337: ⚠️ Potential issue | 🟠 Major

New endpoint validation is still incomplete at the API boundary.

Line 226, Line 299, Line 314, and Line 329 forward payloads without shape checks; and Line 276 rejects observationIds-only forget requests even though observationIds is part of the accepted body type.

Suggested fix pattern
   sdk.registerFunction(
     { id: "api::file-context" },
     async (
       req: ApiRequest<{ sessionId: string; files: string[] }>,
     ): Promise<Response> => {
       const authErr = checkAuth(req, secret);
       if (authErr) return authErr;
+      if (
+        !req.body ||
+        typeof req.body.sessionId !== "string" ||
+        !Array.isArray(req.body.files) ||
+        req.body.files.length === 0 ||
+        req.body.files.some((f) => typeof f !== "string" || !f.trim())
+      ) {
+        return { status_code: 400, body: { error: "sessionId and non-empty files[] are required" } };
+      }
       const result = await sdk.trigger("mem::file-context", req.body);
       return { status_code: 200, body: result };
     },
   );
-      if (!req.body?.sessionId && !req.body?.memoryId) {
+      const hasObservationIds =
+        Array.isArray(req.body?.observationIds) &&
+        req.body!.observationIds.length > 0;
+      if (!req.body?.sessionId && !req.body?.memoryId && !hasObservationIds) {
         return {
           status_code: 400,
-          body: { error: "sessionId or memoryId is required" },
+          body: { error: "sessionId, memoryId, or non-empty observationIds is required" },
         };
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 219 - 337, Several API handlers forward
unvalidated request bodies and the api::forget handler incorrectly rejects
requests that include observationIds only; update each registered function
(api::file-context, api::remember, api::forget, api::consolidate, api::patterns,
api::generate-rules) to validate the expected body shape before calling
sdk.trigger: for api::file-context ensure req.body.sessionId is a string and
req.body.files is a non-empty string[]; for api::remember keep the existing
content string check and validate optional type/concepts/files shapes; for
api::forget accept either sessionId (string) or memoryId (string) or
observationIds (string[]), returning 400 if none match; for api::consolidate,
api::patterns, and api::generate-rules validate optional fields (project as
string, minObservations as number where applicable) and return a 400 with a
clear error when validation fails; only then call sdk.trigger with the
validated/normalized payload.
🧹 Nitpick comments (10)
src/state/search-index.ts (2)

73-87: Prefix matching has O(vocabulary) complexity per query term.

The loop iterates over the entire invertedIndex for each query term to find prefix matches. With a large vocabulary (e.g., 10k+ unique terms), this becomes expensive: O(queryTerms × vocabularySize).

If the index grows significantly, consider optimizing with a sorted term array + binary search, or a trie for prefix lookups.

♻️ Alternative: sorted prefix lookup
// Add as class field:
private sortedTerms: string[] = [];

// Update in add() after modifying invertedIndex:
this.sortedTerms = Array.from(this.invertedIndex.keys()).sort();

// Replace prefix iteration with binary search:
private getPrefixMatches(prefix: string): string[] {
  const start = this.sortedTerms.findIndex(t => t.startsWith(prefix));
  if (start === -1) return [];
  const matches: string[] = [];
  for (let i = start; i < this.sortedTerms.length && this.sortedTerms[i].startsWith(prefix); i++) {
    matches.push(this.sortedTerms[i]);
  }
  return matches;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state/search-index.ts` around lines 73 - 87, The current prefix-match
loop iterates the entire invertedIndex (invertedIndex) per query term, causing
O(vocab) cost; fix by maintaining a sorted term list (add a class field
sortedTerms: string[]) and update it inside add() whenever invertedIndex
changes, then replace the full invertedIndex scan with a prefix lookup helper
(e.g., getPrefixMatches(prefix: string)) that uses binary search on sortedTerms
to find the start index and collects contiguous startsWith matches, and finally
iterate only those matched terms to compute prefixIdf and update scores. Ensure
getPrefixMatches is used in the scoring path instead of the for (const
[indexTerm, obsIds] of this.invertedIndex) loop.

103-108: Add a remove(obsId) method to prevent search index pollution.

When observations are evicted from storage, they remain in the in-memory SearchIndex because evict.ts does not synchronize deletions with the index. While search results are filtered (stale entries are skipped at the KV lookup), the index itself becomes polluted with non-existent documents, degrading search quality:

  • BM25 scoring becomes inaccurate (IDF calculations include non-existent docs)
  • Memory usage grows with stale entries
  • Extra KV lookups waste resources

Without per-document removal, the index can only be fully rebuilt when size === 0, which requires near-complete eviction.

♻️ Proposed remove method
+  remove(obsId: string): boolean {
+    const entry = this.entries.get(obsId);
+    if (!entry) return false;
+
+    const docTerms = this.docTermCounts.get(obsId);
+    if (docTerms) {
+      for (const term of docTerms.keys()) {
+        const termDocs = this.invertedIndex.get(term);
+        if (termDocs) {
+          termDocs.delete(obsId);
+          if (termDocs.size === 0) {
+            this.invertedIndex.delete(term);
+          }
+        }
+      }
+    }
+
+    this.totalDocLength -= entry.termCount;
+    this.docTermCounts.delete(obsId);
+    this.entries.delete(obsId);
+    return true;
+  }

Call this method in evict.ts after deletion, or batch-call it when eviction completes to keep the index in sync with storage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state/search-index.ts` around lines 103 - 108, Add a remove(obsId) method
to the SearchIndex class to delete a single document from the in-memory index:
locate the clear() method and the instance fields entries, invertedIndex,
docTermCounts, and totalDocLength, then implement remove(obsId) to (1)
early-return if entries doesn't contain obsId, (2) fetch the document's term
frequency map from docTermCounts.get(obsId), (3) for each term decrement or
delete the term's posting from invertedIndex (removing the obsId entry and
removing the term entirely if no postings remain), (4) subtract the document's
length from totalDocLength and delete entries and docTermCounts entries for
obsId, and (5) ensure any derived counts (e.g., total document count used in
IDF) reflect the removal; finally call this new SearchIndex.remove(obsId) from
evict.ts after deleting an observation (or batch-call it after an eviction run)
so the index stays in sync.
src/functions/evict.ts (1)

85-89: Sequential deletes may be slow for large eviction batches.

The sequential await in the loop could become a performance bottleneck when evicting many observations. For a maintenance operation like eviction, this is likely acceptable, but consider batching or parallel deletion if performance becomes an issue in production.

♻️ Optional: Batch deletions with Promise.all
         if (!dryRun) {
-          for (const o of toEvict) {
-            await kv.delete(KV.observations(o.sessionId), o.id).catch(() => {});
-          }
+          await Promise.all(
+            toEvict.map((o) =>
+              kv.delete(KV.observations(o.sessionId), o.id).catch(() => {})
+            )
+          );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/evict.ts` around lines 85 - 89, The loop that performs
sequential deletes (checking dryRun then awaiting kv.delete for each item in
toEvict) will be slow for large batches; change it to issue deletes in parallel
(e.g., map toEvict to promises calling kv.delete(KV.observations(o.sessionId),
o.id) and await Promise.all on the array) and optionally limit concurrency by
chunking the toEvict array if needed to avoid overwhelming the store; keep the
dryRun guard and preserve the existing error swallowing behavior (.catch(() =>
{})) on each delete promise.
src/providers/circuit-breaker.ts (1)

13-23: Side effect in getter is intentional but worth documenting.

The isAllowed getter mutates state (line 17: transitions from open to half-open). This is a common pattern for circuit breakers (lazy state transition), but it means calling the getter twice can yield different results. The test file correctly handles this by explicitly calling cb.isAllowed before asserting state.

Consider adding a brief inline comment explaining this behavior for future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/circuit-breaker.ts` around lines 13 - 23, The isAllowed getter
mutates circuit state (it transitions this.state from "open" to "half-open" when
this.openedAt + RECOVERY_TIMEOUT_MS has elapsed), which is intentional lazy
state transition behavior; add a concise inline comment above the isAllowed
getter explaining that accessing isAllowed can change state (i.e., calling
cb.isAllowed may return different values on consecutive calls because it
performs the time-based transition), and mention the key fields involved (state,
openedAt, RECOVERY_TIMEOUT_MS) so future maintainers understand the side effect.
src/functions/dedup.ts (1)

11-55: No maximum size limit could cause unbounded memory growth.

Under sustained high load with unique inputs, the map can grow indefinitely until TTL expiry. Consider adding a max-size eviction policy.

💡 Suggestion

Add a MAX_ENTRIES constant and evict oldest entries when exceeded in record():

+const MAX_ENTRIES = 10_000;
+
   record(hash: string): void {
+    if (this.entries.size >= MAX_ENTRIES) {
+      const oldest = this.entries.keys().next().value;
+      if (oldest) this.entries.delete(oldest);
+    }
     this.entries.set(hash, { hash, expiresAt: Date.now() + TTL_MS });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/dedup.ts` around lines 11 - 55, DedupMap can grow unbounded;
add a MAX_ENTRIES constant and enforce it in record(): after inserting (or
before), if this.entries.size > MAX_ENTRIES evict oldest entries until size <=
MAX_ENTRIES (use insertion order of this.entries Map to remove earliest keys via
this.entries.keys().next().value). Update record() to perform eviction and keep
cleanup()/stop()/computeHash()/isDuplicate() unchanged; ensure MAX_ENTRIES is
configurable or documented and referenced by record() and any tests.
src/health/monitor.ts (1)

29-35: Silent failure on worker list fetch hides operational issues.

The empty catch {} block discards errors completely. Consider logging failures at debug level for troubleshooting.

📝 Proposed fix
     try {
       const result = await sdk.trigger<unknown, { workers?: HealthSnapshot["workers"] }>(
         "engine::workers::list",
         {},
       );
       if (result?.workers) workers = result.workers;
-    } catch {}
+    } catch (err) {
+      // Worker list unavailable - non-critical, continue with empty array
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/health/monitor.ts` around lines 29 - 35, The empty catch on the
sdk.trigger call swallowing errors should be replaced with a debug-level log so
failures to fetch workers are visible; update the catch block after the
sdk.trigger<...>("engine::workers::list", ...) call to log the caught error
(include error.message/stack) and context such as the action name and that
workers remain unchanged (e.g., use an existing logger.debug or
processLogger.debug, falling back to console.debug) while keeping the current
behavior of not throwing.
src/providers/resilient.ts (1)

12-38: Consider extracting duplicated circuit-breaker logic.

The compress and summarize methods have identical try/catch/record patterns. Extracting to a helper would reduce duplication.

♻️ Optional refactor
+  private async withBreaker<T>(fn: () => Promise<T>): Promise<T> {
+    if (!this.breaker.isAllowed) {
+      throw new Error("circuit_breaker_open");
+    }
+    try {
+      const result = await fn();
+      this.breaker.recordSuccess();
+      return result;
+    } catch (err) {
+      this.breaker.recordFailure();
+      throw err;
+    }
+  }
+
   async compress(systemPrompt: string, userPrompt: string): Promise<string> {
-    if (!this.breaker.isAllowed) {
-      throw new Error("circuit_breaker_open");
-    }
-    try {
-      const result = await this.inner.compress(systemPrompt, userPrompt);
-      this.breaker.recordSuccess();
-      return result;
-    } catch (err) {
-      this.breaker.recordFailure();
-      throw err;
-    }
+    return this.withBreaker(() => this.inner.compress(systemPrompt, userPrompt));
   }
 
   async summarize(systemPrompt: string, userPrompt: string): Promise<string> {
-    if (!this.breaker.isAllowed) {
-      throw new Error("circuit_breaker_open");
-    }
-    try {
-      const result = await this.inner.summarize(systemPrompt, userPrompt);
-      this.breaker.recordSuccess();
-      return result;
-    } catch (err) {
-      this.breaker.recordFailure();
-      throw err;
-    }
+    return this.withBreaker(() => this.inner.summarize(systemPrompt, userPrompt));
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/resilient.ts` around lines 12 - 38, Both compress and summarize
duplicate the same circuit-breaker check and try/catch recording logic; create a
single private helper (e.g., executeWithCircuit or runWithBreaker) that takes an
async callback, performs the this.breaker.isAllowed check, awaits the callback
(call sites will pass () => this.inner.compress(...) and () =>
this.inner.summarize(...)), calls this.breaker.recordSuccess() on success and
this.breaker.recordFailure() on error, and rethrows the error; then replace the
bodies of compress and summarize to simply return the helper invoked with the
appropriate inner method call.
src/telemetry/setup.ts (1)

82-96: Unused destructured values in getter functions.

The destructured c and h variables are never used; the functions return the module-level variables instead. This works because initMetrics() sets them as a side effect, but it's confusing.

♻️ Cleaner approach
 export function getCounters(): Counters {
   if (!counters) {
-    const { counters: c } = initMetrics();
-    return c;
+    initMetrics();
   }
   return counters;
 }
 
 export function getHistograms(): Histograms {
   if (!histograms) {
-    const { histograms: h } = initMetrics();
-    return h;
+    initMetrics();
   }
   return histograms;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/telemetry/setup.ts` around lines 82 - 96, The getters getCounters and
getHistograms call initMetrics() but destructure unused local vars (c, h);
remove the unused destructuring and either call initMetrics() for its side
effects (e.g., just invoke initMetrics() when counters or histograms are
undefined) or assign the returned objects to the module-level
counters/histograms variables; update getCounters/getHistograms to call
initMetrics() without creating unused c/h locals and then return the
module-level counters/histograms so the code is clear and free of unused
variables (referencing getCounters, getHistograms, initMetrics, counters,
histograms).
src/eval/quality.ts (1)

44-47: Weak heuristic for detecting structured content.

Checking context.includes("<") could match code with comparison operators. The subsequent regex /<\w+>/g is more reliable. Consider removing the standalone < check or tightening it.

💡 Suggested improvement
-  if (context.includes("<")) score += 15;
   const sectionCount = (context.match(/<\w+>/g) || []).length;
-  if (sectionCount >= 2) score += 15;
+  if (sectionCount >= 1) score += 15;
+  if (sectionCount >= 2) score += 15;
   if (sectionCount >= 4) score += 10;

This bases the score entirely on actual tag matches rather than a generic < presence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eval/quality.ts` around lines 44 - 47, Remove the weak check that awards
points when context.includes("<") is true and instead rely solely on actual tag
matches; in the block that computes sectionCount using (context.match(/<\w+>/g)
|| []).length, delete the if (context.includes("<")) score += 15; line and keep
the existing sectionCount thresholds (if (sectionCount >= 2) score += 15; if
(sectionCount >= 4) score += 10;) so scoring is based only on real tag matches;
reference variables: context, sectionCount, score, and the regex /<\w+>/g.
src/functions/compress.ts (1)

23-38: Prefer a single source of truth for observation type values.

VALID_TYPES duplicates the values in src/eval/schemas.ts ObservationTypeEnum. Keeping both lists manual creates drift risk over time.

Refactor direction
-const VALID_TYPES = new Set<string>([
-  "file_read",
-  "file_write",
-  "file_edit",
-  "command_run",
-  "search",
-  "web_fetch",
-  "conversation",
-  "error",
-  "decision",
-  "discovery",
-  "subagent",
-  "notification",
-  "task",
-  "other",
-]);
+// Reuse exported observation-type values from eval schema module
+// to avoid divergence between parser and validator.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/compress.ts` around lines 23 - 38, VALID_TYPES duplicates the
observation values declared in ObservationTypeEnum in src/eval/schemas.ts;
replace the hard-coded list by importing ObservationTypeEnum and building
VALID_TYPES from it (e.g., construct a Set from
Object.values(ObservationTypeEnum)) so there's a single source of truth; update
the file to import ObservationTypeEnum and create VALID_TYPES using
ObservationTypeEnum instead of the manual array, keeping the variable name
VALID_TYPES and behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@package.json`:
- Line 3: The PR title and package.json are out of sync: package.json's
"version" field is "0.3.0" but the PR title references "v0.2.0"; update the PR
title to "v0.3.0" (or alternatively update the package.json "version" field if
the intended release is 0.2.0) so the PR title and the version field match
exactly.

In `@src/eval/metrics-store.ts`:
- Around line 49-52: getAll() currently returns only in-memory cache entries
when this.cache.size > 0, which can omit metrics stored in KV; change getAll()
in metrics-store.ts to merge cached entries with the full KV list instead of
short-circuiting: always fetch the KV.metrics via
this.kv.list<FunctionMetrics>(KV.metrics), then combine KV results with entries
from this.cache (overwriting KV entries with cache values where keys collide) so
callers receive a complete, up-to-date set; reference getAll(), this.cache,
this.kv.list<FunctionMetrics>(KV.metrics), KV.metrics and existing
record()/get() behavior when implementing the merge.
- Around line 36-38: The averaging logic for qualityScore in FunctionMetrics is
wrong because it divides by m.totalCalls even when qualityScore is absent; add a
new numeric field qualityCallCount to FunctionMetrics, increment it only when
qualityScore !== undefined, and compute m.avgQualityScore as (m.avgQualityScore
* prev + qualityScore) / m.qualityCallCount (or initialize when first quality
score arrives). Update any code that initializes or persists FunctionMetrics to
include qualityCallCount and ensure m.totalCalls remains unchanged.

In `@src/functions/dedup.ts`:
- Around line 15-17: The repeating timer created in the constructor
(this.cleanupTimer = setInterval(() => this.cleanup(), CLEANUP_INTERVAL_MS)) is
keeping the Node process alive; modify the constructor to call unref() on the
timer so it won't keep the event loop active (e.g., invoke
this.cleanupTimer.unref() or use a safe optional call if typing requires) after
creating the interval; keep references to this.cleanupTimer,
CLEANUP_INTERVAL_MS, and cleanup() unchanged except for adding the unref call.

In `@src/functions/summarize.ts`:
- Around line 100-128: Validation result from validateOutput (called with
SummaryOutputSchema and summaryForValidation) is computed but never enforced:
change summarize flow to check validation.valid before persisting or returning
success. Specifically, after calling validateOutput and scoreSummary, only call
kv.set(KV.summaries, data.sessionId, summary) and record
metricsStore.record(...) when validation.valid is true; otherwise record a
failed metric (or skip success=true metric), log the validation errors via
ctx.logger.warn/error including validation.details, and return { success: false,
summary: null, qualityScore } (or an error object) so invalid summaries are not
stored or treated as successful. Ensure references are to validateOutput,
SummaryOutputSchema, summaryForValidation, kv.set, metricsStore.record,
ctx.logger.info to locate the changes.

In `@src/health/monitor.ts`:
- Around line 46-50: The cpu.percent field is hardcoded to 0; either remove it
or compute it by sampling process.cpuUsage over time: add module-scoped
previousCpuUsage and previousCpuTime, then in the function that builds the cpu
object (where cpu.user and cpu.system are read) call currentCpu =
process.cpuUsage(previousCpuUsage) (or compute diffs using process.cpuUsage()
and previous snapshot), compute elapsedMicros = (nowMs - previousTimeMs) * 1000,
diffMicros = currentCpu.user + currentCpu.system, and set cpu.percent =
(diffMicros / (elapsedMicros * require('os').cpus().length)) * 100; finally
update previousCpuUsage and previousCpuTime; alternatively remove cpu.percent
until implemented.
- Around line 7-17: The module-level mutable variable connectionState causes
cross-instance contamination; move connectionState into the
registerHealthMonitor function so each call gets its own closure-scoped state,
update the sdk.on("connection_state", ...) handler to write the local
connectionState, and ensure any places that read or return connection state
(including the returned stop closure) reference that local variable instead of
the module-level one (remove the top-level connectionState declaration).

In `@src/health/thresholds.ts`:
- Around line 54-63: The current memPercent calculation uses snapshot.memory.rss
/ snapshot.memory.heapTotal which mixes RSS and V8 heap and causes spurious
>100% results; change the calculation to use V8 heap utilization: memPercent =
snapshot.memory.heapTotal > 0 ? (snapshot.memory.heapUsed /
snapshot.memory.heapTotal) * 100 : 0; keep the existing alert logic using
cfg.memoryCriticalPercent and cfg.memoryWarnPercent but update alert labels if
you want to be explicit (e.g., memory_heap_critical_... / memory_heap_warn_...);
alternatively, if you prefer to monitor RSS, compute and compare
snapshot.memory.rss against a configured absolute threshold (e.g.,
cfg.maxRssBytes) instead of dividing by heapTotal.

In `@src/providers/index.ts`:
- Around line 11-33: Replace non-null assertions on getEnvVar(...) in the
provider switch so missing keys produce a clear configuration error: call
getEnvVar for each provider (e.g., for "anthropic", "gemini", "openrouter") into
a local const, validate it's defined, and throw or log a descriptive error like
"Missing API key for ANTHROPIC/GEMINI/OPENROUTER" before constructing
AnthropicProvider or OpenRouterProvider; update the cases that reference
AnthropicProvider and OpenRouterProvider to use the validated variable instead
of getEnvVar(... )!.

In `@src/triggers/api.ts`:
- Around line 364-365: The dryRun value passed to sdk.trigger("mem::evict") can
be a non-boolean (e.g. the string "false"); change the assignment so dryRun is a
strict boolean by normalizing req.query_params?.["dryRun"] and req.body?.dryRun
into true only when they are the boolean true or the string "true" (e.g. const
dryRun = req.query_params?.["dryRun"] === "true" || req.body?.dryRun === true ||
req.body?.dryRun === "true"), then pass that normalized dryRun into
sdk.trigger("mem::evict") instead of the current OR-ed expression.

---

Outside diff comments:
In `@src/functions/compress.ts`:
- Around line 104-167: compressWithRetry may return a response that
parseCompressionXml accepts but that doesn't meet your compression schema, and
the code currently writes to KV, search index and triggers the stream before
validating; update the flow after parseCompressionXml to validate the parsed
object (e.g., call an existing validator or implement
validateCompressionSchema(parsed)) and if validation fails log via
ctx.logger.warn with obsId and retried, record the failure via
metricsStore.record("mem::compress", latencyMs, false), and return { success:
false, error: "validation_failed" } without calling kv.set,
getSearchIndex().add, sdk.trigger, or scoreCompression; only compute
qualityScore, persist to KV, add to index and trigger the stream after
validation succeeds.

---

Duplicate comments:
In `@src/functions/observe.ts`:
- Line 65: Replace the fallback using logical OR in the assignment to
raw.toolOutput so an empty string from d["tool_output"] is preserved; change the
expression raw.toolOutput = d["tool_output"] || d["error"] to use nullish
coalescing (raw.toolOutput = d["tool_output"] ?? d["error"]) so only
null/undefined fall back to d["error"] while "" remains a valid toolOutput.

In `@src/triggers/api.ts`:
- Around line 219-337: Several API handlers forward unvalidated request bodies
and the api::forget handler incorrectly rejects requests that include
observationIds only; update each registered function (api::file-context,
api::remember, api::forget, api::consolidate, api::patterns,
api::generate-rules) to validate the expected body shape before calling
sdk.trigger: for api::file-context ensure req.body.sessionId is a string and
req.body.files is a non-empty string[]; for api::remember keep the existing
content string check and validate optional type/concepts/files shapes; for
api::forget accept either sessionId (string) or memoryId (string) or
observationIds (string[]), returning 400 if none match; for api::consolidate,
api::patterns, and api::generate-rules validate optional fields (project as
string, minObservations as number where applicable) and return a 400 with a
clear error when validation fails; only then call sdk.trigger with the
validated/normalized payload.

---

Nitpick comments:
In `@src/eval/quality.ts`:
- Around line 44-47: Remove the weak check that awards points when
context.includes("<") is true and instead rely solely on actual tag matches; in
the block that computes sectionCount using (context.match(/<\w+>/g) ||
[]).length, delete the if (context.includes("<")) score += 15; line and keep the
existing sectionCount thresholds (if (sectionCount >= 2) score += 15; if
(sectionCount >= 4) score += 10;) so scoring is based only on real tag matches;
reference variables: context, sectionCount, score, and the regex /<\w+>/g.

In `@src/functions/compress.ts`:
- Around line 23-38: VALID_TYPES duplicates the observation values declared in
ObservationTypeEnum in src/eval/schemas.ts; replace the hard-coded list by
importing ObservationTypeEnum and building VALID_TYPES from it (e.g., construct
a Set from Object.values(ObservationTypeEnum)) so there's a single source of
truth; update the file to import ObservationTypeEnum and create VALID_TYPES
using ObservationTypeEnum instead of the manual array, keeping the variable name
VALID_TYPES and behavior unchanged.

In `@src/functions/dedup.ts`:
- Around line 11-55: DedupMap can grow unbounded; add a MAX_ENTRIES constant and
enforce it in record(): after inserting (or before), if this.entries.size >
MAX_ENTRIES evict oldest entries until size <= MAX_ENTRIES (use insertion order
of this.entries Map to remove earliest keys via
this.entries.keys().next().value). Update record() to perform eviction and keep
cleanup()/stop()/computeHash()/isDuplicate() unchanged; ensure MAX_ENTRIES is
configurable or documented and referenced by record() and any tests.

In `@src/functions/evict.ts`:
- Around line 85-89: The loop that performs sequential deletes (checking dryRun
then awaiting kv.delete for each item in toEvict) will be slow for large
batches; change it to issue deletes in parallel (e.g., map toEvict to promises
calling kv.delete(KV.observations(o.sessionId), o.id) and await Promise.all on
the array) and optionally limit concurrency by chunking the toEvict array if
needed to avoid overwhelming the store; keep the dryRun guard and preserve the
existing error swallowing behavior (.catch(() => {})) on each delete promise.

In `@src/health/monitor.ts`:
- Around line 29-35: The empty catch on the sdk.trigger call swallowing errors
should be replaced with a debug-level log so failures to fetch workers are
visible; update the catch block after the
sdk.trigger<...>("engine::workers::list", ...) call to log the caught error
(include error.message/stack) and context such as the action name and that
workers remain unchanged (e.g., use an existing logger.debug or
processLogger.debug, falling back to console.debug) while keeping the current
behavior of not throwing.

In `@src/providers/circuit-breaker.ts`:
- Around line 13-23: The isAllowed getter mutates circuit state (it transitions
this.state from "open" to "half-open" when this.openedAt + RECOVERY_TIMEOUT_MS
has elapsed), which is intentional lazy state transition behavior; add a concise
inline comment above the isAllowed getter explaining that accessing isAllowed
can change state (i.e., calling cb.isAllowed may return different values on
consecutive calls because it performs the time-based transition), and mention
the key fields involved (state, openedAt, RECOVERY_TIMEOUT_MS) so future
maintainers understand the side effect.

In `@src/providers/resilient.ts`:
- Around line 12-38: Both compress and summarize duplicate the same
circuit-breaker check and try/catch recording logic; create a single private
helper (e.g., executeWithCircuit or runWithBreaker) that takes an async
callback, performs the this.breaker.isAllowed check, awaits the callback (call
sites will pass () => this.inner.compress(...) and () =>
this.inner.summarize(...)), calls this.breaker.recordSuccess() on success and
this.breaker.recordFailure() on error, and rethrows the error; then replace the
bodies of compress and summarize to simply return the helper invoked with the
appropriate inner method call.

In `@src/state/search-index.ts`:
- Around line 73-87: The current prefix-match loop iterates the entire
invertedIndex (invertedIndex) per query term, causing O(vocab) cost; fix by
maintaining a sorted term list (add a class field sortedTerms: string[]) and
update it inside add() whenever invertedIndex changes, then replace the full
invertedIndex scan with a prefix lookup helper (e.g., getPrefixMatches(prefix:
string)) that uses binary search on sortedTerms to find the start index and
collects contiguous startsWith matches, and finally iterate only those matched
terms to compute prefixIdf and update scores. Ensure getPrefixMatches is used in
the scoring path instead of the for (const [indexTerm, obsIds] of
this.invertedIndex) loop.
- Around line 103-108: Add a remove(obsId) method to the SearchIndex class to
delete a single document from the in-memory index: locate the clear() method and
the instance fields entries, invertedIndex, docTermCounts, and totalDocLength,
then implement remove(obsId) to (1) early-return if entries doesn't contain
obsId, (2) fetch the document's term frequency map from
docTermCounts.get(obsId), (3) for each term decrement or delete the term's
posting from invertedIndex (removing the obsId entry and removing the term
entirely if no postings remain), (4) subtract the document's length from
totalDocLength and delete entries and docTermCounts entries for obsId, and (5)
ensure any derived counts (e.g., total document count used in IDF) reflect the
removal; finally call this new SearchIndex.remove(obsId) from evict.ts after
deleting an observation (or batch-call it after an eviction run) so the index
stays in sync.

In `@src/telemetry/setup.ts`:
- Around line 82-96: The getters getCounters and getHistograms call
initMetrics() but destructure unused local vars (c, h); remove the unused
destructuring and either call initMetrics() for its side effects (e.g., just
invoke initMetrics() when counters or histograms are undefined) or assign the
returned objects to the module-level counters/histograms variables; update
getCounters/getHistograms to call initMetrics() without creating unused c/h
locals and then return the module-level counters/histograms so the code is clear
and free of unused variables (referencing getCounters, getHistograms,
initMetrics, counters, histograms).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 824b0ff and 2319ee1.

📒 Files selected for processing (25)
  • package.json
  • src/eval/metrics-store.ts
  • src/eval/quality.ts
  • src/eval/schemas.ts
  • src/eval/self-correct.ts
  • src/eval/validator.ts
  • src/functions/compress.ts
  • src/functions/dedup.ts
  • src/functions/evict.ts
  • src/functions/observe.ts
  • src/functions/summarize.ts
  • src/health/monitor.ts
  • src/health/recovery.ts
  • src/health/thresholds.ts
  • src/index.ts
  • src/providers/circuit-breaker.ts
  • src/providers/index.ts
  • src/providers/resilient.ts
  • src/state/schema.ts
  • src/state/search-index.ts
  • src/telemetry/setup.ts
  • src/triggers/api.ts
  • src/types.ts
  • test/circuit-breaker.test.ts
  • test/eval.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/types.ts

Comment thread package.json
{
"name": "agentmemory",
"version": "0.1.0",
"version": "0.3.0",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Version mismatch between PR title and package.json.

The PR title references "v0.2.0" but the package version is set to "0.3.0". Consider updating the PR title to reflect the actual version being released to avoid confusion during release tracking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 3, The PR title and package.json are out of sync:
package.json's "version" field is "0.3.0" but the PR title references "v0.2.0";
update the PR title to "v0.3.0" (or alternatively update the package.json
"version" field if the intended release is 0.2.0) so the PR title and the
version field match exactly.

Comment thread src/eval/metrics-store.ts
Comment thread src/eval/metrics-store.ts
Comment thread src/functions/dedup.ts
Comment thread src/functions/summarize.ts
Comment thread src/health/monitor.ts Outdated
Comment thread src/health/monitor.ts
Comment thread src/health/thresholds.ts Outdated
Comment thread src/providers/index.ts
Comment thread src/triggers/api.ts Outdated
- metrics-store: track quality score calls separately to avoid diluted averages
- metrics-store: getAll() merges KV + cache for complete data
- summarize: reject and don't persist summaries that fail validation
- dedup: unref cleanup timer so it doesn't block process shutdown
- health/monitor: move connectionState into closure (no module-level mutation)
- health/monitor: compute real CPU percent via delta sampling
- health/thresholds: use heapUsed/heapTotal instead of rss/heapTotal
- providers/index: throw clear error on missing API keys instead of non-null assert
- remember: handle empty observationIds array explicitly in forget flow
- triggers/api: normalize dryRun to strict boolean
- Extract shared circuit breaker call() helper in ResilientProvider
- Use COUNTER_NAMES/HISTOGRAM_NAMES arrays with shared noop objects in telemetry
- Add MS_PER_DAY constant to replace repeated 24*60*60*1000 in eviction
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/eval/metrics-store.ts (1)

7-8: ⚠️ Potential issue | 🟠 Major

avgQualityScore can become incorrect after restart or cache miss.

qualityCallCounts is process-local (Line 7). When metrics are reloaded from KV (Line 19), the count resets, so the next quality sample is averaged as if it were near-first sample, not an incremental continuation.

🛠️ Proposed fix
+type StoredFunctionMetrics = FunctionMetrics & { qualityCallCount: number };
+
 export class MetricsStore {
-  private cache = new Map<string, FunctionMetrics>();
-  private qualityCallCounts = new Map<string, number>();
+  private cache = new Map<string, StoredFunctionMetrics>();

   constructor(private kv: StateKV) {}

   async record(
@@
-    let m = this.cache.get(functionId);
+    let m = this.cache.get(functionId);
     if (!m) {
-      m = (await this.kv.get<FunctionMetrics>(KV.metrics, functionId)) ?? {
+      m = (await this.kv.get<StoredFunctionMetrics>(KV.metrics, functionId)) ?? {
         functionId,
         totalCalls: 0,
         successCount: 0,
         failureCount: 0,
         avgLatencyMs: 0,
         avgQualityScore: 0,
+        qualityCallCount: 0,
       };
     }
@@
     if (qualityScore !== undefined) {
-      const prevQualityCalls = this.qualityCallCounts.get(functionId) || 0;
+      const prevQualityCalls = m.qualityCallCount ?? 0;
+      m.qualityCallCount = prevQualityCalls + 1;
       m.avgQualityScore =
         (m.avgQualityScore * prevQualityCalls + qualityScore) /
-        (prevQualityCalls + 1);
-      this.qualityCallCounts.set(functionId, prevQualityCalls + 1);
+        m.qualityCallCount;
     }

Also update FunctionMetrics in src/types.ts to include qualityCallCount so this stays type-safe end-to-end.

Also applies to: 19-26, 37-43

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eval/metrics-store.ts` around lines 7 - 8, The in-memory Map
qualityCallCounts is reset on process restart or when metrics are reloaded from
KV, which corrupts incremental avgQualityScore; update the persistence model and
restore logic: add qualityCallCount to the FunctionMetrics type in src/types.ts,
persist that count alongside avgQualityScore when saving metrics, and when
loading metrics from KV populate qualityCallCounts (the private Map
qualityCallCounts in src/eval/metrics-store.ts) from the stored qualityCallCount
so future calls compute a true incremental average; also update any code that
increments/updates avgQualityScore (e.g., the method that records new quality
samples) to use and increment the persisted count instead of relying on a
process-local default.
src/functions/remember.ts (1)

71-86: ⚠️ Potential issue | 🟠 Major

Empty observationIds currently escalates to full-session deletion.

Line 84 treats observationIds: [] as “delete entire session,” which can wipe all observations when a client accidentally sends an empty list. This should return a validation error instead, and full-session deletion should only run when observationIds is truly omitted.

🛠️ Proposed fix
       if (
         data.sessionId &&
         data.observationIds &&
         data.observationIds.length > 0
       ) {
         for (const obsId of data.observationIds) {
           await kv.delete(KV.observations(data.sessionId), obsId);
           deleted++;
         }
       }

+      if (
+        data.sessionId &&
+        Array.isArray(data.observationIds) &&
+        data.observationIds.length === 0
+      ) {
+        return { success: false, error: "observationIds cannot be empty" };
+      }
+
       if (
         data.sessionId &&
-        (!data.observationIds || data.observationIds.length === 0) &&
+        data.observationIds === undefined &&
         !data.memoryId
       ) {
         const observations = await kv.list<{ id: string }>(
           KV.observations(data.sessionId),
         );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/remember.ts` around lines 71 - 86, The handler currently treats
an empty array (data.observationIds === []) as a signal to delete the whole
session; change this so an empty array is a validation error and full-session
deletion only happens when observationIds is absent. Concretely: add an explicit
validation early (e.g., in the function handling request `data`) that if
Array.isArray(data.observationIds) && data.observationIds.length === 0 you
return/throw a validation error; then change the full-session deletion branch
condition from (!data.observationIds || data.observationIds.length === 0) to
(data.observationIds == null) so only a missing observationIds triggers the
KV.observations(sessionId) full deletion path that calls kv.delete and
KV.observations.
src/triggers/api.ts (1)

260-285: ⚠️ Potential issue | 🟡 Minor

Validation does not allow observationIds as a standalone identifier.

The type signature includes observationIds?: string[], but the validation (lines 271-276) only checks for sessionId or memoryId. Users cannot delete specific observations by ID alone.

Suggested fix
-      if (!req.body?.sessionId && !req.body?.memoryId) {
+      const hasValidIds =
+        req.body?.sessionId ||
+        req.body?.memoryId ||
+        (Array.isArray(req.body?.observationIds) &&
+          req.body.observationIds.length > 0);
+      if (!hasValidIds) {
         return {
           status_code: 400,
-          body: { error: "sessionId or memoryId is required" },
+          body: {
+            error: "sessionId, memoryId, or non-empty observationIds required",
+          },
         };
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 260 - 285, The validation in the
api::forget handler (registered via sdk.registerFunction) only accepts sessionId
or memoryId but ignores observationIds, preventing deletion by observation IDs;
update the input validation in the async function for id "api::forget" to accept
any one of sessionId, memoryId, or a non-empty observationIds array (e.g., check
req.body?.sessionId || req.body?.memoryId ||
(Array.isArray(req.body?.observationIds) && req.body.observationIds.length >
0)), return the same 400 error if none are present, and keep forwarding req.body
to sdk.trigger("mem::forget") unchanged.
🧹 Nitpick comments (1)
src/telemetry/setup.ts (1)

48-56: Inconsistent metric naming: circuit_breaker.open uses underscore while others use dots.

All other metrics use dots as delimiters (e.g., compression.success, search.total), but circuit_breaker.open mixes underscore and dot. This inconsistency complicates metric queries and dashboard configurations.

Proposed fix for consistent naming
 const COUNTER_NAMES: Array<[keyof Counters, string]> = [
   ["observationsTotal", "observations.total"],
   ["compressionSuccess", "compression.success"],
   ["compressionFailure", "compression.failure"],
   ["searchTotal", "search.total"],
   ["dedupSkipped", "dedup.skipped"],
   ["evictionTotal", "eviction.total"],
-  ["circuitBreakerOpen", "circuit_breaker.open"],
+  ["circuitBreakerOpen", "circuit.breaker.open"],
 ];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/telemetry/setup.ts` around lines 48 - 56, The COUNTER_NAMES entry for the
circuit breaker metric is inconsistent: change the string value for the
"circuitBreakerOpen" tuple in COUNTER_NAMES from "circuit_breaker.open" to use
dot separators (e.g., "circuit.breaker.open" or "circuit.breaker.open" per
project convention) so it matches the dot-delimited format used by other
metrics; update only the string part of the tuple in the COUNTER_NAMES array to
the chosen dot-delimited name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/evict.ts`:
- Around line 67-71: The stale-session branch in evict (where you increment
stats.staleSessions and call kv.delete(KV.sessions, session.id)) only removes
the session record and leaves per-session observations behind; update that block
so when not dryRun you also remove all keys under the observations namespace for
that session (use the KV.observations(session.id) namespace/identifier to list
or delete the observation keys and delete them—either via a
deletePrefix/list+batch-delete or an API that deletes a namespace), ensuring
errors are caught the same way as the existing kv.delete call and preserving the
dryRun behavior.

In `@src/telemetry/setup.ts`:
- Around line 7-11: The OTEL_CONFIG object has a mismatched serviceVersion
("0.3.0") versus the PR/version in use; update the serviceVersion field on
OTEL_CONFIG to "0.2.0" (or to the authoritative release variable if one exists)
so telemetry/serviceName (agentmemory) and serviceVersion are consistent; locate
OTEL_CONFIG in src/telemetry/setup.ts and change the serviceVersion property
accordingly.

---

Duplicate comments:
In `@src/eval/metrics-store.ts`:
- Around line 7-8: The in-memory Map qualityCallCounts is reset on process
restart or when metrics are reloaded from KV, which corrupts incremental
avgQualityScore; update the persistence model and restore logic: add
qualityCallCount to the FunctionMetrics type in src/types.ts, persist that count
alongside avgQualityScore when saving metrics, and when loading metrics from KV
populate qualityCallCounts (the private Map qualityCallCounts in
src/eval/metrics-store.ts) from the stored qualityCallCount so future calls
compute a true incremental average; also update any code that increments/updates
avgQualityScore (e.g., the method that records new quality samples) to use and
increment the persisted count instead of relying on a process-local default.

In `@src/functions/remember.ts`:
- Around line 71-86: The handler currently treats an empty array
(data.observationIds === []) as a signal to delete the whole session; change
this so an empty array is a validation error and full-session deletion only
happens when observationIds is absent. Concretely: add an explicit validation
early (e.g., in the function handling request `data`) that if
Array.isArray(data.observationIds) && data.observationIds.length === 0 you
return/throw a validation error; then change the full-session deletion branch
condition from (!data.observationIds || data.observationIds.length === 0) to
(data.observationIds == null) so only a missing observationIds triggers the
KV.observations(sessionId) full deletion path that calls kv.delete and
KV.observations.

In `@src/triggers/api.ts`:
- Around line 260-285: The validation in the api::forget handler (registered via
sdk.registerFunction) only accepts sessionId or memoryId but ignores
observationIds, preventing deletion by observation IDs; update the input
validation in the async function for id "api::forget" to accept any one of
sessionId, memoryId, or a non-empty observationIds array (e.g., check
req.body?.sessionId || req.body?.memoryId ||
(Array.isArray(req.body?.observationIds) && req.body.observationIds.length >
0)), return the same 400 error if none are present, and keep forwarding req.body
to sdk.trigger("mem::forget") unchanged.

---

Nitpick comments:
In `@src/telemetry/setup.ts`:
- Around line 48-56: The COUNTER_NAMES entry for the circuit breaker metric is
inconsistent: change the string value for the "circuitBreakerOpen" tuple in
COUNTER_NAMES from "circuit_breaker.open" to use dot separators (e.g.,
"circuit.breaker.open" or "circuit.breaker.open" per project convention) so it
matches the dot-delimited format used by other metrics; update only the string
part of the tuple in the COUNTER_NAMES array to the chosen dot-delimited name.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2319ee1 and f1afd49.

📒 Files selected for processing (11)
  • src/eval/metrics-store.ts
  • src/functions/dedup.ts
  • src/functions/evict.ts
  • src/functions/remember.ts
  • src/functions/summarize.ts
  • src/health/monitor.ts
  • src/health/thresholds.ts
  • src/providers/index.ts
  • src/providers/resilient.ts
  • src/telemetry/setup.ts
  • src/triggers/api.ts

Comment thread src/functions/evict.ts
Comment on lines +67 to +71
if (age > staleDays && !summaryIds.has(session.id)) {
stats.staleSessions++;
if (!dryRun) {
await kv.delete(KV.sessions, session.id).catch(() => {});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stale-session eviction leaves orphaned observations behind.

In the stale-session path, only KV.sessions is deleted. Observations in KV.observations(session.id) can remain indefinitely after the session record is gone.

🛠️ Proposed fix
         if (age > staleDays && !summaryIds.has(session.id)) {
           stats.staleSessions++;
           if (!dryRun) {
+            const staleObs = await kv
+              .list<CompressedObservation>(KV.observations(session.id))
+              .catch(() => []);
+            for (const o of staleObs) {
+              await kv.delete(KV.observations(session.id), o.id).catch(() => {});
+            }
             await kv.delete(KV.sessions, session.id).catch(() => {});
           }
+          continue;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (age > staleDays && !summaryIds.has(session.id)) {
stats.staleSessions++;
if (!dryRun) {
await kv.delete(KV.sessions, session.id).catch(() => {});
}
if (age > staleDays && !summaryIds.has(session.id)) {
stats.staleSessions++;
if (!dryRun) {
const staleObs = await kv
.list<CompressedObservation>(KV.observations(session.id))
.catch(() => []);
for (const o of staleObs) {
await kv.delete(KV.observations(session.id), o.id).catch(() => {});
}
await kv.delete(KV.sessions, session.id).catch(() => {});
}
continue;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/evict.ts` around lines 67 - 71, The stale-session branch in
evict (where you increment stats.staleSessions and call kv.delete(KV.sessions,
session.id)) only removes the session record and leaves per-session observations
behind; update that block so when not dryRun you also remove all keys under the
observations namespace for that session (use the KV.observations(session.id)
namespace/identifier to list or delete the observation keys and delete
them—either via a deletePrefix/list+batch-delete or an API that deletes a
namespace), ensuring errors are caught the same way as the existing kv.delete
call and preserving the dryRun behavior.

Comment thread src/telemetry/setup.ts
Comment on lines +7 to +11
export const OTEL_CONFIG: OtelConfig = {
serviceName: "agentmemory",
serviceVersion: "0.3.0",
metricsExportIntervalMs: 30_000,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Version mismatch: PR is v0.2.0 but serviceVersion is "0.3.0".

The PR title indicates this is v0.2.0, but the telemetry config specifies version "0.3.0". This inconsistency will cause confusion in observability data and dashboards.

Proposed fix
 export const OTEL_CONFIG: OtelConfig = {
   serviceName: "agentmemory",
-  serviceVersion: "0.3.0",
+  serviceVersion: "0.2.0",
   metricsExportIntervalMs: 30_000,
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export const OTEL_CONFIG: OtelConfig = {
serviceName: "agentmemory",
serviceVersion: "0.3.0",
metricsExportIntervalMs: 30_000,
};
export const OTEL_CONFIG: OtelConfig = {
serviceName: "agentmemory",
serviceVersion: "0.2.0",
metricsExportIntervalMs: 30_000,
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/telemetry/setup.ts` around lines 7 - 11, The OTEL_CONFIG object has a
mismatched serviceVersion ("0.3.0") versus the PR/version in use; update the
serviceVersion field on OTEL_CONFIG to "0.2.0" (or to the authoritative release
variable if one exists) so telemetry/serviceName (agentmemory) and
serviceVersion are consistent; locate OTEL_CONFIG in src/telemetry/setup.ts and
change the serviceVersion property accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant