Skip to content

feat(multimodal): Phase 1 — CLIP visual embeddings + vision-search#179

Merged
rohitg00 merged 3 commits intomainfrom
feat/multimodal-phase1-vision-embeddings
Apr 21, 2026
Merged

feat(multimodal): Phase 1 — CLIP visual embeddings + vision-search#179
rohitg00 merged 3 commits intomainfrom
feat/multimodal-phase1-vision-embeddings

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Apr 21, 2026

Summary

Phase 1 of the multimodal plan. Builds on PR #111 (image store + ref counting + VLM description). Adds the missing piece: a direct visual embedding space so agents can search screenshots by natural language or by similar image, without routing through VLM-describe → BM25-text.

Depends on #111. This PR is stacked on top of fix/multimodal-conflicts. Rebase cleanly once #111 merges.

Motivation

  • Mem0 / Zep / Letta are publicly text-only in the 2026 landscape comparisons — explicit multimodal gap.
  • AgenticVision ships CLIP ViT-B/32 but requires manual vision_capture invocation.
  • agentmemory already auto-captures screenshots via the 12 Claude Code hooks. Adding CLIP on top means the agent sees what it sees, and you can retrieve cross-modally without asking. No comparable system ships this combination.

Full design doc lives locally at ~/specs-backup/agentmemory-multimodal-plan.md (per our no-SPEC-in-PR convention).

What ships

New

  • ClipEmbeddingProvider — CLIP ViT-B/32 (Xenova/clip-vit-base-patch32) via @xenova/transformers (already a dep). 512-dim normalized vectors. Handles base64, data URL, and file paths. Lazy model download on first call.
  • createImageEmbeddingProvider() — gated on AGENTMEMORY_IMAGE_EMBEDDINGS=true. Default OFF, same opt-in pattern as AGENTMEMORY_AUTO_COMPRESS (Busting token allocation #138) and AGENTMEMORY_INJECT_CONTEXT (bursting token allocation (re-open) #143).
  • KV.imageEmbeddings scope keyed by imageRef path.
  • mem::vision-embed — stores a CLIP vector for a given imageRef.
  • mem::vision-search — cosine similarity over stored vectors. Accepts text query (text→image cross-modal), image path (image→image), or base64. topK clamped to 50. Optional sessionId filter.
  • POST /agentmemory/vision-embed, POST /agentmemory/vision-search — auth-gated HTTP triggers, 503 when the flag is off.
  • MCP tool memory_vision_search — exposed via the running-server handler in src/mcp/server.ts (standalone MCP unchanged by design; this is server-backed).

Wired into existing flows

  • EmbeddingProvider.embedImage?() — optional slot on the interface. Non-breaking.
  • Observe pipeline: when an image is persisted and the flag is on, fires mem::vision-embed via triggerVoid so embedding runs off the critical path.
  • decrementImageRef now drops the embedding alongside the disk file + ref row. Forget / evict / retention won't orphan vectors.

README / counts

MCP tool count 44 → 45 across README stats, install instructions, config block. CORE_TOOLS test expectation bumped accordingly.

Design choices

Decision Why
Opt-in default Matches #138 / #143 / #160. Don't surprise users with 350MB model download.
Lazy model load Same pattern as iii-engine's curl installer. Avoid npx install bloat.
Brute-force cosine Image memory counts are small (hundreds to thousands). ANN index is premature optimization — Phase 2+ work if warranted.
triggerVoid for auto-embed Don't block observe on model inference; fire-and-forget.
memory_vision_search server-only Standalone MCP is a deliberately narrow offline fallback; vision needs the server anyway.

Test plan

  • npm run build — clean
  • npm test794/794 pass (+6 new vision-search tests)
  • Unit coverage: embed→store, text→image ranking, image→image self-match, sessionId filter, provider-absent 503 error, missing-query 400 error
  • Manual: AGENTMEMORY_IMAGE_EMBEDDINGS=true npx @agentmemory/agentmemory, hook-capture a screenshot, curl -sX POST :3111/agentmemory/vision-search -d '{"queryText":"error banner"}'
  • Manual: confirm decrementImageRef cleans up the vector when refcount hits 0 (via evict dry-run then real run)

Not in this PR (Phase 2+)

  • Thumbnail cache (/images/thumb/{sha})
  • Viewer "Screenshots" tab
  • Hybrid BM25 + CLIP ranking in mem::smart-search (reciprocal rank fusion)
  • Entity↔image graph edges
  • vision_diff, OCR fallback, video keyframes

Ship Phase 1 first, watch real usage, then pick the next axis.

Summary by CodeRabbit

  • New Features

    • Vision search: cross-modal image search using CLIP embeddings.
    • New HTTP endpoints and MCP tool for vision search and embedding.
    • Optional image-embedding provider, toggleable via AGENTMEMORY_IMAGE_EMBEDDINGS.
  • Bug Fixes

    • Improved image reference cleanup to also remove related stored embeddings when images are deleted.
  • Documentation

    • README and plugin metadata updated to reflect 45 tools and the new env var.
  • Tests

    • Added end-to-end tests for vision-embed/search, validation, ranking, and session scoping.

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentmemory Ready Ready Preview, Comment Apr 21, 2026 0:29am

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

Adds optional CLIP-based image embeddings and cross-modal search: new CLIP embedding provider, SDK functions mem::vision-embed / mem::vision-search, API and MCP endpoints, KV schema for image embeddings, automatic vision-embed triggering when images are saved (configurable via env), and minor README/manifest updates.

Changes

Cohort / File(s) Summary
Vision search functions
src/functions/vision-search.ts
New module registering mem::vision-embed and mem::vision-search: validates inputs, builds/embeds query vectors (text/image/ref/base64), stores embeddings in KV.imageEmbeddings, computes cosine similarity, supports session scoping and topK clamping, and logs/audits.
Embedding provider (CLIP)
src/providers/embedding/clip.ts
New ClipEmbeddingProvider (name "clip", dims 512): lazy-loads @xenova/transformers, exposes embed, embedBatch, embedImage, handles data: URIs and filesystem paths, and L2-normalizes vectors.
Provider wiring
src/providers/embedding/index.ts, src/providers/index.ts
Re-exports CLIP provider and adds createImageEmbeddingProvider() factory guarded by AGENTMEMORY_IMAGE_EMBEDDINGS env var; provider cached as singleton.
Startup & wiring
src/index.ts
Instantiates imageEmbeddingProvider at startup (when enabled), logs provider name/dimensions, and registers vision-search functions during worker setup.
Observation & refcount changes
src/functions/observe.ts, src/functions/image-refs.ts
observe conditionally emits mem::vision-embed after saving images when AGENTMEMORY_IMAGE_EMBEDDINGS=true; decrementImageRef now deletes KV.imageEmbeddings entry when refcount reaches zero before removing file.
MCP & API exposure
src/mcp/server.ts, src/mcp/tools-registry.ts, src/triggers/api.ts
Adds MCP tool memory_vision_search and dispatch branch; new HTTP endpoints POST /agentmemory/vision-search and POST /agentmemory/vision-embed with input validation and downstream error mapping.
Types & schema
src/types.ts, src/state/schema.ts
Adds optional embedImage?(src: string) to EmbeddingProvider, new vision_embed audit operation, and KV.imageEmbeddings = "mem:image-embeddings".
Docs & small updates
README.md, plugin/.claude-plugin/plugin.json
Updated MCP tool count (44 → 45) in README and plugin manifest; added AGENTMEMORY_IMAGE_EMBEDDINGS env var docs describing optional screenshot-embedding and lazy-loading behavior.
Tests
test/vision-search.test.ts, test/mcp-standalone.test.ts
New end-to-end vision-search tests exercising embedding and search flows; updated tools count assertion (CORE_TOOLS length 12).

Sequence Diagram(s)

sequenceDiagram
    participant Observe as Observer
    participant VisionEmbed as mem::vision-embed
    participant ImageProvider as CLIP Provider
    participant KV as KV Store

    Observe->>Observe: Save extracted image to disk
    Observe->>Observe: Increment image ref count
    Note right of Observe: if AGENTMEMORY_IMAGE_EMBEDDINGS=true
    Observe->>VisionEmbed: Trigger mem::vision-embed (imageRef, sessionId, obsId)
    VisionEmbed->>KV: kv.get(KV.imageRefs, imageRef) verify ref exists
    VisionEmbed->>ImageProvider: embedImage(filePath)
    ImageProvider->>ImageProvider: Load image & run CLIP pipeline
    ImageProvider-->>VisionEmbed: Float32Array embedding
    VisionEmbed->>KV: kv.put(KV.imageEmbeddings, imageRef, StoredEmbedding)
    VisionEmbed-->>Observe: Return success / dimensions
Loading
sequenceDiagram
    participant Client as Client/API
    participant MCP as mcp::tools / API
    participant VisionSearch as mem::vision-search
    participant ImageProvider as CLIP Provider
    participant KV as KV Store

    Client->>MCP: Request (queryText or queryImageRef / queryImageBase64, topK, sessionId)
    MCP->>VisionSearch: Trigger mem::vision-search with payload
    alt queryText provided
        VisionSearch->>ImageProvider: embed(queryText)
    else queryImageBase64 provided
        VisionSearch->>ImageProvider: embedImage(base64)
    else queryImageRef provided
        VisionSearch->>KV: kv.get(KV.imageRefs, queryImageRef) verify
        VisionSearch->>ImageProvider: embedImage(queryImageRef)
    end
    ImageProvider-->>VisionSearch: query vector
    VisionSearch->>KV: list KV.imageEmbeddings (optionally filter by sessionId)
    KV-->>VisionSearch: stored embeddings
    VisionSearch->>VisionSearch: compute cosine similarities, sort, slice topK
    VisionSearch-->>MCP: Return ranked results
    MCP-->>Client: Respond with results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 I saved the pic and nudged it to sleep,

Numbers hummed softly from pixels so deep,
I hop through vectors, find a match in a sweep,
A carrot, a click — cross-modal memories keep. 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature addition: CLIP-based visual embeddings and vision search capabilities for the multimodal phase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/multimodal-phase1-vision-embeddings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Builds on PR #111's image store. Adds a dedicated visual embedding
path so agents can search screenshots by natural language or by
similar image, without routing through VLM-describe → BM25-text.

## Additions

- `EmbeddingProvider.embedImage?(src)` slot (non-breaking, optional)
- `ClipEmbeddingProvider` — CLIP ViT-B/32 via `@xenova/transformers`
  (already a dep). 512d, normalized. Supports base64, data URL, and
  file paths. Lazy model load; same pattern as the existing local
  text provider.
- `createImageEmbeddingProvider()` gated on
  `AGENTMEMORY_IMAGE_EMBEDDINGS=true` — opt-in, matches the
  auto-compress/context-injection opt-in pattern from #138 and #143.
- `KV.imageEmbeddings` scope keyed by imageRef path.
- `mem::vision-embed` + `mem::vision-search` functions. Cosine
  similarity, brute-force over the stored set (fine at image-memory
  scales; ANN can come later).
- `POST /agentmemory/vision-embed` + `POST /agentmemory/vision-search`
  HTTP triggers with auth + 400/503 error shapes.
- MCP tool `memory_vision_search` exposed via the running-server
  handler. Accepts queryText, queryImageRef, queryImageBase64, topK,
  sessionId.
- Observe pipeline: when an image is stored and the flag is on,
  fires `mem::vision-embed` via triggerVoid so embedding happens off
  the critical path.
- `decrementImageRef` sweeps the embedding too when refcount hits
  zero, so forget/evict/retention stop orphaning vectors.

## Why

Mem0/Zep/Letta are text-only in 2026. AgenticVision ships CLIP but
needs manual capture. agentmemory already auto-captures via hooks —
bolting CLIP onto that pipeline makes it the only memory system
where the agent sees what it sees and can cross-modal retrieve
without manual invocation.

Full plan: local spec at ~/specs-backup/agentmemory-multimodal-plan.md.

## Tests

- test/vision-search.test.ts: 6 new cases covering embed→store, text
  query ranking, image-to-image query, sessionId filter, provider-
  absent error, missing-query rejection.
- Full suite: 794/794 pass (+6 new).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
README.md (2)

813-847: ⚠️ Potential issue | 🟡 Minor

Document the opt-in vision embeddings flag.

The new MCP tool says it requires AGENTMEMORY_IMAGE_EMBEDDINGS=true, but the configuration block does not show that env var. Add it here so users can enable the feature without reading source.

Suggested env var entry
 # GRAPH_EXTRACTION_ENABLED=false
+# AGENTMEMORY_IMAGE_EMBEDDINGS=false # OFF by default. Set true to enable CLIP image embeddings
+                                   # and vision-search over stored screenshots.
 # CONSOLIDATION_ENABLED=true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 813 - 847, Add a new documented env var entry for
AGENTMEMORY_IMAGE_EMBEDDINGS in the configuration block under the "Features"
section so users can opt into image embedding support for the MCP tool; declare
it default false (e.g., AGENTMEMORY_IMAGE_EMBEDDINGS=false) and include a short
comment explaining that setting it to true enables image embeddings required by
the new MCP tool and may affect storage/token usage, similar to the other
AGENTMEMORY_* flags (reference the existing AGENTMEMORY_AUTO_COMPRESS and
AGENTMEMORY_INJECT_CONTEXT entries to match style).

337-366: ⚠️ Potential issue | 🟡 Minor

Finish updating the MCP tool counts and list.

The README now says 45 tools in some places, but this section still has all 43 memory tools, ### 44 Tools, and Extended tools (44 total...). The core tool table also omits memory_vision_search.

Suggested documentation updates
-Install agentmemory for OpenClaw. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to my OpenClaw MCP config so agentmemory is available with all 43 memory tools:
+Install agentmemory for OpenClaw. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to my OpenClaw MCP config so agentmemory is available with all 45 memory tools:

-Install agentmemory for Hermes. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to ~/.hermes/config.yaml so Hermes can use agentmemory as an MCP server with all 43 memory tools:
+Install agentmemory for Hermes. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to ~/.hermes/config.yaml so Hermes can use agentmemory as an MCP server with all 45 memory tools:

-### 44 Tools
+### 45 Tools

+| `memory_vision_search` | Search screenshots by text or similar image |
+
-<summary>Extended tools (44 total — set AGENTMEMORY_TOOLS=all)</summary>
+<summary>Extended tools (45 total — set AGENTMEMORY_TOOLS=all)</summary>

Based on learnings: When adding or removing MCP tools, update ALL of: tools-registry.ts, server.ts, triggers/api.ts, index.ts, test/mcp-standalone.test.ts, README.md, and plugin/.claude-plugin/plugin.json.

Also applies to: 596-620

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 337 - 366, README and some docs still reference
inconsistent counts and omit memory_vision_search; update all mentions of tool
counts and the core tool list so they match the current MCP registry. Edit
README.md to replace any "43", "44", or "45" occurrences with the authoritative
current count, add memory_vision_search to the core tools table, and ensure
headings like "### 44 Tools" reflect the same number; then synchronize the
code-side registries by confirming tools-registry.ts, server.ts,
triggers/api.ts, index.ts, test/mcp-standalone.test.ts, and
plugin/.claude-plugin/plugin.json include the same set (including
memory_vision_search) so all counts and lists match across README.md and those
files.
src/functions/observe.ts (1)

136-155: ⚠️ Potential issue | 🟠 Major

Trigger vision embedding only after the observation is persisted.

Lines 143-149 enqueue mem::vision-embed before kv.set(...) succeeds. If persistence fails, the async embed can race with cleanup and leave a searchable vector for a deleted/uncommitted observation.

Proposed ordering fix
+        let imageRefToEmbed: string | undefined;
         if (pendingImageData && (pendingImageData.startsWith("data:image/") || pendingImageData.startsWith("iVBORw0KGgo") || pendingImageData.startsWith("/9j/"))) {
           const { saveImageToDisk } = await import("../utils/image-store.js");
           const { filePath, bytesWritten } = await saveImageToDisk(pendingImageData);
           raw.imageData = filePath;
           const { incrementImageRef } = await import("./image-refs.js");
           await incrementImageRef(kv, filePath);
           sdk.triggerVoid("mem::disk-size-delta", { deltaBytes: bytesWritten });
-          if (process.env["AGENTMEMORY_IMAGE_EMBEDDINGS"] === "true") {
-            sdk.triggerVoid("mem::vision-embed", {
-              imageRef: filePath,
-              sessionId: payload.sessionId,
-              observationId: obsId,
-            });
-          }
+          imageRefToEmbed = filePath;
         }
 
         try {
 
           await kv.set(KV.observations(payload.sessionId), obsId, raw);
+          if (imageRefToEmbed && process.env["AGENTMEMORY_IMAGE_EMBEDDINGS"] === "true") {
+            sdk.triggerVoid("mem::vision-embed", {
+              imageRef: imageRefToEmbed,
+              sessionId: payload.sessionId,
+              observationId: obsId,
+            });
+          }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/observe.ts` around lines 136 - 155, The mem::vision-embed
trigger is being fired before the observation is persisted, which can create
embeddings for observations that fail to save; change the ordering so that after
saving image to disk (saveImageToDisk) and incrementing refs (incrementImageRef)
you call kv.set(KV.observations(payload.sessionId), obsId, raw) first and only
when that Promise resolves successfully then call
sdk.triggerVoid("mem::vision-embed", { imageRef: filePath, sessionId:
payload.sessionId, observationId: obsId }); keep the disk-size-delta trigger
where it is (it can run before or after persistence) but ensure the vision
embedding enqueue happens strictly after kv.set resolves to avoid embeddings for
unpersisted observations.
🧹 Nitpick comments (2)
test/mcp-standalone.test.ts (1)

49-50: Assert the new tool explicitly, not only the count.

This keeps the test tied to the memory_vision_search registration rather than any arbitrary 12th core tool.

Suggested test tightening
-  it("CORE_TOOLS has 12 items", () => {
+  it("CORE_TOOLS includes memory_vision_search", () => {
     expect(CORE_TOOLS.length).toBe(12);
+    expect(CORE_TOOLS.some((tool) => tool.name === "memory_vision_search")).toBe(true);
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/mcp-standalone.test.ts` around lines 49 - 50, The test only asserts
CORE_TOOLS.length; update the spec to also assert that the new tool registration
is present by name/ID so it doesn't pass due to any unrelated 12th item. In the
test around CORE_TOOLS (the it block that reads CORE_TOOLS has 12 items) add an
assertion that CORE_TOOLS contains the memory_vision_search tool (e.g., check
CORE_TOOLS.some(t => t.name === 'memory_vision_search' or t.id ===
'memory_vision_search') or expect(find(...) toBeDefined), keeping the length
assertion but explicitly verifying the memory_vision_search entry exists.
src/state/schema.ts (1)

45-45: Add a typed record for this new KV scope.

KV.imageEmbeddings needs a matching exported interface in src/types.ts for the stored embedding shape, including the vector plus metadata used for filtering such as imageRef, sessionId, and observationId.

Based on learnings: When adding new KV scopes, add to the KV object in src/state/schema.ts and add corresponding interface in src/types.ts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state/schema.ts` at line 45, Add an exported typed record that matches
the new KV scope KV.imageEmbeddings by creating an interface (e.g.,
ImageEmbeddingRecord) in src/types.ts that describes the stored embedding shape:
include a numeric vector (number[]), required metadata fields imageRef,
sessionId, observationId (strings), and any optional fields you need such as
timestamp or extra metadata; export the interface and ensure the KV mapping for
KV.imageEmbeddings uses this type so reads/writes are strongly typed (reference
KV.imageEmbeddings and the new ImageEmbeddingRecord interface when updating
usages).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/image-refs.ts`:
- Around line 24-25: Currently the code deletes KV.imageRefs first and
suppresses errors on deleting KV.imageEmbeddings, which can leave orphan
searchable embeddings; change the order to delete KV.imageEmbeddings before
KV.imageRefs and remove the .catch(() => {}) suppression so errors from
kv.delete(KV.imageEmbeddings, filePath) are allowed to surface (i.e., await
kv.delete(KV.imageEmbeddings, filePath) followed by await
kv.delete(KV.imageRefs, filePath)), ensuring failures prevent ref deletion and
avoiding orphan embeddings.

In `@src/functions/vision-search.ts`:
- Around line 35-53: The vision-embed path currently writes a StoredEmbedding to
KV.imageEmbeddings but doesn't record an audit entry; after the successful
kv.set in the try block (the block that constructs StoredEmbedding and calls
kv.set(KV.imageEmbeddings, data.imageRef, stored)), call recordAudit(...) to
emit an audit record for operation "mem::vision-embed" including relevant
metadata (at minimum imageRef, sessionId, observationId, modelName or provider
name, dimensions and updatedAt) so operators can trace the write; ensure you
import/use the existing recordAudit function and keep the audit call inside the
success path before returning the success object.

In `@src/mcp/tools-registry.ts`:
- Around line 129-143: Add the new tool "memory_vision_search" to the remaining
MCP surfaces: add test coverage in mcp-standalone.test.ts to exercise the
memory_vision_search tool (use the tool name "memory_vision_search" and its
inputSchema fields like queryText, queryImageBase64, queryImageRef, topK,
sessionId), update README.md to document AGENTMEMORY_IMAGE_EMBEDDINGS and usage
examples for memory_vision_search, and add the corresponding plugin metadata
entry in plugin/.claude-plugin/plugin.json so the plugin includes
memory_vision_search in its tools list and metadata.

In `@src/providers/embedding/clip.ts`:
- Around line 85-104: The heuristic in loadImage (function loadImage, returning
Promise<RawImageInstance>, using t.RawImage.fromBlob and readFile) incorrectly
tries to detect raw base64 by testing src.slice(0,64) and !src.includes("/"),
which rejects valid base64 payloads and misclassifies paths; remove that
heuristic and only treat inputs starting with "data:" as in-memory base64 blobs
(handled by the existing data: branch) and treat all other src values as
filesystem paths passed to readFile; if raw base64 support is required later,
add an explicit option or a clear "base64:" prefix parameter rather than
guessing the format.

In `@src/triggers/api.ts`:
- Around line 1282-1343: The api::vision-search and api::vision-embed handlers
currently forward req.body verbatim and only truthiness-check fields; instead,
whitelist and validate expected fields (e.g., for api::vision-search accept only
queryText:string, queryImageRef:string, queryImageBase64:string,
sessionId:string, topK:number and for api::vision-embed accept only
imageRef:string, sessionId:string, observationId:string), perform typeof checks,
clamp/normalize topK to an integer within acceptable bounds, build a new payload
object containing only those validated fields, and pass that payload to
sdk.trigger({ function_id: "mem::vision-search" }) / sdk.trigger({ function_id:
"mem::vision-embed" }) so downstream functions receive only well-typed,
whitelisted data.

---

Outside diff comments:
In `@README.md`:
- Around line 813-847: Add a new documented env var entry for
AGENTMEMORY_IMAGE_EMBEDDINGS in the configuration block under the "Features"
section so users can opt into image embedding support for the MCP tool; declare
it default false (e.g., AGENTMEMORY_IMAGE_EMBEDDINGS=false) and include a short
comment explaining that setting it to true enables image embeddings required by
the new MCP tool and may affect storage/token usage, similar to the other
AGENTMEMORY_* flags (reference the existing AGENTMEMORY_AUTO_COMPRESS and
AGENTMEMORY_INJECT_CONTEXT entries to match style).
- Around line 337-366: README and some docs still reference inconsistent counts
and omit memory_vision_search; update all mentions of tool counts and the core
tool list so they match the current MCP registry. Edit README.md to replace any
"43", "44", or "45" occurrences with the authoritative current count, add
memory_vision_search to the core tools table, and ensure headings like "### 44
Tools" reflect the same number; then synchronize the code-side registries by
confirming tools-registry.ts, server.ts, triggers/api.ts, index.ts,
test/mcp-standalone.test.ts, and plugin/.claude-plugin/plugin.json include the
same set (including memory_vision_search) so all counts and lists match across
README.md and those files.

In `@src/functions/observe.ts`:
- Around line 136-155: The mem::vision-embed trigger is being fired before the
observation is persisted, which can create embeddings for observations that fail
to save; change the ordering so that after saving image to disk
(saveImageToDisk) and incrementing refs (incrementImageRef) you call
kv.set(KV.observations(payload.sessionId), obsId, raw) first and only when that
Promise resolves successfully then call sdk.triggerVoid("mem::vision-embed", {
imageRef: filePath, sessionId: payload.sessionId, observationId: obsId }); keep
the disk-size-delta trigger where it is (it can run before or after persistence)
but ensure the vision embedding enqueue happens strictly after kv.set resolves
to avoid embeddings for unpersisted observations.

---

Nitpick comments:
In `@src/state/schema.ts`:
- Line 45: Add an exported typed record that matches the new KV scope
KV.imageEmbeddings by creating an interface (e.g., ImageEmbeddingRecord) in
src/types.ts that describes the stored embedding shape: include a numeric vector
(number[]), required metadata fields imageRef, sessionId, observationId
(strings), and any optional fields you need such as timestamp or extra metadata;
export the interface and ensure the KV mapping for KV.imageEmbeddings uses this
type so reads/writes are strongly typed (reference KV.imageEmbeddings and the
new ImageEmbeddingRecord interface when updating usages).

In `@test/mcp-standalone.test.ts`:
- Around line 49-50: The test only asserts CORE_TOOLS.length; update the spec to
also assert that the new tool registration is present by name/ID so it doesn't
pass due to any unrelated 12th item. In the test around CORE_TOOLS (the it block
that reads CORE_TOOLS has 12 items) add an assertion that CORE_TOOLS contains
the memory_vision_search tool (e.g., check CORE_TOOLS.some(t => t.name ===
'memory_vision_search' or t.id === 'memory_vision_search') or expect(find(...)
toBeDefined), keeping the length assertion but explicitly verifying the
memory_vision_search entry exists.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f40a010a-a1ea-4df6-8cff-61067a4bfd74

📥 Commits

Reviewing files that changed from the base of the PR and between bb60b7a and bf6f33f.

📒 Files selected for processing (15)
  • README.md
  • src/functions/image-refs.ts
  • src/functions/observe.ts
  • src/functions/vision-search.ts
  • src/index.ts
  • src/mcp/server.ts
  • src/mcp/tools-registry.ts
  • src/providers/embedding/clip.ts
  • src/providers/embedding/index.ts
  • src/providers/index.ts
  • src/state/schema.ts
  • src/triggers/api.ts
  • src/types.ts
  • test/mcp-standalone.test.ts
  • test/vision-search.test.ts

Comment thread src/functions/image-refs.ts Outdated
Comment thread src/functions/vision-search.ts
Comment thread src/mcp/tools-registry.ts
Comment on lines +129 to +143
{
name: "memory_vision_search",
description:
"Cross-modal image search via CLIP embeddings. Pass queryText to find screenshots matching a description, or queryImageBase64/queryImageRef to find similar images. Requires AGENTMEMORY_IMAGE_EMBEDDINGS=true.",
inputSchema: {
type: "object",
properties: {
queryText: { type: "string", description: "Text query (e.g. 'login form with error banner')" },
queryImageRef: { type: "string", description: "Absolute path to a stored image to match against" },
queryImageBase64: { type: "string", description: "Raw base64 image bytes or data URL" },
topK: { type: "number", description: "Max results (default 10, max 50)" },
sessionId: { type: "string", description: "Filter to a single session" },
},
},
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

targets=(
  "src/mcp/tools-registry.ts"
  "src/mcp/server.ts"
  "src/triggers/api.ts"
  "src/index.ts"
  "test/mcp-standalone.test.ts"
  "README.md"
  "plugin/.claude-plugin/plugin.json"
)

for target in "${targets[@]}"; do
  if [[ ! -e "$target" ]]; then
    printf '\nMISSING: %s\n' "$target"
    continue
  fi

  printf '\n== %s ==\n' "$target"
  rg -n 'memory_vision_search|mem::vision-search|vision-search|AGENTMEMORY_IMAGE_EMBEDDINGS' "$target" || true
done

Repository: rohitg00/agentmemory

Length of output: 1128


Update missing MCP surfaces for memory_vision_search.

The tool is registered in tools-registry.ts, server.ts, triggers/api.ts, and index.ts, but is missing from:

  • test/mcp-standalone.test.ts — add test coverage
  • README.md — add AGENTMEMORY_IMAGE_EMBEDDINGS documentation
  • plugin/.claude-plugin/plugin.json — add plugin metadata

Per the MCP integration pattern, all seven surfaces must be updated when adding tools.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp/tools-registry.ts` around lines 129 - 143, Add the new tool
"memory_vision_search" to the remaining MCP surfaces: add test coverage in
mcp-standalone.test.ts to exercise the memory_vision_search tool (use the tool
name "memory_vision_search" and its inputSchema fields like queryText,
queryImageBase64, queryImageRef, topK, sessionId), update README.md to document
AGENTMEMORY_IMAGE_EMBEDDINGS and usage examples for memory_vision_search, and
add the corresponding plugin metadata entry in plugin/.claude-plugin/plugin.json
so the plugin includes memory_vision_search in its tools list and metadata.

Comment thread src/providers/embedding/clip.ts
Comment thread src/triggers/api.ts
- src/functions/image-refs.ts: delete imageEmbeddings BEFORE imageRefs
  and drop the silent .catch() so transient KV errors surface. Stops
  orphan searchable embeddings when the ref delete fails halfway.
- src/functions/vision-search.ts: record an audit row for every
  successful mem::vision-embed write (new "vision_embed" op). State-
  changing functions must leave an audit trail per the repo policy.
- src/functions/vision-search.ts: mem::vision-search now wraps raw
  queryImageBase64 into a data: URL before handing to embedImage, so
  the provider contract stays "data: URL or filesystem path".
- src/providers/embedding/clip.ts: drop the raw-base64 heuristic in
  loadImage. The old regex whitelisted "/" in the char class but the
  !src.includes("/") guard rejected any realistic base64 blob,
  falling through to readFile and throwing ENOENT. Now: data: URL or
  filesystem path only; callers with raw base64 pass a data: URL.
  Also drop unused RawImage.read type and collapse the redundant
  Float32Array copy in embedImage.
- src/triggers/api.ts: both handlers now whitelist fields with
  asNonEmptyString and parseOptionalPositiveInt before constructing
  the payload. Matches api::observe / api::search style, stops
  forwarding the raw req.body, clamps topK at the boundary.
- src/types.ts: add "vision_embed" to AuditEntry.operation union and
  rename EmbeddingProvider.embedImage's param to src (matches the
  implementation now that raw base64 isn't accepted).
- README.md, plugin/.claude-plugin/plugin.json: document
  AGENTMEMORY_IMAGE_EMBEDDINGS and bump tool count string.

Full suite still 794/794.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

337-366: ⚠️ Potential issue | 🟡 Minor

Make the MCP tool counts and list consistent.

The README now advertises 45 tools, but nearby sections still say 43/44 tools and the extended list omits memory_vision_search. Update the headings/prompts and add the new tool row so users can discover it. Based on learnings: When adding or removing MCP tools, update ALL of: tools-registry.ts, server.ts, triggers/api.ts, index.ts, test/mcp-standalone.test.ts, README.md, and plugin/.claude-plugin/plugin.json.

📝 Proposed README fixes
-Install agentmemory for OpenClaw. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to my OpenClaw MCP config so agentmemory is available with all 43 memory tools:
+Install agentmemory for OpenClaw. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to my OpenClaw MCP config so agentmemory is available with all 45 memory tools:

-Install agentmemory for Hermes. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to ~/.hermes/config.yaml so Hermes can use agentmemory as an MCP server with all 43 memory tools:
+Install agentmemory for Hermes. Run `npx `@agentmemory/agentmemory`` in a separate terminal to start the memory server on localhost:3111. Then add this to ~/.hermes/config.yaml so Hermes can use agentmemory as an MCP server with all 45 memory tools:

-### 44 Tools
+### 45 Tools

-<summary>Extended tools (44 total — set AGENTMEMORY_TOOLS=all)</summary>
+<summary>Extended tools (45 total — set AGENTMEMORY_TOOLS=all)</summary>

Also add a row for:

| `memory_vision_search` | Cross-modal screenshot search via CLIP embeddings |

Also applies to: 596-620

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 337 - 366, Update the README and related registries
to make the MCP tool count and list consistent: change all advertised tool
counts to 45 and add the missing memory_vision_search tool entry to the tools
list in README.md; then propagate the addition/removal to the canonical code
locations — tools-registry.ts, server.ts, triggers/api.ts, index.ts,
test/mcp-standalone.test.ts, and plugin/.claude-plugin/plugin.json — ensuring
each file registers or tests the new memory_vision_search symbol (and any
updated tool-count constants) so documentation and code remain in sync.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/vision-search.ts`:
- Line 75: The topK calculation can become NaN or fractional (since Math.max(1,
NaN) yields NaN), so normalize data?.topK to a finite integer before clamping:
coerce the incoming value (data?.topK) to a Number, guard with Number.isFinite,
fall back to a default (e.g., 10) if invalid, then use Math.floor to remove
fractions and finally clamp with Math.min(50, Math.max(1, normalizedTopK));
update the expression that sets topK in vision-search.ts to use this
normalizedTopK variable.
- Around line 33-37: The code currently calls imageProvider.embedImage(...) with
raw data.imageRef and data.queryImageRef, which allows arbitrary filesystem
reads; before calling embedImage, verify that the provided ref exists in the
managed store by checking KV.imageRefs.get(imageRef) (and likewise
KV.imageRefs.get(queryImageRef)) and return an error if not found or invalid;
update the validation around the embedImage call in the vision-search handler
(where embedImage is invoked) to perform this KV.imageRefs lookup for both
data.imageRef and data.queryImageRef and only call imageProvider.embedImage when
the KV check succeeds.

---

Outside diff comments:
In `@README.md`:
- Around line 337-366: Update the README and related registries to make the MCP
tool count and list consistent: change all advertised tool counts to 45 and add
the missing memory_vision_search tool entry to the tools list in README.md; then
propagate the addition/removal to the canonical code locations —
tools-registry.ts, server.ts, triggers/api.ts, index.ts,
test/mcp-standalone.test.ts, and plugin/.claude-plugin/plugin.json — ensuring
each file registers or tests the new memory_vision_search symbol (and any
updated tool-count constants) so documentation and code remain in sync.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5b217ef9-9e54-4279-a256-f46ec9b6d3a5

📥 Commits

Reviewing files that changed from the base of the PR and between bf6f33f and 241be26.

📒 Files selected for processing (7)
  • README.md
  • plugin/.claude-plugin/plugin.json
  • src/functions/image-refs.ts
  • src/functions/vision-search.ts
  • src/providers/embedding/clip.ts
  • src/triggers/api.ts
  • src/types.ts
✅ Files skipped from review due to trivial changes (1)
  • plugin/.claude-plugin/plugin.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/functions/image-refs.ts
  • src/types.ts
  • src/triggers/api.ts

Comment thread src/functions/vision-search.ts
Comment thread src/functions/vision-search.ts Outdated
CodeRabbit major finding on mem::vision-embed: an authenticated caller
could POST { "imageRef": "/etc/passwd" } and the CLIP provider would
happily readFile that path and store its vector. Same vector on the
queryImageRef path of mem::vision-search. Arbitrary filesystem probing
via a REST endpoint.

- src/functions/vision-search.ts: both handlers now require the
  incoming ref to satisfy isManagedImagePath() and to exist in
  KV.imageRefs. Either check failing short-circuits with a specific
  error. Only paths that went through the managed image store (and
  still have at least one live reference) can be embedded or used as
  a query target.

- src/functions/vision-search.ts: topK is now normalised before
  clamping. `Math.max(1, NaN)` is NaN, so `slice(0, NaN)` returned
  nothing for malformed input. Coerce to finite integer, default 10,
  then clamp to [1, 50].

- test/vision-search.test.ts: imageRefs use the managed path, seed
  KV.imageRefs before each embed, and add coverage for
  managed-path rejection, unregistered-ref rejection, queryImageRef
  rejection, and NaN/fractional topK clamp. 10/10 pass locally, 798
  in the full suite.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/vision-search.ts`:
- Around line 93-97: The inline base64 image path (data.queryImageBase64)
bypasses image-store limits and is passed directly to imageProvider.embedImage;
fix by validating and bounding the decoded byte length before constructing the
data URL and calling embedImage: decode the base64 string to bytes, enforce a
configurable maximum (e.g., 1MB or project limit) and throw a validation error
if exceeded, only then build the `data:image/...;base64,` string and call
`imageProvider.embedImage(b64)`; also ensure this input validation runs as part
of the function registration/validation (sdk.registerFunction) for the
vision-search handler and record the check outcome via recordAudit() so
oversized attempts are rejected and audited.
- Around line 40-42: The refCount read via kv.get<number>(KV.imageRefs,
data.imageRef) can be a non-numeric/truthy value; coerce it immediately to a
Number (e.g., const count = Number(refCount)) and validate that
Number.isFinite(count) && count > 0 before accepting the imageRef, returning the
error when that check fails; update the same pattern used later (the other
occurrence around the 102-104 region) to perform the single coercion and the
finite positive check so malformed mem:image-refs rows are rejected consistently
(refer to kv.get, KV.imageRefs and the refCount variable for locating code).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 35e3ae9f-42b3-4cc5-aeef-64315a063ad4

📥 Commits

Reviewing files that changed from the base of the PR and between 241be26 and c736b39.

📒 Files selected for processing (2)
  • src/functions/vision-search.ts
  • test/vision-search.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/vision-search.test.ts

Comment on lines +40 to +42
const refCount = await kv.get<number>(KV.imageRefs, data.imageRef);
if (!refCount || Number(refCount) < 1) {
return { success: false, error: "imageRef not registered in mem:image-refs" };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject malformed imageRefs rows instead of treating them as registered.

kv.get<number>() does not enforce the runtime value. A truthy non-numeric value can make Number(refCount) < 1 evaluate false, allowing corrupted mem:image-refs rows to pass validation. Coerce once and require a finite positive count in both paths. As per coding guidelines: "src/{functions,triggers}/**/*.ts: Register functions using sdk.registerFunction() with validation of inputs, work via kv.get/kv.set/kv.list, and record audit operations via recordAudit()."

🛡️ Proposed validation hardening
-      const refCount = await kv.get<number>(KV.imageRefs, data.imageRef);
-      if (!refCount || Number(refCount) < 1) {
+      const refCount = Number(await kv.get<number>(KV.imageRefs, data.imageRef));
+      if (!Number.isFinite(refCount) || refCount < 1) {
         return { success: false, error: "imageRef not registered in mem:image-refs" };
       }
...
-          const refCount = await kv.get<number>(KV.imageRefs, data.queryImageRef);
-          if (!refCount || Number(refCount) < 1) {
+          const refCount = Number(await kv.get<number>(KV.imageRefs, data.queryImageRef));
+          if (!Number.isFinite(refCount) || refCount < 1) {
             return { success: false, error: "queryImageRef not registered in mem:image-refs" };
           }

Also applies to: 102-104

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/vision-search.ts` around lines 40 - 42, The refCount read via
kv.get<number>(KV.imageRefs, data.imageRef) can be a non-numeric/truthy value;
coerce it immediately to a Number (e.g., const count = Number(refCount)) and
validate that Number.isFinite(count) && count > 0 before accepting the imageRef,
returning the error when that check fails; update the same pattern used later
(the other occurrence around the 102-104 region) to perform the single coercion
and the finite positive check so malformed mem:image-refs rows are rejected
consistently (refer to kv.get, KV.imageRefs and the refCount variable for
locating code).

Comment on lines +93 to +97
} else if (data?.queryImageBase64) {
const b64 = data.queryImageBase64.startsWith("data:")
? data.queryImageBase64
: `data:image/png;base64,${data.queryImageBase64}`;
queryVec = await imageProvider.embedImage(b64);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bound inline image queries before decoding them.

queryImageBase64 bypasses the managed image-store limits and is passed straight into embedImage(). Add a decoded-byte cap before constructing the data URL so an authenticated caller cannot force unbounded image decoding/model work. As per coding guidelines: "src/{functions,triggers}/**/*.ts: Register functions using sdk.registerFunction() with validation of inputs, work via kv.get/kv.set/kv.list, and record audit operations via recordAudit()."

🛡️ Proposed guard
         } else if (data?.queryImageBase64) {
+          const inlineBase64 = data.queryImageBase64.startsWith("data:")
+            ? data.queryImageBase64.split(",", 2)[1]
+            : data.queryImageBase64;
+          if (!inlineBase64) {
+            return { success: false, error: "queryImageBase64 invalid" };
+          }
+          const estimatedBytes =
+            Math.floor((inlineBase64.length * 3) / 4) -
+            (inlineBase64.endsWith("==") ? 2 : inlineBase64.endsWith("=") ? 1 : 0);
+          if (estimatedBytes > 10 * 1024 * 1024) {
+            return { success: false, error: "queryImageBase64 too large" };
+          }
           const b64 = data.queryImageBase64.startsWith("data:")
             ? data.queryImageBase64
             : `data:image/png;base64,${data.queryImageBase64}`;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else if (data?.queryImageBase64) {
const b64 = data.queryImageBase64.startsWith("data:")
? data.queryImageBase64
: `data:image/png;base64,${data.queryImageBase64}`;
queryVec = await imageProvider.embedImage(b64);
} else if (data?.queryImageBase64) {
const inlineBase64 = data.queryImageBase64.startsWith("data:")
? data.queryImageBase64.split(",", 2)[1]
: data.queryImageBase64;
if (!inlineBase64) {
return { success: false, error: "queryImageBase64 invalid" };
}
const estimatedBytes =
Math.floor((inlineBase64.length * 3) / 4) -
(inlineBase64.endsWith("==") ? 2 : inlineBase64.endsWith("=") ? 1 : 0);
if (estimatedBytes > 10 * 1024 * 1024) {
return { success: false, error: "queryImageBase64 too large" };
}
const b64 = data.queryImageBase64.startsWith("data:")
? data.queryImageBase64
: `data:image/png;base64,${data.queryImageBase64}`;
queryVec = await imageProvider.embedImage(b64);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/vision-search.ts` around lines 93 - 97, The inline base64 image
path (data.queryImageBase64) bypasses image-store limits and is passed directly
to imageProvider.embedImage; fix by validating and bounding the decoded byte
length before constructing the data URL and calling embedImage: decode the
base64 string to bytes, enforce a configurable maximum (e.g., 1MB or project
limit) and throw a validation error if exceeded, only then build the
`data:image/...;base64,` string and call `imageProvider.embedImage(b64)`; also
ensure this input validation runs as part of the function
registration/validation (sdk.registerFunction) for the vision-search handler and
record the check outcome via recordAudit() so oversized attempts are rejected
and audited.

@rohitg00 rohitg00 merged commit 6bf1b67 into main Apr 21, 2026
5 checks passed
@rohitg00 rohitg00 deleted the feat/multimodal-phase1-vision-embeddings branch April 21, 2026 17:22
rohitg00 added a commit that referenced this pull request Apr 21, 2026
…bal scope

Introduces memory slots as a new first-class memory primitive, orthogonal
to observations / memories / graph. A slot is a labelled, size-limited,
agent-editable chunk of context. 8 defaults seed on first run
(persona, user_preferences, tool_guidelines, project_context, guidance,
pending_items, session_patterns, self_notes). Agent reads/writes via
MCP tools; pinned slots serialise cleanly into SessionStart context.

Functions (`src/functions/slots.ts`):
- mem::slot-list
- mem::slot-get
- mem::slot-create
- mem::slot-append   (refuses writes over sizeLimit)
- mem::slot-replace  (refuses content over sizeLimit)
- mem::slot-delete
- mem::slot-reflect  (used by idle reflection)

REST (`src/triggers/api.ts`):
- GET    /agentmemory/slots
- GET    /agentmemory/slot?label=...
- POST   /agentmemory/slot
- POST   /agentmemory/slot/append
- POST   /agentmemory/slot/replace
- DELETE /agentmemory/slot?label=...
- POST   /agentmemory/slot/reflect

MCP tools (`src/mcp/tools-registry.ts`, `src/mcp/server.ts`):
- memory_slot_list
- memory_slot_get
- memory_slot_create
- memory_slot_append
- memory_slot_replace
- memory_slot_delete

event::session::stopped fires mem::slot-reflect via triggerVoid when
AGENTMEMORY_REFLECT=true. The reflector reads the last N observations,
extracts TODO-flavoured titles into pending_items, counts error /
command types into session_patterns, and records touched files in
project_context. Fire-and-forget: does not block summarise.

Slots carry a scope field ("project" | "global"). Global slots live at
KV.globalSlots ("mem:slots:global") — one brain across all projects for
things like persona / user_preferences / tool_guidelines. Project-
scoped slot with the same label shadows the global one on read.

- AGENTMEMORY_SLOTS=true    — enable the whole primitive
- AGENTMEMORY_REFLECT=true  — enable Stop-hook reflection (requires SLOTS)

Both default OFF. Matches the opt-in convention of auto-compress (#138),
inject-context (#143), vision embeddings (#179).

44 MCP tools → 50 across README (stats badge, install instructions,
config block) and plugin/.claude-plugin/plugin.json description. New
AGENTMEMORY_SLOTS and AGENTMEMORY_REFLECT entries in the config block.

test/slots.test.ts — 12 new cases: default seeding by scope, label
validation, create/get/append/replace/delete round-trip, duplicate-
create rejection, sizeLimit enforcement on append + replace, project
shadows global, listPinnedSlots + renderPinnedContext, reflect no-op
on empty session, reflect updates pending_items / session_patterns /
project_context from TODO + error + file signals.

Full suite: 800/800 pass (+12 new). Build clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant