Persistent memory for AI coding agents.
Powered by iii-engine.
Quick Start • Why • Agents • How It Works • Search • Memory Evolution • MCP • Viewer • Configuration • API
Every AI coding agent has the same blind spot. Session ends, memory vanishes. You re-explain architecture. You re-discover bugs. You re-teach preferences. Built-in memory files like CLAUDE.md and .cursorrules are 200-line sticky notes that overflow and go stale. agentmemory replaces that with a searchable, versioned, cross-agent database — 38 MCP tools, triple-stream retrieval (BM25 + vector + knowledge graph), 4-tier memory consolidation, provenance-tracked citations, and cascading staleness so retired facts never pollute your context again. One instance serves Claude Code, Cursor, Codex, Windsurf, and any MCP client simultaneously. 573 tests. Zero external DB dependencies.
The result is measurable. On 240 real observations across 30 sessions, agentmemory hits 64% Recall@10 and perfect MRR while using 92% fewer tokens than dumping everything into context. When an agent searches "database performance optimization," it finds the N+1 fix you made three weeks ago — something keyword grep literally cannot do. Memories version automatically, supersede each other, propagate staleness to related graph nodes, and sync across agent instances via P2P mesh. Your agents stop repeating mistakes. Your context stays clean. Your sessions start fast.
This repository is a modified public fork of rohitg00/agentmemory. It remains licensed under Apache-2.0, keeps upstream attribution, and distributes local changes under the same license terms.
For a public summary of why this fork exists and which files changed by intent, see docs/fork-intent.md.
git clone https://github.com/rohitg00/agentmemory.git && cd agentmemory
docker compose up -d --build
curl http://127.0.0.1:3111/agentmemory/healthThe included docker-compose.yml starts both iii-engine and the agentmemory-worker, mounts iii-config.yaml into the engine container, and persists iii state in the named iii-data volume.
AI coding agents forget everything between sessions. You explain the same architecture, re-discover the same patterns, and re-learn the same preferences every time. agentmemory fixes that.
Session 1: "Add auth to the API"
Agent writes code, runs tests, fixes bugs
agentmemory silently captures every tool use
Session ends -> observations compressed into structured memory
Session 2: "Now add rate limiting"
agentmemory injects context from Session 1:
- Auth uses JWT middleware in src/middleware/auth.ts
- Tests in test/auth.test.ts cover token validation
- Decision: chose jose over jsonwebtoken for Edge compatibility
Agent starts with full project awareness
No manual notes. No copy-pasting. The agent just knows.
| Capability | What it does |
|---|---|
| Automatic capture | Every tool use, file edit, test run, and error is silently recorded via hooks |
| LLM compression | Raw observations are compressed into structured facts, concepts, and narratives |
| Context injection | Past knowledge is injected at session start within a configurable token budget |
| Semantic search | Hybrid BM25 + vector search finds relevant memories even with different wording |
| Memory evolution | Memories version over time, supersede each other, and form relationship graphs |
| Project profiles | Aggregated per-project intelligence: top concepts, files, conventions, common errors |
| Auto-forgetting | TTL expiry, contradiction detection, and importance-based eviction keep memory clean |
| Privacy first | API keys, secrets, and <private> tags are stripped before anything is stored |
| Self-healing | Circuit breaker, provider fallback chain, self-correcting LLM output, health monitoring |
| Claude Code bridge | Bi-directional sync with ~/.claude/projects/*/memory/MEMORY.md |
| Cross-agent MCP | Standalone MCP server for Cursor, Codex, Gemini CLI, Windsurf, any MCP client |
| Citation provenance | JIT verification traces any memory back to source observations and sessions |
| Cascading staleness | Superseded memories auto-flag related graph nodes, edges, and siblings as stale |
| Knowledge graph | Entity extraction + BFS traversal across files, functions, concepts, errors |
| 4-tier memory | Working → episodic → semantic → procedural consolidation with strength decay |
| Team memory | Namespaced shared + private memory across team members |
| Governance | Edit, delete, bulk-delete, and audit trail for all memory operations |
| Git snapshots | Version, rollback, and diff memory state via git commits |
Every AI coding agent now ships with built-in memory — Claude Code has MEMORY.md, Cursor has notepads, Windsurf has Cascade memories, Cline has memory bank. These work like sticky notes: fast, always-on, but fundamentally limited.
agentmemory is the searchable database behind the sticky notes.
| Built-in (CLAUDE.md, .cursorrules) | agentmemory | |
|---|---|---|
| Scale | 200-line cap (MEMORY.md) | Unlimited |
| Search | Loads everything into context | BM25 + vector + graph (returns top-K only) |
| Token cost | 22K+ tokens at 240 observations | ~1,900 tokens (92% less) |
| At 1K observations | 80% of memories invisible | 100% searchable |
| At 5K observations | Exceeds context window | Still ~2K tokens |
| Cross-session recall | Only within line cap | Full corpus search |
| Cross-agent | Per-agent files (no sharing) | MCP + REST API (any agent) |
| Multi-agent coordination | Impossible | Leases, signals, actions, routines |
| Cross-agent sync | No | P2P mesh (7 scopes: memories, actions, semantic, procedural, relations, graph) |
| Memory trust | No verification | Citation chain back to source observations with confidence scores |
| Semantic search | No (keyword grep) | Yes (Recall@10: 64% vs 56% for grep) |
| Memory lifecycle | Manual pruning | Ebbinghaus decay + tiered eviction |
| Knowledge graph | No | Entity extraction + temporal versioning |
| Observability | Read files manually | Real-time viewer on :3113 |
Evaluated on 240 real-world coding observations across 30 sessions with 20 labeled queries:
| System | Recall@10 | NDCG@10 | MRR | Tokens/query |
|---|---|---|---|---|
| Built-in (grep all into context) | 55.8% | 80.3% | 82.5% | 19,462 |
| agentmemory BM25 (stemmed + synonyms) | 55.9% | 82.7% | 95.5% | 1,571 |
| agentmemory + Xenova embeddings | 64.1% | 94.9% | 100.0% | 1,571 |
With real embeddings, agentmemory finds "N+1 query fix" when you search "database performance optimization" — something keyword matching literally cannot do.
Full benchmark reports: benchmark/QUALITY.md, benchmark/SCALE.md, benchmark/REAL-EMBEDDINGS.md
agentmemory works with any agent that supports hooks, MCP, or via its REST API.
These agents support hooks natively. agentmemory captures tool usage automatically via its 12 hooks.
| Agent | Integration | Setup |
|---|---|---|
| Claude Code | 12 hooks (all types) | /plugin install agentmemory or manual hook config |
| Claude Code SDK | Agent SDK provider | Built-in AgentSDKProvider uses your Claude subscription |
Some host forks may integrate with agentmemory natively without using the
Claude plugin path. In that model, the host posts lifecycle events directly to
/agentmemory/observe and uses /agentmemory/context for bounded retrieval.
| Agent | Integration | Setup |
|---|---|---|
| Codex forks with an agentmemory adapter | Native lifecycle adapter | Host-specific fork/config; not shipped by this repo |
Any agent that connects to MCP servers can use agentmemory's 38 tools, 6 resources, and 3 prompts. The agent actively queries and saves memory through MCP calls.
| Agent | How to connect |
|---|---|
| Claude Desktop | Add to claude_desktop_config.json MCP servers |
| Cursor | Add MCP server in settings |
| Windsurf | MCP server configuration |
| Cline / Continue | MCP server configuration |
| Any MCP client | Point to http://localhost:3111/agentmemory/mcp/* |
Agents without hooks or MCP can integrate via 93 REST endpoints directly. This works with any agent, language, or framework.
POST /agentmemory/observe # Capture what the agent did
POST /agentmemory/smart-search # Find relevant memories
POST /agentmemory/context # Get context for injection
POST /agentmemory/enrich # Get enriched context (files + memories + bugs)
POST /agentmemory/remember # Save long-term memory
GET /agentmemory/profile # Get project intelligence| Your situation | Use |
|---|---|
| Claude Code user | Plugin install (hooks + MCP + skills) |
| Running a Codex fork that emits agentmemory lifecycle events directly | Native adapter path (/agentmemory/observe + /agentmemory/context) |
| Building a custom agent with Claude SDK | AgentSDKProvider (zero config) |
| Using stock Codex, Cursor, Windsurf, or any MCP client | MCP server (38 tools + 6 resources + 3 prompts) |
| Building your own agent framework | REST API (93 endpoints) |
| Sharing memory across multiple agents | All agents point to the same iii-engine instance |
claude plugins marketplace add /path/to/agentmemory
claude plugins install agentmemoryThis repo currently installs cleanly through a local Claude marketplace path. After install, start a fresh Claude Code session so the plugin hooks and skills are loaded.
git clone https://github.com/rohitg00/agentmemory.git
cd agentmemory
docker compose up -d --buildUseful lifecycle commands:
docker compose logs -f agentmemory-worker
docker compose restart agentmemory-worker
docker compose stop agentmemory-workercurl http://127.0.0.1:3111/agentmemory/health
# Real-time viewer (auto-starts on port 3113)
open http://localhost:3113{
"status": "healthy",
"service": "agentmemory",
"version": "0.6.1",
"health": {
"memory": { "heapUsed": 42000000, "heapTotal": 67000000 },
"cpu": { "percent": 2.1 },
"eventLoopLagMs": 1.2,
"status": "healthy"
},
"circuitBreaker": { "state": "closed", "failures": 0 }
}If you prefer not to use the plugin, add hooks directly to ~/.claude/settings.json:
{
"hooks": {
"SessionStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-start.mjs" }],
"UserPromptSubmit": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/prompt-submit.mjs" }],
"PreToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-tool-use.mjs" }],
"PostToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-use.mjs" }],
"PostToolUseFailure": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-failure.mjs" }],
"PreCompact": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-compact.mjs" }],
"SubagentStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-start.mjs" }],
"SubagentStop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-stop.mjs" }],
"Notification": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/notification.mjs" }],
"TaskCompleted": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/task-completed.mjs" }],
"Stop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/stop.mjs" }],
"SessionEnd": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-end.mjs" }]
}
}Freshness note:
- the shipped
prompt-submit,post-tool-use,post-tool-failure, andstophooks now forwardturn_idwhen the host provides it, so observations from the same turn can be stitched into a single turn capsule - Claude Code currently relies on the
Stophook pluslast_assistant_messagefor final-turn freshness; that is the default supported path today - if your host/runtime exposes a dedicated final assistant-result hook, point
it at
node ~/agentmemory/dist/hooks/assistant-result.mjsso the latest assistant conclusion is available to retrieval before summarization finishes
PostToolUse hook fires
-> Dedup check SHA-256 hash (5min window, no duplicates)
-> mem::privacy Strip secrets, API keys, <private> tags
-> mem::observe Store raw observation, push to real-time stream
-> mem::compress LLM extracts: type, facts, narrative, concepts, files
Validates with Zod, scores quality (0-100)
Self-corrects on validation failure (1 retry)
Generates vector embedding for semantic search
SessionStart hook fires
-> mem::context Load recent sessions for this project
Hybrid search (BM25 + vector) across observations
Inject project profile (top concepts, files, patterns)
Apply token budget (default: 2000 tokens)
-> stdout Agent receives context in the conversation
| Hook | Captures |
|---|---|
SessionStart |
Project path, session ID, working directory |
UserPromptSubmit |
User prompts (privacy-filtered) plus turn_id when available |
PreToolUse |
File access patterns + enriched context injection (Read, Write, Edit, Glob, Grep) |
PostToolUse |
Tool name, input, output, and turn_id when available |
PostToolUseFailure |
Failed tool invocations with error context and turn_id when available |
PreCompact |
Re-injects memory context before context compaction |
SubagentStart/Stop |
Sub-agent lifecycle events |
Notification |
System notifications |
TaskCompleted |
Task completion events |
Stop |
Persists the latest assistant message for the active turn via last_assistant_message, then triggers end-of-session summary |
SessionEnd |
Marks session complete |
Claude Code currently uses the Stop path above for final-turn freshness.
If another agent runtime provides a separate final assistant-result hook, route
it to dist/hooks/assistant-result.mjs to improve same-session freshness even
further.
agentmemory uses triple-stream retrieval combining three signals for maximum recall.
| Stream | What it does | When |
|---|---|---|
| BM25 | Stemmed keyword matching with synonym expansion and binary-search prefix matching | Always on |
| Vector | Cosine similarity over dense embeddings (Xenova, OpenAI, Gemini, Voyage, Cohere, OpenRouter) | Any embedding provider configured |
| Graph | Knowledge graph traversal via entity matching and co-occurrence edges | Entities detected in query |
All three streams are fused with Reciprocal Rank Fusion (RRF, k=60) and session-diversified (max 3 results per session) to maximize coverage.
BM25 enhancements (v0.6.0): Porter stemmer normalizes word forms ("authentication" ↔ "authenticating"), coding-domain synonyms expand queries ("db" ↔ "database", "perf" ↔ "performance"), and binary-search prefix matching replaces O(n) scans.
agentmemory auto-detects which provider to use. For best results, install local embeddings (no API key needed):
npm install @xenova/transformers| Provider | Model | Dimensions | Env Var | Notes |
|---|---|---|---|---|
| Local (recommended) | all-MiniLM-L6-v2 |
384 | EMBEDDING_PROVIDER=local |
Free, offline, +8pp recall over BM25-only |
| Gemini | gemini-embedding-2-preview |
3072 full / configurable lower | GEMINI_API_KEY |
Set GEMINI_EMBEDDING_MODEL or GEMINI_EMBEDDING_DIMENSIONS to override |
| OpenAI | text-embedding-3-small |
1536 | OPENAI_API_KEY |
$0.02/1M tokens |
| Voyage AI | voyage-code-3 |
1024 | VOYAGE_API_KEY |
Optimized for code |
| Cohere | embed-english-v3.0 |
1024 | COHERE_API_KEY |
Free trial available |
| OpenRouter | Any embedding model | varies | OPENROUTER_API_KEY |
Multi-model proxy |
No embedding provider? BM25-only mode with stemming and synonyms still outperforms built-in memory.
Smart search returns compact results first (title, type, score, timestamp) to save tokens. Expand specific IDs to get full observation details.
# Compact results (50-100 tokens each)
curl -X POST http://localhost:3111/agentmemory/smart-search \
-d '{"query": "database migration"}'
# Expand specific results (500-1000 tokens each)
curl -X POST http://localhost:3111/agentmemory/smart-search \
-d '{"expandIds": ["obs_abc123", "obs_def456"]}'Memories in agentmemory are not static. They version, evolve, and form relationships.
When you save a memory that's similar to an existing one (Jaccard > 0.7), the old memory is superseded:
v1: "Use Express for API routes"
v2: "Use Fastify instead of Express for API routes" (supersedes v1)
v3: "Use Hono instead of Fastify for Edge API routes" (supersedes v2)
Only the latest version is returned in search results. The full chain is preserved for audit.
Memories can be linked: supersedes, extends, derives, contradicts, related. Each relationship carries a confidence score (0-1) computed from co-occurrence, recency, and relation type. Traversal follows these links up to N hops, with optional minConfidence filtering.
agentmemory automatically cleans itself:
| Mechanism | What it does |
|---|---|
| TTL expiry | Memories with forgetAfter date are deleted when expired |
| Contradiction detection | Near-duplicate memories (Jaccard > 0.9) — older one is demoted |
| Low-value eviction | Observations older than 90 days with importance < 3 are removed |
| Per-project cap | Projects are capped at 10,000 observations (lowest importance evicted first) |
Run POST /agentmemory/auto-forget?dryRun=true to preview what would be cleaned.
agentmemory aggregates observations into per-project intelligence:
curl "http://localhost:3111/agentmemory/profile?project=/my/project"Returns top concepts, most-touched files, coding conventions, common errors, and a session count. This profile is automatically injected into session context.
Navigate observations chronologically around any anchor point:
curl -X POST http://localhost:3111/agentmemory/timeline \
-d '{"anchor": "2026-02-15", "before": 5, "after": 5}'Full data portability:
# Export everything
curl http://localhost:3111/agentmemory/export > backup.json
# Import with merge strategy
curl -X POST http://localhost:3111/agentmemory/import \
-d '{"exportData": ..., "strategy": "merge"}'Strategies: merge (combine), replace (overwrite), skip (ignore duplicates).
agentmemory monitors its own health and validates its own output.
Every LLM compression is scored 0-100 based on structured facts, narrative quality, concept extraction, title quality, and importance range. Scores are tracked per-function and exposed via /health.
When LLM output fails Zod validation, agentmemory retries with a stricter prompt explaining the exact errors. This recovers from malformed JSON, missing fields, and out-of-range values.
Primary provider fails
-> Circuit breaker opens (3 failures in 60s)
-> Falls back to next provider in FALLBACK_PROVIDERS chain
-> 30s cooldown -> half-open -> test call -> recovery
Configure with FALLBACK_PROVIDERS=anthropic,gemini,openrouter. When all providers are down, observations are stored raw without compression. No data is lost.
Collects every 30 seconds: heap usage, CPU percentage (delta sampling), event loop lag, connection state. Alerts at warning (80% CPU, 100ms lag) and critical (90% CPU, 500ms lag) thresholds. GET /agentmemory/health returns HTTP 503 when critical.
| Tool | Description |
|---|---|
memory_recall |
Search past observations by keyword |
memory_save |
Save an insight, decision, or pattern |
memory_file_history |
Get past observations about specific files |
memory_patterns |
Detect recurring patterns across sessions |
memory_sessions |
List recent sessions with status |
memory_smart_search |
Hybrid semantic + keyword search with progressive disclosure |
memory_timeline |
Chronological observations around an anchor point |
memory_profile |
Project profile with top concepts, files, patterns |
memory_export |
Export all memory data as JSON |
memory_relations |
Query memory relationship graph (with confidence filtering) |
memory_claude_bridge_sync |
Sync memory to/from Claude Code's native MEMORY.md |
memory_graph_query |
Query the knowledge graph for entities and relationships |
memory_consolidate |
Run 4-tier memory consolidation pipeline |
memory_team_share |
Share a memory or observation with team members |
memory_team_feed |
Get recent shared items from all team members |
memory_audit |
View the audit trail of memory operations |
memory_governance_delete |
Delete specific memories with audit trail |
memory_snapshot_create |
Create a git-versioned snapshot of memory state |
memory_action_create |
Create actionable work items with typed dependencies |
memory_action_update |
Update action status, priority, or details |
memory_frontier |
Get unblocked actions ranked by priority and urgency |
memory_next |
Get the single most important next action |
memory_lease |
Acquire, release, or renew exclusive action leases |
memory_routine_run |
Instantiate a frozen workflow routine into action chains |
memory_signal_send |
Send threaded messages between agents |
memory_signal_read |
Read messages for an agent with read receipts |
memory_checkpoint |
Create or resolve external condition gates (CI, approval, deploy) |
memory_mesh_sync |
Sync memories and actions with peer instances |
memory_sentinel_create |
Create event-driven condition watchers |
memory_sentinel_trigger |
Externally fire a sentinel to unblock gated actions |
memory_sketch_create |
Create ephemeral action graphs for exploratory work |
memory_sketch_promote |
Promote sketch actions to permanent actions |
memory_crystallize |
LLM-powered compaction of completed action chains |
memory_diagnose |
Health checks across all subsystems |
memory_heal |
Auto-fix stuck, orphaned, and inconsistent state |
memory_facet_tag |
Attach structured dimension:value tags to targets |
memory_facet_query |
Query targets by facet tags with AND/OR logic |
memory_verify |
Trace a memory's provenance back to source observations and sessions |
| URI | Description |
|---|---|
agentmemory://status |
Session count, memory count, health status |
agentmemory://project/{name}/profile |
Per-project intelligence (concepts, files, conventions) |
agentmemory://project/{name}/recent |
Last 5 session summaries for a project |
agentmemory://memories/latest |
Latest 10 active memories (id, title, type, strength) |
agentmemory://graph/stats |
Knowledge graph node and edge counts by type |
agentmemory://team/{id}/profile |
Team memory profile with shared concepts and patterns |
| Prompt | Arguments | Description |
|---|---|---|
recall_context |
task_description |
Searches observations + memories, returns context messages |
session_handoff |
session_id |
Returns session data + summary for handoff between agents |
detect_patterns |
project (optional) |
Analyzes recurring patterns across sessions |
Run agentmemory as a standalone MCP server for MCP-compatible agents such as Cursor, Gemini CLI, Windsurf, or stock Codex clients that are using MCP only:
npx agentmemory-mcpOr add to your agent's MCP config:
{
"mcpServers": {
"agentmemory": {
"command": "npx",
"args": ["agentmemory-mcp"]
}
}
}The standalone server uses in-memory KV with optional JSON persistence (STANDALONE_PERSIST_PATH).
Important:
- this standalone MCP path is not equivalent to native lifecycle capture
- a Codex fork that posts
SessionStart,UserPromptSubmit,PostToolUse,Stop, orAssistantResult-style events directly into/agentmemory/observeis using a different, richer integration level - this repo ships the Claude Code native hook/plugin path; Codex-native adapter integrations are host-specific and must be implemented by the fork
GET /agentmemory/mcp/tools — List available tools
POST /agentmemory/mcp/call — Execute a tool
GET /agentmemory/mcp/resources — List available resources
POST /agentmemory/mcp/resources/read — Read a resource by URI
GET /agentmemory/mcp/prompts — List available prompts
POST /agentmemory/mcp/prompts/get — Get a prompt with argumentsFour slash commands for interacting with memory:
| Skill | Usage |
|---|---|
/recall |
Search memory for past context (/recall auth middleware) |
/remember |
Save something to long-term memory (/remember always use jose for JWT) |
/session-history |
Show recent session summaries |
/forget |
Delete specific observations or entire sessions |
agentmemory includes a real-time web dashboard that auto-starts on port 3113 (configurable via III_REST_PORT + 2).
- Live observation stream via WebSocket
- Session explorer with observation details
- Memory browser with search and filtering
- Knowledge graph visualization
- Health and metrics dashboard
Access at http://localhost:3113 or via GET /agentmemory/viewer on the API port. Protected by AGENTMEMORY_SECRET when set. CSP headers applied to all HTML responses.
agentmemory needs an LLM for compressing observations and generating summaries. It auto-detects from your environment.
| Provider | Config | Notes |
|---|---|---|
| Claude subscription (default) | No config needed | Uses @anthropic-ai/claude-agent-sdk. Zero cost beyond your Max/Pro plan |
| Anthropic API | ANTHROPIC_API_KEY |
Direct API access, per-token billing |
| Gemini | GEMINI_API_KEY |
Also enables Gemini embeddings (free tier) |
| OpenRouter | OPENROUTER_API_KEY |
Access any model through one API |
No API key? agentmemory uses your Claude subscription automatically. Zero config.
Create ~/.agentmemory/.env:
# LLM provider (pick one, or leave empty for Claude subscription)
ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=...
# GEMINI_MODEL=gemini-flash-latest
# GEMINI_EMBEDDING_MODEL=gemini-embedding-2-preview
# GEMINI_EMBEDDING_DIMENSIONS=3072
# OPENROUTER_API_KEY=...
# Embedding provider (auto-detected from LLM keys, or override)
# EMBEDDING_PROVIDER=gemini
# VOYAGE_API_KEY=...
# OPENAI_API_KEY=...
# COHERE_API_KEY=...
# Hybrid search weights (quality-leaning Gemini profile)
# BM25_WEIGHT=0.15
# VECTOR_WEIGHT=0.85
# Provider fallback chain (comma-separated, tried in order)
# FALLBACK_PROVIDERS=anthropic,gemini,openrouter
# Bearer token for API auth
# AGENTMEMORY_SECRET=your-secret-here
# Engine connection
# III_ENGINE_URL=ws://localhost:49134
# III_REST_PORT=3111
# III_STREAMS_PORT=3112
# Viewer runs on III_REST_PORT + 2 (default: 3113)
# Memory tuning
# TOKEN_BUDGET=8000
# MAX_TOKENS=8192
# MAX_OBS_PER_SESSION=500
# Claude Code Memory Bridge (v0.5.0)
# CLAUDE_MEMORY_BRIDGE=false
# CLAUDE_MEMORY_LINE_BUDGET=200
# Standalone MCP Server (v0.5.0)
# STANDALONE_MCP=false
# STANDALONE_PERSIST_PATH=~/.agentmemory/standalone.json
# Knowledge Graph (v0.5.0)
# GRAPH_EXTRACTION_ENABLED=true
# GRAPH_EXTRACTION_BATCH_SIZE=10
# Consolidation Pipeline (v0.5.0)
# CONSOLIDATION_ENABLED=true
# CONSOLIDATION_DECAY_DAYS=30
# Team Memory (v0.5.0)
# TEAM_ID=
# USER_ID=
# TEAM_MODE=private
# Git Snapshots (v0.5.0)
# SNAPSHOT_ENABLED=false
# SNAPSHOT_INTERVAL=3600
# SNAPSHOT_DIR=~/.agentmemory/snapshots95 endpoints on port 3111 (89 core + 6 MCP protocol). Protected endpoints require Authorization: Bearer <secret> when AGENTMEMORY_SECRET is set. The table below shows a representative subset; see src/api.ts for the full endpoint list.
| Method | Path | Description |
|---|---|---|
GET |
/agentmemory/health |
Health check with metrics (always public) |
GET |
/agentmemory/livez |
Liveness probe (always public) |
POST |
/agentmemory/session/start |
Start session + get context |
POST |
/agentmemory/session/end |
Mark session complete |
POST |
/agentmemory/observe |
Capture observation |
POST |
/agentmemory/context |
Generate context |
POST |
/agentmemory/search |
Search observations (BM25) |
POST |
/agentmemory/smart-search |
Hybrid search with progressive disclosure |
POST |
/agentmemory/summarize |
Generate session summary |
POST |
/agentmemory/remember |
Save to long-term memory |
POST |
/agentmemory/forget |
Delete observations/sessions |
POST |
/agentmemory/consolidate |
Merge duplicate observations |
POST |
/agentmemory/patterns |
Detect recurring patterns |
POST |
/agentmemory/generate-rules |
Generate CLAUDE.md rules from patterns |
POST |
/agentmemory/file-context |
Get file-specific history |
POST |
/agentmemory/enrich |
Unified enrichment (file context + memories + bugs) |
POST |
/agentmemory/evict |
Evict stale memories (?dryRun=true) |
POST |
/agentmemory/migrate |
Import from SQLite |
POST |
/agentmemory/timeline |
Chronological observations around anchor |
POST |
/agentmemory/relations |
Create memory relationship (with confidence) |
POST |
/agentmemory/evolve |
Evolve memory (new version) |
POST |
/agentmemory/auto-forget |
Run auto-forget (?dryRun=true) |
POST |
/agentmemory/import |
Import data from JSON |
GET |
/agentmemory/profile |
Project profile (?project=/path) |
GET |
/agentmemory/export |
Export all data as JSON |
GET |
/agentmemory/sessions |
List all sessions |
GET |
/agentmemory/observations |
Session observations (?sessionId=X) |
GET |
/agentmemory/viewer |
Real-time web viewer (also at http://localhost:3113) |
GET |
/agentmemory/claude-bridge/read |
Read Claude Code native MEMORY.md |
POST |
/agentmemory/claude-bridge/sync |
Sync memories to MEMORY.md |
POST |
/agentmemory/graph/query |
Query knowledge graph (BFS traversal) |
GET |
/agentmemory/graph/stats |
Knowledge graph node/edge counts |
POST |
/agentmemory/graph/extract |
Extract entities from observations |
POST |
/agentmemory/consolidate-pipeline |
Run 4-tier consolidation pipeline |
POST |
/agentmemory/team/share |
Share memory with team members |
GET |
/agentmemory/team/feed |
Recent shared items from team |
GET |
/agentmemory/team/profile |
Aggregated team memory profile |
GET |
/agentmemory/audit |
Query audit trail (?operation=X&limit=N) |
DELETE |
/agentmemory/governance/memories |
Delete specific memories with audit |
POST |
/agentmemory/governance/bulk-delete |
Bulk delete by type/date/quality |
GET |
/agentmemory/snapshots |
List git snapshots |
POST |
/agentmemory/snapshot/create |
Create git-versioned snapshot |
POST |
/agentmemory/snapshot/restore |
Restore from snapshot commit |
GET |
/agentmemory/mcp/tools |
List MCP tools |
POST |
/agentmemory/mcp/call |
Execute MCP tool |
GET |
/agentmemory/mcp/resources |
List MCP resources |
POST |
/agentmemory/mcp/resources/read |
Read MCP resource by URI |
GET |
/agentmemory/mcp/prompts |
List MCP prompts |
POST |
/agentmemory/mcp/prompts/get |
Get MCP prompt with arguments |
claude plugins marketplace add /path/to/agentmemory
claude plugins install agentmemoryStart a fresh Claude Code session. All 12 hooks, 4 skills, and 38 MCP tools are registered automatically.
claude plugins install agentmemory # Install
claude plugins disable agentmemory # Disable without uninstalling
claude plugins enable agentmemory # Re-enable
claude plugins uninstall agentmemory # Removeagentmemory is built on iii-engine's three primitives:
| What you'd normally need | What agentmemory uses |
|---|---|
| Express.js / Fastify | iii HTTP Triggers |
| SQLite / Postgres + pgvector | iii KV State + in-memory vector index |
| SSE / Socket.io | iii Streams (WebSocket) |
| pm2 / systemd | iii-engine worker management |
| Prometheus / Grafana | iii OTEL + built-in health monitor |
| Redis (circuit breaker) | In-process circuit breaker + fallback chain |
105+ source files. ~16,000 LOC. 573 tests. Zero external DB dependencies.
| Function | Purpose |
|---|---|
mem::observe |
Store raw observation with dedup check |
mem::compress |
LLM compression with validation + quality scoring + embedding |
mem::search |
BM25-ranked full-text search |
mem::smart-search |
Hybrid search with progressive disclosure |
mem::context |
Build session context within token budget |
mem::summarize |
Generate validated session summaries |
mem::remember |
Save to long-term memory (auto-supersedes similar) |
mem::forget |
Delete observations, sessions, or memories |
mem::file-index |
File-specific observation lookup |
mem::consolidate |
Merge duplicate observations |
mem::patterns |
Detect recurring patterns |
mem::generate-rules |
Generate CLAUDE.md rules from patterns |
mem::migrate |
Import from SQLite |
mem::evict |
Age + importance + cap-based memory eviction |
mem::relate |
Create relationship between memories |
mem::evolve |
Create new version of a memory |
mem::get-related |
Traverse memory relationship graph |
mem::timeline |
Chronological observations around anchor |
mem::profile |
Aggregate project profile |
mem::auto-forget |
TTL expiry + contradiction detection |
mem::enrich |
Unified enrichment (file context + observations + bug memories) |
mem::export / mem::import |
Full JSON round-trip (v0.3.0 + v0.4.0 + v0.5.0 formats) |
mem::claude-bridge-read |
Read Claude Code native MEMORY.md |
mem::claude-bridge-sync |
Sync top memories back to MEMORY.md |
mem::graph-extract |
LLM-powered entity extraction from observations |
mem::graph-query |
BFS traversal of knowledge graph |
mem::graph-stats |
Node/edge counts by type |
mem::consolidate-pipeline |
4-tier memory consolidation with strength decay |
mem::team-share |
Share memory/observation with team namespace |
mem::team-feed |
Fetch recent shared items from team |
mem::team-profile |
Aggregate team concepts, files, patterns |
mem::governance-delete |
Delete specific memories with audit trail |
mem::governance-bulk |
Bulk delete by type/date/quality filter |
mem::snapshot-create |
Git commit memory state |
mem::snapshot-list |
List all snapshots |
mem::snapshot-restore |
Restore memory from snapshot commit |
mem::action-create / action-update |
Dependency-aware work items with typed edges |
mem::frontier / mem::next |
Priority-ranked unblocked action queue |
mem::lease-acquire / release / renew |
TTL-based atomic agent claims |
mem::routine-create / run / status |
Frozen workflow templates instantiated into action chains |
mem::signal-send / read / threads |
Threaded inter-agent messaging with read receipts |
mem::checkpoint-create / resolve |
External condition gates (CI, approval, deploy) |
mem::flow-compress |
LLM-powered summarization of completed action chains |
mem::mesh-register / sync / receive |
P2P sync between agentmemory instances |
mem::detect-worktree / branch-sessions |
Git worktree detection for shared memory |
mem::sentinel-create / trigger / check |
Event-driven condition watchers (webhook, timer, threshold, pattern, approval) |
mem::sketch-create / add / promote / discard |
Ephemeral action graphs for exploratory work with auto-expiry |
mem::crystallize / auto-crystallize |
LLM-powered compaction of completed action chains into crystal digests |
mem::diagnose / heal |
Self-diagnosis across 8 categories with auto-fix for stuck/orphaned/stale state |
mem::facet-tag / query / stats |
Multi-dimensional tagging with AND/OR queries on actions, memories, observations |
mem::expand-query |
LLM-generated query reformulations for improved recall |
mem::sliding-window |
Context-window enrichment at ingestion (resolve pronouns, abbreviations) |
mem::temporal-graph |
Append-only versioned edges with point-in-time queries |
mem::retention-score / evict |
Ebbinghaus-inspired decay with tiered storage (hot/warm/cold/evictable) |
mem::graph-retrieval |
Entity search + chunk expansion + temporal queries via knowledge graph |
mem::verify |
JIT verification — trace memory provenance back to source observations |
mem::cascade-update |
Propagate staleness to graph nodes, edges, and sibling memories |
| Scope | Stores |
|---|---|
mem:sessions |
Session metadata, project, timestamps |
mem:obs:{session_id} |
Compressed observations with embeddings |
mem:summaries |
End-of-session summaries |
mem:memories |
Long-term memories (versioned, with relationships) |
mem:relations |
Memory relationship graph |
mem:profiles |
Aggregated project profiles |
mem:emb:{obs_id} |
Vector embeddings |
mem:index:bm25 |
Persisted BM25 index |
mem:metrics |
Per-function metrics |
mem:health |
Health snapshots |
mem:config |
Runtime configuration overrides |
mem:confidence |
Confidence scores for memories |
mem:claude-bridge |
Claude Code MEMORY.md bridge state |
mem:graph:nodes |
Knowledge graph entities |
mem:graph:edges |
Knowledge graph relationships |
mem:semantic |
Semantic memories (consolidated facts) |
mem:procedural |
Procedural memories (extracted workflows) |
mem:team:{id}:shared |
Team shared items |
mem:team:{id}:users:{uid} |
Per-user team state |
mem:team:{id}:profile |
Aggregated team profile |
mem:audit |
Audit trail for all operations |
mem:actions |
Dependency-aware work items |
mem:action-edges |
Typed edges (requires, unlocks, gated_by, etc.) |
mem:leases |
TTL-based agent work claims |
mem:routines |
Frozen workflow templates |
mem:routine-runs |
Instantiated routine execution tracking |
mem:signals |
Inter-agent messages with threading |
mem:checkpoints |
External condition gates |
mem:mesh |
Registered P2P sync peers |
mem:sentinels |
Event-driven condition watchers |
mem:sketches |
Ephemeral action graphs |
mem:crystals |
Compacted action chain digests |
mem:facets |
Multi-dimensional tags |
npm run dev # Hot reload
npm run build # Production build (365KB)
npm test # Unit tests (573 tests, ~1.5s)
npm run test:integration # API tests (requires running services)- Node.js >= 18
- Docker
This repository is distributed under Apache-2.0.
If you publish or redistribute this fork:
- keep the LICENSE file with the source or any redistributions
- keep the NOTICE file with the source or any redistributions from this fork
- retain upstream copyright and attribution notices that still apply
- clearly mark any files you modify when redistributing source form under Apache-2.0 section 4(b)
Original upstream project: rohitg00/agentmemory

