Persistent, searchable memory for AI assistants. Works with Claude Desktop, Cursor, Continue, and any MCP-compatible client.
Store conversations, decisions, preferences, and facts in plain Markdown files with hybrid vector + keyword search.
Based on the excellent work by Manthan Gupta
- Hybrid Search - Vector embeddings + BM25 keyword matching (70/30 blend)
- Plain Markdown - Human-readable files you can edit directly
- Maintenance LLM (OpenAI) - Promotion and deep compaction run via maintenance scripts
- Smart Compaction - TF-IDF clustering catches semantic duplicates
- Concurrent Access - WAL mode supports multiple connections
- Team Ready - Distributed sync architecture (Phase 2)
git clone <repo-url>
cd memory-mcp-node
npm installnpm run buildRun the guided setup (recommended) to generate a client config:
npm run initOr add the MCP server manually (example):
{
"mcpServers": {
"memory": {
"command": "node",
"args": ["/absolute/path/to/memory-mcp-node/dist/index.js"]
}
}
}If your client supports it, add the instructions from MEMORY_PROTOCOL.md to your agent rules.
Maintenance runs outside MCP and uses OpenAI for promotion/deep compaction.
Create a repo-local .env with OPENAI_API_KEY, then run:
npm run maintenance -- --action check| Technology | What It Does | Why We Chose It |
|---|---|---|
| Node.js 18+ | Runtime environment | Native ES modules, excellent async I/O, ubiquitous in AI tooling ecosystem |
| TypeScript | Type-safe JavaScript | Catches bugs at compile time, better IDE support, self-documenting code |
| SQLite + sqlite-vec | Vector database with FTS5 | Zero-config embedded database, native vector search, full-text search in one package. No external services needed |
| WAL Mode | Write-ahead logging | Enables concurrent reads during writes, critical for multiple MCP connections from different clients |
| Transformers.js Embeddings | Xenova/all-MiniLM-L6-v2 | Local semantic search, 384 dimensions, no external embedding API |
| BM25 (FTS5) | Keyword search | Industry-standard relevance ranking, handles exact matches that vector search misses |
| TF-IDF Clustering | Topic grouping | Groups semantically similar entries before deduplication, catches "Chose Stripe" + "Using Stripe API" as related |
| Markdown Files | Storage format | Human-readable, git-friendly, editable with any text editor, survives tool changes |
| MCP Protocol | AI tool interface | Anthropic's standard for tool use, works with Claude, Cursor, Continue, and growing ecosystem |
| dotenv | Configuration | Simple secrets management, 12-factor app compliance, easy local development |
flowchart TB
subgraph clients [MCP Clients]
Claude[Claude Desktop]
Cursor[Cursor IDE]
Continue[Continue.dev]
end
subgraph server [Memory MCP Server]
Tools[MCP Tools Layer]
subgraph core [Core Services]
Indexer[Indexer]
Search[Hybrid Search]
Compact[Compaction]
Promote[Promotion]
end
subgraph storage [Storage Layer]
SQLite[(SQLite + WAL)]
Files[Markdown Files]
end
end
subgraph external [External APIs]
OpenAI[OpenAI API (maintenance)]
end
clients -->|MCP Protocol| Tools
Tools --> core
Promote -->|Scoring| OpenAI
Compact -.->|Deep Mode| OpenAI
core --> storage
sequenceDiagram
participant Client as MCP Client
participant Server as Memory Server
participant Embed as Local Embeddings
participant Vec as Vector Search
participant BM25 as BM25 Search
participant DB as SQLite
Client->>Server: memory_search(query)
Server->>Embed: Generate query embedding
Embed-->>Server: 384-dim vector
par Parallel Search
Server->>Vec: Vector similarity search
Vec->>DB: SELECT with cosine distance
DB-->>Vec: Top K results
and
Server->>BM25: Keyword search
BM25->>DB: FTS5 MATCH query
DB-->>BM25: Top K results
end
Server->>Server: Merge results (70% vec + 30% BM25)
Server->>Server: Deduplicate and rank
Server-->>Client: Ranked memory chunks
flowchart LR
subgraph daily [Daily Memory]
Store[memory_store] --> Daily[memory/YYYY-MM-DD.md]
Daily --> Compact[Compaction]
end
subgraph longterm [Long-Term Memory]
Compact --> Promote{Promotion Score}
Promote -->|Score >= 0.8| MEMORY[MEMORY.md]
Promote -->|Score < 0.8| Archive[Retained in daily]
end
subgraph cleanup [Maintenance]
Retention[Retention Policy] --> Delete[Delete old files]
Delete --> Vacuum[Clean DB entries]
end
Daily --> Retention
flowchart TB
subgraph root [.memory/]
MEMORY[MEMORY.md<br/>Long-term memories]
DB[(index.sqlite<br/>Search index)]
WAL[index.sqlite-wal<br/>WAL journal]
subgraph daily [memory/]
D1[2025-01-26.md]
D2[2025-01-27.md]
D3[2025-01-28.md]
end
subgraph team [team/]
T1[Synced team knowledge<br/>Phase 2]
end
end
style MEMORY fill:#e1f5fe
style DB fill:#fff3e0
style daily fill:#f3e5f5
style team fill:#e8f5e9
MEMORY.md - Promoted long-term memories:
# Memory
## User Preferences
- Prefers TypeScript over JavaScript
- Uses Vim keybindings
## Important Decisions
### 2025-01-28
Chose PostgreSQL for the new project because...
## Key Contacts
- Alice (Tech Lead) - alice@company.commemory/2025-01-28.md - Daily conversation memory:
# 2025-01-28
## 10:30
Working on the authentication module. Decided to use JWT tokens
with refresh token rotation for better security.
## 14:15
User prefers detailed explanations over brief answers when
discussing architecture decisions.API keys and paths. Create in repo root:
# Optional - for maintenance LLM (promotion, deep compaction)
OPENAI_API_KEY=sk-...
All other configuration. Edit config.json in repo root:
{
"embeddingModel": "Xenova/all-MiniLM-L6-v2",
"maxDailyChats": 180,
"maintenance": {
"compactionThresholdKB": 50,
"compactionThresholdEntries": 30,
"compactionMode": "quick",
"promotionScoreThreshold": 0.8,
"promotionLookbackDays": 30,
"autoMaintenanceIntervalHours": 24
},
"distributed": {
"enabled": false,
"autoSync": true
}
}| Setting | Description |
|---|---|
embeddingModel |
Transformers.js embedding model |
compactionMode |
quick (TF-IDF + dedup) or deep (LLM summarization) |
distributed |
Team sync settings (Phase 2) |
~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"memory": {
"command": "node",
"args": ["/absolute/path/to/memory-mcp-node/dist/index.js"]
}
}
}Settings → Features → MCP Servers:
{
"memory": {
"command": "node",
"args": ["/absolute/path/to/memory-mcp-node/dist/index.js"]
}
}Tip: Run
npm run initto get the exact paths for your system.
Codex reads MCP settings from ~/.codex/config.toml (TOML format, not JSON).
Add the MCP server there using the Codex-specific format.
| Tool | Purpose |
|---|---|
memory_search |
Find relevant memories using hybrid search |
memory_store |
Save new information |
memory_get |
Read specific content by path |
memory_list_recent |
Load recent context |
memory_forget |
Remove memories |
memory_status |
Check system health |
The AI uses these tools automatically based on conversation context.
The system maintains itself through retention, compaction, and promotion:
flowchart LR
subgraph triggers [Triggers]
Start[Conversation Start]
Writes[Many Writes]
Manual[User Request]
end
subgraph actions [Maintenance Actions]
Check{Overdue?}
Retention[Retention<br/>Delete old files]
Compaction[Compaction<br/>Deduplicate]
Promotion[Promotion<br/>Extract facts]
end
Start --> Check
Writes --> Check
Manual --> Check
Check -->|Yes| Retention
Retention --> Compaction
Compaction --> Promotion
Check -->|No| Skip[Skip]
| Action | What it does |
|---|---|
| Retention | Keeps latest N daily files, cleans old database entries |
| Compaction | Deduplicates (quick) or summarizes (deep) large daily files |
| Promotion | Extracts long-term facts to MEMORY.md |
# Check status
npm run maintenance -- --action check
# Full maintenance (dry run)
npm run maintenance -- --action full
# Full maintenance (execute)
npm run maintenance -- --action full --dry-run false
# Specific actions
npm run maintenance -- --action retention
npm run maintenance -- --action compact
npm run maintenance -- --action promoteEmbeddings run locally via transformers.js. OpenAI is used only for maintenance (promotion scoring and deep compaction) via the maintenance script.
| Component | Provider | Model | When |
|---|---|---|---|
| Embeddings | Local (transformers.js) | Xenova/all-MiniLM-L6-v2 | Every search/store |
| Promotion Scoring | OpenAI | gpt-4o-mini | Maintenance runs |
| Deep Compaction | OpenAI | gpt-4o-mini | When enabled |
- Ensure
.envexists in repo root withOPENAI_API_KEY - Check for typos in the key
npm install
npm run build- Verify absolute paths in client config
- Ensure
dist/index.jsexists (runnpm run build) - Check client logs for error messages
- Restart the MCP client after config changes
- Rerun the maintenance script after updating
.env - Verify key is valid at platform.openai.com
index.sqlite-walandindex.sqlite-shmare normal- Part of WAL mode for better concurrency
- Don't delete while server is running
- Node.js 18+
- OpenAI API key (required for maintenance only)
MIT