Memmem - Conversation memory with observation-based semantic search across Claude Code sessions.
Gives Claude persistent memory across sessions by automatically indexing conversations and providing progressive disclosure search through structured observations. Based on @obra/episodic-memory with integration into the Claude Code plugin ecosystem.
- Automatic Indexing: SessionEnd hook syncs conversations automatically
- Observation-Based Search: Structured insights from past sessions (~30t each)
- Progressive Disclosure: 3-layer pattern saves 50-100x context
- Layer 1: search() returns compact observations
- Layer 2: get_observations() returns full details
- Layer 3: read() returns raw conversation (rarely needed)
- Semantic Search: Vector embeddings for intelligent similarity matching
- Text Search: Fast exact-text matching for specific terms
- Advanced Filtering: Filter by observation type, concepts, files, projects
- Multi-Concept Search: AND search across 2-5 concepts (legacy exchange-based)
- Date Filtering: Search within specific time ranges
- Conversation Reading: Full conversation retrieval with pagination
- Inline Exclusion Markers: Exclude sensitive conversations with
DO NOT INDEX THIS CHAT - Index Verification: Check index health and repair issues
- CLI Interface: Direct CLI access for manual operations
Specialized agent for searching and synthesizing conversation history using observations (structured insights). Saves 50-100x context by using progressive disclosure and returning synthesized insights.
The agent automatically:
- Searches observations (Layer 1: compact results ~30t each)
- Gets full observation details (Layer 2: complete context ~200-500t each)
- Reads raw conversations only if needed (Layer 3: full transcript ~500-2000t)
- Synthesizes findings into 200-1000 word summary
- Returns actionable insights with sources
Always use the agent instead of MCP tools directly to avoid wasting context.
See agents/search-conversation.md for implementation details.
A skill that guides Claude to search conversation history before reinventing solutions or repeating mistakes.
Core principle: Always dispatch the search-conversation agent. Never use MCP tools directly.
When to use:
- User asks "how should I..." or "what's the best approach..."
- You're stuck after investigating a problem
- User references past work ("last time", "we discussed", etc.)
- Need to follow an unfamiliar workflow
What it does:
- Forces agent delegation (YOU MUST dispatch search-conversation agent)
- Prevents direct MCP tool usage (wastes context)
- Saves 50-100x context vs. loading raw conversations
See skills/remembering-conversations/SKILL.md for complete usage guide.
search-conversation agent instead to save 50-100x context.
These tools are exposed for advanced usage only. See skills/remembering-conversations/MCP-TOOLS.md for complete API reference.
Restores context by searching past conversations using observations (structured insights). Uses progressive disclosure to minimize context usage.
Use the search-conversation agent instead of calling this directly.
Parameters:
query(string | string[], required): Search query (single string for observation-based search, array of 2-5 strings for multi-concept AND search - deprecated, use single-concept with filters instead)limit(number, optional): Maximum results to return (1-50, default: 10)mode(string, optional): Search mode - "vector", "text", or "both" (default: "both", only for single-concept)before(string, optional): Only conversations before this date (YYYY-MM-DD)after(string, optional): Only conversations after this date (YYYY-MM-DD)projects(string[], optional): Filter results to specific project namestypes(string[], optional): Filter by observation types (single-concept only)concepts(string[], optional): Filter by tagged concepts (single-concept only)files(string[], optional): Filter by files mentioned/modified (single-concept only)response_format(string, optional): "markdown" or "json" (default: "markdown")
Examples:
// Observation-based search (recommended)
{ query: "React Router authentication errors" }
// Text search for exact match
{ query: "a1b2c3d4e5f6", mode: "text" }
// Advanced filtering (single-concept only)
{ query: "authentication", types: ["decision", "bug-fix"], concepts: ["JWT"] }
// Multi-concept AND search (deprecated, uses exchanges)
// Instead use: { query: "React Router authentication JWT", concepts: ["React Router", "authentication", "JWT"], mode: "both" }
{ query: ["React Router", "authentication", "JWT"] }
// Date filtering
{ query: "refactoring", after: "2025-09-01" }
// Project filtering
{ query: "authentication", projects: ["my-project"] }Gets full observation details (Layer 2 of progressive disclosure). Use after search() to retrieve complete information including narrative, facts, concepts, and files.
Use the search-conversation agent instead of calling this directly.
Parameters:
ids(string[], required): Array of observation IDs (1-20)
Example:
// Get full details for specific observations
{ ids: ["obs-abc123", "obs-def456", "obs-ghi789"] }Reads full conversations (Layer 3 of progressive disclosure). Use to extract detailed context after finding relevant observations with search() and getting full details with get_observations(). Essential for understanding the complete rationale, evolution, and gotchas behind past decisions.
Use the search-conversation agent instead of calling this directly.
Parameters:
path(string, required): Conversation file path from search resultsstartLine(number, optional): Starting line number (1-indexed) for paginationendLine(number, optional): Ending line number (1-indexed) for pagination
Note: Most searches are satisfied with layers 1-2 (search + get_observations). Only use this when absolutely necessary to save context.
# Install dependencies
cd plugins/memmem
npm install
# Build the plugin
npm run buildThe plugin automatically:
- Creates
~/.config/memmem/directory - Begins indexing conversations via SessionEnd hook
- Provides MCP tools for semantic search
When each Claude Code session starts (startup or resume), the hook (hooks/hooks.json) runs:
node dist/cli.mjs syncThis:
- Scans
~/.claude/sessions/for new/modified conversations - Generates embeddings using Transformers.js
- Stores in SQLite database (
~/.config/memmem/conversations.db) - Runs in background (non-blocking, silent on errors)
~/.config/memmem/
├── conversations.db # SQLite database with embeddings
└── config.json # User settings (optional)
There are two ways to exclude conversations from indexing:
1. Directory-level exclusion:
Create a .no-memmem marker file in the conversation directory:
touch /path/to/conversation/dir/.no-memmem2. Inline content exclusion:
Include one of these markers anywhere in the conversation content:
DO NOT INDEX THIS CHATDO NOT INDEX THIS CONVERSATION이 대화는 인덱싱하지 마세요(Korean)이 대화는 검색에서 제외하세요(Korean)
The entire conversation will be excluded from indexing when any of these markers are detected.
Summarization requires an LLM provider configuration. Create a config file at ~/.config/memmem/config.json:
Supported providers: gemini, zai
{
"provider": "gemini",
"apiKey": "your-gemini-api-key",
"model": "gemini-2.0-flash"
}Getting a Gemini API key:
- Go to Google AI Studio
- Create a new API key
- Add it to your config.json
{
"provider": "zai",
"apiKey": "your-zai-api-key",
"model": "glm-4.7"
}Configuration options:
provider: LLM provider name (geminiorzai)apiKey: API key for the providermodel: Optional model name (defaults:gemini-2.0-flashfor Gemini,glm-4.7for Z.AI)
Note: If no config file is found, conversations will still be indexed but not summarized.
You'll see [Not summarized - no LLM config found] placeholders instead of summaries.
npm run buildBundles:
src/mcp/server.ts→dist/mcp-server.mjs(MCP server)src/cli/index-cli.ts→dist/cli.mjs(CLI for hooks)
npm run typecheckThe plugin provides a CLI interface for manual operations:
# Show help
memmem --help
# Sync new conversations
memmem sync
# Sync with parallel summarization
memmem sync --concurrency 4
# Index a specific session
memmem index-session 2025-02-06-123456
# Verify index health
memmem verify
# Repair detected issues
memmem repair
# Rebuild entire index
memmem rebuild --concurrency 8plugins/memmem/
├── .claude-plugin/
│ └── plugin.json # Plugin metadata
├── .mcp.json # MCP server registration
├── hooks/
│ └── hooks.json # Auto-sync on session start (startup|resume)
├── src/
│ ├── core/ # Core library (from @obra/episodic-memory)
│ │ ├── indexer.ts # Conversation indexing
│ │ ├── searcher.ts # Semantic + text search
│ │ ├── storage.ts # SQLite + embeddings
│ │ └── types.ts # Type definitions
│ ├── cli/ # CLI commands
│ │ ├── sync-cli.ts # Sync command
│ │ ├── search-cli.ts # Search command
│ │ ├── show-cli.ts # Show command
│ │ └── stats-cli.ts # Stats command
│ └── mcp/
│ └── server.ts # MCP server (search, read tools)
├── dist/
│ ├── mcp-server.mjs # Bundled MCP server
│ ├── mcp-wrapper.mjs # Cross-platform wrapper
│ └── cli.mjs # Bundled CLI (for hooks)
├── scripts/
│ ├── build.mjs # esbuild config
│ └── mcp-server-wrapper.mjs # Wrapper script
├── package.json
├── tsconfig.json
└── README.md
@google/generative-ai: ^0.24.1 - For conversation summarization (Gemini API)@modelcontextprotocol/sdk: ^1.0.4 - MCP protocol implementation@huggingface/transformers: ^3.8.1 - ML embeddings (Transformers.js v3)better-sqlite3: ^9.6.0 - SQLite databasesqlite-vec: ^0.1.6 - Vector similarity search extensionzod: ^3.22.4 - Schema validation
typescript: ^5.3.3node: For build and test runtime (Node.js 18+)
IMPORTANT: Version 2.0+ uses EmbeddingGemma with 768-dimensional embeddings (vs 384 in v1.x). The database must be recreated as vector dimensions are incompatible.
# 1. Backup existing database (optional)
cp ~/.config/memmem/conversations.db \
~/.config/memmem/conversations.db.backup
# 2. Remove old database
rm ~/.config/memmem/conversations.db
# 3. Reinstall plugin dependencies
cd plugins/memmem
npm install
# 4. Rebuild plugin
npm run build
# 5. Reindex all conversations (downloads ~197MB model on first run)
node dist/cli.mjs index-allFirst sync timing:
- Model download: ~197MB (one-time, cached to
.cache/) - Reindexing time: Varies by conversation count
- Initial ONNX runtime warmup: ~30 seconds
- ✅ Better Korean Support: 83.86 MRR@10 (vs 55.4 in v1.x) - +51% improvement
- ✅ 100+ Languages: Multilingual coverage including Korean, Japanese, Chinese, etc.
- ✅ Higher Dimensions: 768-dim embeddings (vs 384) for better semantic representation
- ✅ Memory Efficient: < 200MB RAM usage with Q4 quantization
- ✅ Official Package: Migrated to
@huggingface/transformersv3
The plugin automatically installs dependencies on first run. If you encounter errors:
Symptoms: Error messages containing "EACCES" or "permission denied"
Fix:
sudo chown -R $(whoami) ~/.npmThen restart Claude Code.
Symptoms: Timeout or connection errors during dependency installation
Fix:
-
Check your internet connection
-
If behind a corporate firewall, configure npm proxy:
npm config set proxy http://your-proxy:port npm config set https-proxy http://your-proxy:port
-
Try installing manually:
cd plugins/memmem npm install
Symptoms: Error messages containing "ENOSPC"
Fix:
-
Check available disk space:
df -h -
Free up space by cleaning npm cache:
npm cache clean --force
-
Remove old node_modules:
cd plugins/memmem rm -rf node_modules npm install
If automatic installation fails repeatedly, install dependencies manually:
cd plugins/memmem
npm install
npm run build- Standalone Plugin: Complete implementation (not a wrapper)
- Based on @obra/episodic-memory: Forked and integrated into Claude Code plugin ecosystem
- Storage Location:
~/.config/memmem/(not.claude/) - Naming: All public interfaces use
memmemfor clarity - Embedding Model: Google EmbeddingGemma-300M (ONNX, Q4 quantized)
- 768 dimensions (Matryoshka-enabled: 128-768)
- 100+ languages including Korean (MRR@10: 83.86 on XTREME-UP)
- Model size: ~197MB (Q4 quantization)
- Memory usage: < 200MB RAM
- MTEB Multilingual score: 60.62
- Task prefix: "title: none | text: ..." (automatically applied)
- Slash commands:
/memmem search,/memmem stats - Conversation tagging/categorization
- Export/import functionality
- Web UI for browsing history
- Integration with other plugins (e.g., context-restore)
- Original project: episodic-memory
- MCP Protocol: Model Context Protocol
- Claude Code: anthropics/claude-code
MIT