Skip to content

baleen37/memmem

Repository files navigation

Memmem

Memmem - Conversation memory with observation-based semantic search across Claude Code sessions.

Purpose

Gives Claude persistent memory across sessions by automatically indexing conversations and providing progressive disclosure search through structured observations. Based on @obra/episodic-memory with integration into the Claude Code plugin ecosystem.

Features

  • Automatic Indexing: SessionEnd hook syncs conversations automatically
  • Observation-Based Search: Structured insights from past sessions (~30t each)
  • Progressive Disclosure: 3-layer pattern saves 50-100x context
    • Layer 1: search() returns compact observations
    • Layer 2: get_observations() returns full details
    • Layer 3: read() returns raw conversation (rarely needed)
  • Semantic Search: Vector embeddings for intelligent similarity matching
  • Text Search: Fast exact-text matching for specific terms
  • Advanced Filtering: Filter by observation type, concepts, files, projects
  • Multi-Concept Search: AND search across 2-5 concepts (legacy exchange-based)
  • Date Filtering: Search within specific time ranges
  • Conversation Reading: Full conversation retrieval with pagination
  • Inline Exclusion Markers: Exclude sensitive conversations with DO NOT INDEX THIS CHAT
  • Index Verification: Check index health and repair issues
  • CLI Interface: Direct CLI access for manual operations

Agents

search-conversation

Specialized agent for searching and synthesizing conversation history using observations (structured insights). Saves 50-100x context by using progressive disclosure and returning synthesized insights.

The agent automatically:

  1. Searches observations (Layer 1: compact results ~30t each)
  2. Gets full observation details (Layer 2: complete context ~200-500t each)
  3. Reads raw conversations only if needed (Layer 3: full transcript ~500-2000t)
  4. Synthesizes findings into 200-1000 word summary
  5. Returns actionable insights with sources

Always use the agent instead of MCP tools directly to avoid wasting context.

See agents/search-conversation.md for implementation details.

Skills

remembering-conversations

A skill that guides Claude to search conversation history before reinventing solutions or repeating mistakes.

Core principle: Always dispatch the search-conversation agent. Never use MCP tools directly.

When to use:

  • User asks "how should I..." or "what's the best approach..."
  • You're stuck after investigating a problem
  • User references past work ("last time", "we discussed", etc.)
  • Need to follow an unfamiliar workflow

What it does:

  • Forces agent delegation (YOU MUST dispatch search-conversation agent)
  • Prevents direct MCP tool usage (wastes context)
  • Saves 50-100x context vs. loading raw conversations

See skills/remembering-conversations/SKILL.md for complete usage guide.

MCP Tools

⚠️ Warning: Direct MCP tool usage is discouraged. Always use the search-conversation agent instead to save 50-100x context.

These tools are exposed for advanced usage only. See skills/remembering-conversations/MCP-TOOLS.md for complete API reference.

memmem__search

Restores context by searching past conversations using observations (structured insights). Uses progressive disclosure to minimize context usage.

Use the search-conversation agent instead of calling this directly.

Parameters:

  • query (string | string[], required): Search query (single string for observation-based search, array of 2-5 strings for multi-concept AND search - deprecated, use single-concept with filters instead)
  • limit (number, optional): Maximum results to return (1-50, default: 10)
  • mode (string, optional): Search mode - "vector", "text", or "both" (default: "both", only for single-concept)
  • before (string, optional): Only conversations before this date (YYYY-MM-DD)
  • after (string, optional): Only conversations after this date (YYYY-MM-DD)
  • projects (string[], optional): Filter results to specific project names
  • types (string[], optional): Filter by observation types (single-concept only)
  • concepts (string[], optional): Filter by tagged concepts (single-concept only)
  • files (string[], optional): Filter by files mentioned/modified (single-concept only)
  • response_format (string, optional): "markdown" or "json" (default: "markdown")

Examples:

// Observation-based search (recommended)
{ query: "React Router authentication errors" }

// Text search for exact match
{ query: "a1b2c3d4e5f6", mode: "text" }

// Advanced filtering (single-concept only)
{ query: "authentication", types: ["decision", "bug-fix"], concepts: ["JWT"] }

// Multi-concept AND search (deprecated, uses exchanges)
// Instead use: { query: "React Router authentication JWT", concepts: ["React Router", "authentication", "JWT"], mode: "both" }
{ query: ["React Router", "authentication", "JWT"] }

// Date filtering
{ query: "refactoring", after: "2025-09-01" }

// Project filtering
{ query: "authentication", projects: ["my-project"] }

memmem__get_observations

Gets full observation details (Layer 2 of progressive disclosure). Use after search() to retrieve complete information including narrative, facts, concepts, and files.

Use the search-conversation agent instead of calling this directly.

Parameters:

  • ids (string[], required): Array of observation IDs (1-20)

Example:

// Get full details for specific observations
{ ids: ["obs-abc123", "obs-def456", "obs-ghi789"] }

memmem__read

Reads full conversations (Layer 3 of progressive disclosure). Use to extract detailed context after finding relevant observations with search() and getting full details with get_observations(). Essential for understanding the complete rationale, evolution, and gotchas behind past decisions.

Use the search-conversation agent instead of calling this directly.

Parameters:

  • path (string, required): Conversation file path from search results
  • startLine (number, optional): Starting line number (1-indexed) for pagination
  • endLine (number, optional): Ending line number (1-indexed) for pagination

Note: Most searches are satisfied with layers 1-2 (search + get_observations). Only use this when absolutely necessary to save context.

Installation

# Install dependencies
cd plugins/memmem
npm install

# Build the plugin
npm run build

The plugin automatically:

  1. Creates ~/.config/memmem/ directory
  2. Begins indexing conversations via SessionEnd hook
  3. Provides MCP tools for semantic search

How It Works

Automatic Indexing (SessionStart Hook)

When each Claude Code session starts (startup or resume), the hook (hooks/hooks.json) runs:

node dist/cli.mjs sync

This:

  1. Scans ~/.claude/sessions/ for new/modified conversations
  2. Generates embeddings using Transformers.js
  3. Stores in SQLite database (~/.config/memmem/conversations.db)
  4. Runs in background (non-blocking, silent on errors)

Storage Structure

~/.config/memmem/
├── conversations.db          # SQLite database with embeddings
└── config.json              # User settings (optional)

Exclusion

There are two ways to exclude conversations from indexing:

1. Directory-level exclusion:

Create a .no-memmem marker file in the conversation directory:

touch /path/to/conversation/dir/.no-memmem

2. Inline content exclusion:

Include one of these markers anywhere in the conversation content:

  • DO NOT INDEX THIS CHAT
  • DO NOT INDEX THIS CONVERSATION
  • 이 대화는 인덱싱하지 마세요 (Korean)
  • 이 대화는 검색에서 제외하세요 (Korean)

The entire conversation will be excluded from indexing when any of these markers are detected.

LLM Configuration (Required for Summarization)

Summarization requires an LLM provider configuration. Create a config file at ~/.config/memmem/config.json:

Supported providers: gemini, zai

Gemini Configuration

{
  "provider": "gemini",
  "apiKey": "your-gemini-api-key",
  "model": "gemini-2.0-flash"
}

Getting a Gemini API key:

  1. Go to Google AI Studio
  2. Create a new API key
  3. Add it to your config.json

Z.AI Configuration

{
  "provider": "zai",
  "apiKey": "your-zai-api-key",
  "model": "glm-4.7"
}

Configuration options:

  • provider: LLM provider name (gemini or zai)
  • apiKey: API key for the provider
  • model: Optional model name (defaults: gemini-2.0-flash for Gemini, glm-4.7 for Z.AI)

Note: If no config file is found, conversations will still be indexed but not summarized. You'll see [Not summarized - no LLM config found] placeholders instead of summaries.

Development

Build

npm run build

Bundles:

  • src/mcp/server.tsdist/mcp-server.mjs (MCP server)
  • src/cli/index-cli.tsdist/cli.mjs (CLI for hooks)

Type Check

npm run typecheck

CLI Usage

The plugin provides a CLI interface for manual operations:

# Show help
memmem --help

# Sync new conversations
memmem sync

# Sync with parallel summarization
memmem sync --concurrency 4

# Index a specific session
memmem index-session 2025-02-06-123456

# Verify index health
memmem verify

# Repair detected issues
memmem repair

# Rebuild entire index
memmem rebuild --concurrency 8

Project Structure

plugins/memmem/
├── .claude-plugin/
│   └── plugin.json              # Plugin metadata
├── .mcp.json                     # MCP server registration
├── hooks/
│   └── hooks.json               # Auto-sync on session start (startup|resume)
├── src/
│   ├── core/                    # Core library (from @obra/episodic-memory)
│   │   ├── indexer.ts           # Conversation indexing
│   │   ├── searcher.ts          # Semantic + text search
│   │   ├── storage.ts           # SQLite + embeddings
│   │   └── types.ts             # Type definitions
│   ├── cli/                     # CLI commands
│   │   ├── sync-cli.ts          # Sync command
│   │   ├── search-cli.ts        # Search command
│   │   ├── show-cli.ts          # Show command
│   │   └── stats-cli.ts         # Stats command
│   └── mcp/
│       └── server.ts            # MCP server (search, read tools)
├── dist/
│   ├── mcp-server.mjs           # Bundled MCP server
│   ├── mcp-wrapper.mjs          # Cross-platform wrapper
│   └── cli.mjs                  # Bundled CLI (for hooks)
├── scripts/
│   ├── build.mjs                # esbuild config
│   └── mcp-server-wrapper.mjs   # Wrapper script
├── package.json
├── tsconfig.json
└── README.md

Dependencies

Runtime

  • @google/generative-ai: ^0.24.1 - For conversation summarization (Gemini API)
  • @modelcontextprotocol/sdk: ^1.0.4 - MCP protocol implementation
  • @huggingface/transformers: ^3.8.1 - ML embeddings (Transformers.js v3)
  • better-sqlite3: ^9.6.0 - SQLite database
  • sqlite-vec: ^0.1.6 - Vector similarity search extension
  • zod: ^3.22.4 - Schema validation

Development Dependencies

  • typescript: ^5.3.3
  • node: For build and test runtime (Node.js 18+)

Upgrading from v1.x (multilingual-e5-small)

IMPORTANT: Version 2.0+ uses EmbeddingGemma with 768-dimensional embeddings (vs 384 in v1.x). The database must be recreated as vector dimensions are incompatible.

Migration Steps

# 1. Backup existing database (optional)
cp ~/.config/memmem/conversations.db \
   ~/.config/memmem/conversations.db.backup

# 2. Remove old database
rm ~/.config/memmem/conversations.db

# 3. Reinstall plugin dependencies
cd plugins/memmem
npm install

# 4. Rebuild plugin
npm run build

# 5. Reindex all conversations (downloads ~197MB model on first run)
node dist/cli.mjs index-all

First sync timing:

  • Model download: ~197MB (one-time, cached to .cache/)
  • Reindexing time: Varies by conversation count
  • Initial ONNX runtime warmup: ~30 seconds

What's New in v2.0

  • Better Korean Support: 83.86 MRR@10 (vs 55.4 in v1.x) - +51% improvement
  • 100+ Languages: Multilingual coverage including Korean, Japanese, Chinese, etc.
  • Higher Dimensions: 768-dim embeddings (vs 384) for better semantic representation
  • Memory Efficient: < 200MB RAM usage with Q4 quantization
  • Official Package: Migrated to @huggingface/transformers v3

Troubleshooting

Installation Errors

The plugin automatically installs dependencies on first run. If you encounter errors:

Permission Denied (EACCES)

Symptoms: Error messages containing "EACCES" or "permission denied"

Fix:

sudo chown -R $(whoami) ~/.npm

Then restart Claude Code.

Network Errors (ETIMEDOUT, ECONNRESET, ENOTFOUND)

Symptoms: Timeout or connection errors during dependency installation

Fix:

  1. Check your internet connection

  2. If behind a corporate firewall, configure npm proxy:

    npm config set proxy http://your-proxy:port
    npm config set https-proxy http://your-proxy:port
  3. Try installing manually:

    cd plugins/memmem
    npm install

Disk Space Full (ENOSPC)

Symptoms: Error messages containing "ENOSPC"

Fix:

  1. Check available disk space: df -h

  2. Free up space by cleaning npm cache:

    npm cache clean --force
  3. Remove old node_modules:

    cd plugins/memmem
    rm -rf node_modules
    npm install

Manual Installation

If automatic installation fails repeatedly, install dependencies manually:

cd plugins/memmem
npm install
npm run build

Architecture Notes

  • Standalone Plugin: Complete implementation (not a wrapper)
  • Based on @obra/episodic-memory: Forked and integrated into Claude Code plugin ecosystem
  • Storage Location: ~/.config/memmem/ (not .claude/)
  • Naming: All public interfaces use memmem for clarity
  • Embedding Model: Google EmbeddingGemma-300M (ONNX, Q4 quantized)
    • 768 dimensions (Matryoshka-enabled: 128-768)
    • 100+ languages including Korean (MRR@10: 83.86 on XTREME-UP)
    • Model size: ~197MB (Q4 quantization)
    • Memory usage: < 200MB RAM
    • MTEB Multilingual score: 60.62
    • Task prefix: "title: none | text: ..." (automatically applied)

Future Enhancements

  • Slash commands: /memmem search, /memmem stats
  • Conversation tagging/categorization
  • Export/import functionality
  • Web UI for browsing history
  • Integration with other plugins (e.g., context-restore)

References

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors