Skip to content

A standalone Rust memory engine for AI coding assistants. Single binary, zero runtime deps. Graph-vector hybrid with MCP tools.

License

Notifications You must be signed in to change notification settings

cogniplex/codemem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codemem

CI codecov Crates.io License: Apache 2.0

A standalone Rust memory engine for AI coding assistants. Single binary, zero runtime deps.

Codemem stores what your AI assistant discovers -- files read, symbols searched, edits made -- so repositories don't need re-exploring across sessions.

Quick Start

Install

# Shell (macOS/Linux)
curl -fsSL https://raw.githubusercontent.com/cogniplex/codemem/main/install.sh | sh

# Homebrew
brew install cogniplex/tap/codemem

# Cargo
cargo install codemem-cli

Or download a prebuilt binary from Releases.

Platform Architecture Binary
macOS ARM64 (Apple Silicon) codemem-macos-arm64.tar.gz
Linux x86_64 codemem-linux-amd64.tar.gz
Linux ARM64 codemem-linux-arm64.tar.gz

Initialize

cd your-project
codemem init

Downloads the local embedding model (~440MB, one-time), registers lifecycle hooks, and configures the MCP server for your AI assistant. Automatically detects Claude Code, Cursor, and Windsurf.

That's it

Codemem now automatically captures context, injects prior knowledge at session start, and provides 38 MCP tools to your assistant.

Map your codebase (optional)

Index your codebase to build a structural knowledge graph with call relationships, dependency edges, and PageRank-based importance scores:

codemem index

Then use the built-in code-mapper agent to analyze architecture, detect clusters, and store insights:

# In your AI assistant, the code-mapper agent runs these MCP tools:
get_pagerank { "top_k": 20 }              # Find most important symbols
get_clusters { "resolution": 1.0 }        # Detect architectural modules
get_impact { "qualified_name": "...", "depth": 2 }  # Blast radius analysis
search_code { "query": "database connection" }       # Semantic code search

See examples/agents/code-mapper.md for the full workflow.

Key Features

  • Graph-vector hybrid architecture -- HNSW vector search (768-dim) + petgraph knowledge graph with 25 algorithms (PageRank, Louvain, betweenness centrality, BFS/DFS, and more)
  • 38 MCP tools -- Memory CRUD, self-editing (refine/split/merge), graph traversal, code search, consolidation, impact analysis, metrics, and pattern detection over JSON-RPC
  • 4 lifecycle hooks -- Automatic context injection (SessionStart), prompt capture (UserPromptSubmit), observation capture (PostToolUse), and session summaries (Stop)
  • 9-component hybrid scoring -- Vector similarity, graph strength, BM25 token overlap, temporal alignment, tag matching, importance, confidence, and recency
  • Code-aware indexing -- tree-sitter structural extraction for 13 languages (Rust, TypeScript/JS/JSX, Python, Go, C/C++, Java, Ruby, C#, Kotlin, Swift, PHP, Scala, HCL/Terraform)
  • Contextual embeddings -- Metadata and graph context enriched before embedding for higher recall precision
  • Pluggable embeddings -- Candle (local BERT, default), Ollama, or any OpenAI-compatible API
  • Cross-session intelligence -- Pattern detection, file hotspot tracking, decision chains, and session continuity
  • Memory consolidation -- 5 neuroscience-inspired cycles: Decay (power-law), Creative/REM (semantic KNN), Cluster (cosine + union-find), Summarize (LLM-powered), Forget
  • Self-editing memory -- Refine, split, and merge memories with full provenance tracking via temporal graph edges
  • Operational metrics -- Per-tool latency percentiles (p50/p95/p99), call counters, and gauges via codemem_metrics tool
  • Real-time file watching -- notify-based watcher with <50ms debounce and .gitignore support
  • Persistent config -- TOML-based configuration at ~/.codemem/config.toml
  • Production hardened -- Zero .unwrap() in production code, safe concurrency, versioned schema migrations

How It Works

graph LR
    A[AI Assistant] -->|SessionStart hook| B[codemem context]
    A -->|PostToolUse hooks| C[codemem ingest]
    A -->|Stop hook| E[codemem summarize]
    A -->|MCP tools| D[codemem serve]
    B -->|Inject context| A
    C --> F[Storage + Vector + Graph]
    D --> F
    F -->|Recall| A
Loading
  1. Passively captures what your AI reads, searches, and edits via lifecycle hooks
  2. Actively recalls relevant context via MCP tools with 9-component hybrid scoring
  3. Injects context at session start so your assistant picks up where it left off

Hybrid scoring

Component Weight
Vector similarity 25%
Graph strength (PageRank + betweenness + degree + cluster) 25%
BM25 token overlap 15%
Temporal 10%
Tags 10%
Importance 5%
Confidence 5%
Recency 5%

Weights are configurable at runtime via the set_scoring_weights MCP tool and persist in config.toml.

Configuration

Embedding providers

By default, Codemem runs a local BERT model (no API key needed). To use a remote provider:

# Ollama (local server)
export CODEMEM_EMBED_PROVIDER=ollama

# OpenAI-compatible (works with Voyage AI, Together, Azure, etc.)
export CODEMEM_EMBED_PROVIDER=openai
export CODEMEM_EMBED_URL=https://api.voyageai.com/v1
export CODEMEM_EMBED_MODEL=voyage-3
export CODEMEM_EMBED_API_KEY=pa-...

Observation compression

Optionally compress raw tool observations via LLM before storage:

export CODEMEM_COMPRESS_PROVIDER=ollama   # or openai, anthropic

Persistent config

Scoring weights, vector/graph tuning, and storage settings persist in ~/.codemem/config.toml. Partial configs merge with defaults.

MCP Tools

38 tools organized by category. See MCP Tools Reference for full API documentation.

Category Tools
Core Memory (8) store_memory, recall_memory, update_memory, delete_memory, associate_memories, graph_traverse, codemem_stats, codemem_health
Self-Editing (3) refine_memory, split_memory, merge_memories
Structural Index (10) index_codebase, search_symbols, get_symbol_info, get_dependencies, get_impact, get_clusters, get_cross_repo, get_pagerank, search_code, set_scoring_weights
Export/Import (2) export_memories, import_memories
Recall & Namespace (4) recall_with_expansion, list_namespaces, namespace_stats, delete_namespace
Consolidation (6) consolidate_decay, consolidate_creative, consolidate_cluster, consolidate_forget, consolidate_summarize, consolidation_status
Impact & Patterns (4) recall_with_impact, get_decision_chain, detect_patterns, pattern_insights
Observability (1) codemem_metrics

CLI

codemem init          # Initialize project (model + hooks + MCP)
codemem search        # Search memories
codemem stats         # Database statistics
codemem serve         # Start MCP server (JSON-RPC stdio)
codemem index         # Index codebase with tree-sitter
codemem consolidate   # Run consolidation cycles
codemem viz           # Interactive memory graph dashboard
codemem watch         # Real-time file watcher
codemem export/import # Backup and restore (JSONL, JSON, CSV, Markdown)
codemem sessions      # Session management (list, start, end)
codemem doctor        # Health checks on installation
codemem config        # Get/set configuration values
codemem migrate       # Run pending schema migrations

See CLI Reference for full usage.

Performance

Operation Target
HNSW search k=10 (100K vectors) < 2ms
Embedding (single sentence) < 50ms
Embedding (cache hit) < 0.01ms
Graph BFS depth=2 < 1ms
Hook ingest (Read) < 200ms

Documentation

Building from Source

git clone https://github.com/cogniplex/codemem.git
cd codemem
cargo build --release          # Optimized binary at target/release/codemem
cargo test --workspace         # Run all 415 tests
cargo bench                    # Criterion benchmarks

12-crate Cargo workspace. See CONTRIBUTING.md for development guidelines.

Research and Inspirations

Codemem builds on ideas from several research papers, blog posts, and open-source projects.

Papers
Paper Venue Key Contribution
HippoRAG NeurIPS 2024 Neurobiologically-inspired long-term memory using LLMs + knowledge graphs + Personalized PageRank. Up to 20% improvement on multi-hop QA.
From RAG to Memory ICML 2025 Non-parametric continual learning for LLMs (HippoRAG 2). 7% improvement in associative memory tasks.
A-MEM 2025 Zettelkasten-inspired agentic memory with dynamic indexing, linking, and memory evolution.
MemGPT ICLR 2024 OS-inspired hierarchical memory tiers for LLMs -- self-editing memory via function calls.
MELODI Google DeepMind 2024 Hierarchical short-term + long-term memory compression. 8x memory footprint reduction.
ReadAgent Google DeepMind 2024 Human-inspired reading agent with episodic gist memories for 20x context extension.
LoCoMo ACL 2024 Benchmark for evaluating very long-term conversational memory (300-turn, 9K-token conversations).
Mem0 2025 Production-ready AI agents with scalable long-term memory. 26% accuracy improvement over OpenAI Memory.
Zep 2025 Temporal knowledge graph architecture for agent memory with bi-temporal data model.
Memory in the Age of AI Agents Survey 2024 Comprehensive taxonomy of agent memory: factual, experiential, working memory.
AriGraph 2024 Episodic + semantic memory in knowledge graphs for LLM agent exploration.
Blog posts and techniques
  • Contextual Retrieval (Anthropic, 2024) -- Prepending chunk-specific context before embedding reduces failed retrievals by 49%. Codemem adapts this as template-based contextual enrichment using metadata + graph relationships.
  • Contextual Embeddings Cookbook (Anthropic) -- Implementation guide for contextual embeddings with prompt caching.
Open-source projects
  • AutoMem -- Graph-vector hybrid memory achieving 90.53% on LoCoMo. Direct inspiration for Codemem's hybrid scoring and consolidation cycles.
  • claude-mem -- Persistent memory compression via Claude Agent SDK. Inspired lifecycle hooks and observation compression.
  • Mem0 -- Production memory layer for AI (47K+ stars). Informed memory type design.
  • Zep/Graphiti -- Temporal knowledge graph engine. Inspired graph persistence model.
  • Letta (MemGPT) -- Stateful AI agents with self-editing memory.
  • Cognee -- Knowledge graph memory via triplet extraction.
  • claude-context -- AST-aware code search via MCP (by Zilliz).

See docs/comparison.md for detailed feature comparisons.

License

Apache 2.0

About

A standalone Rust memory engine for AI coding assistants. Single binary, zero runtime deps. Graph-vector hybrid with MCP tools.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages