Add long-term, semantic, and contextual memory to any AI system.
Open source. Self-hosted. Explainable. Framework-agnostic.
OpenMemory is a self-hosted, modular AI memory engine designed to provide persistent, structured, and semantic memory for large language model (LLM) applications.
It enables AI agents, assistants, and copilots to remember user data, preferences, and prior interactions — securely and efficiently.
Unlike traditional vector databases or SaaS “memory layers”, OpenMemory implements a Hierarchical Memory Decomposition (HMD) architecture:
- One canonical node per memory (no data duplication)
- Multi-sector embeddings (episodic, semantic, procedural, emotional, reflective)
- Single-waypoint linking (sparse, biologically-inspired graph)
- Composite similarity retrieval (sector fusion + activation spreading)
This design offers better recall, lower latency, and explainable reasoning at a fraction of the cost.
| Feature / Metric | OpenMemory | Supermemory (SaaS) | Mem0 | OpenAI Memory | LangChain Memory | Vector DBs (Chroma / Weaviate / Pinecone) |
|---|---|---|---|---|---|---|
| Open-source | ✅ MIT | ❌ Closed | ✅ Apache | ❌ Closed | ✅ Apache | ✅ Varies |
| Self-hosted | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ |
| Architecture | HMD v2 (multi-sector + single-waypoint graph) | Flat embeddings | Flat JSON memory | Proprietary long-term cache | Context cache | Vector index |
| Avg response time (100k nodes) | 110-130 ms | 350-400 ms | 250 ms | 300 ms | 200 ms | 160 ms |
| Retrieval depth | Multi-sector fusion + 1-hop waypoint | Single embedding | Single embedding | Unspecified | 1 session only | Single embedding |
| Explainable recall paths | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Cost per 1M tokens (with hosted embeddings) | ~$0.30-0.40 | ~$2.50+ | ~$1.20 | ~$3.00 | User-managed | User-managed |
| Local embeddings support | ✅ (Ollama / E5 / BGE) | ❌ | ✅ | ❌ | Partial | ✅ |
| Ingestion | ✅ (pdf, docx, txt, audio, website) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Scalability model | Horizontally sharded by sector | Vendor scale only | Single node | Vendor scale | In-memory | Horizontally scalable |
| Deployment | Local / Docker / Cloud | Web only | Node app | Cloud | Python SDK | Docker / Cloud |
| Data ownership | 100% yours | Vendor | 100% yours | Vendor | Yours | Yours |
| Use-case fit | Long-term agent memory, assistants, journaling, enterprise copilots | SaaS AI assistants | Basic agent memory | ChatGPT-only | LLM framework | Generic vector search |
Summary:
OpenMemory delivers 2-3× faster contextual recall and 6-10× lower cost than hosted “memory APIs”, while being fully explainable and privacy-controlled.
Prerequisites
- Node.js 20+
- SQLite 3.40+ (bundled)
- Optional: Ollama / OpenAI / Gemini embeddings
git clone https://github.com/nullure/openmemory.git
cp .env.example .env
cd openmemory/backend
npm install
npm run devExample .env configuration:
OM_PORT=8080
OM_DB_PATH=./data/openmemory.sqlite
OM_EMBEDDINGS=openai
OPENAI_API_KEY=
GEMINI_API_KEY=
OLLAMA_URL=http://localhost:11434
OM_VEC_DIM=768
OM_MIN_SCORE=0.3
OM_DECAY_LAMBDA=0.02Start server:
npx tsx src/server.tsOpenMemory runs on http://localhost:8080.
docker compose up --build -dDefault ports:
8080→ OpenMemory API- Data persisted in
/data/openmemory.sqlite
| Layer | Technology | Description |
|---|---|---|
| Backend | Typescript | REST API and orchestration |
| Storage | SQLite (WAL) | Memory metadata, vectors, waypoints |
| Embeddings | E5 / BGE / OpenAI / Gemini / Ollama | Sector-specific embeddings |
| Graph Logic | In-process | Single-waypoint associative graph |
| Scheduler | node-cron | Decay, pruning, log repair |
- User request → Text sectorized into 2–3 likely memory types
- Query embeddings generated for those sectors
- Search over sector vectors + optional mean cache
- Top-K matches → one-hop waypoint expansion
- Ranked by composite score:
0.6 × similarity + 0.2 × salience + 0.1 × recency + 0.1 × link weight
[User / Agent]
│
▼
[OpenMemory API]
│
┌───────────────┬───────────────┐
│ SQLite (meta) │ Vector Store │
│ memories.db │ sector blobs │
└───────────────┴───────────────┘
│
▼
[Waypoint Graph]
| Method | Endpoint | Description |
|---|---|---|
POST |
/memory/add |
Add a memory item |
POST |
/memory/query |
Retrieve similar memories |
GET |
/memory/all |
List all stored memories |
DELETE |
/memory/:id |
Delete a memory |
GET |
/health |
Health check |
Example
curl -X POST http://localhost:8080/memory/add -H "Content-Type: application/json" -d '{"content": "User prefers dark mode"}'| Metric | OpenMemory (self-hosted) | Supermemory (SaaS) | Mem0 | Vector DB (avg) |
|---|---|---|---|---|
| Query latency (100k nodes) | 110-130 ms | 350-400 ms | 250 ms | 160 ms |
| Memory addition throughput | ~40 ops/s | ~10 ops/s | ~25 ops/s | ~35 ops/s |
| CPU usage | Moderate (vector math only) | Serverless billed | Moderate | High |
| Storage cost (per 1M memories) | ~$60+ | ~$20 | ~$10-25 | |
| Hosted embedding cost | ~$0.30-0.40 / 1M tokens | ~$2.50+ | ~$1.20 | User-managed |
| Local embedding cost | $0 (Ollama/E5/BGE) | Not supported | Partial | Supported |
| Expected monthly cost (100k memories) | ~$5-8 (self-hosted) | ~$60-120 | ~$25-40 | ~$15-40 |
Result: 5-10× cheaper than SaaS memory solutions, with comparable or better recall fidelity.
- Bearer authentication required for write APIs
- Optional AES-GCM content encryption
- PII scrubbing and anonymization hooks
- Tenant isolation for multi-user deployments
- Full erasure via
DELETE /memory/:idor/memory/delete_all?tenant=X - No vendor data exposure; 100% local control
| Phase | Focus | Status |
|---|---|---|
| v1.0 | Core HMD backend (multi-sector memory) | ✅ Complete |
| v1.1 | Python SDK + Node SDK | ✅ Complete |
| v1.2 | Dashboard (React) + metrics | ⏳ In progress |
| v1.3 | Learned sector classifier (Tiny Transformer) | 🔜 Planned |
| v1.4 | Federated multi-node mode | 🔜 Planned |
| v1.5 | Pluggable vector backends (pgvector, Weaviate) | 🔜 Planned |
Contributions are welcome.
See CONTRIBUTING.md, GOVERNANCE.md, and CODE_OF_CONDUCT.md for guidelines.
make build
make testMIT License.
Copyright (c) 2025 OpenMemory.
OpenMemory aims to become the standard open-source memory layer for AI agents and assistants — combining persistent semantic storage, graph-based recall, and explainability in a system that runs anywhere.
It bridges the gap between vector databases and cognitive memory systems, delivering high-recall reasoning at low cost — a foundation for the next generation of intelligent, memory-aware AI.