Claude-style long-term memory with 4/6/8-bit TurboQuant compression β runs on a laptop.
π Live Demo (Colab) Β· π Docs Β· πΊοΈ Roadmap Β· π¬ Discussions
| Feature | TurboMemory | Mem0 | Zep | LangMem |
|---|---|---|---|---|
| Embedding Compression | β 4/6/8-bit packed | β | β | β |
| Self-Healing (autoDream) | β Merge, dedupe, resolve contradictions | Partial | Partial | β |
| Retrieval Verification | β Cross-reference scoring | β | β | β |
| Quality Scoring | β Confidence + freshness + specificity | β | β | β |
| Exclusion Rules | β Configurable "what NOT to store" | β | β | β |
| Runs on Laptop | β SQLite + local models | β Needs server | ||
| Memory Size (10K chunks) | ~5 MB (6-bit) | ~150 MB | ~200 MB | ~150 MB |
| Open Source | β MIT | β Apache 2.0 | β | β |
| Plugin System | β Scorers, providers, storage | β | β | β |
The compression advantage: TurboMemory's 6-bit quantization stores embeddings at ~25% the size of full float32 with >0.95 cosine similarity. That means 10,000 memories in ~5 MB instead of ~150 MB.
We especially need:
- Benchmarks β Compare vs Mem0, Zep, LangMem
- LangChain integration β Retriever + chat history (started, needs testing)
- Web dashboard β Streamlit app for browsing memories
- Documentation β Tutorials, API docs, architecture diagrams
π Good First Issues Β· Contributing Guide
pip install -e .from turbomemory import TurboMemory
with TurboMemory(root="my_memory") as tm:
# Add memory
tm.add_memory("python", "Python uses dynamic typing and garbage collection")
# Query with verification
results = tm.verify_and_score("How does Python work?")
for score, topic, chunk, verif in results:
print(f"{'β' if verif.verified else '?'} {chunk['text']}")Memory = index, not storage
MEMORY.mdstores only pointers (~150 chars/line). Actual knowledge lives in topic files.
3-layer bandwidth-aware design Index (always) β Topics (on-demand) β Transcripts (append-only)
Strict write discipline Write to file, then update index. Never dump content into the index.
Background memory rewriting (autoDream) Merges duplicates, resolves contradictions, converts vague β absolute. Memory is continuously edited.
Staleness is first-class If memory β reality, memory is wrong. Code-derived facts are never stored.
Retrieval is skeptical, not blind Memory is a hint, not truth. Cross-reference verification before use.
What we don't store is the real insight No debug logs, no code structure, no PR history. If derivable, don't persist.
- SQLite index with connection pooling
- Packed quantization (4-bit / 6-bit / 8-bit) β up to 8x compression
- Topic centroid prefilter for fast retrieval
- Contradiction detection + confidence decay
- TTL (time-to-live) for memory chunks
- Cross-reference verification across topics
- Agreement scoring between related chunks
- Contradiction flagging during retrieval
- Optional "verified-only" query mode
- Per-chunk quality scores (confidence + freshness + specificity + verification)
- Automatic quality decay over time
- Quality-based ranking adjustments
- Configurable patterns for what NOT to store
- Blocks: debug output, code snippets, secrets, PR history
- Exclusion logging for auditability
- Semantic merging of similar chunks
- Contradiction resolution (older chunks decayed)
- Vague-to-absolute language conversion
- Aggressive deduplication and pruning
- Per-topic health scores (0.0 - 1.0)
- Consolidation event logging
- Comprehensive metrics (JSON output)
- Custom quality scorers
- Custom embedding providers
- Custom storage backends (Redis, PostgreSQL, etc.)
- Custom verification strategies
- LangChain β
TurboMemoryRetriever,TurboMemoryChatMessageHistory - CrewAI β Memory provider example
- More coming: AutoGen, LlamaIndex, Haystack
# Add memory
python cli.py add_memory --topic turboquant.video --text "TurboQuant-v3 uses block matching" --bits 6
# Query with verification
python cli.py query --query "How does TurboQuant work?" --verify
# Stats with topic health
python cli.py stats
# Consolidate
python consolidator.pyfrom turbomemory.integrations.langchain import TurboMemoryRetriever
retriever = TurboMemoryRetriever(root="my_memory", k=5, enable_verification=True)
docs = retriever.invoke("What is TurboQuant?")pip install streamlit
streamlit run dashboard.py| Bits | Original (384-dim) | Compressed | Ratio | Similarity |
|---|---|---|---|---|
| 4-bit | 1536 bytes | ~192 bytes | 8.0x | >0.90 |
| 6-bit | 1536 bytes | ~288 bytes | 5.3x | >0.95 |
| 8-bit | 1536 bytes | ~384 bytes | 4.0x | >0.99 |
Run benchmarks: python -m benchmarks.compression_bench
MEMORY.md (index, always loaded)
β
topics/*.tmem (structured topic files, loaded on demand)
β
sessions/*.jsonl (immutable logs, appended only)
β
db/index.sqlite (fast retrieval, connection pooled)
β
plugins/ (custom scorers, providers, storage, verification)
We welcome contributions! See CONTRIBUTING.md for details.
Good first issues: View issues
Roadmap: ROADMAP.md
MIT β see LICENSE