A local-first, cognitively-inspired memory system for AI agents. Built on Complementary Learning Systems (CLS) theory, it provides persistent, searchable, and self-organizing memory with emotional modulation, graph-based relationships, and sleep-inspired consolidation.
The system uses four complementary storage layers:
| Layer | Technology | Role |
|---|---|---|
| Raw Logs | Append-only JSONL | Immutable conversation history with byte-offset indexing |
| Metadata | SQLite (async) | Memory records, access tracking, compaction history, policy decisions |
| Graph | Kuzu (embedded) | Relationships between memories and entities |
| Vector | Qdrant (embedded) | Semantic search via text embeddings, visual search via CLIP |
- LLM-driven decisions — Each conversation turn is evaluated by an LLM to decide whether it should be saved as a memory, with structured confidence scoring
- Fast-path bypass — High-arousal or high-surprise events are saved immediately without LLM evaluation
- Retrieval gap awareness (A2.1) — When the system detects a gap in its knowledge (recent retrieval misses), the save threshold is lowered to capture more information in that area
- Three-layer search — Grep (ripgrep subprocess), keyword (SQLite FTS), and semantic (Qdrant vector similarity) layers run in parallel, results are merged
- Graph traversal — Related memories are discovered via graph walks up to configurable depth
- Mood-congruent weighting — Emotionally similar memories are boosted based on the current emotional context
- Decay scoring — Memory relevance decays over time using an exponential model with recency and frequency components
- Arousal and surprise slow decay — High-arousal or surprising memories decay more slowly, reflecting the psychological finding that emotional events are better remembered
- Semantic floor — Semantic (high-generation) memories never decay below a floor of 0.3, preserving distilled knowledge
- Keyword-overlap grouping — Memories sharing sufficient keyword overlap are candidates for merging
- Valence exclusion — Emotionally opposed memories (large valence delta) are never merged together
- Generation gap guard (A2.3) — Only memories within 1 generation of each other can be merged, preventing loss of abstraction hierarchy
- Merge validation via generative replay (A2.4) — Before committing a merge, synthetic queries test whether the merged memory retrieves as well as the originals; merges that degrade retrieval beyond tolerance are rejected
- Tiering — Memories flow through hot → warm → cold tiers based on age, access patterns, and generation
- Lineage tracking — Full
EVOLVED_FROMgraph edges preserve compaction history
- Keywords shared by graph-connected memories are boosted in weight, improving retrieval for strongly related topics
- Random walk strategy — Semi-random graph traversals discover latent connections between memories from different sessions
- Cluster bridge strategy — Identifies memories at the boundary of semantic clusters and tests cross-cluster connections
- Edge commitment — Discovered relationships are committed to the graph with full logging of exploration runs
- Decision-outcome pairing — Save and retrieval decisions are logged with outcomes to enable future policy learning
- Outcome assessment — Periodic assessment marks whether saved memories were actually useful (accessed) and whether retrievals were helpful (no immediate re-query on same topic)
- Training data export — JSONL export of labeled decision data for future RL-based policy training
- Heuristic-based controller providing
should_save,retrieval_config, andcompaction_prioritymethods - Hard constraints enforced: never delete raw logs, never compact fast-path gen-0 memories
- Designed as a drop-in point for a future learned policy (v2)
- CLIP embeddings — Images are embedded using OpenCLIP for cross-modal retrieval
- Dual-code search — Text queries can retrieve visual memories and vice versa
agent_memory/
├── config.py # Central configuration
├── models.py # Data models (Memory, SaveDecision, etc.)
├── core/
│ ├── memory_manager.py # Main orchestrator
│ ├── save_decision.py # LLM-driven save logic with gap awareness
│ ├── retrieval.py # Three-layer retrieval with mood weighting
│ ├── compaction.py # Merge, tier, validate with generative replay
│ ├── decay.py # Exponential decay with emotional modulation
│ ├── keyword_reweight.py # Graph-informed keyword boosting
│ └── dream_explorer.py # Exploratory graph walks
├── storage/
│ ├── jsonl_log.py # Immutable JSONL log with byte-offset index
│ ├── sqlite_store.py # Async SQLite for metadata and policy data
│ ├── graph_store.py # Kuzu graph for relationships
│ └── vector_store.py # Qdrant vectors for semantic search
├── embeddings/
│ ├── text_embedder.py # Sentence-transformer embeddings
│ └── visual_embedder.py # CLIP embeddings for images
├── emotion/
│ └── scorer.py # Emotional dimension scoring
├── llm/
│ └── client.py # LiteLLM provider-agnostic inference
└── policy/
├── controller.py # Heuristic policy controller (v1)
├── outcome_assessor.py # Decision-outcome assessment
└── export.py # Training data export
pip install -e ".[dev]"Requires Python 3.11+.
- litellm — Provider-agnostic LLM inference
- sentence-transformers — Text embeddings (default:
all-MiniLM-L6-v2) - open-clip-torch — Visual embeddings (default:
ViT-B-32) - qdrant-client — Embedded vector store
- kuzu — Embedded graph database
- aiosqlite — Async SQLite access
from agent_memory.core.memory_manager import MemoryManager
manager = MemoryManager()
await manager.initialize()
# Process a conversation turn
await manager.process_turn(
session_id="session-001",
turn_number=1,
content="The user asked about Python async patterns...",
emotional_context={"valence": 0.3, "arousal": 0.2},
)
# Retrieve relevant memories
results = await manager.retrieve(
query="async patterns in Python",
emotional_context={"valence": 0.0, "arousal": 0.0},
)
# Run compaction (includes keyword reweighting, dream exploration, outcome assessment)
await manager.run_compaction()
# Clean up
await manager.close()All settings are in agent_memory/config.py via the MEMORY_CONFIG dictionary. Key settings:
| Setting | Default | Description |
|---|---|---|
save_confidence_threshold |
0.5 | Minimum LLM confidence to save a memory |
fast_path_arousal |
0.85 | Arousal threshold for fast-path save bypass |
gap_threshold_reduction |
0.7 | Multiplier applied to save threshold when a retrieval gap is detected |
decay_halflife_days |
7 | Half-life for memory decay scoring |
keyword_overlap_merge_threshold |
0.6 | Minimum keyword overlap to consider merging |
max_generation_gap_for_merge |
1 | Maximum generation gap allowed for merging |
merge_degradation_tolerance |
0.15 | Maximum retrieval quality degradation allowed for a merge |
dream_walk_count |
50 | Number of random walks per dream exploration run |
dream_similarity_threshold |
0.7 | Minimum similarity for a discovered edge |
hot_tier_threshold |
500 | Hot-tier count that triggers compaction |
pytest93 tests covering all subsystems: storage layers, save decisions, retrieval, compaction, decay, keyword reweighting, dream exploration, policy logging, and policy controller.
See repository for license details.