A Graph-Theoretic Memory Kernel for Agentic AI Systems
"Beyond RAG: Stateful memory for AI agents that actually remembers."
Launch the interactive dashboard and explore the memory graph:
# Clone and setup
git clone https://github.com/ARYAN2302/ContextOS.git
cd ContextOS
# Install dependencies
pip install -r requirements.txt
# Launch the demo dashboard
streamlit run examples/app.pyThen open http://localhost:8501 and click "Load Demo Brain" to see the 3D memory visualization!
ContextOS is a framework for building AI agents with persistent, structured memory. Unlike standard RAG (Retrieval-Augmented Generation) which treats documents as flat vectors, ContextOS models memory as a Knowledge Graph where:
- Nodes are memories (semantic facts, episodic events, procedural rules)
- Edges encode relationships (temporal, causal, associative)
- Retrieval uses hybrid scoring:
PageRank centrality + Vector similarity
This enables multi-hop reasoning that pure vector search cannot achieve.
Interactive 3D visualization of the memory graph showing Semantic (Cyan), Episodic (Orange), and Procedural (Green) nodes.
- ๐ง CoALA Memory Architecture - Semantic, Episodic, and Procedural memory types
- ๐ Graph-Native Storage - NetworkX topology + ChromaDB vectors
- โก Hybrid Retrieval -
Relevance = (ฮฑ ร Vector) + (ฮฒ ร Graph) - ๐พ Persistent by Default - Memory survives restarts
- ๐ Framework Agnostic - Works with LangChain, LlamaIndex, or raw API calls
The Streamlit dashboard includes:
- 3D Brain Visualization - Drag, zoom, and explore nodes
- Color-Coded Nodes - Cyan (Semantic), Orange (Episodic), Green (Procedural)
- Size by PageRank - Important memories appear larger
- Hover Details - See memory content and metadata
- Total Memories - Count of stored memories
- Connections - Relationship edges in the graph
- Vector Index - Semantic search index size
- Graph Density - Connectivity measure
- One-Click Brain Load - Generate 60+ memories instantly
- Needle in Haystack Demo - See how important facts become graph hubs
- Chat Interface - Add memories and query context
- 15x Memory Boost - Llama-8B + ContextOS vs alone
- 100% vs 6.7% - Accuracy comparison
- Multi-Hop Reasoning - Graph traversal advantages
pip install agentic-memoryOr for local development:
git clone https://github.com/ARYAN2302/ContextOS.git
cd ContextOS
pip install -r requirements.txt
pip install -e .from agentic_memory import ContextClient, MemoryType
# Initialize (loads existing memory if available)
client = ContextClient()
# Add memories
client.add_memory("User prefers dark mode", MemoryType.SEMANTIC)
client.add_memory("User asked about Python yesterday", MemoryType.EPISODIC)
# Compile context for a query
context = client.compile("What are the user's preferences?")
print(context)from agentic_memory import ContextClient
client = ContextClient()
def my_llm(system_prompt: str, user_query: str) -> str:
# Your LLM call here (OpenAI, Groq, Anthropic, etc.)
return llm.invoke(system_prompt + user_query)
# Run a full RAG loop with automatic memory logging
response = client.chat("What should I work on today?", llm_callable=my_llm)from agentic_memory import ContextGraph, ContextNode, ContextEdge, MemoryType, ContextCompiler
# Direct graph access
kernel = ContextGraph()
node = ContextNode(content="Important fact", type=MemoryType.SEMANTIC)
kernel.add_node(node)
# Add relationships
edge = ContextEdge(source=node1.id, target=node2.id, relation="CAUSES")
kernel.add_edge(edge)
# Compile context
compiler = ContextCompiler(kernel)
context = compiler.compile("query", token_budget=500, alpha=50, beta=50)Does memory make small LLMs useful?
| Setting | Accuracy |
|---|---|
| Llama-8B + ContextOS | 100% |
| Llama-8B alone (stateless) | 6.7% |
15x improvement - Proves that SLM + structured memory >> SLM alone.
cd experiments && python memory_boost_benchmark.pyReal HotpotQA dev set with 10 paragraphs per question (2 relevant + 8 distractors).
| Method | Exact Match | F1 Score |
|---|---|---|
| ContextOS | 54.0% | 67.7% |
| Vector-only RAG | 48.0% | 64.3% |
| No retrieval (stateless) | 0.0% | 10.8% |
+3.4% F1 over pure vector RAG. Graph structure helps filter noisy distractors.
cd experiments && python hotpotqa_real_benchmark.py| Configuration | Multi-Hop Accuracy | Analysis |
|---|---|---|
| Vector Only (RAG) | 50% | Found first hop, missed connections |
| Graph Only | 50% | Failed to ground initial query |
| ContextOS (Hybrid) | 100% | Anchored via Vector, traversed via Graph |
agentic_memory/
โโโ client.py # ContextClient - main entry point
โโโ core/
โ โโโ schema.py # Pydantic models (ContextNode, ContextEdge, MemoryType)
โ โโโ graph.py # Hybrid storage (NetworkX + ChromaDB)
โโโ memory/
โ โโโ ingestor.py # LLM-powered memory classification
โ โโโ compiler.py # PageRank + Vector hybrid retrieval
โโโ utils/
โโโ text.py # Text processing utilities
relevance(node, query) = (ฮฑ ร semantic_similarity) + (ฮฒ ร pagerank_centrality ร time_decay)
- ฮฑ (alpha): Weight for semantic similarity (vector search)
- ฮฒ (beta): Weight for graph centrality (structural importance)
- time_decay: Recency factor for episodic memories
# Install dependencies
pip install -r requirements.txt
# Launch Streamlit dashboard
streamlit run examples/app.pyDashboard Features:
- ๐ง Interactive 3D memory graph visualization
- ๐ Real-time statistics and metrics
- ๐ One-click demo brain generation
- ๐ฌ Chat interface with memory persistence
- ๐ Benchmark comparison charts
client = ContextClient(
storage_path="my_memory.json", # Graph persistence
chroma_path="my_vectors/", # Vector store
auto_persist=True # Save on every change
)
# Retrieval tuning
context = client.compile(
query="...",
token_budget=1000, # Max tokens in context
alpha=50.0, # Vector weight
beta=50.0 # Graph weight
)Multi-Session Persistence: Long-term user profiles and cross-session entity resolution.
Memory Consolidation (Sleep Cycles): Background merging of duplicate nodes and pruning of stale memories to optimize graph density.
v0.3 (Planned):
Causal Reasoning Engine: New edge types (CAUSES, PREVENTS) for deeper logic.
Document Ingestion: Parsing full PDFs into knowledge sub-graphs.
MIT License - See LICENSE for details.
Inspired by the CoALA architecture for cognitive agents.
Aryan - GitHub
Star โญ the repo if you find it useful!
