Documentation | Architecture | API | Quick Start
The epistemics layer for AI systems — memory that knows what it knows, what it doesn't, and why it believes what it does.
Most vector databases treat embeddings as the whole story. But AI systems that interact with the real world need facts that expire, sources that lie, memory that decays, and context that matters. contextdb handles all four.
Documentation | Quick Start | API Reference | Python SDK | TypeScript SDK
Problem: AI apps forget context across sessions. Each chat starts fresh. RAG is static. Knowledge graphs don't track trust.
Solution: ContextDB is a temporal graph-vector database that remembers, evolves, and validates AI memory.
30-second demo:
# Store a fact with source credibility
curl -X POST http://localhost:8080/v1/sources \
-d '{"name": "standup_notes", "alpha": 8, "beta": 2}'
# Search with semantic + credibility ranking
curl "http://localhost:8080/v1/search?q=project+status"Why you need it:
| Without ContextDB | With ContextDB |
|---|---|
| "I don't know" | "Based on yesterday's standup (high credibility)..." |
| Static RAG dumps | Temporal versioning - facts evolve |
| All sources equal | Bayesian credibility propagation |
| "I can't explain my reasoning" | Narrative retrieval with evidence chains and citations |
| Uncalibrated confidence | Platt scaling: 0.7 confidence means 70% true |
| Forget-nothing or forget-everything | GDPR erasure, interference protection, cascade retraction |
See docs/concepts/credibility.md and docs/concepts/sm2.md.
graph LR
subgraph Client SDKs
GO[Go SDK]
PY[Python SDK]
TS[TypeScript SDK]
end
subgraph Server
GRPC[gRPC :7700]
REST[REST :7701]
OBS[Observe :7702]
ADMIN[Admin UI]
end
subgraph Core Pipeline
EMB[Auto-Embedding]
ING[Ingest + Conflict Detection]
RET[Retrieval + Reranking]
BG[Background Workers]
end
subgraph Backends
MEM[Memory]
BAD[BadgerDB + HNSW]
PG[Postgres + pgvector]
QD[Qdrant]
RD[Redis]
end
GO --> GRPC
PY --> REST
TS --> REST
GRPC --> EMB
REST --> EMB
EMB --> ING
EMB --> RET
ING --> MEM & BAD & PG & QD & RD
RET --> MEM & BAD & PG & QD & RD
BG --> MEM & BAD & PG & QD & RD
style GRPC fill:#4a9eff,stroke:#333,color:#fff
style REST fill:#4a9eff,stroke:#333,color:#fff
style MEM fill:#2ecc71,stroke:#333,color:#fff
style BAD fill:#27ae60,stroke:#333,color:#fff
style PG fill:#16a085,stroke:#333,color:#fff
style QD fill:#8e44ad,stroke:#333,color:#fff
style RD fill:#c0392b,stroke:#333,color:#fff
Go:
db := client.MustOpen(client.Options{})
defer db.Close()
ns := db.Namespace("my-app", namespace.ModeGeneral)
res, _ := ns.Write(ctx, client.WriteRequest{
Content: "Go 1.22 added routing patterns to net/http",
SourceID: "docs-crawler", // tracks credibility of this source over time
Vector: embedding, // or omit — auto-embedded when an Embedder is configured
})
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
Vector: queryVec,
TopK: 5, // return the 5 highest-scoring results (default: 10)
// TopK controls the result set size. Lower values (1–5) are faster and
// more focused — good for single-answer lookups. Higher values (20–50)
// give the retrieval pipeline more candidates to score, rerank, and
// diversify — better for RAG contexts where you need broad coverage.
})Python:
from contextdb import ContextDB
db = ContextDB("http://localhost:7701")
ns = db.namespace("my-app", mode="general")
ns.write(content="Go 1.22 added routing patterns", source_id="docs-crawler")
results = ns.retrieve(text="What changed in Go 1.22?", top_k=5)TypeScript:
import { ContextDB } from "contextdb";
const db = new ContextDB("http://localhost:7701");
const ns = db.namespace("my-app", "general");
await ns.write({ content: "Go 1.22 added routing patterns", sourceId: "docs-crawler" });
const results = await ns.retrieve({ text: "What changed in Go 1.22?", topK: 5 });Zero external dependencies for embedded mode. One go get and you're running.
Bi-temporal storage -- Every node tracks valid_time (when the fact was true in the world) and transaction_time (when the system learned it). Query what the database knew on any date. Typical vector DBs: single timestamp or none.
// What did we know about rate limits on June 1st?
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
Text: "API rate limit", TopK: 1, AsOf: time.Date(2024, 6, 1, 0, 0, 0, 0, time.UTC),
})Source credibility -- Sources have Bayesian credibility (Beta distribution). The admission gate rejects low-credibility writes before they enter the graph. Typical vector DBs: trust everything equally.
// Moderator writes are always admitted at full confidence
ns.LabelSource(ctx, "moderator:alice", []string{"moderator"})
// Flag a troll -- all future writes rejected at the gate
ns.LabelSource(ctx, "user:spammer", []string{"troll"})Conflict detection -- Contradictions are identified at write time via semantic similarity + label overlap, then tracked as contradicts edges in the graph. Example. Typical vector DBs: no contradiction awareness.
Credibility learning -- Sources that produce validated claims gain trust; those that contradict reliable facts lose it. Updates are Bayesian -- uncertainty decreases with more observations. Example. Typical vector DBs: static trust scores.
Memory decay -- Different knowledge decays at different rates. Episodic memories (half-life ~9h) fade quickly; procedural skills (half-life ~29d) persist. Background workers consolidate episodic memories into durable semantic knowledge. Example. Typical vector DBs: no decay model.
Hybrid retrieval -- Fan out to vector ANN, graph walk, and session context simultaneously, then fuse with configurable weights. MMR diversity prevents near-duplicate results. Example. Typical vector DBs: vector-only.
Reranking -- Optional LLM cross-encoder reranking after fusion. Falls back gracefully on LLM failure. Typical vector DBs: no reranking.
Caller-supplied weights -- Every query can tune the balance between similarity, confidence, recency, and utility. Or use namespace mode defaults. Typical vector DBs: fixed ranking.
// Boost recency for a news-focused query
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
Text: "latest updates", TopK: 10,
ScoreParams: core.ScoreParams{SimilarityWeight: 0.3, RecencyWeight: 0.5, ConfidenceWeight: 0.1, UtilityWeight: 0.1},
})Query DSL -- Two syntax tiers. Pipe syntax for the REPL (search "x" | where confidence > 0.7 | top 5). CQL for apps (FIND "x" WHERE ... WEIGHT ... LIMIT 5). Both compile to the same AST. Example. Typical vector DBs: API-only.
Label filtering -- Restrict retrieval to nodes carrying specific labels. Push-down to backend when supported. Typical vector DBs: no label support.
Belief reconciliation -- When agents disagree, get a structured diff: which claims conflict, the evidence chain for each side, the credibility gap. "Git diff for beliefs." Typical vector DBs: no belief tracking.
diff, _ := retrieval.ComputeBeliefDiff(ctx, graph, "ops", nil)
// diff.Conflicts[0].ClaimA: "Deploy uses blue-green" (conf: 0.9, 3 supporters)
// diff.Conflicts[0].ClaimB: "Deploy uses canary" (conf: 0.7, 1 supporter)Narrative retrieval -- "Walk me through what you know about X and why." Returns a structured report with citations, evidence chains, contradictions, and a confidence explanation. Example. Typical vector DBs: ranked list of chunks.
Calibration pipeline -- Measure how well confidence predicts truth (Brier score, ECE). Correct it with Platt scaling or isotonic regression. A claim with 0.7 confidence should be true ~70% of the time. Example. Typical vector DBs: uncalibrated scores.
Interference detection -- A low-credibility source can't erode a well-established, well-cited claim. The contradiction is tracked, but the original claim's confidence is protected. Example. Typical vector DBs: last-write-wins.
Knowledge gap detection -- "What don't I know about X?" Detects sparse regions in the semantic space and suggests what information to acquire next. Example. Typical vector DBs: no gap awareness.
Retraction -- Non-destructive "I take this back" that cascades through derives_from edges. The audit trail is preserved -- retraction markers, not deletion. Example. Typical vector DBs: hard delete or nothing.
GDPR erasure -- Audit-trailed right-to-erasure across graph, vectors, KV cache, and event log. Retracts nodes, deletes embeddings, invalidates edges, preserves the audit shape. Example. Typical vector DBs: manual deletion.
Claim federation -- Gossip-based multi-instance replication using hashicorp/memberlist. Source credibility merges additively in Beta space -- two instances observing the same source is more evidence, not a conflict. Example. Typical vector DBs: single instance only.
Namespace modes -- belief_system, agent_memory, general, procedural. Each ships tuned defaults for scoring weights, decay rates, and compaction. Switch with one parameter. Typical vector DBs: one-size-fits-all.
Auto-embedding -- Text automatically embedded via OpenAI, local, or custom providers with LRU cache. Send text, get vectors — or bring your own. Typical vector DBs: bring your own vectors.
Active recall -- SM-2 spaced repetition boosts utility for memories that are successfully recalled. Background worker handles scheduling. Example. Typical vector DBs: no recall model.
Memory consolidation -- RAPTOR compaction clusters similar nodes and summarizes them. Episodic memories promote to durable semantic knowledge. Typical vector DBs: no consolidation.
RBAC -- Token-based tenant:permissions:secret controlling read/write/admin per namespace. Typical vector DBs: no access control.
Snapshot/restore -- NDJSON export and import per namespace, including full version history. Typical vector DBs: no portability.
Admin UI -- Built-in dashboard on the observe port with stats, metrics links, and time-travel queries. Typical vector DBs: external tooling.
score = w_sim * cosine(candidate, query) + w_conf * confidence + w_rec * exp(-alpha * age) + w_util * utility
All weights normalised at query time. Different namespace modes ship tuned defaults.
| Mode | Backend | Use case |
|---|---|---|
| Embedded | In-memory or BadgerDB | Dev, testing, sidecars, CLIs |
| Standard | Postgres + pgvector | Production single-node |
| Remote | gRPC to contextdb server | Microservices, multi-language clients |
| Scaled | Qdrant + Redis + Postgres | High-throughput production |
go get github.com/antiartificial/contextdb@latest# Run the server (no external dependencies)
make run
# With Postgres
docker compose up --build
# Scaled mode (Qdrant + Redis + Postgres)
docker compose --profile scaled up --build
# Run all tests
make test
# Coverage
make cover-text
# Benchmarks
make bench-mteb # MTEB retrieval quality
make bench-adversarial # Poisoning resistance, temporal consistency
make bench # Full benchmark suite with HTML reportcontextdb/
├── cmd/contextdb/ # server entrypoint (gRPC + REST + observe)
├── internal/
│ ├── core/ # Node, Edge, Source, ScoreParams
│ ├── store/ # GraphStore, VectorIndex, KVStore, EventLog
│ │ ├── memory/ # in-process backend
│ │ ├── badger/ # BadgerDB + HNSW backend
│ │ ├── postgres/ # Postgres + pgvector backend
│ │ ├── qdrant/ # Qdrant vector backend (scaled mode)
│ │ ├── redis/ # Redis KV + EventLog backend (scaled mode)
│ │ └── remote/ # gRPC remote store client
│ ├── embedding/ # auto-embedding (OpenAI, local, cached)
│ ├── extract/ # LLM entity/relation extraction
│ ├── ingest/ # admission gate, conflict detection, credibility learning
│ ├── compact/ # RAPTOR compaction, memory consolidation, active recall
│ ├── dsl/ # query languages (pipe syntax + CQL)
│ ├── retrieval/ # hybrid retrieval, scoring, reranking
│ ├── server/ # gRPC + REST + RBAC + auth
│ ├── admin/ # admin dashboard UI
│ ├── snapshot/ # NDJSON export/import
│ ├── namespace/ # mode presets
│ ├── federation/ # gossip-based claim federation
│ └── observe/ # metrics, pprof, health
├── pkg/client/ # Go SDK
├── sdk/
│ ├── python/ # Python SDK (pip install contextdb)
│ └── typescript/ # TypeScript SDK (npm install contextdb)
├── bench/
│ ├── longmemeval/ # LongMemEval benchmark
│ ├── mteb/ # MTEB retrieval quality
│ └── adversarial/ # adversarial resistance
├── deploy/helm/contextdb/ # Helm chart
└── docs/ # Documentation (GitHub Pages)
- Zep / Graphiti -- bi-temporal KG for agent memory
- Hindsight -- TEMPR multi-strategy retrieval
- RAPTOR -- hierarchical summarisation for compaction
- A-MAC -- adaptive memory admission control
MIT