Skip to content

antiartificial/contextdb

Repository files navigation

contextdb

Documentation | Architecture | API | Quick Start

The epistemics layer for AI systems — memory that knows what it knows, what it doesn't, and why it believes what it does.

Most vector databases treat embeddings as the whole story. But AI systems that interact with the real world need facts that expire, sources that lie, memory that decays, and context that matters. contextdb handles all four.

Documentation | Quick Start | API Reference | Python SDK | TypeScript SDK

What & Why

Problem: AI apps forget context across sessions. Each chat starts fresh. RAG is static. Knowledge graphs don't track trust.

Solution: ContextDB is a temporal graph-vector database that remembers, evolves, and validates AI memory.

30-second demo:

# Store a fact with source credibility
curl -X POST http://localhost:8080/v1/sources \
  -d '{"name": "standup_notes", "alpha": 8, "beta": 2}'

# Search with semantic + credibility ranking
curl "http://localhost:8080/v1/search?q=project+status"

Why you need it:

Without ContextDB With ContextDB
"I don't know" "Based on yesterday's standup (high credibility)..."
Static RAG dumps Temporal versioning - facts evolve
All sources equal Bayesian credibility propagation
"I can't explain my reasoning" Narrative retrieval with evidence chains and citations
Uncalibrated confidence Platt scaling: 0.7 confidence means 70% true
Forget-nothing or forget-everything GDPR erasure, interference protection, cascade retraction

See docs/concepts/credibility.md and docs/concepts/sm2.md.

Architecture

graph LR
    subgraph Client SDKs
        GO[Go SDK]
        PY[Python SDK]
        TS[TypeScript SDK]
    end

    subgraph Server
        GRPC[gRPC :7700]
        REST[REST :7701]
        OBS[Observe :7702]
        ADMIN[Admin UI]
    end

    subgraph Core Pipeline
        EMB[Auto-Embedding]
        ING[Ingest + Conflict Detection]
        RET[Retrieval + Reranking]
        BG[Background Workers]
    end

    subgraph Backends
        MEM[Memory]
        BAD[BadgerDB + HNSW]
        PG[Postgres + pgvector]
        QD[Qdrant]
        RD[Redis]
    end

    GO --> GRPC
    PY --> REST
    TS --> REST
    GRPC --> EMB
    REST --> EMB
    EMB --> ING
    EMB --> RET
    ING --> MEM & BAD & PG & QD & RD
    RET --> MEM & BAD & PG & QD & RD
    BG --> MEM & BAD & PG & QD & RD

    style GRPC fill:#4a9eff,stroke:#333,color:#fff
    style REST fill:#4a9eff,stroke:#333,color:#fff
    style MEM fill:#2ecc71,stroke:#333,color:#fff
    style BAD fill:#27ae60,stroke:#333,color:#fff
    style PG fill:#16a085,stroke:#333,color:#fff
    style QD fill:#8e44ad,stroke:#333,color:#fff
    style RD fill:#c0392b,stroke:#333,color:#fff
Loading

Five lines to a working database

Go:

db := client.MustOpen(client.Options{})
defer db.Close()

ns := db.Namespace("my-app", namespace.ModeGeneral)
res, _ := ns.Write(ctx, client.WriteRequest{
    Content:  "Go 1.22 added routing patterns to net/http",
    SourceID: "docs-crawler",  // tracks credibility of this source over time
    Vector:   embedding,       // or omit — auto-embedded when an Embedder is configured
})
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
    Vector: queryVec,
    TopK:   5, // return the 5 highest-scoring results (default: 10)
    // TopK controls the result set size. Lower values (1–5) are faster and
    // more focused — good for single-answer lookups. Higher values (20–50)
    // give the retrieval pipeline more candidates to score, rerank, and
    // diversify — better for RAG contexts where you need broad coverage.
})

Python:

from contextdb import ContextDB

db = ContextDB("http://localhost:7701")
ns = db.namespace("my-app", mode="general")
ns.write(content="Go 1.22 added routing patterns", source_id="docs-crawler")
results = ns.retrieve(text="What changed in Go 1.22?", top_k=5)

TypeScript:

import { ContextDB } from "contextdb";

const db = new ContextDB("http://localhost:7701");
const ns = db.namespace("my-app", "general");
await ns.write({ content: "Go 1.22 added routing patterns", sourceId: "docs-crawler" });
const results = await ns.retrieve({ text: "What changed in Go 1.22?", topK: 5 });

Zero external dependencies for embedded mode. One go get and you're running.

What makes it different

Data model

Bi-temporal storage -- Every node tracks valid_time (when the fact was true in the world) and transaction_time (when the system learned it). Query what the database knew on any date. Typical vector DBs: single timestamp or none.

// What did we know about rate limits on June 1st?
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
    Text: "API rate limit", TopK: 1, AsOf: time.Date(2024, 6, 1, 0, 0, 0, 0, time.UTC),
})

Source credibility -- Sources have Bayesian credibility (Beta distribution). The admission gate rejects low-credibility writes before they enter the graph. Typical vector DBs: trust everything equally.

// Moderator writes are always admitted at full confidence
ns.LabelSource(ctx, "moderator:alice", []string{"moderator"})

// Flag a troll -- all future writes rejected at the gate
ns.LabelSource(ctx, "user:spammer", []string{"troll"})

Conflict detection -- Contradictions are identified at write time via semantic similarity + label overlap, then tracked as contradicts edges in the graph. Example. Typical vector DBs: no contradiction awareness.

Credibility learning -- Sources that produce validated claims gain trust; those that contradict reliable facts lose it. Updates are Bayesian -- uncertainty decreases with more observations. Example. Typical vector DBs: static trust scores.

Memory decay -- Different knowledge decays at different rates. Episodic memories (half-life ~9h) fade quickly; procedural skills (half-life ~29d) persist. Background workers consolidate episodic memories into durable semantic knowledge. Example. Typical vector DBs: no decay model.

Retrieval

Hybrid retrieval -- Fan out to vector ANN, graph walk, and session context simultaneously, then fuse with configurable weights. MMR diversity prevents near-duplicate results. Example. Typical vector DBs: vector-only.

Reranking -- Optional LLM cross-encoder reranking after fusion. Falls back gracefully on LLM failure. Typical vector DBs: no reranking.

Caller-supplied weights -- Every query can tune the balance between similarity, confidence, recency, and utility. Or use namespace mode defaults. Typical vector DBs: fixed ranking.

// Boost recency for a news-focused query
results, _ := ns.Retrieve(ctx, client.RetrieveRequest{
    Text: "latest updates", TopK: 10,
    ScoreParams: core.ScoreParams{SimilarityWeight: 0.3, RecencyWeight: 0.5, ConfidenceWeight: 0.1, UtilityWeight: 0.1},
})

Query DSL -- Two syntax tiers. Pipe syntax for the REPL (search "x" | where confidence > 0.7 | top 5). CQL for apps (FIND "x" WHERE ... WEIGHT ... LIMIT 5). Both compile to the same AST. Example. Typical vector DBs: API-only.

Label filtering -- Restrict retrieval to nodes carrying specific labels. Push-down to backend when supported. Typical vector DBs: no label support.

Trust & epistemics

Belief reconciliation -- When agents disagree, get a structured diff: which claims conflict, the evidence chain for each side, the credibility gap. "Git diff for beliefs." Typical vector DBs: no belief tracking.

diff, _ := retrieval.ComputeBeliefDiff(ctx, graph, "ops", nil)
// diff.Conflicts[0].ClaimA: "Deploy uses blue-green" (conf: 0.9, 3 supporters)
// diff.Conflicts[0].ClaimB: "Deploy uses canary" (conf: 0.7, 1 supporter)

Narrative retrieval -- "Walk me through what you know about X and why." Returns a structured report with citations, evidence chains, contradictions, and a confidence explanation. Example. Typical vector DBs: ranked list of chunks.

Calibration pipeline -- Measure how well confidence predicts truth (Brier score, ECE). Correct it with Platt scaling or isotonic regression. A claim with 0.7 confidence should be true ~70% of the time. Example. Typical vector DBs: uncalibrated scores.

Interference detection -- A low-credibility source can't erode a well-established, well-cited claim. The contradiction is tracked, but the original claim's confidence is protected. Example. Typical vector DBs: last-write-wins.

Knowledge gap detection -- "What don't I know about X?" Detects sparse regions in the semantic space and suggests what information to acquire next. Example. Typical vector DBs: no gap awareness.

Operations

Retraction -- Non-destructive "I take this back" that cascades through derives_from edges. The audit trail is preserved -- retraction markers, not deletion. Example. Typical vector DBs: hard delete or nothing.

GDPR erasure -- Audit-trailed right-to-erasure across graph, vectors, KV cache, and event log. Retracts nodes, deletes embeddings, invalidates edges, preserves the audit shape. Example. Typical vector DBs: manual deletion.

Claim federation -- Gossip-based multi-instance replication using hashicorp/memberlist. Source credibility merges additively in Beta space -- two instances observing the same source is more evidence, not a conflict. Example. Typical vector DBs: single instance only.

Namespace modes -- belief_system, agent_memory, general, procedural. Each ships tuned defaults for scoring weights, decay rates, and compaction. Switch with one parameter. Typical vector DBs: one-size-fits-all.

Auto-embedding -- Text automatically embedded via OpenAI, local, or custom providers with LRU cache. Send text, get vectors — or bring your own. Typical vector DBs: bring your own vectors.

Active recall -- SM-2 spaced repetition boosts utility for memories that are successfully recalled. Background worker handles scheduling. Example. Typical vector DBs: no recall model.

Memory consolidation -- RAPTOR compaction clusters similar nodes and summarizes them. Episodic memories promote to durable semantic knowledge. Typical vector DBs: no consolidation.

RBAC -- Token-based tenant:permissions:secret controlling read/write/admin per namespace. Typical vector DBs: no access control.

Snapshot/restore -- NDJSON export and import per namespace, including full version history. Typical vector DBs: no portability.

Admin UI -- Built-in dashboard on the observe port with stats, metrics links, and time-travel queries. Typical vector DBs: external tooling.

Scoring function

score = w_sim * cosine(candidate, query) + w_conf * confidence + w_rec * exp(-alpha * age) + w_util * utility

All weights normalised at query time. Different namespace modes ship tuned defaults.

Deployment modes

Mode Backend Use case
Embedded In-memory or BadgerDB Dev, testing, sidecars, CLIs
Standard Postgres + pgvector Production single-node
Remote gRPC to contextdb server Microservices, multi-language clients
Scaled Qdrant + Redis + Postgres High-throughput production

Quick start

go get github.com/antiartificial/contextdb@latest
# Run the server (no external dependencies)
make run

# With Postgres
docker compose up --build

# Scaled mode (Qdrant + Redis + Postgres)
docker compose --profile scaled up --build

# Run all tests
make test

# Coverage
make cover-text

# Benchmarks
make bench-mteb          # MTEB retrieval quality
make bench-adversarial   # Poisoning resistance, temporal consistency
make bench               # Full benchmark suite with HTML report

Project layout

contextdb/
├── cmd/contextdb/           # server entrypoint (gRPC + REST + observe)
├── internal/
│   ├── core/                # Node, Edge, Source, ScoreParams
│   ├── store/               # GraphStore, VectorIndex, KVStore, EventLog
│   │   ├── memory/          # in-process backend
│   │   ├── badger/          # BadgerDB + HNSW backend
│   │   ├── postgres/        # Postgres + pgvector backend
│   │   ├── qdrant/          # Qdrant vector backend (scaled mode)
│   │   ├── redis/           # Redis KV + EventLog backend (scaled mode)
│   │   └── remote/          # gRPC remote store client
│   ├── embedding/           # auto-embedding (OpenAI, local, cached)
│   ├── extract/             # LLM entity/relation extraction
│   ├── ingest/              # admission gate, conflict detection, credibility learning
│   ├── compact/             # RAPTOR compaction, memory consolidation, active recall
│   ├── dsl/                 # query languages (pipe syntax + CQL)
│   ├── retrieval/           # hybrid retrieval, scoring, reranking
│   ├── server/              # gRPC + REST + RBAC + auth
│   ├── admin/               # admin dashboard UI
│   ├── snapshot/            # NDJSON export/import
│   ├── namespace/           # mode presets
│   ├── federation/          # gossip-based claim federation
│   └── observe/             # metrics, pprof, health
├── pkg/client/              # Go SDK
├── sdk/
│   ├── python/              # Python SDK (pip install contextdb)
│   └── typescript/          # TypeScript SDK (npm install contextdb)
├── bench/
│   ├── longmemeval/         # LongMemEval benchmark
│   ├── mteb/                # MTEB retrieval quality
│   └── adversarial/         # adversarial resistance
├── deploy/helm/contextdb/   # Helm chart
└── docs/                    # Documentation (GitHub Pages)

Related work

  • Zep / Graphiti -- bi-temporal KG for agent memory
  • Hindsight -- TEMPR multi-strategy retrieval
  • RAPTOR -- hierarchical summarisation for compaction
  • A-MAC -- adaptive memory admission control

License

MIT

About

The epistemics layer for AI systems. Memory that knows what it knows, what it doesn't, and why it believes what it does.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages