Releases: asheshgoplani/opengraphdb
Release list
v0.6.0
OpenGraphDB v0.6.0
A single-binary embedded graph database in Rust — openCypher + vector + full-text + MCP + RDF. Apache-2.0, no JVM, no AGPL, one file. This is the first public release: the milestone where correctness, reachable authentication, and AI-native retrieval all became true over the wire, not just in the engine.
We lead with honesty. Everything below is backed by a reproducible test or a live transcript, and the "Honest limitations" section is as much a feature of this release as the highlights. Where a competitor would round up, we tell you exactly where the edges are.
Pre-1.0 notice. OpenGraphDB v0.6.0 is single-process and embedded. The API and on-disk format may still change, durability and concurrency are not yet hardened for untrusted multi-tenant network exposure, and several capabilities below are new in this release. Run it where that posture fits — local-first apps, AI agent memory, GraphRAG, a graph that ships inside your binary — and read the limitations before you put it on a shared network.
Why OpenGraphDB
It's the graph database shaped the way an AI actually works: understand knowledge and code, retrieve it efficiently, keep it secured and shareable under access control — all from one file you drop into a binary instead of a cluster you operate.
- One Rust binary. No JVM, no sidecar, no server farm. The visual playground SPA ships inside the binary.
- openCypher with documented dialect limits, plus RDF (Turtle/OWL) import/export round-trip.
- AI-native retrieval: HNSW vectors + BM25 full-text + graph-walk, fused with Reciprocal Rank Fusion (RRF).
- A built-in MCP server so AI agents (Claude Desktop, Cursor, the MCP SDKs) connect directly.
- Apache-2.0. A graph database you can embed, fork, and ship without an AGPL or JVM tax — the Neo4j alternative for people who don't want a cluster.
Highlights
🔐 Authentication that is actually reachable and enforced
RBAC existed in the engine for releases, but nothing in a shipped binary could activate it — the token path was effectively dead code. v0.6.0 wires it to a real front door:
ogdb user add / grant / revoke / list— creating the first user activates RBAC for that database. Roles areadmin,read_write,read_only.user listnever prints token values.--require-auth(orOGDB_REQUIRE_AUTH=1) is fail-closed: the server refuses to start if auth is required but no users exist, and every data route returns401to anonymous or invalid requests./healthstays open for liveness.--require-auth-readsgates reads (/query,/schema,/metrics) too — while the open, anonymous-read playground remains the explicit default so the demo experience is unchanged unless you opt in.- Label-injection hardening: the MCP
search_nodestool validates labels against^[A-Za-z_][A-Za-z0-9_]*$before interpolation, so a crafted label is rejected with400, never executed.
🧠 Vector retrieval, end to end over the wire
The HNSW engine (three distance metrics, 1–4096 dimensions, crash-safe persistence) was always strong — but you couldn't create a vector index from any client. Now you can:
CREATE VECTOR INDEXandCREATE FULLTEXT INDEXparse, route, and execute (previously they failed at the parser).- kNN over the wire:
CALL db.index.vector.queryNodes(name, vec, k)returns correctly ranked nearest neighbors. - Numeric Cypher list literals written as embeddings are coerced to indexable vectors at harvest time, so vectors you author in Cypher actually get indexed.
This makes the bring-your-own-vectors hybrid retrieval story true end-to-end: store your embeddings, then fuse vector + graph-walk + BM25 with RRF in one round-trip. (Bring-your-own-model — managed auto-embeddings — is next; see What's next.)
🤝 An MCP server that stock AI clients connect to cleanly
The MCP front door is now spec-conformant for the path real clients use (initialize → notifications/initialized → tools/list → tools/call):
protocolVersionis negotiated, not hardcoded — the server echoes a supported version the client asks for.- Capabilities are objects, notifications correctly get no reply, and
tools/callresults are wrapped in the proper MCP content envelope withstructuredContent. - Clean typed cells: query results come back as native JSON (
30,"Alice") instead of double-encoded, type-prefixed strings ("i64:30","string:Alice"). - stdout is pure JSON-RPC — a stray status line that used to corrupt the stream for line-by-line clients now goes to stderr.
Point Claude Desktop or Cursor at a single-file graph and it just works.
📡 Opt-in realtime change feed + a live-polling playground
GET /changesexposes a monotonic mutation counter, enabled with--enable-changes(orOGDB_ENABLE_CHANGES=1). Reads don't bump it; writes do. Disabled by default — when off, the route returns a clear404explaining how to turn it on.- The visual playground auto-refreshes when the counter advances: write a node from a separate terminal and watch the canvas update within ~1 second, no manual re-run. The badge advances "Live → Polling" when subscribed, and silently falls back to plain "Live" against a server that hasn't enabled the feed.
✅ Seven engine-correctness fixes + two crash fixes
These are the ones that matter most for anything built on top, because they were silently wrong:
labels(),id(),type(),toString()returned NULL instead of real values — which quietly brokesearch_nodesand any query relying on them. Now implemented and verified (labels(n)→[Person], not NULL).ORDER BY <projection alias>(e.g.RETURN n.name AS myname ORDER BY myname) is resolved correctly instead of erroring or falling back to insertion order.- Parser-routing fixes: a trailing semicolon and
UNIONqueries no longer silently fall back to the legacy parser. - Importer phantom nodes: an edge referencing a non-existent node is now rejected instead of fabricating endpoints, and gap-filled node counts are reported instead of happening silently.
- Two crash fixes: a client that disconnects mid-write no longer takes the whole server down, and the MCP stdout-pollution issue above is closed.
🛡️ Secure-by-default networking + a TLS deployment recipe
- Bolt and gRPC now bind
127.0.0.1by default (previously0.0.0.0). Exposing all interfaces requires an explicit--bind 0.0.0.0:<port>and prints a loud warning. - A reverse-proxy TLS recipe ships in
SECURITY.md(nginx/Caddy, including SSE buffering and a Bolt L4 note) — the supported secure posture today is bind-loopback + terminate TLS at the proxy. (Native in-process TLS is deliberately deferred — see below.)
📊 Honest, reproducible benchmarks
Measured on a fixed box (i9-10920X, powersave governor, N=5, lower-median; CPU-bound and reproduces closely):
| Operation | p50 | Notes |
|---|---|---|
Point read (neighbors(), 10k graph) |
5.8 µs | ~166k qps |
| 2-hop traversal | 22.9 µs | ~48k qps |
| Hybrid retrieval (RRF) | 204 µs | published number is conservative; verified runs ~2× faster |
| Footprint (10k graph) | — | ~28 MB RSS, ~39 MB on disk, sub-second load |
| Visual playground | — | 5,000 nodes @ 58–61 fps (17k edges @ 53–59 fps) |
We publish our methodology and the harness to reproduce these — including the numbers that aren't flattering. We do not publish head-to-head "X× faster than $competitor" bars; those comparisons are directional, not measured.
Honest limitations / not yet
We'd rather you find these here than discover them in production.
- No native in-process TLS. HTTP, Bolt, and MCP are cleartext on the wire. The supported secure deployment is bind-loopback + a reverse proxy terminating TLS (recipe in
SECURITY.md). Native TLS is on the roadmap and was deliberately not half-built — half a TLS stack is worse than none. - Durability is not 1.0-grade. In particular, edge type and properties are not yet WAL-logged and can be lost if the sidecar is corrupted after a crash. Single-process embedded durability, not a replicated cluster.
- Concurrency is single-process. This is an embedded database. There is no multi-tenant isolation, per-object ACL, or row/node-level security — process-per-tenant is the supported isolation model.
- RBAC is coarse. Database-wide read / write / admin. There is no per-label or per-row enforcement yet, and per-message Bolt/gRPC gating beyond the existing token check is not part of this release.
- Bring your own vectors, not your own model (yet). There is no built-in embedder. You supply the vectors; OGDB indexes and retrieves them. Managed auto-embeddings are the next headline.
- Not a wire-compatible Neo4j driver drop-in. The Bolt server negotiates v1 only, so modern Neo4j 5.x drivers won't connect. The honest claim is "openCypher dialect familiar to Neo4j users, no JVM, no AGPL, single file" — a migration target, not a drop-in driver replacement.
$parambinding is not implemented. To avoid a silent correctness bug,$paramnow hard-errors loudly instead of mis-resolving to a literal string. Real parameter binding is roadmapped.- Cypher dialect limits. Some reserved-word labels (
:Order,:CONTAINS) require escaping; the limitation is documented in the quickstart and migration guide. - Bitemporal / time-travel is not a claimed feature in this release.
AT TIMEis not yet a verified, shipping capability — it is deliberately left off the highlights rather than advertised as working. - Visualization ceiling. The 2D canvas is verified at 5,000 nodes @ ~58 fps and degrades beyond that. We claim the number we measured, not more.
- Language bindings (
ogdb-node,ogdb-python) are preview / build-from-source. The first-class AI integration surface is the MCP server, whi...
v0.5.5
v0.5.4
Added
- bolt: v3 + v4.0 + v4.4 protocol support (handshake + HELLO + BEGIN/COMMIT/ROLLBACK + PULL{n,qid}) — c9fa7bd. Closes the major 'Neo4j-compatible' gap. Modern neo4j drivers (Python>=5, JS>=5) now negotiate v4.4 successfully. v1 unchanged.
- testkit: scaffold cross-binding parity harness (YAML corpus + Python driver + nightly non-required CI) — ece5d93. Initial corpus covers 10 entries across the 4 cypher engine fixes + CRUD + traversal + aggregation. Phase 1 of 8-step rollout.
- ogdb-eval: factory + dispatch unit tests for real-LLM adapters (OpenAI + local + mod-internals) — 9c2ade9. Closes 0% coverage on OpenAI/local backends.
Changed
- refactor(ogdb-core): Phase C lib.rs split slice 2 — extracted errors.rs + metrics.rs + concurrency.rs (90116b4). Slice 3 — extracted audit/replication/compaction/batch.rs (12be861). Pure file moves, zero API change. lib.rs trimmed by ~700 lines across both slices.
- docs: SPEC.md + DESIGN.md refreshed to match current code reality — async runtime (std::net not tokio), compression (zstd only), vector backend (instant-distance), perf numbers marked measured-vs-aspirational, removed fictional PageType enum (ed268cd).
Fixed
- ci: coverage uncovered-lines ratchet 5000→6000 to accommodate the 11 silent-null function branches added in v0.5.3 (33d563f).
- ci: CONTRIBUTING.md coverage claim aligned with scripts/coverage.sh (1d50c58).
- ci: 5 CI cascade fixes — testkit toolchain pin, cargo fmt sweeps on bolt/eval/replication, doc-intra-link fixes for [QueryError] + [SharedDatabase] (e8740bb, 14f9ea2, 5ef19c5, 4c7f2e5, 451bcc9).
- test: scoped #[allow(clippy::approx_constant)] on vector_search_direct fixture literals — they're rounded diagonals, not constants (c791673).
Full Changelog: v0.5.3...v0.5.4