Skip to content

epic: multi-backend storage — native graph DBs (neo4j / falkordb / surrealdb / GrafeoDB) #554

@ohdearquant

Description

@ohdearquant

Goal

Let khive run on a native graph database (neo4j, falkordb, surrealdb, GrafeoDB, …) in addition to the embedded SQLite backend, per the ADR-009 "one crate per backend" pattern (khive-db-neo4j was already named there).

What's already backend-agnostic (good news — ~60% done by design)

  • The GqlQuery AST is backend-neutral (ADR-008). Both GQL and SPARQL parse into it.
  • The 8 storage traits (ADR-005: SqlAccess, EntityStore, NoteStore, GraphStore, VectorStore, TextSearch, EventStore, SnapshotStore) are the abstraction boundary. CRUD + neighbors/traverse/get/create/link route through traits, so they work on any backend that implements them.
  • RRF fusion (khive-score) is deterministic i64 math — backend-neutral.

The actual coupling (code-verified)

The query verb (GQL/SPARQL pattern matching) is the only SQLite-bound path. Trace (khive-runtime/src/operations.rs:1685-1728):

query str → parse_auto → GqlQuery AST → khive_query::compile(ast) → CompiledQuery{sql,params}
          → self.sql().reader().query_all(SqlStatement) → SqlRow

Two hardcoded couplings:

  1. Singular SQLite compiler. khive-query/src/compilers/mod.rs is literally pub mod sql;. compile() emits SQLite SQL (JOIN-chain for fixed-length, recursive CTE for variable-length). No dialect selection. (ADR-008 already lists "Cypher output / SQL dialects" as future scope.)
  2. Execution hardwired to SqlAccess, returns SqlRow (operations.rs:1722). The query verb bypasses GraphStore for pattern matching and shoves raw SQL through SqlAccess — a native graph DB speaks neither SQL nor SqlRow.

Required changes (tracked as sub-issues)

  • A — dialect-keyed query compiler (khive-query/compilers/): trait QueryCompiler with SqliteCompiler, CypherCompiler (neo4j + falkordb share Cypher), SurrealCompiler. AST stays; only codegen forks.
  • B — abstract query execution off SqlAccess (khive-storage + khive-runtime): a GraphQuery/QueryExecutor trait + a backend-neutral row type (not SqlRow). SQLite impl = compile→SQL→SqlAccess; neo4j impl = compile→Cypher→bolt→records. Runtime hands the backend the AST + namespace scope and stops knowing SQL exists.
  • C — dialect-aware namespace scoping: today injected as SQL WHERE namespace=? (opts.scopes). neo4j scopes via label/property or db-per-namespace; surreal via NS/DB. Scope injection must move into the per-dialect compiler. (See also khive-runtime/src/portability.rs:11GraphStore::query_edges already has no namespace column; edges scope via entity endpoints, an asymmetry each backend handles differently.)

Per-backend, then

Each DB = a new khive-db-<x> crate implementing the 8 traits + its QueryCompiler/QueryExecutor. Vector/FTS map onto VectorStore/TextSearch (neo4j vector indexes, surreal native, etc.). Migrations are inherently per-backend.

Candidate backends

neo4j (Cypher/bolt), falkordb (Cypher/Redis), surrealdb (SurrealQL), GrafeoDB.

Open design question

Push-down vs. translate: should the runtime hand the AST to the backend (backend owns its compiler — cleaner, option B) or keep a central multi-dialect compiler and just swap the executor? Leaning push-down: the runtime should not know SQL exists. Needs an ADR (amends/extends ADR-008 + ADR-009).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions