Skip to content

selene-db v1.0.0

Choose a tag to compare

@jscott3201 jscott3201 released this 16 May 18:50
· 915 commits to development since this release
1a4979d

[1.0.0] — 2026-05-16

First stable release. selene-db is now usable as a Rust dependency for
embedding a property graph engine that targets ISO/IEC 39075:2024 (GQL)
conformance. The public API surface across selene-core,
selene-graph, selene-persist, selene-gql, and selene-pack is
considered stable: subsequent 1.x releases will maintain
backwards-compatible additions and reserve breaking changes for major
version bumps.

Highlights

  • Strict ISO/IEC 39075:2024 GQL parser, semantic analyzer, planner,
    optimizer, and executor. The GQL Flagger (Clause 24.6) rejects
    non-standard or unclaimed constructs at parse time. No Cypher, no
    SQL, no SPARQL grammar leaks into the engine.
  • In-memory property graph with copy-on-write isolation built on
    ArcSwap + parking_lot::RwLock + imbl persistent collections +
    RoaringBitmap label indexes + typed secondary indexes. Single graph
    write-lock with lock-free reads; strict-serializable transaction
    isolation.
  • Write-ahead log (SLDB magic) and rkyv-archived snapshots
    (SLSN magic) with two-step recovery; the persistence crate is
    graph-blind (operates on &[Change]) so graph types can evolve
    independently.
  • Procedure-pack registry with JSON Schema 2020-12 manifest
    validation, typestate-sealed activation, and a single mutation funnel
    shared between graph writes and pack lifecycle audit (atomic via the
    WAL).
  • Graph algorithm library with 15 algorithms across four families
    (structural, pathfinding, centrality, community), exposed through 19
    algo.* procedures via selene-algorithms-pack. Rayon-parallel
    implementations for APSP, betweenness, PageRank, and triangle count.
  • Vector index extension with HNSW and IVF providers, SQ8/PQ/OPQ
    quantization, and 9 vector.* procedures via selene-vector-pack.
  • Snapshot-protected runtime surfaces: planner, executor,
    procedure-pack, and algorithm outputs are pinned by golden snapshots
    for drift detection (D21 snapshot harness pattern).
  • Engineering posture: #![forbid(unsafe_code)] workspace-wide,
    missing_docs = "deny", 700 LOC per-file cap, rustls-only TLS in
    transitive dependencies, no hand-rolled crypto / TLS / async / serde
    primitives.

Crates

  • selene-core — foundation types: Value, IStr interner,
    PropertyMap, LabelSet, schema types, Codec, Origin,
    Changeset, feature_register (ISO feature claims).
  • selene-graph — in-memory property graph: storage primitives,
    Mutator write funnel, label/typed/composite indexes,
    IndexProvider extension trait, GraphTypeDef runtime binding.
  • selene-persist — WAL format (SLDB), snapshot format with
    TLV-tagged sections (SLSN), recovery pipeline. Graph-blind.
  • selene-gql — pest GQL grammar, AST, semantic analyzer, planner,
    rule-based optimizer (13 rules), row-at-a-time executor,
    ProcedureRegistry trait, GQL Flagger.
  • selene-pack — procedure-pack registry, manifest validator,
    typestate activation state machine, atomic mutation-funnel audit,
    blake3 content hashing, platform built-ins (selene.health,
    selene.create_index, selene.drop_index, selene.pack.history,
    selene.feature_status).
  • selene-algorithmsGraphProjection + ProjectionCatalog
    foundation, four algorithm families. Independent of selene-gql.
  • selene-algorithms-pack — procedure-pack adapters that expose
    selene-algorithms through GQL CALL.
  • selene-vector — opt-in HNSW and IVF vector indexes with search,
    mutation replay, snapshots, quantization, IndexProvider
    registration.
  • selene-vector-pack — procedure-pack adapters for vector search,
    mutation, bulk mutation, IVF search, and IVF stats through CALL.
  • selene-testing — shared fixtures, synthetic graph generators, and
    pure-mirror snapshot-harness DSLs. Dev-only.

ISO/IEC 39075:2024 conformance

selene-db targets minimum conformance plus a curated subset of
optional features:

  • Mandatory data types: STRING, BOOLEAN, INT, FLOAT.
  • Optional types behind ISO feature gates: date/time, decimal, list,
    record, path, references.
  • Both GG01 (open graph) and GG02 (closed graph) are supported;
    per-graph choice.
  • Default transaction isolation is serializable (Clause 4.6).
  • Implementation-defined hooks claimed: IW010 (external procedures
    via CALL), IV011 (dynamic property value type), ID001 /
    IW002 / ID003 (principals, authorization, and privileges as
    embedder responsibilities), IE002 / IE004 (transaction
    isolation).
  • No wire format is in scope (Clause 4.2.3 is explicit). Embedders pick
    their own transport.

The feature_register in selene-core is the authoritative list of
claimed Implication-table features. The Flagger rejects unclaimed
constructs at parse time so embedders cannot accidentally rely on
non-standard syntax.

Snapshot and WAL format

  • WAL magic: SLDB. Append-only log of Change records with
    configurable SyncPolicy (synchronous, batched, or unsynced for
    testing).
  • Snapshot magic: SLSN. rkyv-archived TLV sections; producer-tagged
    for graph core (GRPH), vectors (VECS), quantization (QUNT,
    CQNT, IPQB), IVF (IVF1), OPQ rotation (ROTN), and PQ
    training (PQTC). Extensions register their own section tags.
  • Two-step recovery: load snapshot, replay WAL from snapshot's last
    seqno. Selene preserves byte-parity for prior snapshot section
    versions on load while writing the current version.

Performance posture

Benchmarks are local-only via scripts/run-benches.sh (criterion and
iai-callgrind, sequential, with mimalloc as the global allocator).
BENCHMARKS.md is the committed, dated measurement
record. Selected v1.0.0 headlines (Apple M5, 100k-scale full profile):

Surface Measurement
graph_node_fetch ~2.10 ns (columnar storage, lock-free read)
graph_typed_index_point ~4.53 ns (flat-curve Cow tri-state)
gql_analyze_corpus/m5c ~5.32 µs (full analyzer pipeline)
algo_betweenness 100k parallel speedup ~2.40× over sequential
vector_ivf_search ~2.88 µs per query

See docs/performance.md for the full surface,
tuning knobs, and methodology.

Documentation

This release introduces a full user-facing documentation set under
docs/:

The README is now focused on evaluation and orientation; depth lives in
the documentation pages above.

Stability guarantees

The following surfaces are stable starting with 1.0.0:

  • Public types and traits in selene-core (Value, IStr,
    PropertyMap, LabelSet, Change, Codec).
  • Public types and methods in selene-graph (SharedGraph,
    Mutator, IndexProvider, write-transaction commit flow).
  • Public types and methods in selene-persist (WAL format, snapshot
    format, recovery API).
  • Public types and methods in selene-gql (parse, analyze,
    plan, execute_statement, Session, ProcedureRegistry).
  • Public types and methods in selene-pack (manifest schema,
    ExternalProcedurePack, lifecycle events).
  • Wire formats for WAL and snapshot sections (read-side compatibility
    preserved across 1.x).
  • The 13 optimizer rules and their static effects on plans.
  • The 19 algo.* and 9 vector.* procedure surfaces.

Platform support

Platform Status
Linux (x86_64, aarch64) Primary deployment target
macOS (Apple Silicon, Intel) Primary development target
Windows Out of scope

Known deferrals (post-1.0.0)

The following items are intentionally deferred and tracked for future
1.x releases:

  • Louvain parallelization (currently sequential).
  • Edge-index planner support (typed/composite indexes for edges
    currently return Linear selectivity).
  • Analyzer recursion-depth bound (parser is bounded at 64; analyzer
    binding is not yet contractually bounded).
  • Mutation/MATCH planner threading of BindingId (currently uses
    per-statement context).
  • OPQ rotation inner-allocation tightening.
  • Fresh extension crates beyond selene-vector and selene-algorithms.