Skip to content

nedb-engine v0.9.0 — Causal Write Provenance

Choose a tag to compare

@Eth-Interchained Eth-Interchained released this 15 Jun 00:08
· 20 commits to master since this release

nedb-engine v0.9.0 — Causal Write Provenance

The first embedded database with cryptographically-sealed causal chains.
Every write can now declare why it happened, what caused it, and how confident
the writer was — sealed inside the hash chain at write time, tamper-evident,
time-travelable, and queryable in both directions.


Install / Upgrade

pip install --upgrade nedb-engine

Supports Python ≥ 3.8. Native Rust-core wheels ship for Linux (x86_64 manylinux),
macOS (arm64), and Windows (x86_64). All platforms fall back to the pure-Python
reference engine automatically.


What's New in v0.9.0 — Causal Write Provenance

The feature

AI agents write data constantly. Until now, no embedded database tracked why a
write happened — which inputs triggered it, what it was inferred from, or how
certain the writer was. That meant auditing an agent's reasoning required
reconstructing causality from application logs, which are untrustworthy and
inevitably stale.

NEDB v0.9.0 makes causality a first-class storage primitive. Three optional
fields on every put() call:

Field Type Meaning
caused_by List[int] Seqs of the ops that caused this write
evidence str Source type: user_message · inference · tool_result · correction · external
confidence float Agent certainty, 0.0–1.0

These are sealed inside the BLAKE2b hash chain at write time — if you change
them after the fact, verify() fails. They are also mirrored into the document
as queryable _caused_by, _evidence, _confidence fields so they work with
any WHERE clause.

API

from nedb import NEDB

db = NEDB("./agent-memory")

# Raw inputs — uncaused roots
db.put("inputs", "turn_1", {"role": "user", "text": "I hate bright screens"})
db.put("inputs", "turn_2", {"role": "user", "text": "I have migraines"})
seq_1, seq_2 = db.seq - 1, db.seq

# Derived belief, sealed in the chain
db.put("beliefs", "dark_mode_pref",
    {"value": True, "summary": "User prefers dark mode"},
    caused_by=[seq_1, seq_2],
    evidence="user_message",
    confidence=0.95)

# Second-order inference
db.put("beliefs", "low_blue_light",
    {"value": True},
    caused_by=[db.seq],       # caused by dark_mode_pref
    evidence="inference",
    confidence=0.82)

# Provenance is queryable via normal WHERE
db.query('FROM beliefs WHERE _evidence = "user_message"')
db.query('FROM beliefs WHERE _confidence > 0.9')

NQL: TRACE operator

FROM beliefs WHERE _id = "dark_mode_pref" TRACE caused_by

Backward traversal — recursively follows caused_by seqs to their originating
documents. Answers: "Why does the agent believe this?"

FROM inputs WHERE _id = "turn_1" TRACE caused_by REVERSE

Forward traversal — uses the in-memory reverse index to find all documents that
declared this op as a cause. Answers: "What did this input cause downstream?"

Why this matters

  • EU AI Act (full applicability: August 2026): Article 13 requires operators of
    high-risk AI systems to produce records of "the logic involved" in decisions.
    Causal provenance at the storage layer makes this verifiable, not self-reported.
  • OWASP Agentic Top 10 (ASI06 — Memory & Context Poisoning): A tamper-evident
    causal chain means you can detect injected beliefs — they either lack provenance
    or break verify().
  • Operational auditing: "Why does the agent recommend X?" is now a database
    query, not a forensic reconstruction exercise.

Engram and Operad build causal provenance at the application layer over PostgreSQL
and Neo4j. NEDB builds it at the storage layer — sealed in the same hash chain
that already proves tamper-evidence and time-travel.

Backward compatibility

Fully backward-compatible. Existing databases and AOF files verify without
modification. Old ops without provenance omit the fields from their hash body —
mixed chains (some ops with provenance, some without) verify correctly. No
migration required.


What Changed Since v0.7.4

v0.8.3 — Deploy fix (encrypted new databases)

Bug: Creating a new database while NEDB_TMK (encryption) was set failed with
FileNotFoundError: ./nedb-data/<name>/key.enc.tmp because load_or_create_dek()
ran before _open() created the directory.

Fix: os.makedirs(path, exist_ok=True) immediately before the DEK call.

This was the root cause of every "Deploy failed (502)" in the studio when the
daemon was running with at-rest encryption enabled.

v0.8.2 — Structured logging + deploy integration test

  • NEDBD_DEBUG=1 / nedbd --log-level N (0=errors only, 1=requests, 2=deploy phases, 3=verbose)
  • Full traceback always printed on unhandled exceptions (never swallowed silently)
  • tests/test_deploy.py — integration test for the full scaffold deploy path
    (the test that would have caught the v0.8.3 bug before production)

v0.8.0 — Concurrent daemon (group-commit sequencer)

ThreadingHTTPServer was thread-unsafe: concurrent writes to one database raced
the hash chain, causing 500s surfaced as the studio's 502. The fix:

  • Single-writer per database: writers enqueue intents, one committer thread
    owns all mutation — correct chain by construction, zero write locks.
  • Group commit: the committer drains the whole queue, applies every op,
    then issues one fsync per batch. More concurrent writers → bigger batches →
    higher throughput. ~15,000 writes/s under load.
  • Lock-free MVCC reads: reads run at the last committed seq (snapshot
    isolation); they never touch the write queue or take a lock.

v0.7.6 — Self-healing chains

The encryption backfill (plaintext DB opened with NEDB_TMK) appended a
checkpoint op that was never persisted — creating a permanent gap in the chain.
Every subsequent open returned verify() = False, showing the "tampered" pill
in the studio even though nothing was tampered.

  • Backfill no longer checkpoints — it drops the stale snapshot and rewrites
    the AOF cleanly.
  • _self_heal_if_needed() on open: if verify() fails but every op is
    internally consistent (only the linkage broke, content is intact), the chain
    is re-linked in place and the AOF rewritten. Genuine content tampering is left
    False with a warning — real attacks are never masked.

v0.7.5 — MongoDB compatibility adapter

nedb.mongo.MongoCompat (MongoClient alias) — the third compatibility layer
alongside SQL and Redis adapters. Full document/collection API:

from nedb import NEDB, MongoClient

db = NEDB()
users = MongoClient(db)["users"]
users.insert_many([{"name": "Alice", "age": 31}, {"name": "Bob", "age": 24}])
list(users.find({"age": {"$gt": 25}}).sort("age", -1))
users.update_one({"name": "Alice"}, {"$inc": {"logins": 1}})
users.aggregate([{"$group": {"_id": None, "avg": {"$avg": "$age"}}}])

Supports: find findOne count distinct aggregate insertOne insertMany
updateOne updateMany deleteOne deleteMany replaceOne · Query operators:
$eq $ne $gt $gte $lt $lte $in $nin $exists $regex $and $or $nor $not $size $all $mod $elemMatch · Update operators: $set $unset $inc $mul $min $max $rename $push $addToSet $pull $pop $setOnInsert · Aggregation: $match $group $sort $skip $limit $count $project

Also exposed over nedbd: POST /v1/databases/:name/mongo


Full Changelog (v0.7.4 → v0.9.0)

Version Date Summary
v0.7.4 2026-06-14 Fix maturin native wheel — stage Python source into crate so the built wheel contains the full package, not just _native.so
v0.7.5 2026-06-14 MongoDB compatibility adapter (MongoCompat/MongoClient)
v0.7.6 2026-06-14 Self-healing chains; encrypt-backfill gap fix
v0.8.0 2026-06-14 Concurrent daemon — single-writer group-commit sequencer
v0.8.1 2026-06-14 MongoDB nedbd endpoint (POST /v1/databases/:name/mongo)
v0.8.2 2026-06-14 Structured logging (--log-level); deploy integration test
v0.8.3 2026-06-14 Fix encrypted new-DB deploy (makedirs before DEK creation)
v0.9.0 2026-06-15 Causal Write Provenancecaused_by, evidence, confidence, TRACE NQL

Architecture Summary

nedb-engine
├── Hash-chained append-only OpLog     (tamper-evident, replay-protected)
│   └── Causal provenance v0.9.0      (caused_by / evidence / confidence)
├── MVCC store                         (time-travel AS OF seq)
├── Relations + adjacency index        (TRAVERSE, link/unlink)
├── Eq / Ordered / Search indexes
├── Concurrent Sequencer               (group-commit, lock-free reads)
├── AES-256-GCM at-rest encryption    (TMK / DEK double-envelope)
├── AOF durable persistence + snapshots
└── Compatibility adapters
    ├── SQL    (SELECT/INSERT/UPDATE/DELETE)
    ├── Redis  (GET/HSET/SADD/LPUSH/…)
    └── MongoDB (find/aggregate/update/…)

Links


Built by INTERCHAINED LLC × Claude Sonnet 4.6
Apache-2.0 (engine) · GPLv3 (studio)