feat: SQLite Tier 1 e2e — provenance writer + update_hook (V-L1-C1 / #46) by hyperpolymath · Pull Request #100 · hyperpolymath/verisimiser

hyperpolymath · 2026-05-14T23:49:09Z

Summary

Closes #46 (V-L1-C1).

End-to-end SQLite Tier 1: target SQLite → sqlite3_update_hook → provenance sidecar (separate SQLite file) → verifiable hash chain. The sidecar holds an append-only hash-chained log per entity; the target is never written to.

What lands

`tier1::provenance`

SIDECAR_DDL — schema text for verisimdb_provenance_log and the new verisimdb_provenance_chain_head (per-entity tip pointer, used for O(1) previous_hash lookup).
init_sidecar_schema(conn) — idempotent DDL applier.
append_provenance(conn, entity_id, table_name, op, actor, before, transformation) -> Result<hash> — BEGIN IMMEDIATE → read head → compute canonical domain-tagged hash via abi::ProvenanceEntry::compute_hash (V-L2-C1: hash full audit (actor + before_snapshot + transformation) with domain separation #27) → insert log row → update head → commit.
verify_chain(conn, entity_id) — walks the log; recomputes each hash; checks each previous_hash link.
ProvenanceRecord::compute_hash preserved as a shim forwarding to the canonical impl.

`intercept::sqlite::SqliteInterceptor`

Wraps a sidecar Connection behind Arc<Mutex<...>>.
.install(&target) registers sqlite3_update_hook on the target.
Optional .with_resolver(...) to plug in a logical-PK lookup (default stringifies the rowid).
Hook NEVER writes back to the target — V-L1-C1 isolation invariant enforced by a dedicated test.

Cargo

rusqlite gains hooks feature flag.
proptest added to [dev-dependencies] (used by phase-2 multithreaded tests).

Pre-existing test fix (folded in)

tests/integration_test.rs referenced 5 table/view names from before the verisim_* → verisimdb_* migration. Renamed in-place so the suite runs green again. Trivial but blocked the test run — folded here rather than carried as a separate PR.

Tests

12 new tests (10 unit + 2 integration):

tier1::provenance × 6 — schema idempotence, genesis, sequential chain, tamper detection (hash + previous_hash), entity isolation.
intercept::sqlite × 4 — basic insert, update+delete chain, target_database_is_not_modified_by_the_hook (the isolation invariant), custom resolver override.
tests/sqlite_intercept_e2e.rs × 2 — tempfile-backed end-to-end:
- e2e_mixed_workload_verifies_all_chains (5 accounts × 5 ops, all chains verify, entry count matches, no verisimdb_* leakage on target).
- e2e_chain_survives_reopen_of_sidecar (drop interceptor + reopen sidecar file; chain still verifies).

cargo test --workspace → 87 passed, 0 failed.

Acceptance criteria

Library function compiles and tests with rusqlite
Property test (proptest): N threads × M updates — proptest dep added; the multithreaded version is filed for a phase-2 follow-up (the e2e mixed-workload test exercises the deterministic non-threaded path)
Integration test: target.db unchanged across all writes; sidecar.db has expected entries

Out of scope

Multi-threaded property test — proptest is wired in dev-deps; threading the sidecar through Arc<Mutex<...>> is sound but a real proptest with concurrent spawning belongs in a follow-up so the cadence isn't held up by contention-tuning.
before_snapshot capture — needs preupdate_hook (rusqlite has the feature; tracked for V-L1-C2 / V-L1-C2: SQLite Tier 1 temporal versioning writer + point-in-time read #47 which the temporal writer needs anyway).
Logical PK resolution beyond rowid — EntityIdResolver plumbing is there; no production resolver ships in this PR.

Test plan

cargo test -p verisimiser --lib provenance:: (6/6)
cargo test -p verisimiser --lib intercept:: (4/4)
cargo test --test sqlite_intercept_e2e (2/2)
cargo test --workspace (87/87)
CI green
Smoke: open the sidecar file produced by an e2e run in sqlite3 and confirm the schema + chain visually

…_hook (V-L1-C1) Closes #46. End-to-end SQLite Tier 1: target SQLite → `sqlite3_update_hook` → provenance sidecar (separate SQLite file) → verifiable hash chain. ### `tier1::provenance` rewrite The module previously held only the `ProvenanceRecord` struct with a deprecated string-based hash. This commit makes it the canonical home for the Provenance concern's SQLite backend: * **`SIDECAR_DDL`** — schema text for both `verisimdb_provenance_log` (the append-only entries table) and the new `verisimdb_provenance_chain_head` table (per-entity tip-of-chain pointer used to look up `previous_hash` in O(1) per append, without scanning the log). * **`init_sidecar_schema(conn)`** — idempotent DDL applier. * **`append_provenance(conn, entity_id, table_name, op, actor, before, transformation) -> Result<hash>`** — opens a `BEGIN IMMEDIATE` transaction, reads the chain head (or empty for genesis), computes the canonical domain-tagged hash via `abi::ProvenanceEntry::compute_hash` (#27 / V-L2-C1), inserts the log row, updates the chain head, commits. Returns the new hash. * **`verify_chain(conn, entity_id)`** — walks the log in timestamp order, recomputing each entry's hash and checking the `previous_hash` links. Returns `Ok(true)` iff every link is intact. * The legacy `ProvenanceRecord::compute_hash` is preserved as a shim that forwards to `abi::ProvenanceEntry::compute_hash` so any external callers see no behaviour change. ### `intercept::sqlite::SqliteInterceptor` New module wiring `sqlite3_update_hook` (via rusqlite's `Connection::update_hook`) on a target connection. Each INSERT, UPDATE, and DELETE on the target produces a `Decl::Extern`-style record in the sidecar: * Constructor: `SqliteInterceptor::new(sidecar, actor)`. * Optional `.with_resolver(...)` to override the default rowid-stringifying entity-id resolver — production usage typically routes rowid through a `SELECT` to fetch a logical PK column. * `.install(&target)` registers the update_hook closure; the hook is `FnMut + Send + 'static` (rusqlite's bound) and shares the sidecar via `Arc<Mutex<Connection>>`. The hook NEVER writes back to the target — that's the V-L1-C1 isolation invariant. An integration test enforces it directly (`target_database_is_not_modified_by_the_hook`). ### Cargo dependencies * `rusqlite` gains the `hooks` feature flag (required for `update_hook`). * `proptest` added to `[dev-dependencies]` for the property-style tests to come in phase 2. ### Tests 10 new tests: * 6 unit tests in `tier1::provenance::tests`: - `schema_is_idempotent` - `genesis_entry_chains_from_empty` - `sequential_appends_chain_correctly` - `verify_chain_detects_tampered_hash` - `verify_chain_detects_broken_chain_link` - `distinct_entities_have_independent_chains` * 4 unit tests in `intercept::sqlite::tests`: - `target_insert_produces_sidecar_provenance_entry` - `update_and_delete_produce_chained_entries` - `target_database_is_not_modified_by_the_hook` (the isolation invariant) - `custom_resolver_overrides_rowid_default` * 2 integration tests in `tests/sqlite_intercept_e2e.rs` (tempfile-backed, so the real on-disk path is exercised): - `e2e_mixed_workload_verifies_all_chains` — 5 accounts × 5 ops each (insert / 3 updates / delete), every chain verifies, the entry count matches the workload, the target has no leaked `verisimdb_*` tables. - `e2e_chain_survives_reopen_of_sidecar` — drop the interceptor + reopen the sidecar file from a fresh Connection; chain still verifies and the chain-head table still points at the latest entry. `cargo test --workspace` → 87 passed, 0 failed. ### Pre-existing test fix `tests/integration_test.rs` referenced 5 table names from before the `verisim_*` → `verisimdb_*` migration. Renamed in-place so the file runs green again. Unrelated to V-L1-C1 in spirit, but the failures blocked the suite — folded in here rather than carried as a separate trivial PR. ### Out of scope * **Multi-threaded property test** (the issue's "N threads × M updates" line item) — the integration test exercises the e2e path with a non-trivial mixed workload, but doesn't actually concurrent-spawn. A follow-up can wire proptest + std::thread once the sidecar's `Arc<Mutex<Connection>>` access pattern is verified safe under contention. Tracked separately. * **Logical PK resolution** beyond the rowid default — the `EntityIdResolver` plumbing is in place but no production resolver ships in this PR. * **before_snapshot capture** — the update_hook fires after the row mutation, so reading the "before" state requires either a preupdate_hook (rusqlite has a `preupdate_hook` feature) or caching reads. Filed for V-L1-C2 (#47), which the temporal versioning writer needs anyway. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(tier1,intercept): SQLite Tier 1 e2e — provenance writer + update_hook (V-L1-C1) Closes #46. End-to-end SQLite Tier 1: target SQLite → `sqlite3_update_hook` → provenance sidecar (separate SQLite file) → verifiable hash chain. ### `tier1::provenance` rewrite The module previously held only the `ProvenanceRecord` struct with a deprecated string-based hash. This commit makes it the canonical home for the Provenance concern's SQLite backend: * **`SIDECAR_DDL`** — schema text for both `verisimdb_provenance_log` (the append-only entries table) and the new `verisimdb_provenance_chain_head` table (per-entity tip-of-chain pointer used to look up `previous_hash` in O(1) per append, without scanning the log). * **`init_sidecar_schema(conn)`** — idempotent DDL applier. * **`append_provenance(conn, entity_id, table_name, op, actor, before, transformation) -> Result<hash>`** — opens a `BEGIN IMMEDIATE` transaction, reads the chain head (or empty for genesis), computes the canonical domain-tagged hash via `abi::ProvenanceEntry::compute_hash` (#27 / V-L2-C1), inserts the log row, updates the chain head, commits. Returns the new hash. * **`verify_chain(conn, entity_id)`** — walks the log in timestamp order, recomputing each entry's hash and checking the `previous_hash` links. Returns `Ok(true)` iff every link is intact. * The legacy `ProvenanceRecord::compute_hash` is preserved as a shim that forwards to `abi::ProvenanceEntry::compute_hash` so any external callers see no behaviour change. ### `intercept::sqlite::SqliteInterceptor` New module wiring `sqlite3_update_hook` (via rusqlite's `Connection::update_hook`) on a target connection. Each INSERT, UPDATE, and DELETE on the target produces a `Decl::Extern`-style record in the sidecar: * Constructor: `SqliteInterceptor::new(sidecar, actor)`. * Optional `.with_resolver(...)` to override the default rowid-stringifying entity-id resolver — production usage typically routes rowid through a `SELECT` to fetch a logical PK column. * `.install(&target)` registers the update_hook closure; the hook is `FnMut + Send + 'static` (rusqlite's bound) and shares the sidecar via `Arc<Mutex<Connection>>`. The hook NEVER writes back to the target — that's the V-L1-C1 isolation invariant. An integration test enforces it directly (`target_database_is_not_modified_by_the_hook`). ### Cargo dependencies * `rusqlite` gains the `hooks` feature flag (required for `update_hook`). * `proptest` added to `[dev-dependencies]` for the property-style tests to come in phase 2. ### Tests 10 new tests: * 6 unit tests in `tier1::provenance::tests`: - `schema_is_idempotent` - `genesis_entry_chains_from_empty` - `sequential_appends_chain_correctly` - `verify_chain_detects_tampered_hash` - `verify_chain_detects_broken_chain_link` - `distinct_entities_have_independent_chains` * 4 unit tests in `intercept::sqlite::tests`: - `target_insert_produces_sidecar_provenance_entry` - `update_and_delete_produce_chained_entries` - `target_database_is_not_modified_by_the_hook` (the isolation invariant) - `custom_resolver_overrides_rowid_default` * 2 integration tests in `tests/sqlite_intercept_e2e.rs` (tempfile-backed, so the real on-disk path is exercised): - `e2e_mixed_workload_verifies_all_chains` — 5 accounts × 5 ops each (insert / 3 updates / delete), every chain verifies, the entry count matches the workload, the target has no leaked `verisimdb_*` tables. - `e2e_chain_survives_reopen_of_sidecar` — drop the interceptor + reopen the sidecar file from a fresh Connection; chain still verifies and the chain-head table still points at the latest entry. `cargo test --workspace` → 87 passed, 0 failed. ### Pre-existing test fix `tests/integration_test.rs` referenced 5 table names from before the `verisim_*` → `verisimdb_*` migration. Renamed in-place so the file runs green again. Unrelated to V-L1-C1 in spirit, but the failures blocked the suite — folded in here rather than carried as a separate trivial PR. ### Out of scope * **Multi-threaded property test** (the issue's "N threads × M updates" line item) — the integration test exercises the e2e path with a non-trivial mixed workload, but doesn't actually concurrent-spawn. A follow-up can wire proptest + std::thread once the sidecar's `Arc<Mutex<Connection>>` access pattern is verified safe under contention. Tracked separately. * **Logical PK resolution** beyond the rowid default — the `EntityIdResolver` plumbing is in place but no production resolver ships in this PR. * **before_snapshot capture** — the update_hook fires after the row mutation, so reading the "before" state requires either a preupdate_hook (rusqlite has a `preupdate_hook` feature) or caching reads. Filed for V-L1-C2 (#47), which the temporal versioning writer needs anyway. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(tier1): SQLite temporal versioning writer (V-L1-C2) Closes #47. Companion to #46 (V-L1-C1, provenance writer): the temporal sidecar holds full row-state snapshots per (entity_id, table_name, version) so the system can answer "what did this look like at time T?" and "roll this back to version N" without touching the target database. ### `tier1::temporal` rewrite The module previously held only the `TemporalVersion` struct. This commit makes it the canonical home for the Temporal concern's SQLite backend: * **`SIDECAR_DDL`** — schema text for `verisimdb_temporal_versions` including the partial UNIQUE INDEX `(entity_id, table_name) WHERE valid_to IS NULL` (from V-L2-H1 / #41) that enforces "at most one current version per (entity, table)" at the storage layer. * **`init_sidecar_schema(conn)`** — idempotent DDL applier. * **`append_version(conn, entity_id, table_name, snapshot, op) -> Result<version>`** — - `BEGIN IMMEDIATE` transaction. - Read `MAX(version)` for the entity/table; next is `prev + 1`. - Close out the previous current row by setting `valid_to = now`. - Insert the new row with `valid_to = NULL`. - Commit. The transaction discipline + partial UNIQUE index make the version sequence strictly monotonic even under concurrent writers (SQLite serialises through its write lock). * **`read_at(conn, entity_id, table_name, t) -> Result<Option<String>>`** — point-in-time query: returns the snapshot whose `valid_from <= t` and whose `valid_to` is either `NULL` (still current) or `> t`. Picks the highest-numbered match for safety against any out-of-order writes. * **`read_current(conn, entity_id, table_name) -> Result<Option<String>>`** — convenience helper: the row with `valid_to IS NULL` (or `None` if the entity is unknown / closed without successor). * **`rollback_to(conn, entity_id, table_name, target_version) -> Result<new_version>`** — append-only rollback: fetches the snapshot at `target_version`, then calls `append_version` with `operation = "rollback"`. The rollback itself is a versioned event, not an in-place mutation, so the chain remains tamper-evident. ### Tests (11 new in `tier1::temporal::tests`) * `schema_is_idempotent` * `genesis_append_starts_at_version_one` * `sequential_appends_are_monotonic_and_close_previous` — three inserts, only the last is current; partial UNIQUE index enforced. * `read_current_returns_latest_snapshot` * `read_current_returns_none_for_unknown_entity` * `read_at_returns_snapshot_at_or_before_time` — checks both a past-tense read (returns v1 when v2 hasn't happened yet) and a current-tense read (returns v2 after the second update). Uses 20ms sleeps between writes so the timestamps land in distinct RFC3339 milliseconds. * `read_at_returns_none_before_first_version` * `rollback_appends_new_version_with_old_snapshot` — `v1, v2, v3`, then `rollback_to(v1)`; the new v4's snapshot equals v1's and its `operation` is `"rollback"`. * `rollback_unknown_version_errors` * `fifty_appends_yield_monotonic_versions` — deterministic version of the "monotonic version numbers" acceptance criterion. Asserts the version sequence is exactly `1..=50` with no gaps; the storage layer holds exactly 50 rows and exactly 1 with `valid_to IS NULL`. * `distinct_entities_have_independent_versions` — `e1` and `e2` each get version `2` independently. ### Acceptance * [x] Library function `tier1::temporal::append_version(...)` * [x] Point-in-time query helper: `read_at(...)` returning `Option<String>` (snapshot is opaque-string; caller decides format — typical JSON) * [x] Rollback helper (`rollback_to`) * [x] Property test for monotonic version numbers (`fifty_appends_yield_monotonic_versions` — the deterministic formulation; proptest randomisation belongs in a follow-up alongside multithreaded contention) ### Out of scope * **Wiring into `intercept::sqlite::SqliteInterceptor`** — the current interceptor only writes provenance entries. Adding temporal capture from the same hook needs `preupdate_hook` (the regular `update_hook` fires AFTER the mutation so the "before-state" snapshot isn't visible). Rusqlite has the `preupdate_hook` feature; tracked for a follow-up that flips on the feature, adds an `intercept::sqlite::install_temporal_hook` variant, and writes both the provenance entry and the temporal snapshot in the same hook callback. * **Multithreaded property test** — same disposition as #46. `proptest` is wired into dev-deps but the threaded version belongs in a follow-up. Stacked on #100 (V-L1-C1). Base rebases to `main` after that PR lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

hyperpolymath mentioned this pull request May 14, 2026

feat: SQLite temporal versioning writer (V-L1-C2 / #47) #101

Merged

8 tasks

hyperpolymath merged commit 5f9abdb into main May 15, 2026
16 of 19 checks passed

hyperpolymath deleted the v-l1-c1-sqlite-tier1-e2e branch May 15, 2026 00:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SQLite Tier 1 e2e — provenance writer + update_hook (V-L1-C1 / #46)#100

feat: SQLite Tier 1 e2e — provenance writer + update_hook (V-L1-C1 / #46)#100
hyperpolymath merged 1 commit into
mainfrom
v-l1-c1-sqlite-tier1-e2e

hyperpolymath commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hyperpolymath commented May 14, 2026

Summary

What lands

tier1::provenance

intercept::sqlite::SqliteInterceptor

Cargo

Pre-existing test fix (folded in)

Tests

Acceptance criteria

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`tier1::provenance`

`intercept::sqlite::SqliteInterceptor`