Skip to content

feat: SQLite temporal versioning writer (V-L1-C2 / #47)#101

Merged
hyperpolymath merged 2 commits into
mainfrom
v-l1-c2-sqlite-temporal-writer
May 15, 2026
Merged

feat: SQLite temporal versioning writer (V-L1-C2 / #47)#101
hyperpolymath merged 2 commits into
mainfrom
v-l1-c2-sqlite-temporal-writer

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Closes #47 (V-L1-C2).

Companion to #100: the temporal sidecar holds full row-state snapshots so the system can answer "what did this look like at time T?" and "roll this back to version N" — both without ever touching the target database.

Stacked on #100. Base rebases to main once V-L1-C1 lands.

API

  • SIDECAR_DDL / init_sidecar_schema(conn) — schema with the partial UNIQUE INDEX (entity_id, table_name) WHERE valid_to IS NULL from V-L2-H1 / V-L2-H1: temporal versions need UNIQUE partial index + valid_to CHECK #41.
  • append_version(conn, entity_id, table_name, snapshot, op) -> Result<version>BEGIN IMMEDIATE → read MAX(version) → close previous current row (valid_to = now) → insert new row with valid_to = NULL.
  • read_at(conn, entity_id, table_name, t) -> Result<Option<String>> — point-in-time query: returns the snapshot whose valid_from <= t and valid_to is either NULL or > t.
  • read_current(conn, entity_id, table_name) -> Result<Option<String>> — convenience: the row with valid_to IS NULL.
  • rollback_to(conn, entity_id, table_name, target_version) -> Result<new_version> — append-only rollback: re-append_version with the historical snapshot and operation = "rollback". No in-place mutation, audit trail preserved.

Tests (11 new in tier1::temporal::tests)

  • schema_is_idempotent
  • genesis_append_starts_at_version_one
  • sequential_appends_are_monotonic_and_close_previous — three inserts, only the last is current.
  • read_current_returns_latest_snapshot
  • read_current_returns_none_for_unknown_entity
  • read_at_returns_snapshot_at_or_before_time — both a past-tense and a current-tense read against a workload with timestamped gaps.
  • read_at_returns_none_before_first_version
  • rollback_appends_new_version_with_old_snapshotrollback_to(v1) produces a v4 with the v1 snapshot and operation = "rollback".
  • rollback_unknown_version_errors
  • fifty_appends_yield_monotonic_versions — deterministic monotonic-version property: exactly 1..=50, no gaps; storage holds 50 rows with exactly one valid_to IS NULL (the partial UNIQUE INDEX enforces it).
  • distinct_entities_have_independent_versions

Acceptance

  • Library function tier1::temporal::append_version(...)
  • Point-in-time query helper read_at(...) returning Option<String> (caller chooses serialisation; JSON is typical)
  • Rollback helper (rollback_to)
  • Property test for monotonic version numbers (deterministic fifty_appends_yield_monotonic_versions; proptest randomisation belongs in a follow-up)

Out of scope (filed for follow-up)

  • Wiring into intercept::sqlite: the existing interceptor uses update_hook which fires after the row mutation, so the "before-state" isn't visible. Capturing temporal snapshots needs preupdate_hook (rusqlite has the feature flag). Follow-up will turn it on and add an install_temporal_hook variant.
  • Multithreaded proptest — same disposition as feat: SQLite Tier 1 e2e — provenance writer + update_hook (V-L1-C1 / #46) #100.

Test plan

  • cargo test -p verisimiser --lib tier1::temporal:: (11/11)
  • cargo test --workspace (clean — all 87 plus 11 new = 98 passed)
  • CI green
  • Visual: sqlite3 sidecar.db "SELECT entity_id, version, valid_from, valid_to, operation FROM verisimdb_temporal_versions ORDER BY version" after a run produces a coherent history with exactly one open row per entity

hyperpolymath and others added 2 commits May 15, 2026 00:48
…_hook (V-L1-C1)

Closes #46.

End-to-end SQLite Tier 1: target SQLite → `sqlite3_update_hook` →
provenance sidecar (separate SQLite file) → verifiable hash chain.

### `tier1::provenance` rewrite

The module previously held only the `ProvenanceRecord` struct with a
deprecated string-based hash. This commit makes it the canonical home
for the Provenance concern's SQLite backend:

* **`SIDECAR_DDL`** — schema text for both
  `verisimdb_provenance_log` (the append-only entries table) and the
  new `verisimdb_provenance_chain_head` table (per-entity tip-of-chain
  pointer used to look up `previous_hash` in O(1) per append, without
  scanning the log).
* **`init_sidecar_schema(conn)`** — idempotent DDL applier.
* **`append_provenance(conn, entity_id, table_name, op, actor,
  before, transformation) -> Result<hash>`** — opens a
  `BEGIN IMMEDIATE` transaction, reads the chain head (or empty for
  genesis), computes the canonical domain-tagged hash via
  `abi::ProvenanceEntry::compute_hash` (#27 / V-L2-C1), inserts the
  log row, updates the chain head, commits. Returns the new hash.
* **`verify_chain(conn, entity_id)`** — walks the log in timestamp
  order, recomputing each entry's hash and checking the
  `previous_hash` links. Returns `Ok(true)` iff every link is intact.
* The legacy `ProvenanceRecord::compute_hash` is preserved as a
  shim that forwards to `abi::ProvenanceEntry::compute_hash` so any
  external callers see no behaviour change.

### `intercept::sqlite::SqliteInterceptor`

New module wiring `sqlite3_update_hook` (via rusqlite's
`Connection::update_hook`) on a target connection. Each INSERT,
UPDATE, and DELETE on the target produces a `Decl::Extern`-style
record in the sidecar:

* Constructor: `SqliteInterceptor::new(sidecar, actor)`.
* Optional `.with_resolver(...)` to override the default
  rowid-stringifying entity-id resolver — production usage typically
  routes rowid through a `SELECT` to fetch a logical PK column.
* `.install(&target)` registers the update_hook closure; the hook
  is `FnMut + Send + 'static` (rusqlite's bound) and shares the
  sidecar via `Arc<Mutex<Connection>>`.

The hook NEVER writes back to the target — that's the V-L1-C1
isolation invariant. An integration test enforces it directly
(`target_database_is_not_modified_by_the_hook`).

### Cargo dependencies

* `rusqlite` gains the `hooks` feature flag (required for
  `update_hook`).
* `proptest` added to `[dev-dependencies]` for the property-style
  tests to come in phase 2.

### Tests

10 new tests:

* 6 unit tests in `tier1::provenance::tests`:
  - `schema_is_idempotent`
  - `genesis_entry_chains_from_empty`
  - `sequential_appends_chain_correctly`
  - `verify_chain_detects_tampered_hash`
  - `verify_chain_detects_broken_chain_link`
  - `distinct_entities_have_independent_chains`
* 4 unit tests in `intercept::sqlite::tests`:
  - `target_insert_produces_sidecar_provenance_entry`
  - `update_and_delete_produce_chained_entries`
  - `target_database_is_not_modified_by_the_hook` (the isolation
    invariant)
  - `custom_resolver_overrides_rowid_default`
* 2 integration tests in `tests/sqlite_intercept_e2e.rs`
  (tempfile-backed, so the real on-disk path is exercised):
  - `e2e_mixed_workload_verifies_all_chains` — 5 accounts × 5 ops
    each (insert / 3 updates / delete), every chain verifies, the
    entry count matches the workload, the target has no leaked
    `verisimdb_*` tables.
  - `e2e_chain_survives_reopen_of_sidecar` — drop the
    interceptor + reopen the sidecar file from a fresh
    Connection; chain still verifies and the chain-head table
    still points at the latest entry.

`cargo test --workspace` → 87 passed, 0 failed.

### Pre-existing test fix

`tests/integration_test.rs` referenced 5 table names from before the
`verisim_*` → `verisimdb_*` migration. Renamed in-place so the file
runs green again. Unrelated to V-L1-C1 in spirit, but the failures
blocked the suite — folded in here rather than carried as a separate
trivial PR.

### Out of scope

* **Multi-threaded property test** (the issue's
  "N threads × M updates" line item) — the integration test
  exercises the e2e path with a non-trivial mixed workload, but
  doesn't actually concurrent-spawn. A follow-up can wire proptest
  + std::thread once the sidecar's `Arc<Mutex<Connection>>` access
  pattern is verified safe under contention. Tracked separately.
* **Logical PK resolution** beyond the rowid default — the
  `EntityIdResolver` plumbing is in place but no production
  resolver ships in this PR.
* **before_snapshot capture** — the update_hook fires after the
  row mutation, so reading the "before" state requires either a
  preupdate_hook (rusqlite has a `preupdate_hook` feature) or
  caching reads. Filed for V-L1-C2 (#47), which the temporal
  versioning writer needs anyway.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes #47.

Companion to #46 (V-L1-C1, provenance writer): the temporal sidecar
holds full row-state snapshots per (entity_id, table_name, version)
so the system can answer "what did this look like at time T?" and
"roll this back to version N" without touching the target database.

### `tier1::temporal` rewrite

The module previously held only the `TemporalVersion` struct. This
commit makes it the canonical home for the Temporal concern's SQLite
backend:

* **`SIDECAR_DDL`** — schema text for `verisimdb_temporal_versions`
  including the partial UNIQUE INDEX
  `(entity_id, table_name) WHERE valid_to IS NULL` (from V-L2-H1 /
  #41) that enforces "at most one current version per (entity, table)"
  at the storage layer.
* **`init_sidecar_schema(conn)`** — idempotent DDL applier.
* **`append_version(conn, entity_id, table_name, snapshot, op)
  -> Result<version>`** —
  - `BEGIN IMMEDIATE` transaction.
  - Read `MAX(version)` for the entity/table; next is `prev + 1`.
  - Close out the previous current row by setting `valid_to = now`.
  - Insert the new row with `valid_to = NULL`.
  - Commit.
  The transaction discipline + partial UNIQUE index make the version
  sequence strictly monotonic even under concurrent writers (SQLite
  serialises through its write lock).
* **`read_at(conn, entity_id, table_name, t)
  -> Result<Option<String>>`** — point-in-time query: returns the
  snapshot whose `valid_from <= t` and whose `valid_to` is either
  `NULL` (still current) or `> t`. Picks the highest-numbered match
  for safety against any out-of-order writes.
* **`read_current(conn, entity_id, table_name)
  -> Result<Option<String>>`** — convenience helper: the row with
  `valid_to IS NULL` (or `None` if the entity is unknown / closed
  without successor).
* **`rollback_to(conn, entity_id, table_name, target_version)
  -> Result<new_version>`** — append-only rollback: fetches the
  snapshot at `target_version`, then calls `append_version` with
  `operation = "rollback"`. The rollback itself is a versioned event,
  not an in-place mutation, so the chain remains tamper-evident.

### Tests (11 new in `tier1::temporal::tests`)

* `schema_is_idempotent`
* `genesis_append_starts_at_version_one`
* `sequential_appends_are_monotonic_and_close_previous` — three
  inserts, only the last is current; partial UNIQUE index enforced.
* `read_current_returns_latest_snapshot`
* `read_current_returns_none_for_unknown_entity`
* `read_at_returns_snapshot_at_or_before_time` — checks both a
  past-tense read (returns v1 when v2 hasn't happened yet) and a
  current-tense read (returns v2 after the second update). Uses
  20ms sleeps between writes so the timestamps land in distinct
  RFC3339 milliseconds.
* `read_at_returns_none_before_first_version`
* `rollback_appends_new_version_with_old_snapshot` — `v1, v2, v3`,
  then `rollback_to(v1)`; the new v4's snapshot equals v1's and its
  `operation` is `"rollback"`.
* `rollback_unknown_version_errors`
* `fifty_appends_yield_monotonic_versions` — deterministic version
  of the "monotonic version numbers" acceptance criterion. Asserts
  the version sequence is exactly `1..=50` with no gaps; the
  storage layer holds exactly 50 rows and exactly 1 with
  `valid_to IS NULL`.
* `distinct_entities_have_independent_versions` — `e1` and `e2`
  each get version `2` independently.

### Acceptance

* [x] Library function `tier1::temporal::append_version(...)`
* [x] Point-in-time query helper: `read_at(...)` returning
  `Option<String>` (snapshot is opaque-string; caller decides
  format — typical JSON)
* [x] Rollback helper (`rollback_to`)
* [x] Property test for monotonic version numbers
  (`fifty_appends_yield_monotonic_versions` — the deterministic
  formulation; proptest randomisation belongs in a follow-up
  alongside multithreaded contention)

### Out of scope

* **Wiring into `intercept::sqlite::SqliteInterceptor`** — the
  current interceptor only writes provenance entries. Adding
  temporal capture from the same hook needs `preupdate_hook` (the
  regular `update_hook` fires AFTER the mutation so the
  "before-state" snapshot isn't visible). Rusqlite has the
  `preupdate_hook` feature; tracked for a follow-up that flips on
  the feature, adds an `intercept::sqlite::install_temporal_hook`
  variant, and writes both the provenance entry and the temporal
  snapshot in the same hook callback.
* **Multithreaded property test** — same disposition as #46.
  `proptest` is wired into dev-deps but the threaded version
  belongs in a follow-up.

Stacked on #100 (V-L1-C1). Base rebases to `main` after that PR
lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hyperpolymath hyperpolymath changed the base branch from v-l1-c1-sqlite-tier1-e2e to main May 15, 2026 00:00
@hyperpolymath hyperpolymath merged commit 88e5528 into main May 15, 2026
15 of 16 checks passed
@hyperpolymath hyperpolymath deleted the v-l1-c2-sqlite-temporal-writer branch May 15, 2026 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

V-L1-C2: SQLite Tier 1 temporal versioning writer + point-in-time read

1 participant