fix: persist GSI queue to PostgreSQL for crash safety#128
fix: persist GSI queue to PostgreSQL for crash safety#128LeeroyHannigan wants to merge 1 commit into
Conversation
| SELECT id FROM gsi_pending \ | ||
| WHERE ready_at <= NOW() \ | ||
| ORDER BY id \ | ||
| LIMIT $1 \ |
There was a problem hiding this comment.
Is LIMIT applied before or after ORDER BY?
There was a problem hiding this comment.
I did a bit of reading ... It looks like Postgres will sort the entire resultset first, and then apply the limit. So if "ready_at" is far enough in the past on a busy table, the sort could be expensive, though the index on ready_at should help mitigate that. My guess is that BATCH_SIZE is small enough to discourage the planner from believing that a full table scan would be cheaper than an index scan.
| pk_hash(pk_text.as_ref()), | ||
| &key_info.account_id, | ||
| &key_info.table_name, | ||
| if has_async_indexes(&indexes, sys_delay) { |
There was a problem hiding this comment.
This could probably be called once at the beginning (right after indexes is populated).
| -- Inserted atomically within the base write transaction, consumed by | ||
| -- background workers. Survives process crash/restart. | ||
|
|
||
| CREATE TABLE IF NOT EXISTS gsi_pending ( |
There was a problem hiding this comment.
I don't recall where it's buried, but somewhere there is a catalog version identifier that should be updated when the metadata/system table schema is updated, so that extenddb migrate will know that there is a migration to perform. I believe it's in storage-postgres somewhere.
jcshepherd
left a comment
There was a problem hiding this comment.
Couple initial questions/comments. Probably the one I'm most concerned with is the evaluation order of LIMIT and ORDER BY. If I were a gambler, I'd wager results are LIMITed before ORDER BY, which may not given you the ordering guarantees you want.
What
Replaces the in-memory
VecDequeGSI propagation queue with a PostgreSQL-backed persistent queue (gsi_pendingtable). Pending GSI updates are now inserted within the same transaction as the base table write and processed by workers that claim ready rows usingDELETE ... RETURNINGwithFOR UPDATE SKIP LOCKED.Key changes:
gsi_pendingtable in the data database (migration002_gsi_pending.sql)enqueue_gsi_pending()inserts inside the base write transaction, zero crash windowready_attimestamp, not by sleeping inside transactionstable_idwith 30s TTL to avoid repeated catalog queriesWhy
Closes #125
The previous in-memory queue lost all pending GSI updates on process crash or restart, causing permanent GSI inconsistency with no recovery path. The only workaround was re-touching every item in the base table, effectively data loss at scale.
Testing done
cargo fmt --all -- --check— cleancargo clippy --all-targets -- -D warnings— cleancargo test --workspace— 375 tests passgsi_pendingtable schema created correctly via\d gsi_pendingtest_gsi_async.pyvalidates propagation delay behavior end-to-endChecklist
cargo test --workspace)cargo fmt --check)cargo clippy -- -W clippy::pedantic)Breaking changes
None. The
gsi_pendingtable is created automatically via the data migration on firstinitor server startup. Existing deployments gain crash safety transparently.By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache License 2.0 and I agree to the Developer Certificate of
Origin (DCO). See CONTRIBUTING.md for details.