Skip to content

fix: persist GSI queue to PostgreSQL for crash safety#128

Open
LeeroyHannigan wants to merge 1 commit into
mainfrom
fix/persistent-gsi-queue
Open

fix: persist GSI queue to PostgreSQL for crash safety#128
LeeroyHannigan wants to merge 1 commit into
mainfrom
fix/persistent-gsi-queue

Conversation

@LeeroyHannigan
Copy link
Copy Markdown
Collaborator

What

Replaces the in-memory VecDeque GSI propagation queue with a PostgreSQL-backed persistent queue (gsi_pending table). Pending GSI updates are now inserted within the same transaction as the base table write and processed by workers that claim ready rows using DELETE ... RETURNING with FOR UPDATE SKIP LOCKED.

Key changes:

  • New gsi_pending table in the data database (migration 002_gsi_pending.sql)
  • enqueue_gsi_pending() inserts inside the base write transaction, zero crash window
  • Propagation delay enforced by ready_at timestamp, not by sleeping inside transactions
  • Workers never hold connections idle, they only touch rows whose delay has elapsed
  • Index metadata cached per table_id with 30s TTL to avoid repeated catalog queries

Why

Closes #125

The previous in-memory queue lost all pending GSI updates on process crash or restart, causing permanent GSI inconsistency with no recovery path. The only workaround was re-touching every item in the base table, effectively data loss at scale.

Testing done

  • cargo fmt --all -- --check — clean
  • cargo clippy --all-targets -- -D warnings — clean
  • cargo test --workspace — 375 tests pass
  • Verified gsi_pending table schema created correctly via \d gsi_pending
  • Existing test_gsi_async.py validates propagation delay behavior end-to-end

Checklist

  • I have read CONTRIBUTING.md
  • All tests pass (cargo test --workspace)
  • Code is formatted (cargo fmt --check)
  • Clippy is clean (cargo clippy -- -W clippy::pedantic)
  • I have added or updated tests for new functionality
  • I have updated documentation if behavior changed
  • Breaking changes are noted below (if any)

Breaking changes

None. The gsi_pending table is created automatically via the data migration on first init or server startup. Existing deployments gain crash safety transparently.


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache License 2.0 and I agree to the Developer Certificate of
Origin (DCO). See CONTRIBUTING.md for details.

SELECT id FROM gsi_pending \
WHERE ready_at <= NOW() \
ORDER BY id \
LIMIT $1 \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is LIMIT applied before or after ORDER BY?

Copy link
Copy Markdown
Collaborator

@jcshepherd jcshepherd May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a bit of reading ... It looks like Postgres will sort the entire resultset first, and then apply the limit. So if "ready_at" is far enough in the past on a busy table, the sort could be expensive, though the index on ready_at should help mitigate that. My guess is that BATCH_SIZE is small enough to discourage the planner from believing that a full table scan would be cheaper than an index scan.

pk_hash(pk_text.as_ref()),
&key_info.account_id,
&key_info.table_name,
if has_async_indexes(&indexes, sys_delay) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be called once at the beginning (right after indexes is populated).

-- Inserted atomically within the base write transaction, consumed by
-- background workers. Survives process crash/restart.

CREATE TABLE IF NOT EXISTS gsi_pending (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall where it's buried, but somewhere there is a catalog version identifier that should be updated when the metadata/system table schema is updated, so that extenddb migrate will know that there is a migration to perform. I believe it's in storage-postgres somewhere.

Copy link
Copy Markdown
Collaborator

@jcshepherd jcshepherd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple initial questions/comments. Probably the one I'm most concerned with is the evaluation order of LIMIT and ORDER BY. If I were a gambler, I'd wager results are LIMITed before ORDER BY, which may not given you the ordering guarantees you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Data Loss / Inconsistency on process exit

2 participants