fix: add index on GSI base key columns for efficient propagation deletes by LeeroyHannigan · Pull Request #127 · ExtendDB/extenddb

LeeroyHannigan · 2026-05-25T11:39:21Z

What

Add a B-tree index on (base_pk, base_sk_*) to every GSI/LSI table, both at creation time and retroactively on server startup for existing tables.

Why

Closes #115, Closes #124

When an item with GSI keys is updated, ExtendDB deletes the old GSI row using WHERE base_pk = $1 AND base_sk_s = $2. The existing indexes on GSI tables lead with the GSI partition key (pk), so PostgreSQL cannot use them for this lookup, it falls back to a sequential scan of the entire GSI table. At 20GB this produces 730ms median latency; at 200GB it times out entirely.

Testing done

cargo fmt --all -- --check — clean
cargo clippy --all-targets -- -D warnings — clean
cargo test --workspace — 375 tests pass
Verified DDL path: created table with GSI, confirmed ddb_base_key_idx index on (base_pk, base_sk_s) via pg_indexes
Verified startup migration path: dropped the index manually, restarted server, confirmed ensure_gsi_base_key_indexes recreated it using CREATE INDEX CONCURRENTLY IF NOT EXISTS

Checklist

I have read CONTRIBUTING.md
All tests pass (cargo test --workspace)
Code is formatted (cargo fmt --check)
Clippy is clean (cargo clippy -- -W clippy::pedantic)
I have added or updated tests for new functionality
I have updated documentation if behavior changed
Breaking changes are noted below (if any)

Breaking changes

None. The change is purely additive, an extra index on existing columns. No schema, wire protocol, or behavioral changes.

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache License 2.0 and I agree to the Developer Certificate of
Origin (DCO). See CONTRIBUTING.md for details.

Closes #124

parsnips · 2026-05-25T19:52:09Z

this resolves #115

jcshepherd

I'm okay with the secondary index approach in ddl,rs. Happy to debate further if other folks have concerns or alternate proposals.

I actually think the auto-upgrade in lib.rs opens a can of worms. As a one-off for a 0.11 release: fine, whatever. I don't know that we want this lurking in the code forever though. And I suspect it's not going to be the last time we need to change something about the schema of a user-data table.

Should we be looking at a way to migrate user-data tables to new schema, just like we have for metadata tables? That way the user can pick when they want the migration to happen (because they perform it using the CLI), and also more easily see what is going to change.

Any thoughts trying that approach?

fix: add index on GSI base key columns for efficient propagation deletes

ce1b2b3

Closes #124

parsnips mentioned this pull request May 25, 2026

perf(gsi): add index on (base_pk, base_sk_*) for cascade deletes #116

Closed

7 tasks

jcshepherd reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add index on GSI base key columns for efficient propagation deletes#127

fix: add index on GSI base key columns for efficient propagation deletes#127
LeeroyHannigan wants to merge 1 commit into
mainfrom
fix/gsi-base-key-index

LeeroyHannigan commented May 25, 2026 •

edited

Loading

Uh oh!

parsnips commented May 25, 2026

Uh oh!

jcshepherd left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LeeroyHannigan commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Testing done

Checklist

Uh oh!

parsnips commented May 25, 2026

Uh oh!

jcshepherd left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LeeroyHannigan commented May 25, 2026 •

edited

Loading