Merged
Conversation
🦋 Changeset detectedLatest commit: cbdc4d3 The changes in this PR will be included in the next version bump. This PR includes changesets to release 12 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
rkistner
commented
Mar 30, 2026
Rentacookie
approved these changes
Mar 30, 2026
Contributor
|
Thats a really big improvement in execution time! 🎉 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
We run per-table validations for Postgres, both for the validation API (using the passed-in sync config), and the diagnostics API (using the current sync config).
This runs these checks sequentially for each table:
pg_class).pg_class).pg_attribute&pg_index).SELECT 1 FROM <table> LIMIT 1).pg_publication_tables).pg_class,pg_roles).This results in 6 sequential queries for every table. This is fine for a small number of tables, or when we have low latency to the source database. However, we have seen cases where:
This then results in the validation and diagnostics APIs taking longer than the 60s timeout to complete.
The fix
This refactors all these checks into a single plpgsql script, using a total of 3 round-trips to the source database.
This also adds more comprehensive tests to check that we're covering each case.
For the most part, the behavior is kept the same. One exception is
REPLICA INDENTITY NOTHING- this was previously reported as an error, but we do support it in the replication code (with limits - postgres prevents you from updating or deleting rows), so this check is relaxed now.Testing
Testing validation with a high latency db connection (200ms+), 70 tables.
Before: 80s.
After: 6s, of which around 2s is from
getDebugTablesInfo. The rest is from other connection checks and fetching of the schema.The implementation
The actual queries are to a large extent based on the old ones, ported by Codex to a single plpgsql script. Since these scripts / DO blocks can't take in parameters directly, this uses
set_config/current_settingto set the parameters in a local transaction.Instead of a plpgsql script like this, it could be feasible to runs this as a couple of separate queries. The biggest difficultly would be the loop doing
SELECT 1 FROM <table> LIMIT 1, which is tricky without a script (or individual queries, which would be too slow).