Skip to content

Conversation

@rkistner
Copy link
Contributor

This adds a check for replication slots that are "lost" due to max_slot_wal_keep_size exceeded, to automatically re-create the slot if needed. There is an explicit wal_status field since Postgres 13+ that we now check, as well as some additional error message checks.

The slot health check is now also modified to only wait 2 minutes for the slot to become healthy, rather than 120 tries. This is relevant because each individual try can take 2 minutes in some scenarios, which can cause the overall check to only fail after 4 hours.

This does not yet solve the issue of this health check potentially causing high load on the source database. For that we should probably use an exponential back-off mechanism for the overall replication retries (separate PR).

@changeset-bot
Copy link

changeset-bot bot commented Oct 30, 2025

🦋 Changeset detected

Latest commit: 72b12b1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
@powersync/service-module-postgres Patch
@powersync/lib-service-postgres Patch
@powersync/service-schema Patch
@powersync/service-image Patch
@powersync/service-module-postgres-storage Patch
@powersync/service-module-mongodb Patch
@powersync/service-module-mysql Patch
@powersync/service-core Patch
@powersync/service-core-tests Patch
@powersync/service-module-core Patch
@powersync/service-module-mongodb-storage Patch
test-client Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@rkistner rkistner marked this pull request as ready for review October 30, 2025 02:00
stevensJourney
stevensJourney previously approved these changes Oct 30, 2025
Copy link
Collaborator

@stevensJourney stevensJourney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :)

Base automatically changed from test-pg-18 to main October 31, 2025 09:38
@rkistner rkistner dismissed stevensJourney’s stale review October 31, 2025 09:38

The base branch was changed.

@rkistner rkistner force-pushed the lost-replication-slot branch from e1f9e83 to 5b6e088 Compare October 31, 2025 09:39
@rkistner rkistner merged commit 0e9aa94 into main Oct 31, 2025
22 checks passed
@rkistner rkistner deleted the lost-replication-slot branch October 31, 2025 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants