[Incremental Reprocessing] [MongoDB] stream while snapshotting by rkistner · Pull Request #641 · powersync-ja/powersync-service

rkistner · 2026-05-18T15:12:14Z

This is a re-implementation of #450.

This refactors MongoDB replication to:

Split out snapshotting implementation from streaming replication.
Start streaming concurrently with the initial snapshot.

If a table needs a re-snapshot (e.g. due to replica identity changes), this also happens concurrently with streaming (although it still blocks the next commit until it completed). Truncating tables still block the replication stream.

The change

Currently, the replication process is effectively linear / "single threaded". When new sync config is deployed, we create a new replication stream, which performs a snapshot on each table, then starts streaming. This has a couple of limitations:

Replication is slower than it needs to be due to not being able to replicate tables concurrently.
After the initial table snapshots are complete, there could be a significant replication lag that we need to catch up on.

The changes here are also part of the bigger project to implement differential sync config updates - only re-replicating for changed bucket definitions / sync stream definitions. Part of that requires switching to a single replication stream for all copies of sync config, and this builds the base to implement that.

Implementation notes

There is a specific rare but important edge case when we start doing snapshots while streaming: If we don't use soft deletes, as introduced in storage version 3 in #425, then we can miss deletes that were made during the initial snapshot. Due to this, the concurrent snapshots are disabled on the older storage versions. This also adds a test for that specific case. Due to the very specific timing required to reproduce the issue, this adds some storage hooks that allows executing logic before and after a flush. We can investigate using these same or similar hooks to simplify some other tests later, that currently rely on timing only.

The main implementation changes are to move snapshotting to a separate MongoSnapshotter class, using a separate "queue" that can operate concurrently with streaming replication. This does not yet allow using separate processes for snapshot versus streaming yet, or using multiple concurrent snapshotters. Support for those can be added in the future, but requires more careful checks on potential race conditions / consistency issues.

There are additional edge cases around table snapshots: We can now get markSnapshotDone() at the same time that another snapshot is queued. To handle this, the markSnapshotDone() now explicitly checks that there are no pending individual table snapshots. The old markAllSnapshotDone() is still kept around for tests.

The above has a specific effect on MSSQL: There were cases where a capture instance is not present, where the SourceTable was created but the snapshot never performed, which then failed the above check during markSnapshotDone(). This is now changed to not persisted the SourceTable at all in that case.

AI Usage & Implementation notes

Used Codex gpt-5.5 to assist with the implementation, manually guiding and reviewing the changes. At lot of the changes were ported directly from #450, but with additional changes to keep the old behavior for older storage versions, and to fix more edge cases.

changeset-bot · 2026-05-18T15:12:25Z

🦋 Changeset detected

Latest commit: 32a5292

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 11 packages

Name	Type
@powersync/service-module-postgres-storage	Minor
@powersync/service-module-mongodb-storage	Minor
@powersync/service-module-postgres	Minor
@powersync/service-module-mongodb	Minor
@powersync/service-core	Minor
@powersync/service-module-mssql	Minor
@powersync/service-module-mysql	Minor
@powersync/service-schema	Minor
@powersync/service-image	Minor
@powersync/service-module-core	Patch
test-client	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

rkistner · 2026-05-20T09:57:13Z

@Rentacookie Could you check the CDCStream changes here please? See PR description for background on that.

Rentacookie

This looks good to me. I like that most of the snapshotting logic has been split out of the ChangeStream, it makes it much easier to reason about.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfff28e672

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…aming-2

stevensJourney

This looks good to me also. I really like the addition of hooks to the storage.

rkistner added 3 commits May 18, 2026 17:02

[WIP] Concurrent snapshot + streaming for MongoDB.

71dcdbb

Cleanup dead code.

1e2750e

Use native Promise.withResolvers()

bfec131

rkistner added 7 commits May 18, 2026 17:25

Fix queue race conditions.

c277e31

Rename abortSignal.

0478069

Add failing regression test for resurrected deletes.

f4a3d4e

Avoid concurrent snapshots on older storage versions.

eb3230b

Fix uncaught rejection.

b9c07cb

Fix test.

6b4780c

Add changeset.

7c88a35

rkistner marked this pull request as ready for review May 19, 2026 11:10

This comment was marked as outdated.

Sign in to view

Add comments back.

5292186

This comment was marked as outdated.

Sign in to view

rkistner marked this pull request as draft May 19, 2026 11:47

rkistner added 4 commits May 19, 2026 13:55

Redo populatePersistentChecksumCache ordering.

4e925a0

Guard snapshot complete condition.

c03e260

Handle non-concurrent inline snapshots.

519bacd

Bypass the snapshot guard for tests.

a7ec8a9

rkistner marked this pull request as ready for review May 19, 2026 12:20

SQL server: Don't resolve tables we can't snapshot.

617c01b

This comment was marked as resolved.

Sign in to view

rkistner marked this pull request as draft May 19, 2026 13:08

rkistner added 2 commits May 19, 2026 16:42

"Capture instance created" is no longer a schema change.

f0ac876

Handle snapshot queue race conditions.

0fbec3e

rkistner changed the title ~~[Incremental Reprocessing] [WIP] [MongoDB] stream while snapshotting~~ [Incremental Reprocessing] [MongoDB] stream while snapshotting May 20, 2026

Update changeset.

5f52a0a

rkistner marked this pull request as ready for review May 20, 2026 09:53

This comment was marked as resolved.

Sign in to view

Refactor.

dfdc4a7

rkistner force-pushed the mongo-concurrent-streaming-2 branch from fd14b43 to dfdc4a7 Compare May 20, 2026 10:04

rkistner added 3 commits May 20, 2026 12:05

Fix resnapshotting tables.

8d590b6

Remove relationCache from MongoSnapshotter.

51e48aa

Simplify relationCache usage.

17813a6

rkistner requested a review from stevensJourney May 20, 2026 10:57

Rentacookie reviewed May 20, 2026

View reviewed changes

Comment thread modules/module-mssql/src/replication/CDCStream.ts

Rentacookie reviewed May 20, 2026

View reviewed changes

Comment thread modules/module-mongodb/src/replication/ChangeStream.ts Outdated

Rentacookie reviewed May 20, 2026

View reviewed changes

Comment thread modules/module-mongodb/src/replication/ChangeStream.ts Outdated

Rentacookie previously approved these changes May 20, 2026

View reviewed changes

Clarify replication error handling.

b148243

rkistner dismissed Rentacookie’s stale review via b148243 May 20, 2026 14:15

This comment was marked as resolved.

Sign in to view

Move concurrency logic to MongoSnapshotter.

dfff28e

chatgpt-codex-connector Bot reviewed May 20, 2026

View reviewed changes

Comment thread modules/module-mssql/src/replication/CDCStream.ts

rkistner added 3 commits May 20, 2026 17:25

Use read-only table status when snapshotting.

a27916a

Remove dead code.

71c26cb

Merge remote-tracking branch 'origin/main' into mongo-concurrent-stre…

32a5292

…aming-2

stevensJourney approved these changes May 21, 2026

View reviewed changes

rkistner merged commit 15e2466 into main May 21, 2026
44 checks passed

rkistner deleted the mongo-concurrent-streaming-2 branch May 21, 2026 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Incremental Reprocessing] [MongoDB] stream while snapshotting#641

[Incremental Reprocessing] [MongoDB] stream while snapshotting#641
rkistner merged 28 commits into
mainfrom
mongo-concurrent-streaming-2

rkistner commented May 18, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented May 18, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as resolved.

Uh oh!

rkistner commented May 20, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rentacookie left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

stevensJourney left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rkistner commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The change

Implementation notes

AI Usage & Implementation notes

Uh oh!

changeset-bot Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as resolved.

Uh oh!

rkistner commented May 20, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rentacookie left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

stevensJourney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rkistner commented May 18, 2026 •

edited

Loading

changeset-bot Bot commented May 18, 2026 •

edited

Loading