Auto-detect starting ID from partition range in synchronize#20
Auto-detect starting ID from partition range in synchronize#20
Conversation
When --start is omitted, synchronize now resolves the partition time range from the intermediate table and uses it to scope the initial MIN(id) lookup and batch queries — matching how fill already works. Shared partition resolution logic (transformIdValue, resolvePartitionContext, resolvePartitionTimeFilter) is consolidated in table.ts, eliminating duplication across synchronizer.ts and filler.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| } else { | ||
| // Get max from dest - resume from where we left off (exclusive) | ||
| const destMaxId = await destTable.maxId(tx); | ||
| startingId = destMaxId ?? undefined; |
There was a problem hiding this comment.
In the description it says it that fill auto-calculates the starting ID. Is that true? Asking because from the code above, it looks like it actually selects the greatest ID from the destination table (destTable generally the intermediate table).
So if you are resuming a previous fill, it'll start where it left off. But it's the first fill, then startingId will be undefined. Does this mean we are relying on the timeFilter to ensure that fill gets the oldest ID from the source table that would fit into the oldest partition?
Curious what you found from testing. This part of the code was always a bit murky to me.
There was a problem hiding this comment.
The description could've been clearer. The behaviour of fill is unchanged. The auto-detection is the new synchronize behaviour.
As you pointed out, during fill's first-run the startingId is null, so timeFilter does the scoping. What I changed here is to make synchronize partition-aware too by computing a time-bounded minId instead of an unbounded MIN(id).
I added a test ("scopes to partition range on first fill") that demonstrates the fill behaviour explicitly, the test inserts an in-range and out-of-range row and asserts only the in-range row gets filled.
The test is the fill counterpart to the synchronize test at https://github.com/workos/pgslice/pull/20/changes#diff-61fad5299499932598bc32e477627a209e7dfaa4be682eaac2b88db17c7b2ed3R292
There was a problem hiding this comment.
Cool, appreciate you taking another look.
Moves `resolvePartitionContext` and `resolvePartitionTimeFilter` from free functions to `Table#partitionContext` and `Table#partitionTimeFilter`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
--startis omitted, synchronize now uses the intermediate table'spartition range to compute the starting ID, instead of scanning the entire
source table with an unbounded
MIN(id)."no partition found" errors when source rows fall outside existing partitions.
transformIdValue,resolvePartitionContext,resolvePartitionTimeFilter) are consolidatedin
table.ts, eliminating duplication acrosssynchronizer.tsandfiller.ts.Motivation
Fill has always resolved partition settings from the destination table and
applied a time filter to its batch queries. Synchronize was the odd one out —
it did a bare
MIN(id)across the entire source table and had no time boundson batch fetches.
Callers no longer need to manually compute a starting ID based on partition
boundaries. The
--startoption still works for explicit overrides, but thedefault behavior is now partition-aware.
What changed
table.ts:transformIdValueexported (with nullable/non-nullableoverloads). New
resolvePartitionContextreturns settings, partitions, andtime filter in one call.
resolvePartitionTimeFilteris a conveniencewrapper.
partitions()signature widened toCommonQueryMethods.synchronizer.ts: Uses shared helpers.initpasses the time filterto
sourceTable.minId()when--startis omitted.#fetchBatchappliesthe time filter to batch queries.
filler.ts: Replaced inline partition resolution and localtransformIdValuewith shared imports fromtable.ts.Test plan
--startis omitted and partitions exist
PGSLICE_URL🤖 Generated with Claude Code