[SPARK-56566][SS] Stop applying prev lower bound to transformWithState timer scans#55474
Closed
HeartSaVioR wants to merge 1 commit into
Closed
[SPARK-56566][SS] Stop applying prev lower bound to transformWithState timer scans#55474HeartSaVioR wants to merge 1 commit into
HeartSaVioR wants to merge 1 commit into
Conversation
HyukjinKwon
approved these changes
Apr 22, 2026
Contributor
Author
|
Thanks! Merging to master. |
longvu-db
pushed a commit
to longvu-db/spark
that referenced
this pull request
Apr 22, 2026
…e timer scans ### What changes were proposed in this pull request? This PR changes `TransformWithStateExec.processTimers` to stop passing `prevBatchTimestampMs` / `eventTimeWatermarkForLateEvents` as the exclusive lower bound to `processorHandle.getExpiredTimers(...)`. Timer scans now always run without a lower bound (full scan up to the current batch timestamp / eviction watermark). Concretely: - ProcessingTime branch: `getExpiredTimers(batchTimestamp, prevBatchTimestampMs)` -> `getExpiredTimers(batchTimestamp)`. - EventTime branch: the `STATEFUL_OPERATOR_ALLOW_MULTIPLE` conditional that computed `prevWatermark` is removed; `getExpiredTimers(watermark, prevWatermark)` -> `getExpiredTimers(watermark)`. The `TimerStateImpl.getExpiredTimers` signature is unchanged — its `prevExpiryTimestampMs` parameter (added in SPARK-56400) is retained as a library primitive so that we can re-enable the optimization in the future once `registerTimer` enforces `ts > currentBatchTimestamp / watermark` server-side. `TTLState.ttlEvictionIterator` is also unchanged. TTL expirations are always strictly above `prevBatchTimestampMs` by construction (TTL is processing-time-only, `ttlDuration` is validated `> 0`, and `batchTimestampMs` is monotonically non-decreasing), so the TTL path is not affected by this bug. ### Why are the changes needed? SPARK-56400 introduced a bounded `rangeScan` in `TimerStateImpl.getExpiredTimers` whose exclusive lower bound is `prevBatchTimestampMs` (ProcessingTime) or `eventTimeWatermarkForLateEvents` (EventTime, multi-op — the default). `registerTimer` has guard on the registered expiry, so a user can legally call `registerTimer(ts)` with `ts` at or below that lower bound. In that case the bounded scan excludes the entry, and because the lower bound is monotonically non-decreasing across batches, the timer is never fired in any subsequent batch either — silently lost. Reproduction: - ProcessingTime: from batch 2 onwards (`prevBatchTimestampMs` is non-`None`), any `registerTimer(ts <= prevBatchTimestampMs)` never fires. - EventTime (default `STATEFUL_OPERATOR_ALLOW_MULTIPLE=true`): any `registerTimer(ts <= eventTimeWatermarkForLateEvents)` never fires. - EventTime legacy single-op (`STATEFUL_OPERATOR_ALLOW_MULTIPLE=false`): unaffected — `processTimers` already falls back to a full scan in that mode. Affected idioms the bug silently breaks: - `registerTimer(0L)` as a "fire on the next batch" pattern. - Event-time timers derived from a record's `event_time` in multi-op chains where a downstream operator's late-events watermark is looser than the upstream. - Any `registerTimer(ts)` that hits `ts <= prev` even once — the timer is dead forever unless the user also calls `deleteTimer`. Reverting the lower bound is the most conservative fix. A complementary follow-up (tracked separately) should restrict the valid timestamp range on `registerTimer` so that `ts > currentBatchTimestamp / watermark` is enforced; after that lands we can re-enable the bounded scan. ### Does this PR introduce _any_ user-facing change? No, since the bug wasn't released yet. ### How was this patch tested? New UTs. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Closes apache#55474 from HeartSaVioR/SPARK-56566. Authored-by: Jungtaek Lim <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR changes
TransformWithStateExec.processTimersto stop passingprevBatchTimestampMs/eventTimeWatermarkForLateEventsas the exclusive lower bound toprocessorHandle.getExpiredTimers(...). Timer scans now always run without a lower bound (full scan up to the current batch timestamp / eviction watermark).Concretely:
getExpiredTimers(batchTimestamp, prevBatchTimestampMs)->getExpiredTimers(batchTimestamp).STATEFUL_OPERATOR_ALLOW_MULTIPLEconditional that computedprevWatermarkis removed;getExpiredTimers(watermark, prevWatermark)->getExpiredTimers(watermark).The
TimerStateImpl.getExpiredTimerssignature is unchanged — itsprevExpiryTimestampMsparameter (added in SPARK-56400) is retained as a library primitive so that we can re-enable the optimization in the future onceregisterTimerenforcests > currentBatchTimestamp / watermarkserver-side.TTLState.ttlEvictionIteratoris also unchanged. TTL expirations are always strictly aboveprevBatchTimestampMsby construction (TTL is processing-time-only,ttlDurationis validated> 0, andbatchTimestampMsis monotonically non-decreasing), so the TTL path is not affected by this bug.Why are the changes needed?
SPARK-56400 introduced a bounded
rangeScaninTimerStateImpl.getExpiredTimerswhose exclusive lower bound isprevBatchTimestampMs(ProcessingTime) oreventTimeWatermarkForLateEvents(EventTime, multi-op — the default).registerTimerhas guard on the registered expiry, so a user can legally callregisterTimer(ts)withtsat or below that lower bound. In that case the bounded scan excludes the entry, and because the lower bound is monotonically non-decreasing across batches, the timer is never fired in any subsequent batch either — silently lost.Reproduction:
prevBatchTimestampMsis non-None), anyregisterTimer(ts <= prevBatchTimestampMs)never fires.STATEFUL_OPERATOR_ALLOW_MULTIPLE=true): anyregisterTimer(ts <= eventTimeWatermarkForLateEvents)never fires.STATEFUL_OPERATOR_ALLOW_MULTIPLE=false): unaffected —processTimersalready falls back to a full scan in that mode.Affected idioms the bug silently breaks:
registerTimer(0L)as a "fire on the next batch" pattern.event_timein multi-op chains where a downstream operator's late-events watermark is looser than the upstream.registerTimer(ts)that hitsts <= preveven once — the timer is dead forever unless the user also callsdeleteTimer.Reverting the lower bound is the most conservative fix. A complementary follow-up (tracked separately) should restrict the valid timestamp range on
registerTimerso thatts > currentBatchTimestamp / watermarkis enforced; after that lands we can re-enable the bounded scan.Does this PR introduce any user-facing change?
No, since the bug wasn't released yet.
How was this patch tested?
New UTs.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7