Skip to content

branch-4.1: [fix](streaming-job) Avoid NPE on cross-table DML during snapshot chunk read #63435#63503

Merged
yiguolei merged 1 commit into
branch-4.1from
auto-pick-63435-branch-4.1
May 22, 2026
Merged

branch-4.1: [fix](streaming-job) Avoid NPE on cross-table DML during snapshot chunk read #63435#63503
yiguolei merged 1 commit into
branch-4.1from
auto-pick-63435-branch-4.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Cherry-picked from #63435

…nk read (#63435)

### What problem does this PR solve?

During the snapshot phase, the chunk reader
(`IncrementalSourceScanFetcher`)
consumes from a change-event queue that may also contain DML records
from
**other tables** being captured concurrently. When such a foreign-table
record reached `isChangeRecordInChunkRange`, the code compared it
against
the **current chunk's** PK range via `isRecordBetween(...)`. Two
problems:

1. The foreign table's schema may not yet be loaded for this fetcher, so
     extracting its PK throws NPE.
2. Even if the schema were loaded, the foreign table's PK columns do not
necessarily align with this chunk's bounds, so any range comparison is
meaningless and the record would be incorrectly merged into the wrong
     chunk's output buffer.

  This patch adds an explicit `TableId` check at the very start of
  `isChangeRecordInChunkRange`: records whose `TableId` does not match
  `currentSnapshotSplit.getTableId()` are skipped before any PK-based
  comparison runs.
@github-actions github-actions Bot requested a review from yiguolei as a code owner May 22, 2026 02:12
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

run buildall

@yiguolei yiguolei merged commit f4d6208 into branch-4.1 May 22, 2026
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants