Skip to content

[client] SortMergeReader stops after deleted snapshot rows #3134

@luoyuxia

Description

@luoyuxia

Describe the bug

SortMergeReader.SnapshotMergedRowIteratorWrapper#hasNext() can terminate iteration early when a snapshot row is deleted by the changelog.

When sortMergeWithChangeLog() returns SortMergeRows.EMPTY, hasNext() leaves currentMergedRows as null and immediately returns false. The caller then treats the snapshot iterator as exhausted, so all remaining snapshot rows are skipped.

To Reproduce

A minimal reproduction is:

  1. Build a snapshot iterator with ordered primary keys, for example 0..9.
  2. Build a changelog iterator that contains DELETE records for keys that exist in the snapshot, for example 2, 5, and 8.
  3. Create SortMergeReader.readBatch() with the snapshot iterator and the changelog iterator.
  4. Iterate through the returned reader.

Expected keys: 0,1,3,4,6,7,9

Actual behavior before the fix: iteration stops at the first deleted snapshot key and later snapshot rows are never returned.

Expected behavior

Deleted snapshot rows should be skipped, but iteration should continue until the snapshot iterator is actually exhausted.

Root cause

The bug is in SnapshotMergedRowIteratorWrapper#hasNext() in fluss-client/src/main/java/org/apache/fluss/client/table/scanner/SortMergeReader.java.

The method only tries to fetch one SortMergeRows result. If that result is empty, it does not advance to the next snapshot row and returns false immediately.

Proposed fix

Change hasNext() to keep advancing until it either:

  • finds a non-empty merged result and returns the next row; or
  • truly exhausts the snapshot iterator.

This means empty merge results produced by changelog deletes are skipped instead of terminating the whole scan.

Additional context

I prepared a companion fix and regression test in SortMergeReaderTest that covers the delete scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions