[flink] Fix batch fallback generating mixed split types for primary-key tables by matrixsparse · Pull Request #3296 · apache/fluss

matrixsparse · 2026-05-10T07:28:38Z

Summary

Follow-up fix for #3208.

For primary-key tables in batch mode, when no lake snapshot exists, the previous
fallback logic had two issues:

Mixed split types: initPartitionedSplits() / initNonPartitionedSplits()
may produce both HybridSnapshotLogSplit and LogSplit, which the Flink
connector does not support merging.
Missing stopping offset: getLogSplit() used the 3-parameter LogSplit
constructor without stopping offset, causing the reader to never stop in
batch mode.

Changes

Use initLogTablePartitionSplits() / getLogSplit() in the fallback path
to generate uniform LogSplit for all buckets.
Add stoppingOffsetsInitializer.getBucketOffsets() in getLogSplit() to
provide stopping offsets for batch mode bounded reads.

matrixsparse · 2026-05-10T07:52:03Z

Hi @luoyuxia, this is a follow-up fix for the mixed split types issue you mentioned in #3208.

Could you PTAL? Thanks! cc @fresh-borzoni

fresh-borzoni

@matrixsparse Ty, LGTM in general, one comment:

It matches spark logic, but at the same time this scenario is a bit dangerous - we have a big table that was never tiered to lake, then we decide to tier it to lake and run batched query through this fallback, instead of using kv_snapshot and replaying log on top, we read from earliest which is potentially a lot of records. So it's not very efficient and potentially OOM prone.

Let's file an issue about this to address separately for Spark and Flink?
cc @luoyuxia WDYT about this plan?

luoyuxia

@matrixsparse Thanks for the quick fix. Left on comments. PTAL

luoyuxia · 2026-05-12T06:25:01Z

+                                // Use log-only splits to avoid generating mixed split
+                                // types (HybridSnapshotLogSplit + LogSplit) for
+                                // primary-key tables, which is not supported.
+                                splits = this.initLogTablePartitionSplits(partitions);


Note it'll just genereate log split without stopping offset which will then nerver stop..

@matrixsparse Good catch! Updated getLogSplit() to use stoppingOffsetsInitializer.getBucketOffsets() and pass the stopping offset to the 4-parameter LogSplit constructor. PTAL.

…ey tables

fresh-borzoni · 2026-05-16T01:47:49Z

@matrixsparse @luoyuxia filed followups:
#3327
#3317 - resolved

fresh-borzoni

LGTM 👍

matrixsparse force-pushed the feature/fix-batch-fallback-mixed-splits branch from 09b5acd to 7ac1f8c Compare May 10, 2026 07:45

fresh-borzoni reviewed May 11, 2026

View reviewed changes

luoyuxia reviewed May 12, 2026

View reviewed changes

[flink] Fix batch fallback generating mixed split types for primary-k…

927825a

…ey tables

matrixsparse force-pushed the feature/fix-batch-fallback-mixed-splits branch from 7ac1f8c to 927825a Compare May 15, 2026 13:27

fresh-borzoni mentioned this pull request May 16, 2026

[flink] Lake-batch fallback should use KV snapshot instead of reading log from earliest #3327

Open

2 tasks

fresh-borzoni approved these changes May 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink] Fix batch fallback generating mixed split types for primary-key tables#3296

[flink] Fix batch fallback generating mixed split types for primary-key tables#3296
matrixsparse wants to merge 1 commit into
apache:mainfrom
matrixsparse:feature/fix-batch-fallback-mixed-splits

matrixsparse commented May 10, 2026 •

edited

Loading

Uh oh!

matrixsparse commented May 10, 2026

Uh oh!

fresh-borzoni left a comment

Uh oh!

luoyuxia left a comment

Uh oh!

luoyuxia May 12, 2026

Uh oh!

matrixsparse May 15, 2026

Uh oh!

fresh-borzoni commented May 16, 2026

Uh oh!

fresh-borzoni left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

matrixsparse commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

matrixsparse commented May 10, 2026

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

luoyuxia left a comment

Choose a reason for hiding this comment

Uh oh!

luoyuxia May 12, 2026

Choose a reason for hiding this comment

Uh oh!

matrixsparse May 15, 2026

Choose a reason for hiding this comment

Uh oh!

fresh-borzoni commented May 16, 2026

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

matrixsparse commented May 10, 2026 •

edited

Loading