Skip to content

[Flink] Add option to avoid shuffle in bucket unaware append sink#7717

Merged
JingsongLi merged 1 commit into
apache:masterfrom
nickdelnano:nickdelnano/bucket-unaware-append-no-shuffle
May 7, 2026
Merged

[Flink] Add option to avoid shuffle in bucket unaware append sink#7717
JingsongLi merged 1 commit into
apache:masterfrom
nickdelnano:nickdelnano/bucket-unaware-append-no-shuffle

Conversation

@nickdelnano
Copy link
Copy Markdown
Contributor

@nickdelnano nickdelnano commented Apr 27, 2026

Purpose

Bucket unaware append table [1] is a great choice for streaming events into Paimon format for batch consumers. These types of streams can be very high throughput like clickstream data. Currently there is shuffle in the writer #4203 and in my production use cases (Kafka --> Paimon) this shuffles a lot of data.

rebalance was added to avoid chaining, instead we can use startNewChain. This can avoid the deadlock issue described in that PR without shuffle.

[1] https://cwiki.apache.org/confluence/display/PAIMON/PIP-6%3A+Unaware-Bucket+Table

Tests

FlinkCdcSyncTableSinkITCase.java has tests for schema evolution, however #4203 mentions this is hard to exercise in tests

@nickdelnano nickdelnano force-pushed the nickdelnano/bucket-unaware-append-no-shuffle branch 7 times, most recently from ec17872 to 90c6196 Compare April 28, 2026 16:20
@nickdelnano nickdelnano force-pushed the nickdelnano/bucket-unaware-append-no-shuffle branch from 90c6196 to 8f91105 Compare April 28, 2026 17:37
@nickdelnano
Copy link
Copy Markdown
Contributor Author

@JingsongLi I saw you authored #4203 which relates to this PR, can you review it?

@nickdelnano nickdelnano marked this pull request as ready for review April 29, 2026 15:37
@JingsongLi
Copy link
Copy Markdown
Contributor

+1

@JingsongLi JingsongLi merged commit b0e238b into apache:master May 7, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants