Skip to content

Conversation

@SteNicholas
Copy link
Member

What is the purpose of the change

Shuffle data compression is currently not enabled by default. Shuffle data compression is important for blocking data shuffle and enabling shuffle data compression by default can improve the usability. taskmanager.network.blocking-shuffle.compression.enabled should be true by default.

Brief change log

  • Updates the default value of taskmanager.network.blocking-shuffle.compression.enabled to true.

Verifying this change

  • The testSortMergeShuffleConfigOptionsCorrelation of NettyShuffleEnvironmentConfigurationTest adds the test to verify whether the defaule value of taskmanager.network.blocking-shuffle.compression.enabled is true.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit e3269ea (Wed Nov 17 03:00:01 UTC 2021)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!
  • This pull request references an unassigned Jira ticket. According to the code contribution guide, tickets need to be assigned before starting with the implementation work.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 17, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@SteNicholas
Copy link
Member Author

@wsry @zhuzhurk , could you please take a look at this pull request?

@wsry
Copy link
Contributor

wsry commented Nov 26, 2021

Thanks for preparing this PR, I will take a look soon.

Copy link
Contributor

@hililiwei hililiwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments. :-)

@SteNicholas
Copy link
Member Author

SteNicholas commented Dec 24, 2021

@hililiwei, thanks for your detailed review and I have addressed the above comment. IMO, the boolean type option has no the definition in NettyShuffleEnvironmentOptions and thus I have not added this definition.
@wsry, could you please take a look at the changes?

@wsry
Copy link
Contributor

wsry commented Dec 24, 2021

Hi, thanks a lot for the PR and review. I will take look after finishing this discussion: https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p.

@wsry
Copy link
Contributor

wsry commented Jan 13, 2022

@SteNicholas Thanks for your contribution, I have left a small comment.
BTW, could you please also update the corresponding document for blocking shuffle? (docs/content/docs/ops/batch/blocking_shuffle.md & docs/content.zh/docs/ops/batch/blocking_shuffle.md)

@wsry
Copy link
Contributor

wsry commented Jan 13, 2022

The description for taskmanager.network.sort-shuffle.min-parallelism need also to be updated.

@SteNicholas
Copy link
Member Author

The description for taskmanager.network.sort-shuffle.min-parallelism need also to be updated.

@wsry , I would like to update the description for taskmanager.network.sort-shuffle.min-parallelism.

@SteNicholas SteNicholas requested a review from wsry January 17, 2022 09:17
@wsry
Copy link
Contributor

wsry commented Jan 17, 2022

@SteNicholas Thanks for the update, changes LGTM. I will help to merge it after the CI passes.

@SteNicholas SteNicholas requested a review from wsry January 17, 2022 15:10
@wsry wsry closed this in a14c554 Jan 18, 2022
niklassemmler pushed a commit to niklassemmler/flink that referenced this pull request Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants