-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-15031][runtime] Calculate required shuffle memory before allocating slots if resources are specified #16173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 536c625 (Thu Sep 23 17:57:03 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
|
Thanks for the PR, @jinxing64 . Would you like to rebase on the lastest |
9b3f70d to
c036cb8
Compare
zhuzhurk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this PR @jinxing64
The change generally looks good to me. There are some minor comments though.
My major concern is about the possible doubled network memory requirement if include floating buffers into the announced memory. One idea I can think of is to introduce a fraction style config to make the extra network memory configurable. See comment in FLINK-15031.
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/PointwisePatternTest.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/PointwisePatternTest.java
Show resolved
Hide resolved
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SSGNetworkMemoryCalculator.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/test/java/org/apache/flink/runtime/shuffle/NettyShuffleUtilsTest.java
Show resolved
Hide resolved
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtil.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtil.java
Outdated
Show resolved
Hide resolved
8862f89 to
7ee72c3
Compare
|
Thanks a lot for shepherd on this @zhuzhurk ~ |
...e/src/test/java/org/apache/flink/runtime/scheduler/SSGNetworkMemoryCalculationUtilsTest.java
Outdated
Show resolved
Hide resolved
...ntime/src/main/java/org/apache/flink/runtime/scheduler/SSGNetworkMemoryCalculationUtils.java
Outdated
Show resolved
Hide resolved
...-runtime/src/test/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtilTest.java
Outdated
Show resolved
Hide resolved
...-runtime/src/test/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtilTest.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/NettyShuffleUtils.java
Outdated
Show resolved
Hide resolved
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/NettyShuffleUtils.java
Outdated
Show resolved
Hide resolved
…e grained network buffer requirement
...-runtime/src/test/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtilTest.java
Outdated
Show resolved
Hide resolved
...-runtime/src/test/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtilTest.java
Outdated
Show resolved
Hide resolved
…in ResourceProfile of SSG
|
@zhuzhurk Thanks for deep review ~ I verified the network memory requirement calculating algorithm by real Flink job. The result is in line with expectation.
|
zhuzhurk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the comments! @jinxing64
The change looks good to me.
Will merge it once CI gives green.
|
Merging. |
What is the purpose of the change
Calculating and announcing the volume of required network memory for shuffle is an important part of 'fine grained resource management'(FLIP-156). This PR proposes the implementation and help user avoid suffering from network memory shortage.
Brief change log
Verifying this change
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (yes / no)Documentation