-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41376][CORE] Correct the Netty preferDirectBufs check logic on executor start #38901
Conversation
… spark.shuffle.io.preferDirectBufs
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
Can one of the admins verify this patch? |
@srowen code and comment have been updated. |
It's a bigger change, maybe someone with more understanding of the code can evaluate this |
+CC @Ngone51 (who last updated this) and @cloud-fan (who merged the commit). |
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
val preferDirectBufsForSharedByteBufAllocators = | ||
env.conf.getBoolean("spark.network.io.preferDirectBufs", true) | ||
val preferDirectBufs = | ||
env.conf.getBoolean("spark.shuffle.io.preferDirectBufs", true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see where is this conf accessed? I didn't find it used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a combination of ("spark", module, "io.preferDirectBufs"), for shuffle client, the module is "shuffle", see org.apache.spark.network.util.TransportConf
@@ -85,7 +85,19 @@ private[spark] class CoarseGrainedExecutorBackend( | |||
|
|||
logInfo("Connecting to driver: " + driverUrl) | |||
try { | |||
if (PlatformDependent.directBufferPreferred() && | |||
// The following logic originally comes from the constructor of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add this logic into a NettyUitls function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid not, NettyUtils
located at network-common
, it can not access SparkConf
which in core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about NettyUtils just accept boolean inputs, not config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, @Ngone51 would you please take a look again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems OK; can we or should we share this logic elsewhere? can it be called at other sites or is this the only place this check happens?
I think yes. The following logic looks similar but not same, I think the existing code is clear enough. spark/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java Lines 73 to 79 in 1d2159f
Lines 110 to 116 in 1d2159f
|
Merged to master |
Thank you all. |
Thanks @dongjoon-hyun for checking, I think you are right, it should be ported to 3.2/3.3 |
Could you make a backporting PR, @pan3793 ? |
… executor start ### What changes were proposed in this pull request? Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38901 from pan3793/SPARK-41376. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Sean Owen <srowen@gmail.com>
… executor start ### What changes were proposed in this pull request? Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38901 from pan3793/SPARK-41376. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Sean Owen <srowen@gmail.com>
@dongjoon-hyun I opened #38981 (for branch-3.3) and #38982 (for branch-3.2), please take a look. |
…ic on executor start ### What changes were proposed in this pull request? Backport #38901 to branch-3.3. Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes #38981 from pan3793/SPARK-41376-3.3. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ic on executor start ### What changes were proposed in this pull request? Backport #38901 to branch-3.2. Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes #38982 from pan3793/SPARK-41376-3.2. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
… executor start ### What changes were proposed in this pull request? Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38901 from pan3793/SPARK-41376. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Sean Owen <srowen@gmail.com>
…ic on executor start ### What changes were proposed in this pull request? Backport apache#38901 to branch-3.2. Fix the condition for judging Netty prefer direct memory on executor start, the logic should match `org.apache.spark.network.client.TransportClientFactory`. ### Why are the changes needed? The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match `org.apache.spark.network.client.TransportClientFactory` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing. Closes apache#38982 from pan3793/SPARK-41376-3.2. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Fix the condition for judging Netty prefer direct memory on executor start, the logic should match
org.apache.spark.network.client.TransportClientFactory
.Why are the changes needed?
The check logical was added in SPARK-27991, the original intention is to avoid potential Netty OOM issue when Netty uses direct memory to consume shuffle data, but the condition is not sufficient, this PR updates the logic to match
org.apache.spark.network.client.TransportClientFactory
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual testing.