New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26632][Core] Separate Thread Configurations of Driver and Executor #23560
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very useful in RPC performance tuning, especially on large scale. Can you please fix a few style issue?
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
This pr looks very clear. |
@zjf2012 , can you please update the PR description with the performance test result? |
cc @cloud-fan @vanzin @jerryshao, can you please help review this? In our test, configuring RPC threads separately for driver and executor are very usefully for performance, especially for large scale. |
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
cc @zsxwing |
Can you please fix the conflict? |
ok to test. |
Maybe we should also update the docs. |
Test build #101625 has finished for PR 23560 at commit
|
…ds, clientThreads and dispatcher Threads
…verThreads, clientThreads and dispatcher Threads" This reverts commit dd54f80a09dfcb92f4959b557e738a94288bd982.
…ds, clientThreads and dispatcher Threads
d55d6cb
to
ecab1cc
Compare
Test build #101660 has finished for PR 23560 at commit
|
@jerryshao , I checked configuration.md as well as other related md files under the docs folder. I cannot find a proper place to put my documentation. Actually, the spark documentation even doesn't mention IO threads-related configurations. Would you please give some suggestion? thanks. |
OK, I see. |
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
Test build #101804 has finished for PR 23560 at commit
|
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
retest this please |
Test build #103464 has started for PR 23560 at commit |
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
Outdated
Show resolved
Hide resolved
@VaibhavDesai ?!?!? |
wont happen again! |
Test build #103489 has finished for PR 23560 at commit
|
Test build #103491 has finished for PR 23560 at commit
|
@vanzin, Is this ticket ok for you to merge? any more comments? thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid adding comments that just repeat what the code is doing.
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
Outdated
Show resolved
Hide resolved
Also it would be good to update |
@vanzin , I tried to give some documentation about this PR. Please help review. |
@carsonwang , please help review. |
Test build #105023 has finished for PR 23560 at commit
|
core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/network/netty/SparkTransportConfSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/network/netty/SparkTransportConfSuite.scala
Outdated
Show resolved
Hide resolved
Test build #105138 has finished for PR 23560 at commit
|
|
||
private[netty] val transportConf = SparkTransportConf.fromSparkConf( | ||
conf.clone.set(RPC_IO_NUM_CONNECTIONS_PER_PEER, 1), | ||
"rpc", | ||
conf.get(RPC_IO_THREADS).getOrElse(numUsableCores)) | ||
conf.get(RPC_IO_THREADS).getOrElse(numUsableCores), | ||
role) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not necessary to separate into another line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine (one arg per line).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jerryshao , The guide says,
For method declarations, use 4 space indentation for their parameters and put each in each line when the parameters don't fit in two lines.
I'll keep this code.
core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
Outdated
Show resolved
Hide resolved
|
||
private[netty] val transportConf = SparkTransportConf.fromSparkConf( | ||
conf.clone.set(RPC_IO_NUM_CONNECTIONS_PER_PEER, 1), | ||
"rpc", | ||
conf.get(RPC_IO_THREADS).getOrElse(numUsableCores)) | ||
conf.get(RPC_IO_THREADS).getOrElse(numUsableCores), | ||
role) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine (one arg per line).
core/src/test/scala/org/apache/spark/network/netty/SparkTransportConfSuite.scala
Outdated
Show resolved
Hide resolved
Test build #105180 has finished for PR 23560 at commit
|
retest this please |
Test build #105299 has finished for PR 23560 at commit
|
Merging to master. |
What changes were proposed in this pull request?
For the below three thread configuration items applied to both driver and executor,
spark.rpc.io.serverThreads
spark.rpc.io.clientThreads
spark.rpc.netty.dispatcher.numThreads,
we separate them to driver specifics and executor specifics.
spark.driver.rpc.io.serverThreads < - > spark.executor.rpc.io.serverThreads
spark.driver.rpc.io.clientThreads < - > spark.executor.rpc.io.clientThreads
spark.driver.rpc.netty.dispatcher.numThreads < - > spark.executor.rpc.netty.dispatcher.numThreads
Spark reads these specifics first and fall back to the common configurations.
How was this patch tested?
We ran the SimpleMap app without shuffle for benchmark purpose to test Spark's scalability in HPC with omini-path NIC which has higher bandwidth than normal ethernet NIC.
Spark's base version is 2.4.0.
Spark ran in the Standalone mode. Driver was in a standalone node.
After the separation, the performance is improved a lot in 256 nodes and 512 nodes. see below test results of SimpleMapTask before and after the enhancement. You can view the tables in the JIRA too.
ds: spark.driver.rpc.io.serverThreads
dc: spark.driver.rpc.io.clientThreads
dd: spark.driver.rpc.netty.dispatcher.numThreads
ed: spark.executor.rpc.netty.dispatcher.numThreads
time: Overall Time (s)
old time: Overall Time without Separation (s)
Before:
After: