[SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side #31936

xkrogen · 2021-03-22T23:04:01Z

What changes were proposed in this pull request?

Add a new config, spark.shuffle.service.name, which allows for Spark applications to look for a YARN shuffle service which is defined at a name other than the default spark_shuffle.

Add a new config, spark.yarn.shuffle.service.metrics.namespace, which allows for configuring the namespace used when emitting metrics from the shuffle service into the NodeManager's metrics2 system.

Add a new mechanism by which to override shuffle service configurations independently of the configurations in the NodeManager. When a resource spark-shuffle-site.xml is present on the classpath of the shuffle service, the configs present within it will be used to override the configs coming from yarn-site.xml (via the NodeManager).

Why are the changes needed?

There are two use cases which can benefit from these changes.

One use case is to run multiple instances of the shuffle service side-by-side in the same NodeManager. This can be helpful, for example, when running a YARN cluster with a mixed workload of applications running multiple Spark versions, since a given version of the shuffle service is not always compatible with other versions of Spark (e.g. see SPARK-27780). With this PR, it is possible to run two shuffle services like spark_shuffle and spark_shuffle_3.2.0, one of which is "legacy" and one of which is for new applications. This is possible because YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see YARN-4577), meaning multiple Spark versions can coexist.

Besides this, the separation of shuffle service configs into spark-shuffle-site.xml can be useful for administrators who want to change and/or deploy Spark shuffle service configurations independently of the configurations for the NodeManager (e.g., perhaps they are owned by two different teams).

Does this PR introduce any user-facing change?

Yes. There are two new configurations related to the external shuffle service, and a new mechanism which can optionally be used to configure the shuffle service. docs/running-on-yarn.md has been updated to provide user instructions; please see this guide for more details.

How was this patch tested?

In addition to the new unit tests added, I have deployed this to a live YARN cluster and successfully deployed two Spark shuffle services simultaneously, one running a modified version of Spark 2.3.0 (which supports some of the newer shuffle protocols) and one running Spark 3.1.1. Spark applications of both versions are able to communicate with their respective shuffle services without issue.

xkrogen · 2021-03-22T23:09:31Z

cc @mridulm @HeartSaVioR @tgravescs @dongjoon-hyun

SparkQA · 2021-03-23T00:24:38Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40950/

SparkQA · 2021-03-23T00:33:25Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40950/

SparkQA · 2021-03-23T01:11:43Z

Test build #136366 has finished for PR 31936 at commit 37b9d14.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

docs/running-on-yarn.md

...-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnShuffleIntegrationSuite.scala

dongjoon-hyun · 2021-03-23T05:08:08Z

docs/running-on-yarn.md

+  <td><code>sparkShuffleService</code></td>
+  <td>
+    The namespace to use when emitting shuffle service metrics into Hadoop metrics2 system of the
+    NodeManager.


Could you add some description about the limitation with old Hadoop versions (like 2.7.x)? Here or at Section Running multiple versions of the Spark Shuffle Service?

For this section, everything will work as expected on Hadoop 2.7.x. The "Running multiple versions" section won't work on 2.7, but I already called out the supported YARN versions there. Can you let me know if there's anything else you think I should call out?

I'm worrying about the situation some users try to use Apache Spark distribution (with Hadoop 2.7) at YARN 2.9+ cluster. Does it work?

it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.
Perhaps we just need to be more explicit in the config spark.shuffle.service.name that either references the section running multiple versions of the Spark Shuffle Service or explicitly states supported in YARN 2.9+. I assume this config with metrics doesn't matter as far as Hadoop version.
Also did we explicitly test with Hadoop 2.7 and the case @dongjoon-hyun brings up?

it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.

No, this is not correct. YARN ignores the hard-coded name on all versions of YARN. Take a look at AuxServices on the 2.7.0 branch:
https://github.com/apache/hadoop/blob/f95b390df2ca7d599f0ad82cf6e8d980469e7abb/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java#L129-L136

spark.shuffle.service.name works fine on Hadoop 2.7, it is only the isolated classloader that won't work on older versions.

I'm worrying about the situation some users try to use Apache Spark distribution (with Hadoop 2.7) at YARN 2.9+ cluster. Does it work?

I don't quite understand the concern here. Does my explanation above address your question? We haven't changed any of the interfaces used to interact with YARN, there should be no binary compatibility issues or anything of that sort. I can test whichever combination of Spark Version + Hadoop Version Distribution running on top of Hadoop Version YARN you like, but I am failing to see where the concern is / what you'd like me to look for.

great, I'm glad it works with 2.7 as well, thanks for clarifying. Yeah the concern was if it didn't work in 2.7 so I think you answered that.

dongjoon-hyun · 2021-03-23T05:10:28Z

Thank you for pining me, @xkrogen .

tgravescs · 2021-03-23T13:27:22Z

this was tried once before under #26000 so just putting it here for reference. I think the main thing is if we are going to support that we make sure its truly configurable and not just half done. But it sounds like you are trying to address the concerns brought up there but I need to do a more detailed review.

One thing doing a quick skim I don't see if in YarnShuffleService.java its still registering the AuxiliaryService with the name spark_shuffle, Is that not really used by the yarn auxiliary service and it uses what you configure in the node manager? In the very least that needs to be documented in the code.

xkrogen · 2021-03-23T15:59:32Z

Working on addressing test failures, will update the PR soon ...

@tgravescs thanks for the insightful comments!

this was tried once before under #26000 so just putting it here for reference. I think the main thing is if we are going to support that we make sure its truly configurable and not just half done. But it sounds like you are trying to address the concerns brought up there but I need to do a more detailed review.

Thanks a lot for sharing this, I did not find it in my previous search. I agree that the concerns raised there around supporting different configurations should be addressed by this PR. Just making the shuffle service name configurable was the easy part :)

One thing doing a quick skim I don't see if in YarnShuffleService.java its still registering the AuxiliaryService with the name spark_shuffle, Is that not really used by the yarn auxiliary service and it uses what you configure in the node manager? In the very least that needs to be documented in the code.

Great callout. I mentioned this in the JIRA description but it looks like I forgot to include it in the PR description. The name in the NodeManager config (yarn.nodemanager.aux-services) is the source-of-truth for the name. The name provided by the service (hard-coded to spark_shuffle in our case) is compared to the configured one, and the NodeManager will log a warning if it doesn't match because it indicates you might have a misconfiguration, but then it is never used again. Actually when using the isolated classloader, the hard-coded name is completely ignored, and instead the name of the class is used:

https://github.com/apache/hadoop/blob/a89ca56a1b0eb949f56e7c6c5c25fdf87914a02f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxiliaryServiceWithCustomClassLoader.java#L170-L172

I would use the value of spark.shuffle.service.name from the configuration instead of hard-coding it, just to keep things clean, but we need the name at the time of class instantiation, and we don't get a Configuration object until serviceInit is called, at which point it's too late. I will update the code comments accordingly.

xkrogen · 2021-03-23T16:38:19Z

Fixed the test in YarnShuffleServiceSuite. I can't reproduce the failures in YarnClusterSuite and they seem potentially unrelated -- which specific tests failed don't seem correlated with my changes -- so I'm going to let the tests re-run and see if it was just a fluke.

SparkQA · 2021-03-23T17:13:17Z

Test build #136417 has started for PR 31936 at commit 5e242f0.

docs/running-on-yarn.md

tgravescs · 2021-03-24T13:29:09Z

docs/running-on-yarn.md

+  <td><code>sparkShuffleService</code></td>
+  <td>
+    The namespace to use when emitting shuffle service metrics into Hadoop metrics2 system of the
+    NodeManager.


it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.
Perhaps we just need to be more explicit in the config spark.shuffle.service.name that either references the section running multiple versions of the Spark Shuffle Service or explicitly states supported in YARN 2.9+. I assume this config with metrics doesn't matter as far as Hadoop version.
Also did we explicitly test with Hadoop 2.7 and the case @dongjoon-hyun brings up?

common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java

...s/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnShuffleAlternateNameConfigSuite.scala

xkrogen · 2021-03-24T17:15:10Z

Updated documentation to make it a bit more clear which features need YARN >= 2.9.0 (only the isolated classloader / multiple shuffle service support, none of the other new additions).

SparkQA · 2021-03-24T17:19:01Z

Test build #136477 has started for PR 31936 at commit 0427a2f.

xkrogen · 2021-03-24T22:56:17Z

Only test failure from the last run was org.apache.spark.scheduler.DAGSchedulerSuite, I don't think it looks related to my changes. This test also succeeded on a previous run and the only change since then was documentation.

SparkQA · 2021-03-24T22:59:34Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41061/

SparkQA · 2021-03-24T23:47:37Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41061/

tgravescs · 2021-03-25T13:53:17Z

overall looks good. Just to clarify did you test this on both a Hadoop < 2.9 and > 2.9 version? It sounds like you probably did but double checking.

xkrogen · 2021-03-25T17:44:04Z

I went through the logs for the most recent test failure and couldn't find any test failures:

> grep -E -e "completed.+aborted" -e "Passed.+Failed" 6_Build\ modules\ sql\ -\ slow\ tests\ \(JDK\ 8,\ hadoop3.2,\ hive2.3\).txt
2021-03-25T15:45:53.5726513Z [info] Suites: completed 423, aborted 0
2021-03-25T15:45:53.5834071Z [info] Passed: Total 1062, Failed 0, Errors 0, Passed 1062, Ignored 1

(I had to download the logs since they were too big for the GitHub web viewer).

Not sure what's wrong but this same suite passed on my previous build with only documentation changes since then (the previous failure was in a different module).

Just to clarify did you test this on both a Hadoop < 2.9 and > 2.9 version? It sounds like you probably did but double checking.

@tgravescs -- I think I did test on a Hadoop 2.7 cluster previously, but just to be sure, I tried just now. Works great using a custom shuffle service name and conf overlay via spark-shuffle-site.xml. And I have tested on a Hadoop 2.10 cluster which is where I was doing the multiple-ESS testing.

tgravescs · 2021-03-26T16:28:24Z

thanks, rekicking the build.

xkrogen · 2021-03-26T18:59:16Z

Okay so the tests are still passing just fine from the module that is reporting as failed:

> tail -n 50 6_Build\ modules\ sql\ -\ slow\ tests\ \(JDK\ 8,\ hadoop3.2,\ hive2.3\).txt | head
2021-03-26T17:28:34.3588275Z [info] ScalaTest
2021-03-26T17:28:34.3593568Z [info] Run completed in 38 minutes, 4 seconds.
2021-03-26T17:28:34.3596067Z [info] Total number of tests run: 1062
2021-03-26T17:28:34.3598550Z [info] Suites: completed 423, aborted 0
2021-03-26T17:28:34.3601246Z [info] Tests: succeeded 1062, failed 0, canceled 0, ignored 1, pending 0
2021-03-26T17:28:34.3603891Z [info] All tests passed.
2021-03-26T17:28:34.3660549Z [info] Passed: Total 1062, Failed 0, Errors 0, Passed 1062, Ignored 1
2021-03-26T17:28:45.5978270Z [warn] In the last 2325 seconds, 9.286 (0.4%) were spent in GC. [Heap: 2.41GB free of 3.00GB, max 3.56GB] Consider increasing the JVM heap using `-Xmx` or try a different collector, e.g. `-XX:+UseG1GC`, for better performance.
2021-03-26T17:28:45.6688003Z [success] Total time: 2326 s (38:46), completed Mar 26, 2021 5:28:45 PM
2021-03-26T17:28:46.6311617Z

But I did notice that there are compilation errors:

[error] /home/runner/work/spark/spark/mllib/target/java/org/apache/spark/mllib/util/MLlibTestSparkContext.java:10:1:  error: illegal combination of modifiers: public and protected
[error]   protected  class testImplicits {
[error]              ^
[error] /home/runner/work/spark/spark/sql/core/target/java/org/apache/spark/sql/UDFSuite.java:4:1:  error: modifier static not allowed here
[error]     static public  class MalformedNonPrimitiveFunction implements scala.Function1<java.lang.String, java.lang.Object>, scala.Serializable {
[error]                    ^
[error] /home/runner/work/spark/spark/sql/core/target/java/org/apache/spark/sql/test/SQLTestUtilsBase.java:19:1:  error: illegal combination of modifiers: public and protected
[error]   protected  class testImplicits {
[error]              ^

(these are about 100 of these)

But none of them are related to my changes. I'm pretty confused about why they're showing up here, do you have any insight?

tgravescs · 2021-03-29T14:02:10Z

I'm not sure why they are failing but I think its these:

2021-03-26T17:53:41.9387410Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m*** 2 TESTS FAILED ***�[0m�[0m
2021-03-26T17:53:41.9700291Z �[0m[�[0m�[31merror�[0m] �[0m�[0mFailed: Total 8676, Failed 2, Errors 0, Passed 8674, Ignored 34�[0m
2021-03-26T17:53:42.0008332Z �[0m[�[0m�[31merror�[0m] �[0m�[0mFailed tests:�[0m
2021-03-26T17:53:42.0010101Z �[0m[�[0m�[31merror�[0m] �[0m�[0m	org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite�[0m
2021-03-26T17:53:42.0011919Z �[0m[�[0m�[31merror�[0m] �[0m�[0m	org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite�[0m

…side and allow for classpath-based config override on server side

…meConfigSuite into a separate class. Fix new test in YarnShuffleServiceSuite when run with other tests in the suite.

…n YARN >= 2.9.0

xkrogen · 2021-03-29T17:16:48Z

Thanks for the pointer @tgravescs ! Looks like a new build was triggered (but not yet run) and I can't find a way to get the old logs so I can't check in more detail why those ones failed.

I tried to reproduce those failures locally after a rebase and haven't been able to. I'm going to push again with the new rebase and see if picking up recent changes from master resolves things. If not I'll check the logs and dig in more detail.

SparkQA · 2021-03-29T20:49:44Z

Test build #136669 has finished for PR 31936 at commit ef88052.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

xkrogen · 2021-03-29T20:57:16Z

The GitHub Actions checks have been queued all day with no progress -- so can't confirm if those are working -- but from the SparkPullRequestBuilder output, the TPCDS tests are succeeding:

Class | Duration | Fail | (diff) | Skip | (diff) | Pass | (diff) | Total | (diff)
...
TPCDSV1_4_PlanStabilitySuite | 47 sec | 0 |   | 0 |   | 103 |   | 103 |  
TPCDSV1_4_PlanStabilityWithStatsSuite | 38 sec | 0 |   | 0 |   | 103 |   | 103

SparkQA · 2021-03-29T21:20:18Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41251/

SparkQA · 2021-03-29T21:24:23Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41251/

shaneknapp · 2021-03-29T21:56:45Z

test this please

SparkQA · 2021-03-29T22:39:09Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41252/

SparkQA · 2021-03-30T00:11:42Z

Test build #136670 has finished for PR 31936 at commit ef88052.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

xkrogen · 2021-03-30T14:55:35Z

Looks like it is clean now 🎉
@tgravescs any more concerns or are we good to go?

tgravescs · 2021-03-30T15:09:58Z

merged to master, thanks @xkrogen

xkrogen · 2021-03-30T15:16:23Z

Fantastic, many thanks @tgravescs !
Also thanks to @dongjoon-hyun for some early feedback!

dongjoon-hyun · 2021-03-30T16:49:48Z

Thank you, @xkrogen and @tgravescs !

github-actions bot added CORE DOCS YARN labels Mar 22, 2021

dongjoon-hyun reviewed Mar 23, 2021

View reviewed changes

docs/running-on-yarn.md Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Mar 23, 2021

View reviewed changes

...-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnShuffleIntegrationSuite.scala Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Mar 23, 2021

View reviewed changes

tgravescs reviewed Mar 24, 2021

View reviewed changes

xkrogen force-pushed the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch from 5e242f0 to 0427a2f Compare March 24, 2021 17:15

xkrogen added 3 commits March 29, 2021 09:12

[SPARK-34828][YARN] Make shuffle service name configurable on client …

ca6b3ce

…side and allow for classpath-based config override on server side

Revert whitespace changes in running-on-yarn.md

e5d7a2d

Minor documentation and comment updates. Break YarnShuffleAlternateNa…

90e743a

…meConfigSuite into a separate class. Fix new test in YarnShuffleServiceSuite when run with other tests in the suite.

Update documentation to make it more clear which features only work o…

ef88052

…n YARN >= 2.9.0

xkrogen force-pushed the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch from 0427a2f to ef88052 Compare March 29, 2021 17:15

tgravescs approved these changes Mar 30, 2021

View reviewed changes

asfgit closed this in 9f065ff Mar 30, 2021

xkrogen deleted the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch March 30, 2021 15:16

itholic mentioned this pull request Jan 21, 2022

[SPARK-37925][DOC] Update document to mention the workaround for YARN-11053 #35223

Closed

panbingkun mentioned this pull request Jan 3, 2024

[SPARK-46546][DOCS] Fix the formatting of tables in running-on-yarn pages #44540

Closed

[SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side #31936

[SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side #31936

Conversation

xkrogen commented Mar 22, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

xkrogen commented Mar 22, 2021

SparkQA commented Mar 23, 2021

SparkQA commented Mar 23, 2021

SparkQA commented Mar 23, 2021

dongjoon-hyun Mar 23, 2021

Choose a reason for hiding this comment

xkrogen Mar 23, 2021

Choose a reason for hiding this comment

dongjoon-hyun Mar 24, 2021

Choose a reason for hiding this comment

tgravescs Mar 24, 2021

Choose a reason for hiding this comment

xkrogen Mar 24, 2021

Choose a reason for hiding this comment

tgravescs Mar 25, 2021

Choose a reason for hiding this comment

dongjoon-hyun commented Mar 23, 2021

tgravescs commented Mar 23, 2021

xkrogen commented Mar 23, 2021

xkrogen commented Mar 23, 2021

SparkQA commented Mar 23, 2021

tgravescs Mar 24, 2021

Choose a reason for hiding this comment

xkrogen commented Mar 24, 2021

SparkQA commented Mar 24, 2021

xkrogen commented Mar 24, 2021

SparkQA commented Mar 24, 2021

SparkQA commented Mar 24, 2021

tgravescs commented Mar 25, 2021

xkrogen commented Mar 25, 2021 • edited Loading

tgravescs commented Mar 26, 2021

xkrogen commented Mar 26, 2021

tgravescs commented Mar 29, 2021

xkrogen commented Mar 29, 2021

SparkQA commented Mar 29, 2021

xkrogen commented Mar 29, 2021

SparkQA commented Mar 29, 2021

SparkQA commented Mar 29, 2021

shaneknapp commented Mar 29, 2021

SparkQA commented Mar 29, 2021

SparkQA commented Mar 30, 2021

xkrogen commented Mar 30, 2021

tgravescs commented Mar 30, 2021

xkrogen commented Mar 30, 2021

dongjoon-hyun commented Mar 30, 2021

xkrogen commented Mar 25, 2021 •

edited

Loading