Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side #31936

Conversation

xkrogen
Copy link
Contributor

@xkrogen xkrogen commented Mar 22, 2021

What changes were proposed in this pull request?

Add a new config, spark.shuffle.service.name, which allows for Spark applications to look for a YARN shuffle service which is defined at a name other than the default spark_shuffle.

Add a new config, spark.yarn.shuffle.service.metrics.namespace, which allows for configuring the namespace used when emitting metrics from the shuffle service into the NodeManager's metrics2 system.

Add a new mechanism by which to override shuffle service configurations independently of the configurations in the NodeManager. When a resource spark-shuffle-site.xml is present on the classpath of the shuffle service, the configs present within it will be used to override the configs coming from yarn-site.xml (via the NodeManager).

Why are the changes needed?

There are two use cases which can benefit from these changes.

One use case is to run multiple instances of the shuffle service side-by-side in the same NodeManager. This can be helpful, for example, when running a YARN cluster with a mixed workload of applications running multiple Spark versions, since a given version of the shuffle service is not always compatible with other versions of Spark (e.g. see SPARK-27780). With this PR, it is possible to run two shuffle services like spark_shuffle and spark_shuffle_3.2.0, one of which is "legacy" and one of which is for new applications. This is possible because YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see YARN-4577), meaning multiple Spark versions can coexist.

Besides this, the separation of shuffle service configs into spark-shuffle-site.xml can be useful for administrators who want to change and/or deploy Spark shuffle service configurations independently of the configurations for the NodeManager (e.g., perhaps they are owned by two different teams).

Does this PR introduce any user-facing change?

Yes. There are two new configurations related to the external shuffle service, and a new mechanism which can optionally be used to configure the shuffle service. docs/running-on-yarn.md has been updated to provide user instructions; please see this guide for more details.

How was this patch tested?

In addition to the new unit tests added, I have deployed this to a live YARN cluster and successfully deployed two Spark shuffle services simultaneously, one running a modified version of Spark 2.3.0 (which supports some of the newer shuffle protocols) and one running Spark 3.1.1. Spark applications of both versions are able to communicate with their respective shuffle services without issue.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 22, 2021

@SparkQA
Copy link

SparkQA commented Mar 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40950/

@SparkQA
Copy link

SparkQA commented Mar 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40950/

@SparkQA
Copy link

SparkQA commented Mar 23, 2021

Test build #136366 has finished for PR 31936 at commit 37b9d14.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

<td><code>sparkShuffleService</code></td>
<td>
The namespace to use when emitting shuffle service metrics into Hadoop metrics2 system of the
NodeManager.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some description about the limitation with old Hadoop versions (like 2.7.x)? Here or at Section Running multiple versions of the Spark Shuffle Service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this section, everything will work as expected on Hadoop 2.7.x. The "Running multiple versions" section won't work on 2.7, but I already called out the supported YARN versions there. Can you let me know if there's anything else you think I should call out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worrying about the situation some users try to use Apache Spark distribution (with Hadoop 2.7) at YARN 2.9+ cluster. Does it work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.
Perhaps we just need to be more explicit in the config spark.shuffle.service.name that either references the section running multiple versions of the Spark Shuffle Service or explicitly states supported in YARN 2.9+. I assume this config with metrics doesn't matter as far as Hadoop version.
Also did we explicitly test with Hadoop 2.7 and the case @dongjoon-hyun brings up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.

No, this is not correct. YARN ignores the hard-coded name on all versions of YARN. Take a look at AuxServices on the 2.7.0 branch:
https://github.com/apache/hadoop/blob/f95b390df2ca7d599f0ad82cf6e8d980469e7abb/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java#L129-L136

spark.shuffle.service.name works fine on Hadoop 2.7, it is only the isolated classloader that won't work on older versions.

I'm worrying about the situation some users try to use Apache Spark distribution (with Hadoop 2.7) at YARN 2.9+ cluster. Does it work?

I don't quite understand the concern here. Does my explanation above address your question? We haven't changed any of the interfaces used to interact with YARN, there should be no binary compatibility issues or anything of that sort. I can test whichever combination of Spark Version + Hadoop Version Distribution running on top of Hadoop Version YARN you like, but I am failing to see where the concern is / what you'd like me to look for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, I'm glad it works with 2.7 as well, thanks for clarifying. Yeah the concern was if it didn't work in 2.7 so I think you answered that.

@dongjoon-hyun
Copy link
Member

Thank you for pining me, @xkrogen .

@tgravescs
Copy link
Contributor

this was tried once before under #26000 so just putting it here for reference. I think the main thing is if we are going to support that we make sure its truly configurable and not just half done. But it sounds like you are trying to address the concerns brought up there but I need to do a more detailed review.

One thing doing a quick skim I don't see if in YarnShuffleService.java its still registering the AuxiliaryService with the name spark_shuffle, Is that not really used by the yarn auxiliary service and it uses what you configure in the node manager? In the very least that needs to be documented in the code.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 23, 2021

Working on addressing test failures, will update the PR soon ...

@tgravescs thanks for the insightful comments!

this was tried once before under #26000 so just putting it here for reference. I think the main thing is if we are going to support that we make sure its truly configurable and not just half done. But it sounds like you are trying to address the concerns brought up there but I need to do a more detailed review.

Thanks a lot for sharing this, I did not find it in my previous search. I agree that the concerns raised there around supporting different configurations should be addressed by this PR. Just making the shuffle service name configurable was the easy part :)

One thing doing a quick skim I don't see if in YarnShuffleService.java its still registering the AuxiliaryService with the name spark_shuffle, Is that not really used by the yarn auxiliary service and it uses what you configure in the node manager? In the very least that needs to be documented in the code.

Great callout. I mentioned this in the JIRA description but it looks like I forgot to include it in the PR description. The name in the NodeManager config (yarn.nodemanager.aux-services) is the source-of-truth for the name. The name provided by the service (hard-coded to spark_shuffle in our case) is compared to the configured one, and the NodeManager will log a warning if it doesn't match because it indicates you might have a misconfiguration, but then it is never used again. Actually when using the isolated classloader, the hard-coded name is completely ignored, and instead the name of the class is used:

https://github.com/apache/hadoop/blob/a89ca56a1b0eb949f56e7c6c5c25fdf87914a02f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxiliaryServiceWithCustomClassLoader.java#L170-L172

I would use the value of spark.shuffle.service.name from the configuration instead of hard-coding it, just to keep things clean, but we need the name at the time of class instantiation, and we don't get a Configuration object until serviceInit is called, at which point it's too late. I will update the code comments accordingly.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 23, 2021

Fixed the test in YarnShuffleServiceSuite. I can't reproduce the failures in YarnClusterSuite and they seem potentially unrelated -- which specific tests failed don't seem correlated with my changes -- so I'm going to let the tests re-run and see if it was just a fluke.

@SparkQA
Copy link

SparkQA commented Mar 23, 2021

Test build #136417 has started for PR 31936 at commit 5e242f0.

docs/running-on-yarn.md Show resolved Hide resolved
<td><code>sparkShuffleService</code></td>
<td>
The namespace to use when emitting shuffle service metrics into Hadoop metrics2 system of the
NodeManager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the name referenced by the node manager works with the Hadoop 2.9+ custom class loader, but I assume with Hadoop 2.7 it requires the spark_shuffle name ? hence the spark.shuffle.service.name won't work unless you have recompiled the code and manually changed it.
Perhaps we just need to be more explicit in the config spark.shuffle.service.name that either references the section running multiple versions of the Spark Shuffle Service or explicitly states supported in YARN 2.9+. I assume this config with metrics doesn't matter as far as Hadoop version.
Also did we explicitly test with Hadoop 2.7 and the case @dongjoon-hyun brings up?

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 24, 2021

Updated documentation to make it a bit more clear which features need YARN >= 2.9.0 (only the isolated classloader / multiple shuffle service support, none of the other new additions).

@xkrogen xkrogen force-pushed the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch from 5e242f0 to 0427a2f Compare March 24, 2021 17:15
@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Test build #136477 has started for PR 31936 at commit 0427a2f.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 24, 2021

Only test failure from the last run was org.apache.spark.scheduler.DAGSchedulerSuite, I don't think it looks related to my changes. This test also succeeded on a previous run and the only change since then was documentation.

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41061/

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41061/

@tgravescs
Copy link
Contributor

overall looks good. Just to clarify did you test this on both a Hadoop < 2.9 and > 2.9 version? It sounds like you probably did but double checking.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 25, 2021

I went through the logs for the most recent test failure and couldn't find any test failures:

> grep -E -e "completed.+aborted" -e "Passed.+Failed" 6_Build\ modules\ sql\ -\ slow\ tests\ \(JDK\ 8,\ hadoop3.2,\ hive2.3\).txt
2021-03-25T15:45:53.5726513Z [info] Suites: completed 423, aborted 0
2021-03-25T15:45:53.5834071Z [info] Passed: Total 1062, Failed 0, Errors 0, Passed 1062, Ignored 1

(I had to download the logs since they were too big for the GitHub web viewer).

Not sure what's wrong but this same suite passed on my previous build with only documentation changes since then (the previous failure was in a different module).

Just to clarify did you test this on both a Hadoop < 2.9 and > 2.9 version? It sounds like you probably did but double checking.

@tgravescs -- I think I did test on a Hadoop 2.7 cluster previously, but just to be sure, I tried just now. Works great using a custom shuffle service name and conf overlay via spark-shuffle-site.xml. And I have tested on a Hadoop 2.10 cluster which is where I was doing the multiple-ESS testing.

@tgravescs
Copy link
Contributor

thanks, rekicking the build.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 26, 2021

Okay so the tests are still passing just fine from the module that is reporting as failed:

> tail -n 50 6_Build\ modules\ sql\ -\ slow\ tests\ \(JDK\ 8,\ hadoop3.2,\ hive2.3\).txt | head
2021-03-26T17:28:34.3588275Z [info] ScalaTest
2021-03-26T17:28:34.3593568Z [info] Run completed in 38 minutes, 4 seconds.
2021-03-26T17:28:34.3596067Z [info] Total number of tests run: 1062
2021-03-26T17:28:34.3598550Z [info] Suites: completed 423, aborted 0
2021-03-26T17:28:34.3601246Z [info] Tests: succeeded 1062, failed 0, canceled 0, ignored 1, pending 0
2021-03-26T17:28:34.3603891Z [info] All tests passed.
2021-03-26T17:28:34.3660549Z [info] Passed: Total 1062, Failed 0, Errors 0, Passed 1062, Ignored 1
2021-03-26T17:28:45.5978270Z [warn] In the last 2325 seconds, 9.286 (0.4%) were spent in GC. [Heap: 2.41GB free of 3.00GB, max 3.56GB] Consider increasing the JVM heap using `-Xmx` or try a different collector, e.g. `-XX:+UseG1GC`, for better performance.
2021-03-26T17:28:45.6688003Z [success] Total time: 2326 s (38:46), completed Mar 26, 2021 5:28:45 PM
2021-03-26T17:28:46.6311617Z

But I did notice that there are compilation errors:

[error] /home/runner/work/spark/spark/mllib/target/java/org/apache/spark/mllib/util/MLlibTestSparkContext.java:10:1:  error: illegal combination of modifiers: public and protected
[error]   protected  class testImplicits {
[error]              ^
[error] /home/runner/work/spark/spark/sql/core/target/java/org/apache/spark/sql/UDFSuite.java:4:1:  error: modifier static not allowed here
[error]     static public  class MalformedNonPrimitiveFunction implements scala.Function1<java.lang.String, java.lang.Object>, scala.Serializable {
[error]                    ^
[error] /home/runner/work/spark/spark/sql/core/target/java/org/apache/spark/sql/test/SQLTestUtilsBase.java:19:1:  error: illegal combination of modifiers: public and protected
[error]   protected  class testImplicits {
[error]              ^

(these are about 100 of these)

But none of them are related to my changes. I'm pretty confused about why they're showing up here, do you have any insight?

@tgravescs
Copy link
Contributor

I'm not sure why they are failing but I think its these:

2021-03-26T17:53:41.9387410Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m*** 2 TESTS FAILED ***�[0m�[0m
2021-03-26T17:53:41.9700291Z �[0m[�[0m�[31merror�[0m] �[0m�[0mFailed: Total 8676, Failed 2, Errors 0, Passed 8674, Ignored 34�[0m
2021-03-26T17:53:42.0008332Z �[0m[�[0m�[31merror�[0m] �[0m�[0mFailed tests:�[0m
2021-03-26T17:53:42.0010101Z �[0m[�[0m�[31merror�[0m] �[0m�[0m	org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite�[0m
2021-03-26T17:53:42.0011919Z �[0m[�[0m�[31merror�[0m] �[0m�[0m	org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite�[0m

…side and allow for classpath-based config override on server side
…meConfigSuite into a separate class. Fix new test in YarnShuffleServiceSuite when run with other tests in the suite.
@xkrogen xkrogen force-pushed the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch from 0427a2f to ef88052 Compare March 29, 2021 17:15
@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 29, 2021

Thanks for the pointer @tgravescs ! Looks like a new build was triggered (but not yet run) and I can't find a way to get the old logs so I can't check in more detail why those ones failed.

I tried to reproduce those failures locally after a rebase and haven't been able to. I'm going to push again with the new rebase and see if picking up recent changes from master resolves things. If not I'll check the logs and dig in more detail.

@SparkQA
Copy link

SparkQA commented Mar 29, 2021

Test build #136669 has finished for PR 31936 at commit ef88052.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 29, 2021

The GitHub Actions checks have been queued all day with no progress -- so can't confirm if those are working -- but from the SparkPullRequestBuilder output, the TPCDS tests are succeeding:

Class | Duration | Fail | (diff) | Skip | (diff) | Pass | (diff) | Total | (diff)
...
TPCDSV1_4_PlanStabilitySuite | 47 sec | 0 |   | 0 |   | 103 |   | 103 |  
TPCDSV1_4_PlanStabilityWithStatsSuite | 38 sec | 0 |   | 0 |   | 103 |   | 103

@SparkQA
Copy link

SparkQA commented Mar 29, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41251/

@SparkQA
Copy link

SparkQA commented Mar 29, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41251/

@shaneknapp
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Mar 29, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41252/

@SparkQA
Copy link

SparkQA commented Mar 30, 2021

Test build #136670 has finished for PR 31936 at commit ef88052.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 30, 2021

Looks like it is clean now 🎉
@tgravescs any more concerns or are we good to go?

@asfgit asfgit closed this in 9f065ff Mar 30, 2021
@tgravescs
Copy link
Contributor

merged to master, thanks @xkrogen

@xkrogen
Copy link
Contributor Author

xkrogen commented Mar 30, 2021

Fantastic, many thanks @tgravescs !
Also thanks to @dongjoon-hyun for some early feedback!

@xkrogen xkrogen deleted the xkrogen-SPARK-34828-shufflecompat-config-from-classpath branch March 30, 2021 15:16
@dongjoon-hyun
Copy link
Member

Thank you, @xkrogen and @tgravescs !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants