-
Notifications
You must be signed in to change notification settings - Fork 28.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-34828][YARN] Make shuffle service name configurable on client …
…side and allow for classpath-based config override on server side ### What changes were proposed in this pull request? Add a new config, `spark.shuffle.service.name`, which allows for Spark applications to look for a YARN shuffle service which is defined at a name other than the default `spark_shuffle`. Add a new config, `spark.yarn.shuffle.service.metrics.namespace`, which allows for configuring the namespace used when emitting metrics from the shuffle service into the NodeManager's `metrics2` system. Add a new mechanism by which to override shuffle service configurations independently of the configurations in the NodeManager. When a resource `spark-shuffle-site.xml` is present on the classpath of the shuffle service, the configs present within it will be used to override the configs coming from `yarn-site.xml` (via the NodeManager). ### Why are the changes needed? There are two use cases which can benefit from these changes. One use case is to run multiple instances of the shuffle service side-by-side in the same NodeManager. This can be helpful, for example, when running a YARN cluster with a mixed workload of applications running multiple Spark versions, since a given version of the shuffle service is not always compatible with other versions of Spark (e.g. see SPARK-27780). With this PR, it is possible to run two shuffle services like `spark_shuffle` and `spark_shuffle_3.2.0`, one of which is "legacy" and one of which is for new applications. This is possible because YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see YARN-4577), meaning multiple Spark versions can coexist. Besides this, the separation of shuffle service configs into `spark-shuffle-site.xml` can be useful for administrators who want to change and/or deploy Spark shuffle service configurations independently of the configurations for the NodeManager (e.g., perhaps they are owned by two different teams). ### Does this PR introduce _any_ user-facing change? Yes. There are two new configurations related to the external shuffle service, and a new mechanism which can optionally be used to configure the shuffle service. `docs/running-on-yarn.md` has been updated to provide user instructions; please see this guide for more details. ### How was this patch tested? In addition to the new unit tests added, I have deployed this to a live YARN cluster and successfully deployed two Spark shuffle services simultaneously, one running a modified version of Spark 2.3.0 (which supports some of the newer shuffle protocols) and one running Spark 3.1.1. Spark applications of both versions are able to communicate with their respective shuffle services without issue. Closes #31936 from xkrogen/xkrogen-SPARK-34828-shufflecompat-config-from-classpath. Authored-by: Erik Krogen <xkrogen@apache.org> Signed-off-by: Thomas Graves <tgraves@apache.org>
- Loading branch information
Showing
8 changed files
with
240 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
79 changes: 79 additions & 0 deletions
79
...arn/src/test/scala/org/apache/spark/deploy/yarn/YarnShuffleAlternateNameConfigSuite.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.deploy.yarn | ||
|
||
import java.net.URLClassLoader | ||
|
||
import org.apache.hadoop.yarn.conf.YarnConfiguration | ||
|
||
import org.apache.spark._ | ||
import org.apache.spark.internal.config._ | ||
import org.apache.spark.network.yarn.{YarnShuffleService, YarnTestAccessor} | ||
import org.apache.spark.tags.ExtendedYarnTest | ||
|
||
/** | ||
* SPARK-34828: Integration test for the external shuffle service with an alternate name and | ||
* configs (by using a configuration overlay) | ||
*/ | ||
@ExtendedYarnTest | ||
class YarnShuffleAlternateNameConfigSuite extends YarnShuffleIntegrationSuite { | ||
|
||
private[this] val shuffleServiceName = "custom_shuffle_service_name" | ||
|
||
override def newYarnConfig(): YarnConfiguration = { | ||
val yarnConfig = super.newYarnConfig() | ||
yarnConfig.set(YarnConfiguration.NM_AUX_SERVICES, shuffleServiceName) | ||
yarnConfig.set(YarnConfiguration.NM_AUX_SERVICE_FMT.format(shuffleServiceName), | ||
classOf[YarnShuffleService].getCanonicalName) | ||
val overlayConf = new YarnConfiguration() | ||
// Enable authentication in the base NodeManager conf but not in the client. This would break | ||
// shuffle, unless the shuffle service conf overlay overrides to turn off authentication. | ||
overlayConf.setBoolean(NETWORK_AUTH_ENABLED.key, true) | ||
// Add the authentication conf to a separate config object used as an overlay rather than | ||
// setting it directly. This is necessary because a config overlay will override previous | ||
// config overlays, but not configs which were set directly on the config object. | ||
yarnConfig.addResource(overlayConf) | ||
yarnConfig | ||
} | ||
|
||
override protected def extraSparkConf(): Map[String, String] = | ||
super.extraSparkConf() ++ Map(SHUFFLE_SERVICE_NAME.key -> shuffleServiceName) | ||
|
||
override def beforeAll(): Unit = { | ||
val configFileContent = | ||
s"""<?xml version="1.0" encoding="UTF-8"?> | ||
|<configuration> | ||
| <property> | ||
| <name>${NETWORK_AUTH_ENABLED.key}</name> | ||
| <value>false</value> | ||
| </property> | ||
|</configuration> | ||
|""".stripMargin | ||
val jarFile = TestUtils.createJarWithFiles(Map( | ||
YarnTestAccessor.getShuffleServiceConfOverlayResourceName -> configFileContent | ||
)) | ||
// Configure a custom classloader which includes the conf overlay as a resource | ||
val oldClassLoader = Thread.currentThread().getContextClassLoader | ||
Thread.currentThread().setContextClassLoader(new URLClassLoader(Array(jarFile))) | ||
try { | ||
super.beforeAll() | ||
} finally { | ||
Thread.currentThread().setContextClassLoader(oldClassLoader) | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters