Skip to content

Conversation

@sryza
Copy link
Contributor

@sryza sryza commented Mar 28, 2015

This is difficult to write a test for because it relies on the latest version of YARN, but I verified manually that the patch does pass along the label expression on this version and containers are successfully launched.

@SparkQA
Copy link

SparkQA commented Mar 28, 2015

Test build #29349 has started for PR 5242 at commit 7e9c424.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 28, 2015

Test build #29349 has finished for PR 5242 at commit 7e9c424.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29349/
Test FAILed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, import ordering here.

This looks like a pretty straightforward change, yeah. Too bad this requires reflection, but it's clean. I look forward to being able to assume later versions of YARN / Hadoop to undo some of this. (Although requiring 2.6 probably won't happen too soon.)

The key use case this enables is, for example, running a Spark app on a subset of a cluster's machines that are labeled as having a GPU. I like this given the relatively low impact simple change and the upside to what it enables.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29408 has started for PR 5242 at commit ce82383.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29408 has finished for PR 5242 at commit ce82383.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29408/
Test PASSed.

@tgravescs
Copy link
Contributor

mostly looks good. Seems like we may want to add this to the ApplicationSubmissionContext also to control where the AM can go. And the question there is do we want that to be able to go to different label then executors. If so we need two configs.

@tgravescs
Copy link
Contributor

public static ApplicationSubmissionContext newInstance(
ApplicationId applicationId, String applicationName, String queue,
Priority priority, ContainerLaunchContext amContainer,
boolean isUnmanagedAM, boolean cancelTokensWhenComplete,
int maxAppAttempts, Resource resource, String applicationType,
boolean keepContainers, String appLabelExpression,
String amContainerLabelExpression)

I think we could potentially use the appLabelExpression for the executors also without having to set it on each ResourceRequest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sryza, I think the constructor parameter here is not correct, it should be either

  public static ApplicationSubmissionContext newInstance(
      ApplicationId applicationId, String applicationName, String queue,
      Priority priority, ContainerLaunchContext amContainer,
      boolean isUnmanagedAM, boolean cancelTokensWhenComplete,
      int maxAppAttempts, Resource resource, String applicationType,
      boolean keepContainers, String appLabelExpression,
      String amContainerLabelExpression)

Or

  public static ApplicationSubmissionContext newInstance(
      ApplicationId applicationId, String applicationName, String queue,
      ContainerLaunchContext amContainer, boolean isUnmanagedAM,
      boolean cancelTokensWhenComplete, int maxAppAttempts,
      String applicationType, boolean keepContainers,
      String appLabelExpression, ResourceRequest resourceRequest) 

In 2.6.0 release or after.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me. From the hadoop code:

public ContainerRequest(Resource capability, String[] nodes,
    String[] racks, Priority priority, boolean relaxLocality,
    String nodeLabelsExpression)

@tgravescs
Copy link
Contributor

@sryza could you update to do application master as well?

@srowen
Copy link
Member

srowen commented Apr 14, 2015

@sryza are you still looking at this one? it's a good feature. This guy needs a rebase and Thomas has a decent suggestion here too.

@sryza
Copy link
Contributor Author

sryza commented Apr 14, 2015

@srowen @tgravescs sorry for the delay here. Have been working on a new patch that adds support or AM label expressions, but ran into some YARN API issues that makes it difficult. Will try to post something soon.

@sryza sryza force-pushed the sandy-spark-6470 branch from ce82383 to a892793 Compare April 14, 2015 16:29
@sryza
Copy link
Contributor Author

sryza commented Apr 14, 2015

Updated patch still only supports executor node labels, but renames the property so that adding driver/am node labels in the future won't look weird. I also switched to the API that sets the label expression upon app submission.

@SparkQA
Copy link

SparkQA commented Apr 14, 2015

Test build #30252 has started for PR 5242 at commit a892793.

@SparkQA
Copy link

SparkQA commented Apr 14, 2015

Test build #30252 has finished for PR 5242 at commit a892793.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30252/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should error out more catastrophically if the user is trying to use a feature that doesn't exist.

@sryza sryza force-pushed the sandy-spark-6470 branch from a892793 to 4a1b837 Compare April 15, 2015 00:09
@sryza
Copy link
Contributor Author

sryza commented Apr 15, 2015

It turns out that setting the label expression when creating the ApplicationSubmissionContext sets it for the AM as well, which is undesirable. Updated patch reverts to the original approach of setting the label expression on every container request.

@SparkQA
Copy link

SparkQA commented Apr 15, 2015

Test build #30284 has started for PR 5242 at commit 4a1b837.

@SparkQA
Copy link

SparkQA commented Apr 15, 2015

Test build #30284 has finished for PR 5242 at commit 4a1b837.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.
  • This patch adds the following new dependencies:
    • commons-math3-3.1.1.jar
    • snappy-java-1.1.1.6.jar
  • This patch removes the following dependencies:
    • commons-math3-3.4.1.jar
    • snappy-java-1.1.1.7.jar

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30284/
Test PASSed.

@sryza sryza force-pushed the sandy-spark-6470 branch from 4a1b837 to e377ed6 Compare April 15, 2015 06:48
@SparkQA
Copy link

SparkQA commented Apr 15, 2015

Test build #30323 has started for PR 5242 at commit e377ed6.

@SparkQA
Copy link

SparkQA commented Apr 15, 2015

Test build #30323 has finished for PR 5242 at commit e377ed6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch adds the following new dependencies:
    • commons-math3-3.4.1.jar
    • snappy-java-1.1.1.7.jar
  • This patch removes the following dependencies:
    • commons-math3-3.1.1.jar
    • snappy-java-1.1.1.6.jar

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30323/
Test FAILed.

@sryza
Copy link
Contributor Author

sryza commented Apr 15, 2015

retest this please

@sryza
Copy link
Contributor Author

sryza commented Apr 15, 2015

@tgravescs the first will apply to the AM as well if there's not one for the AM specifically set:
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java#L380

@vanzin yes, came across that as well. My concern was mainly about corner cases with how these get instantiated - do we need to check getAMContainerResourceRequest for null or does it get lazily instantiated. Not big deals at all, but my bandwidth is low right now, and I'm not convinced that we should add AM node label expressions until we hear of a use case for them, so wanted to put that off for a separate JIRA.

@tgravescs
Copy link
Contributor

@sryza so to clarify this pr is only going to do this for executor? is there a jira for doing it for the application master? Based on above is this ready for review?

@sryza
Copy link
Contributor Author

sryza commented Apr 27, 2015

That's correct. Only for executor, but I updated the config naming after your earlier comment so that we have room to add it for the AM in the future. I just filed SPARK-7173 for this.

This is ready for review.

@SparkQA
Copy link

SparkQA commented Apr 27, 2015

Test build #31037 has started for PR 5242 at commit e377ed6.

@tgravescs
Copy link
Contributor

@sryza I think the only remaining comment is to change logInfo to logWarn, otherwise looks good.

@tgravescs
Copy link
Contributor

Jenkins, test this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32395 has started for PR 5242 at commit e377ed6.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32395 has finished for PR 5242 at commit e377ed6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32395/
Test PASSed.

@sryza sryza force-pushed the sandy-spark-6470 branch from e377ed6 to 6af87b9 Compare May 11, 2015 17:09
@sryza
Copy link
Contributor Author

sryza commented May 11, 2015

Sorry, missed that. Just updated the patch.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@srowen
Copy link
Member

srowen commented May 11, 2015

LGTM pending tests. You can tap it in when you're ready.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32408 has started for PR 5242 at commit 6af87b9.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32408 has finished for PR 5242 at commit 6af87b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32408/
Test PASSed.

@asfgit asfgit closed this in 82fee9d May 11, 2015
@sryza
Copy link
Contributor Author

sryza commented May 11, 2015

Checked this in.

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
This is difficult to write a test for because it relies on the latest version of YARN, but I verified manually that the patch does pass along the label expression on this version and containers are successfully launched.

Author: Sandy Ryza <sandy@cloudera.com>

Closes apache#5242 from sryza/sandy-spark-6470 and squashes the following commits:

6af87b9 [Sandy Ryza] Change info to warning
6e22d99 [Sandy Ryza] [YARN] SPARK-6470.  Add support for YARN node labels.
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This is difficult to write a test for because it relies on the latest version of YARN, but I verified manually that the patch does pass along the label expression on this version and containers are successfully launched.

Author: Sandy Ryza <sandy@cloudera.com>

Closes apache#5242 from sryza/sandy-spark-6470 and squashes the following commits:

6af87b9 [Sandy Ryza] Change info to warning
6e22d99 [Sandy Ryza] [YARN] SPARK-6470.  Add support for YARN node labels.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This is difficult to write a test for because it relies on the latest version of YARN, but I verified manually that the patch does pass along the label expression on this version and containers are successfully launched.

Author: Sandy Ryza <sandy@cloudera.com>

Closes apache#5242 from sryza/sandy-spark-6470 and squashes the following commits:

6af87b9 [Sandy Ryza] Change info to warning
6e22d99 [Sandy Ryza] [YARN] SPARK-6470.  Add support for YARN node labels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants