[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

vanzin · 2015-02-27T20:57:42Z

Some YARN configurations return a vcore count for allocated
containers that does not match the requested resource. That means
Spark would always ignore those containers. So relax the the matching
of the vcore count to allow the Spark jobs to run.

Some YARN configurations return a resource structure for allocated containers that does not match the requested resource. That means Spark would always ignore those containers. So add an option to relax the matching of resources, so that users can still run Spark apps in those situations.

vanzin · 2015-02-27T20:59:08Z

@tgravescs @mridulm

Tested:

--executor-cores 1, no conf = passed
--executor-cores 2, no conf = cannot allocate resources, job waits forever
--executor-cores 2, new conf enabled = passed

I chose to leave the default value as "false" because it seems more correct to be strict when matching, but can change it if others feel that's more appropriate.

mridulm · 2015-02-27T21:04:37Z

This is specific to vcores and not mem iirc.
A solution might be to check vcores returned and modify it to what we
requested if found to be 1 when flag is set (we loose actual men allocated,
other info if we replace with 'resource ' all the time, no ?)

On Friday, February 27, 2015, Marcelo Vanzin notifications@github.com
wrote:

@tgravescs https://github.com/tgravescs @mridulm
https://github.com/mridulm

Tested:

--executor-cores 1, no conf = passed

--executor-cores 2, no conf = cannot allocate resources, job waits
forever ---executor-cores 2, new conf enabled = passed

I chose to leave the default value as "false" because it seems more
correct to be strict when matching, but can change it if others feel that's
more appropriate.

—
Reply to this email directly or view it on GitHub
#4818 (comment).

vanzin · 2015-02-27T21:05:51Z

We don't really lose anything, as far as I can tell. That information is only used to make sure that the allocated containers match those that were requested, not to do any other scheduling within Spark.

AmplabJenkins · 2015-02-27T21:17:19Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28087/
Test FAILed.

vanzin · 2015-02-27T21:18:11Z

Jenkins, retest this please.

SparkQA · 2015-02-27T21:22:59Z

Test build #28090 has started for PR 4818 at commit 3359692.

This patch merges cleanly.

tgravescs · 2015-02-27T21:26:31Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala

+    // the request; for example, capacity scheduler + DefaultResourceCalculator. Allow users in
+    // those situations to disable resource matching.
+    val matchingResource =
+        if (sparkConf.getBoolean("spark.yarn.container.matchAnyResource", false)) {


seems like the config is named backwards? If I want to match any then I want to use allocatedContainer.getResource.

Perhaps matchExactResource or keep the name and switch what you get. I would expect to match any by default so its backwards compatible for now.

Also as @mridulm mentioned we are basically just matching what we got back which disables all checks. Before we were checking that memory was atleast big enough.

private def isResourceConstraintSatisfied(container: Container): Boolean = {
container.getResource.getMemory >= (executorMemory + memoryOverhead)
}

Huh, no, if you want to match any you want to use the resource field, not the value returned by yarn (allocatedContainer.getResource). That means the comparison will effectively be resource == resource which will always be true.

I can change the config name if you think that would be clearer.

re: memory, can yarn really allocate a container with less resources than you asked for?

The resource in this class is pretty much static during the Spark app's lifetime.

Sorry I'm still not seeing it. resource is:

Resource.newInstance(executorMemory + memoryOverhead, executorCores)

which is going to be executorCores passed in, which would be for instance 8 if I request 8.

That resource is then passed to amClient.getMatchingRequests which is going to find requests that have 8 vcores, which isn't what we want because the RM without cpu scheduling returns ones with 1.

Take a look at the latest version to see if it's any clearer.

But the gist is that amClient.getMatchingResources() is matching the resources you asked for (which is resource) against the parameter you're passing. What the option controls is whether you're passing resource also as the resource to match against - so basically, the exact same structure that is already in the outstanding list of requests.

So I think you're reading the condition backwards. Or something.

Oh, sorry I see now. I was reading it backwards and thinking it was matching what was actually allocated.

So I guess the question is whether we want the default of this to be true so that its backwards compatible. Otherwise the behavior changes for anyone running now that upgrades.

SparkQA · 2015-02-27T22:08:04Z

Test build #28095 has started for PR 4818 at commit 8c9c346.

This patch merges cleanly.

SparkQA · 2015-02-27T22:43:29Z

Test build #28090 has finished for PR 4818 at commit 3359692.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- logError("User class threw exception: " + cause.getMessage, cause)

AmplabJenkins · 2015-02-27T22:43:33Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28090/
Test PASSed.

mridulm · 2015-02-27T22:46:21Z

Looks good to me - pending addressing Tom's comment about what the default should be.

vanzin · 2015-02-27T22:48:03Z

No strong opinion from me on the default. Whatever people prefer.

SparkQA · 2015-02-27T23:28:45Z

Test build #28095 has finished for PR 4818 at commit 8c9c346.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-02-27T23:28:49Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28095/
Test PASSed.

pwendell · 2015-02-28T01:31:06Z

It seems reasonable to me to have the default of "false" and make a comment in the release notes. No strong feelings here though. It depends a lot how many users are hit by this issue.

sryza · 2015-02-28T06:54:35Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala

+    // situations to disable matching of the core count.
+    val matchingResource =
+      if (sparkConf.getBoolean("spark.yarn.container.disableCpuMatching", false)) {
+        Resource.newInstance(allocatedContainer.getResource().getMemory(),


Nit: take out parens for consistency.

sryza · 2015-02-28T07:04:55Z

My opinion is that we should make the default true, as the vanilla YARN default of FIFOScheduler will run into this issue (though most vendor distributions have a better default). There are no versions of YARN that will return containers smaller than were requested, except in this weird situation where the scheduler doesn't support CPU scheduling. I actually think it might be better to avoid a config at all and always just avoid matching on CPU. It's really hard to imagine any situation where it would actually benefit someone to set the config to false. The only one I can think of is debugging incorrect behavior in YARN, and, if we care about that, it would be better to just log something.

mridulm · 2015-02-28T07:23:12Z

@sryza When cpu scheduling is enabled (ref @tgravescs comment here and in jira) it must be validated.
Just as we validate memory and while prioritizing based on locality of the returned resource.

It is a current implementation detail of YARN that it tries to ensure that response contains resource with adequate cpu and memory (or cpu scheduling is disabled) - but this could easily change in future.
Ideally, yarn must have returned requested vCores in the response when cpu scheduling is disabled ... but that is a different issue.

sryza · 2015-02-28T08:13:09Z

I wouldn't really agree that this is a YARN implementation detail. This is of course somewhat subjective given that YARN doesn't really document this behavior, but, speaking with my YARN committer hat on, I'd consider a change that makes YARN return smaller containers than requested when CPU scheduling is on a break in compatibility.

Not strongly opposed to adding a config, but, if we do, I think it's better for the default to optimize for ease of use over the remote possibility that a future version of YARN might change behavior in this way.

Also, whatever behavior we decide on, it would be good (and should be straightforward) to add a test for it in YarnAllocatorSuite.

mridulm · 2015-02-28T10:12:17Z

AFAIK this is not documented or part of the YARN interfaces/public contract : I would prefer that spark depended on defined interfaces which are reasonably stable.
As and when YARN stabilizes and documents their interfaces; we can modify our codebase if need be - but until then let us be defensive about using implementation details.

tgravescs · 2015-02-28T23:28:44Z

I think having the default true would be better so that its backwards compatible. As @sryza mentioned YARN shouldn't really be giving you containers smaller then you requested anyway.

pwendell · 2015-03-02T05:05:28Z

@vanzin okay so maybe set this to true then? I don't have any opinion, but would love to get this in as it's one of the only release blockers.

vanzin · 2015-03-02T18:05:38Z

I'll just remove the option then, since it doesn't seem very useful to have an option to enable more strict matching given the discussion.

…ng).

SparkQA · 2015-03-02T18:22:44Z

Test build #28177 has started for PR 4818 at commit 991c803.

This patch merges cleanly.

SparkQA · 2015-03-02T19:38:32Z

Test build #28177 has finished for PR 4818 at commit 991c803.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-03-02T19:38:36Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28177/
Test PASSed.

tgravescs · 2015-03-02T22:40:12Z

this looks good to me. +1.

…ers. Some YARN configurations return a vcore count for allocated containers that does not match the requested resource. That means Spark would always ignore those containers. So relax the the matching of the vcore count to allow the Spark jobs to run. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #4818 from vanzin/SPARK-6050 and squashes the following commits: 991c803 [Marcelo Vanzin] Remove config option, standardize on legacy behavior (no vcore matching). 8c9c346 [Marcelo Vanzin] Restrict lax matching to vcores only. 3359692 [Marcelo Vanzin] [SPARK-6050] [yarn] Add config option to do lax resource matching. (cherry picked from commit 6b348d9) Signed-off-by: Thomas Graves <tgraves@apache.org>

tgravescs reviewed Feb 27, 2015
View reviewed changes

Restrict lax matching to vcores only.

8c9c346

sryza reviewed Feb 28, 2015
View reviewed changes

Remove config option, standardize on legacy behavior (no vcore matchi…

991c803

…ng).

vanzin changed the title ~~[SPARK-6050] [yarn] Add config option to do lax resource matching.~~ [SPARK-6050] [yarn] Relax matching of vcore count in received containers. Mar 2, 2015

asfgit closed this in 6b348d9 Mar 2, 2015

vanzin deleted the SPARK-6050 branch March 6, 2015 00:42

vanzin mentioned this pull request Oct 13, 2015

[SPARK-11082][YARN] Fix wrong core number when response vcore is less than requested vcore #9095

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

vanzin commented Feb 27, 2015

vanzin commented Feb 27, 2015

mridulm commented Feb 27, 2015

vanzin commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

vanzin commented Feb 27, 2015

SparkQA commented Feb 27, 2015

tgravescs Feb 27, 2015

tgravescs Feb 27, 2015

vanzin Feb 27, 2015

vanzin Feb 27, 2015

tgravescs Feb 27, 2015

vanzin Feb 27, 2015

tgravescs Feb 27, 2015

SparkQA commented Feb 27, 2015

SparkQA commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

mridulm commented Feb 27, 2015

vanzin commented Feb 27, 2015

SparkQA commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

pwendell commented Feb 28, 2015

sryza Feb 28, 2015

sryza commented Feb 28, 2015

mridulm commented Feb 28, 2015

sryza commented Feb 28, 2015

mridulm commented Feb 28, 2015

tgravescs commented Feb 28, 2015

pwendell commented Mar 2, 2015

vanzin commented Mar 2, 2015

SparkQA commented Mar 2, 2015

SparkQA commented Mar 2, 2015

AmplabJenkins commented Mar 2, 2015

tgravescs commented Mar 2, 2015

[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

Conversation

vanzin commented Feb 27, 2015

vanzin commented Feb 27, 2015

mridulm commented Feb 27, 2015

vanzin commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

vanzin commented Feb 27, 2015

SparkQA commented Feb 27, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Feb 27, 2015

SparkQA commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

mridulm commented Feb 27, 2015

vanzin commented Feb 27, 2015

SparkQA commented Feb 27, 2015

AmplabJenkins commented Feb 27, 2015

pwendell commented Feb 28, 2015

Choose a reason for hiding this comment

sryza commented Feb 28, 2015

mridulm commented Feb 28, 2015

sryza commented Feb 28, 2015

mridulm commented Feb 28, 2015

tgravescs commented Feb 28, 2015

pwendell commented Mar 2, 2015

vanzin commented Mar 2, 2015

SparkQA commented Mar 2, 2015

SparkQA commented Mar 2, 2015

AmplabJenkins commented Mar 2, 2015

tgravescs commented Mar 2, 2015