Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6050] [yarn] Relax matching of vcore count in received containers. #4818

Closed
wants to merge 3 commits into from

Conversation

vanzin
Copy link
Contributor

@vanzin vanzin commented Feb 27, 2015

Some YARN configurations return a vcore count for allocated
containers that does not match the requested resource. That means
Spark would always ignore those containers. So relax the the matching
of the vcore count to allow the Spark jobs to run.

Some YARN configurations return a resource structure for allocated
containers that does not match the requested resource. That means
Spark would always ignore those containers. So add an option to relax
the matching of resources, so that users can still run Spark apps in
those situations.
@vanzin
Copy link
Contributor Author

vanzin commented Feb 27, 2015

@tgravescs @mridulm

Tested:

  • --executor-cores 1, no conf = passed
  • --executor-cores 2, no conf = cannot allocate resources, job waits forever
  • --executor-cores 2, new conf enabled = passed

I chose to leave the default value as "false" because it seems more correct to be strict when matching, but can change it if others feel that's more appropriate.

@mridulm
Copy link
Contributor

mridulm commented Feb 27, 2015

This is specific to vcores and not mem iirc.
A solution might be to check vcores returned and modify it to what we
requested if found to be 1 when flag is set (we loose actual men allocated,
other info if we replace with 'resource ' all the time, no ?)

On Friday, February 27, 2015, Marcelo Vanzin notifications@github.com
wrote:

@tgravescs https://github.com/tgravescs @mridulm
https://github.com/mridulm

Tested:

  • --executor-cores 1, no conf = passed
  • --executor-cores 2, no conf = cannot allocate resources, job waits
    forever ---executor-cores 2, new conf enabled = passed

I chose to leave the default value as "false" because it seems more
correct to be strict when matching, but can change it if others feel that's
more appropriate.


Reply to this email directly or view it on GitHub
#4818 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Feb 27, 2015

We don't really lose anything, as far as I can tell. That information is only used to make sure that the allocated containers match those that were requested, not to do any other scheduling within Spark.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28087/
Test FAILed.

@vanzin
Copy link
Contributor Author

vanzin commented Feb 27, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Feb 27, 2015

Test build #28090 has started for PR 4818 at commit 3359692.

  • This patch merges cleanly.

// the request; for example, capacity scheduler + DefaultResourceCalculator. Allow users in
// those situations to disable resource matching.
val matchingResource =
if (sparkConf.getBoolean("spark.yarn.container.matchAnyResource", false)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the config is named backwards? If I want to match any then I want to use allocatedContainer.getResource.

Perhaps matchExactResource or keep the name and switch what you get. I would expect to match any by default so its backwards compatible for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as @mridulm mentioned we are basically just matching what we got back which disables all checks. Before we were checking that memory was atleast big enough.

private def isResourceConstraintSatisfied(container: Container): Boolean = {
container.getResource.getMemory >= (executorMemory + memoryOverhead)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, no, if you want to match any you want to use the resource field, not the value returned by yarn (allocatedContainer.getResource). That means the comparison will effectively be resource == resource which will always be true.

I can change the config name if you think that would be clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: memory, can yarn really allocate a container with less resources than you asked for?

The resource in this class is pretty much static during the Spark app's lifetime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm still not seeing it. resource is:

Resource.newInstance(executorMemory + memoryOverhead, executorCores)

which is going to be executorCores passed in, which would be for instance 8 if I request 8.

That resource is then passed to amClient.getMatchingRequests which is going to find requests that have 8 vcores, which isn't what we want because the RM without cpu scheduling returns ones with 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the latest version to see if it's any clearer.

But the gist is that amClient.getMatchingResources() is matching the resources you asked for (which is resource) against the parameter you're passing. What the option controls is whether you're passing resource also as the resource to match against - so basically, the exact same structure that is already in the outstanding list of requests.

So I think you're reading the condition backwards. Or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry I see now. I was reading it backwards and thinking it was matching what was actually allocated.

So I guess the question is whether we want the default of this to be true so that its backwards compatible. Otherwise the behavior changes for anyone running now that upgrades.

@SparkQA
Copy link

SparkQA commented Feb 27, 2015

Test build #28095 has started for PR 4818 at commit 8c9c346.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 27, 2015

Test build #28090 has finished for PR 4818 at commit 3359692.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • logError("User class threw exception: " + cause.getMessage, cause)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28090/
Test PASSed.

@mridulm
Copy link
Contributor

mridulm commented Feb 27, 2015

Looks good to me - pending addressing Tom's comment about what the default should be.

@vanzin
Copy link
Contributor Author

vanzin commented Feb 27, 2015

No strong opinion from me on the default. Whatever people prefer.

@SparkQA
Copy link

SparkQA commented Feb 27, 2015

Test build #28095 has finished for PR 4818 at commit 8c9c346.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28095/
Test PASSed.

@pwendell
Copy link
Contributor

It seems reasonable to me to have the default of "false" and make a comment in the release notes. No strong feelings here though. It depends a lot how many users are hit by this issue.

// situations to disable matching of the core count.
val matchingResource =
if (sparkConf.getBoolean("spark.yarn.container.disableCpuMatching", false)) {
Resource.newInstance(allocatedContainer.getResource().getMemory(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: take out parens for consistency.

@sryza
Copy link
Contributor

sryza commented Feb 28, 2015

My opinion is that we should make the default true, as the vanilla YARN default of FIFOScheduler will run into this issue (though most vendor distributions have a better default). There are no versions of YARN that will return containers smaller than were requested, except in this weird situation where the scheduler doesn't support CPU scheduling. I actually think it might be better to avoid a config at all and always just avoid matching on CPU. It's really hard to imagine any situation where it would actually benefit someone to set the config to false. The only one I can think of is debugging incorrect behavior in YARN, and, if we care about that, it would be better to just log something.

@mridulm
Copy link
Contributor

mridulm commented Feb 28, 2015

@sryza When cpu scheduling is enabled (ref @tgravescs comment here and in jira) it must be validated.
Just as we validate memory and while prioritizing based on locality of the returned resource.

It is a current implementation detail of YARN that it tries to ensure that response contains resource with adequate cpu and memory (or cpu scheduling is disabled) - but this could easily change in future.
Ideally, yarn must have returned requested vCores in the response when cpu scheduling is disabled ... but that is a different issue.

@sryza
Copy link
Contributor

sryza commented Feb 28, 2015

I wouldn't really agree that this is a YARN implementation detail. This is of course somewhat subjective given that YARN doesn't really document this behavior, but, speaking with my YARN committer hat on, I'd consider a change that makes YARN return smaller containers than requested when CPU scheduling is on a break in compatibility.

Not strongly opposed to adding a config, but, if we do, I think it's better for the default to optimize for ease of use over the remote possibility that a future version of YARN might change behavior in this way.

Also, whatever behavior we decide on, it would be good (and should be straightforward) to add a test for it in YarnAllocatorSuite.

@mridulm
Copy link
Contributor

mridulm commented Feb 28, 2015

AFAIK this is not documented or part of the YARN interfaces/public contract : I would prefer that spark depended on defined interfaces which are reasonably stable.
As and when YARN stabilizes and documents their interfaces; we can modify our codebase if need be - but until then let us be defensive about using implementation details.

@tgravescs
Copy link
Contributor

I think having the default true would be better so that its backwards compatible. As @sryza mentioned YARN shouldn't really be giving you containers smaller then you requested anyway.

@pwendell
Copy link
Contributor

pwendell commented Mar 2, 2015

@vanzin okay so maybe set this to true then? I don't have any opinion, but would love to get this in as it's one of the only release blockers.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 2, 2015

I'll just remove the option then, since it doesn't seem very useful to have an option to enable more strict matching given the discussion.

@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28177 has started for PR 4818 at commit 991c803.

  • This patch merges cleanly.

@vanzin vanzin changed the title [SPARK-6050] [yarn] Add config option to do lax resource matching. [SPARK-6050] [yarn] Relax matching of vcore count in received containers. Mar 2, 2015
@SparkQA
Copy link

SparkQA commented Mar 2, 2015

Test build #28177 has finished for PR 4818 at commit 991c803.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28177/
Test PASSed.

@tgravescs
Copy link
Contributor

this looks good to me. +1.

asfgit pushed a commit that referenced this pull request Mar 2, 2015
…ers.

Some YARN configurations return a vcore count for allocated
containers that does not match the requested resource. That means
Spark would always ignore those containers. So relax the the matching
of the vcore count to allow the Spark jobs to run.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4818 from vanzin/SPARK-6050 and squashes the following commits:

991c803 [Marcelo Vanzin] Remove config option, standardize on legacy behavior (no vcore matching).
8c9c346 [Marcelo Vanzin] Restrict lax matching to vcores only.
3359692 [Marcelo Vanzin] [SPARK-6050] [yarn] Add config option to do lax resource matching.

(cherry picked from commit 6b348d9)
Signed-off-by: Thomas Graves <tgraves@apache.org>
@asfgit asfgit closed this in 6b348d9 Mar 2, 2015
@vanzin vanzin deleted the SPARK-6050 branch March 6, 2015 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants