Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13232][YARN] Fix executor node label #11129

Closed
wants to merge 1 commit into from

Conversation

AtkinsChang
Copy link

Specify node label for executor will cause Spark not working on Yarn due to Yarn does not support node container request with both label and locality(racks/nodes).

Add test to YarnAllocatorSuite to reproduce this situation and change YarnAllocator#createContainerRequest to apply node label only to locality free request.

@tgravescs
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Feb 16, 2016

Test build #51369 has finished for PR 11129 at commit 28f1491.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

constructor.newInstance(resource, nodes, racks, RM_REQUEST_PRIORITY, true: java.lang.Boolean,
labelExpression.orNull)
labelExp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, currently in your implementation if nodes or racks is not empty, label expression will not be worked even it is explicitly set through configuration.

IMO I would choose to set nodes and racks to null if label expression is configured, otherwise user will be confused why explicitly setting lab expression is not worked.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had this idea before, but imo if totally disregard the data locality may cause a lot of network overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally if user set this label expression configuration, they want the label-based scheduling obviously. But in your implementation you silently disable this, this will make user confuse. Also since label-based scheduling cannot be worked with locality preferences in YARN side (I think it is intentionally), it would be better to ignore the locality information here.

@jerryshao
Copy link
Contributor

Any further updates on it? CC @sryza about this.

@srowen
Copy link
Member

srowen commented Mar 7, 2016

CC @sryza or maybe @steveloughran or @vanzin . I don't speak YARN

@steveloughran
Copy link
Contributor

Yarn and labels, joy.

  • Currently, a node can have exactly one label. That may change at a time in the future, a time called, provisionally "the patch that broke all the code trying to be clever" :)
  • you can ask for work =anywhere, in which case you also need to set relaxLocality=true. Otherwise some validation rejects the request
  • you can do explicit nodes/racks requests, with relax=true or relax=false. Relax = true means relax placement after a couple of heartbeats, 10-15s. I've seen bugreps implying the fair scheduler relaxes more aggressively; not looked @ it.
  • If you want labels, you don't get to ask for locations
  • if you want to mix labels with locality, you need to be clever and grab the nodemap, parse the labels, make your own decisions. That's not easy given there's nothing in the AMRM API to query that nodemap; you need to do it on the client or in the AM by passing in the relevant client/AM delegation token, do it as a one-off then subscribe to events. Then you need to scan the labelled nodes, find those that match and use it as your placement subset.

Yes, I have done this. No, I would not recommend it. I only did it for anti-affinity placement, where we needed a guarantee that there'd be only one instance per node, labelled or not. (i.e we wanted containers to be away from each other, rather than near the data).

The request validation logic. I think; it failed for a while (SLIDER-1051) until I turned off some of the checks.

There's one more corner case: app doesn't ask for labels, but the queue is bonded to a label. I actually don't know what happens to located requests here.

@srowen
Copy link
Member

srowen commented May 6, 2016

@steveloughran do you think this is a good or not-good change overall?

@steveloughran
Copy link
Contributor

It's actually being fixed right now in Hadoop 2.8, which will take a while to surface.

YARN-4925.

Looking at the patch, all Bibin is doing is cutting that validation check out from the container request submission...it's up to the scheduler how it handles things at that point. Presumably (hopefully) it does the right thing.

If the patch goes in to spark, then eventually, there may be benefits in pulling it. If it goes in now, it stops things breaking today.

This is a tough call...nobody wants to add another switch for this do they?

(now, maximally devious would be to catch the exception and downgrade, but that's both hard to test and inevitably brittle in some way)

@jerryshao
Copy link
Contributor

now, maximally devious would be to catch the exception and downgrade

Maybe we could do this in Spark side, though a little complicated but doable.

Yes it is hard to test label related things in Spark side, at least we could manually verify it locally.

@vanzin
Copy link
Contributor

vanzin commented Dec 7, 2016

It's been a while since discussion died down here; I don't like the current version because it will break things with a fixed YARN. So the options are either implement Steve's suggestion, or not do anything and tell users to not use labels in broken version of YARN. I'm kinda leaning towards the latter but don't really feel strongly.

@HyukjinKwon
Copy link
Member

Hi @AtkinsChang is it still active?

@movwei
Copy link

movwei commented Mar 21, 2018

But In Hadoop 2.8.2 there still have this problem when request a container with both node label and locality(rack/node). ResourceManager will also checked it, and the related code is as follows:

RMAppManager#validateAndCreateResourceRequest -> SchedulerUtils#validateResourceRequest

org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.java
private static void validateResourceRequest(ResourceRequest resReq,
      Resource maximumResource, QueueInfo queueInfo, RMContext rmContext)
      throws InvalidResourceRequestException {
   .......
    String labelExp = resReq.getNodeLabelExpression();
    // we don't allow specify label expression other than resourceName=ANY now
    if (!ResourceRequest.ANY.equals(resReq.getResourceName())
        && labelExp != null && !labelExp.trim().isEmpty()) {
      throw new InvalidLabelResourceRequestException(
          "Invalid resource request, queue=" + queueInfo.getQueueName()
              + " specified node label expression in a "
              + "resource request has resource name = "
              + resReq.getResourceName());
    }
    
    // we don't allow specify label expression with more than one node labels now
    if (labelExp != null && labelExp.contains("&&")) {
      throw new InvalidLabelResourceRequestException(
          "Invailid resource request, queue=" + queueInfo.getQueueName()
              + " specified more than one node label "
              + "in a node label expression, node label expression = "
              + labelExp);
    }
   ......
    }
  }

It will cause spark ApplicationMaster failed with InvalidLabelResourceRequestException, so we still need this patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants