[SPARK-14957][Yarn] Adopt healthy dir to store executor meta by suyanNone · Pull Request #12735 · apache/spark

suyanNone · 2016-04-27T13:35:07Z

What changes were proposed in this pull request?

Adopt a healthy dir to store executor meta.

How was this patch tested?

Unit tests

SparkQA · 2016-04-27T13:50:30Z

Test build #57126 has finished for PR 12735 at commit 2527391.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-04-27T15:36:42Z

I don't understand the problem or your change based on your JIRA. Are you trying to test for writeability? I don't think we want to arbitrarily require certain free space. This patch looks like it deletes directories though, which seems wrong.

tgravescs · 2016-04-27T16:12:19Z

I agree with @srowen can you please describe what you are trying to do in more detail. Why are you looking for multiple of the db files and deleting older ones, when would there be more then 1? Any sort of min disk space would need to be configurable.

Note that in hadoop 2.5 they added a getRecoveryPath() routine to the AuxiliaryService that gets a path to put the ldb. I don't know if we decided on what hadoop version we are support in 2.x, @srowen do you remember if that was decided?

srowen · 2016-04-27T16:27:17Z

Hadoop 2.2 is still the base build for 2.x. If you think this is a small additional good reason to up the requirement, let's highlight that.

tgravescs · 2016-04-27T17:06:13Z

thanks, this is pretty minor and wouldn't warrant changing that.

suyanNone · 2016-04-28T02:54:53Z

    registeredExecutorFile =
      findRegisteredExecutorFile(conf.getTrimmedStrings("yarn.nodemanager.local-dirs"));

we got that local dirs from yarnConfig, there a lot of dirs, but current we always adopt the first dir. right?

first assume there don't exist any meta file:
if the first disk had been removed, or had disk errors, like read-only filesystem/input/output errors/no space left. it will cause ExternalShuffleBlockResolver to create a new leveldb file, but it will failed...and throw IOException, and this IOException will be engulfed by YarnShufflerService

try {
      blockHandler = new ExternalShuffleBlockHandler(transportConf, registeredExecutorFile);
    } catch (Exception e) {
      logger.error("Failed to initialize external shuffle service", e);
    }

So may be we can choose a more healthy disk dir for storing meta file, to avoid necessary exception.

suyanNone · 2016-04-28T03:07:00Z

for why will have multi-meta files, assume we found one, but the disk will be a read-only filesystem, still use that ? or choose another healthy dir to create new one?

if choose the second, we can't delete right, and so we will have 2 files exist...

suyanNone · 2016-04-28T03:15:01Z

To be honest, I not walk through all Yarn shuffle server process, I just fix our user reported problem, why can't connect with shuffle server due to create level db in an non-exists folder. I will take some time to re-produce problem, and be more comprehend about this.

@tgravescs I will look into the getRecoveryPath api...

tgravescs · 2016-04-28T13:56:52Z

If the disk is bad or missing there is nothing else you can do then create a new db since as you say deleting wouldn't work.

Note I think all it does is log a message because we didn't want it to kill the entire nodemanager, but I think we should change that. We should throw an exception if registration doesn't work because the way the nodemanager currently works is that it doesn't fail until someone goes to fetch the data. If it failed quick when the executor registered with it that would be preferable, but that is a YARN bug.

If you are going to look at the getRecoveryPath api, I think we can do it without reflection by defining our own setRecoveryPath function in YarnShuffleService (leave override off so it works with older versions of hadoop). Have it default to some invalid value and if its hadoop 2.5 or greater it will get called and set it to a real value. Then in our code we could check to see if its set and if it is use it, if not we could fall back to currently implementation. Note that setRecoverPath is the only one we really need define since getRecoverPath is protected, but to be safe we might also implement that. We can store our own path.
The only other thing here is that we may want to handle upgrading. If you are currently running the shuffle service the ldb will be in local dirs but when you upgrade it will go to a new path and wouldn't find the old one. To handle this we could just look for it in the new path first and if not there look for it in the old locations and if found then move to the new location.

I think between the throw and using the new api we shouldn't need to check the disks. The recovery path that is supposed to be set by administrators is supposed to be something very resilient such that if its down nothing on the node would work.

tgravescs · 2016-04-28T14:12:31Z

Actually after looking a bit more, Spark does fail fast if the shuffle service isn't there because very soon after start up the BlockManager registers with the shuffle service so if it didn't come up the executors should fail quickly. Is this what you were seeing?
This to me isn't so bad, at least it isn't wasting a bunch of work. Yes new executors could get scheduled there but they should fail very quickly without wasting working.

jerryshao · 2016-05-03T02:22:52Z

Yes, agree with @tgravescs , setRecoveryPath and getRecoverPath is a good way for 2.5+. But as mentioned above, executor will be failed faster if external shuffle service is not existed, so leave as it is is not a big problem.

srowen · 2016-05-05T10:04:01Z

OK sounds like we should close this PR

suyanNone · 2016-05-11T08:02:21Z

@tgravescs, yes, the executor will failed fast...
vaguely remember that there have a application failed caused by shuffle server unhealthy dir, I don't have time to walk deep at that time... now I can't re-produce it
so I will close this.

Adopt healthy dir to store executor meta

2527391

suyanNone changed the title ~~[SPARK-14957][Spark] Adopt healthy dir to store executor meta~~ [SPARK-14957][Yarn] Adopt healthy dir to store executor meta Apr 27, 2016

suyanNone closed this May 11, 2016

Conversation

suyanNone commented Apr 27, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 27, 2016

Uh oh!

srowen commented Apr 27, 2016

Uh oh!

tgravescs commented Apr 27, 2016

Uh oh!

srowen commented Apr 27, 2016

Uh oh!

tgravescs commented Apr 27, 2016

Uh oh!

suyanNone commented Apr 28, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suyanNone commented Apr 28, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suyanNone commented Apr 28, 2016

Uh oh!

tgravescs commented Apr 28, 2016

Uh oh!

tgravescs commented Apr 28, 2016

Uh oh!

jerryshao commented May 3, 2016

Uh oh!

srowen commented May 5, 2016

Uh oh!

suyanNone commented May 11, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

suyanNone commented Apr 28, 2016 •

edited

Loading

suyanNone commented Apr 28, 2016 •

edited

Loading