Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10901] [YARN] spark.yarn.user.classpath.first doesn't work #8959

Closed
wants to merge 5 commits into from

Conversation

tgravescs
Copy link
Contributor

This should go into 1.5.2 also.

The issue is we were no longer adding the app.jar to the system classpath.

@vanzin
Copy link
Contributor

vanzin commented Oct 1, 2015

Hi @tgravescs ,

This fixes the problem but I think it's actually just masking a subtle bug elsewhere. In getUserClasspath, there's this code:

val mainUri = mainJar.orElse(Some(APP_JAR)).map(new URI(_))

That is not actually doing the right thing in certain cases. When invoked from a Client instance, the mainJar argument comes from ClientArguments.userJar, so it's never going to be None (and thus always return the name of the original jar instead of APP_JAR).

The "cleanest" thing would be to have just a single version of getUserClasspath that gets things from SparkConf, but that runs into the problem that the conf has not yet been updated when populateClasspath is called.

I think changing that map call to something like the following would fix the source of the problem:

.map { path =>
  val uri = new URI(path)
  if (uri.getScheme == LOCAL_SCHEME) new URI(uri.getPath()) else uri
}

(Not tested.) Does that make sense? Could you try that out?

@tgravescs
Copy link
Contributor Author

where are you suggesting putting this .map so its clear? in getUserPath on mainUri? I'm not seeing how your map call fixes anything so I"m guessing I'm missing the context.

@vanzin
Copy link
Contributor

vanzin commented Oct 1, 2015

Sorry, my suggestion wasn't very clear and maybe not completely correct.

Here's the possible values for mainJar when that method is called:

  • Some("local:/path"): in Client instance when spark.yarn.user.classpath.first is true
  • Some("/path"): in Client instance when spark.yarn.user.classpath.first is true
  • None: in cluster when user jar is not local:
  • Some("local:/path"): in cluster when user jar is local:

The return value for that method for these cases should be URIs pointing to:

  • "/path"
  • APP_JAR
  • APP_JAR
  • "/path"

So I think the following code should handle that:

val mainUri = mainJar.flatMap { path =>
    val uri = new URI(path)
    if (uri.getScheme == LOCAL_SCHEME) Some(new URI(uri.getPath())) else None
  }.getOrElse(new URI(APP_JAR))

Hope that makes better sense now.

@SparkQA
Copy link

SparkQA commented Oct 1, 2015

Test build #43157 has finished for PR 8959 at commit 08f9382.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

At first glance that sounds good, but on further looking it doesn't work because of the way addFileToClasspath is called. It passes them as the uri and not filename.

That is why originally I did this patch as just passing it in. I'll look to see if I can clean it up more.

@vanzin
Copy link
Contributor

vanzin commented Oct 2, 2015

Hmm, I think I see what you mean. Maybe there should be separate handling for app jar vs. other jars when spark.yarn.user.classpath.first is set.

Man this code is overdue for a cleanup / sanity check...

If it starts to get too nasty it might just be easier to add APP_JAR to the classpath like you're doing, and hope the bug doesn't affect anything else.

@SparkQA
Copy link

SparkQA commented Oct 5, 2015

Test build #43242 has finished for PR 8959 at commit b860226.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -1183,11 +1194,22 @@ object Client extends Logging {
private def getUserClasspath(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not needed anymore, is it? Only the one that takes a SparkConf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see the other one calls this; both could be merged into a single method, though.)

@vanzin
Copy link
Contributor

vanzin commented Oct 5, 2015

LGTM pending tests and a minor cleanup.

@tgravescs
Copy link
Contributor Author

sorry typo, running scalastyle manually and will post updated patch shortly.

@SparkQA
Copy link

SparkQA commented Oct 5, 2015

Test build #43245 has finished for PR 8959 at commit e627adc.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2015

Test build #43246 has finished for PR 8959 at commit ac3ffcf.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Oct 6, 2015

Test build #43251 has finished for PR 8959 at commit ac3ffcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Oct 6, 2015

Merging this, master and branch-1.5.

@asfgit asfgit closed this in e978360 Oct 6, 2015
asfgit pushed a commit that referenced this pull request Oct 6, 2015
This should go into 1.5.2 also.

The issue is we were no longer adding the __app__.jar to the system classpath.

Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com>
Author: Tom Graves <tgraves@yahoo-inc.com>

Closes #8959 from tgravescs/SPARK-10901.

(cherry picked from commit e978360)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants