Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21339] [CORE] spark-shell --packages option does not add jars to classpath on windows #18708

Closed
wants to merge 2 commits into from

Conversation

devaraj-kavali
Copy link

What changes were proposed in this pull request?

The --packages option jars are getting added to the classpath with the scheme as "file:///", in Unix it doesn't have problem with this since the scheme contains the Unix Path separator which separates the jar name with location in the classpath. In Windows, the jar file is not getting resolved from the classpath because of the scheme.

Windows : file:///C:/Users//.ivy2/jars/.jar
Unix : file:///home//.ivy2/jars/.jar

With this PR, we are avoiding the 'file://' scheme to get added to the packages jar files.

How was this patch tested?

I have verified manually in Windows and Unix environments, with the change it adds the jar to classpath like below,

Windows : C:\Users<user>.ivy2\jars<jar-name>.jar
Unix : /home//.ivy2/jars/.jar

@cloud-fan
Copy link
Contributor

ok to test

@jiangxb1987
Copy link
Contributor

cc @HyukjinKwon Could you look at this PR please?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jul 23, 2017

Thanks for cc'ing me @jiangxb1987.

@devaraj-kavali,

In Windows, the jar file is not getting resolved from the classpath because of the scheme.

Are you saying "file:///C:/Users//.ivy2/jars/.jar" is not the correct form of URI on Windows?

scala> new java.io.File("C:\\Temp\\text.txt").exists
res0: Boolean = true

scala> new java.io.File(new java.net.URI("file:///C:/Temp/text.txt")).exists
res1: Boolean = true

I think that is also a correct form of URI on Windows - https://blogs.msdn.microsoft.com/ie/2006/12/06/file-uris-in-windows/

I think the root cause sounds a different encoding resolution in Hadoop library or somewhere.

@HyukjinKwon
Copy link
Member

It's quite late in my local time so will take a closer look within coming few days but I hope @devaraj-kavali double checks this.

@SparkQA
Copy link

SparkQA commented Jul 23, 2017

Test build #79888 has finished for PR 18708 at commit 3242532.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@devaraj-kavali
Copy link
Author

Thanks @HyukjinKwon for checking this and for the link.

Are you saying "file:///C:/Users//.ivy2/jars/.jar" is not the correct form of URI on Windows?

This URI form is correct for java.io.File and also for ClassLoader. I see that that the jar is getting loaded with this URI form from the ClassLoader during my inspection.

SparkSubmit.scala

  private def addJarToClasspath(localJar: String, loader: MutableURLClassLoader) {
    val uri = Utils.resolveURI(localJar)
    uri.getScheme match {
      case "file" | "local" =>
        val file = new File(uri.getPath)
        if (file.exists()) {
          loader.addURL(file.toURI.toURL)
        } else {

But this URI form is not supported by the java -classpath. In this below code, the jars are having the file:/// scheme which is not supported and causing the issue.

Main.scala

  private[repl] def doMain(args: Array[String], _interp: SparkILoop): Unit = {
    interp = _interp
    val jars = Utils.getUserJars(conf, isShell = true).mkString(File.pathSeparator)
    val interpArguments = List(
      "-Yrepl-class-based",
      "-Yrepl-outdir", s"${outputDir.getAbsolutePath}",
      "-classpath", jars

And also we can see the difference with these java commands,

In Windows,

C:\>java -classpath file:///C:/work/test/a1.jar com.test.second.ClassZ
Error: Could not find or load main class com.test.second.ClassZ

C:\>java -classpath C:/work/test/a1.jar com.test.second.ClassZ
From main method

In Unix,

[devaraj@devaraj-work-pc ~]$ java -classpath /home/devaraj/work/install/test/a1.jar com.test.second.ClassZ
From main method

[devaraj@devaraj-work-pc ~]$ java -classpath file:///home/devaraj/work/install/test/a1.jar com.test.second.ClassZ
From main method

@vanzin
Copy link
Contributor

vanzin commented Jul 24, 2017

But this URI form is not supported by the java -classpath

Then shouldn't the fix be in the code that transforms the URI list into an argument for -classpath?

I'm pretty sure the code you're modifying in SparkSubmit will end up breaking other use cases.

@devaraj-kavali
Copy link
Author

Thanks @vanzin for checking this.

Then shouldn't the fix be in the code that transforms the URI list into an argument for -classpath?

I'm pretty sure the code you're modifying in SparkSubmit will end up breaking other use cases.

I thought this would be clean instead of adding the scheme file:/// for each jar and removing it before setting into the -classpath. I have verified the change in all deployment modes and I didn't see any issue. If you are sure that it breaks the other user cases, I will update the PR to remove the file:/// scheme from jars before setting into classpath.

@vanzin
Copy link
Contributor

vanzin commented Jul 24, 2017

If you are sure that it breaks the other user cases,

For one, --jars file:/tmp/jars/* will break with your change, since globs won't be resolved for file URLs.

@SparkQA
Copy link

SparkQA commented Jul 26, 2017

Test build #79954 has finished for PR 18708 at commit 025e484.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@devaraj-kavali
Copy link
Author

@vanzin I have updated the changes, can you check and validate the change? Thanks

@vanzin
Copy link
Contributor

vanzin commented Aug 1, 2017

Merging to master / 2.2. I'll fix a style nit while merging.

asfgit pushed a commit that referenced this pull request Aug 1, 2017
…o classpath on windows

The --packages option jars are getting added to the classpath with the scheme as "file:///", in Unix it doesn't have problem with this since the scheme contains the Unix Path separator which separates the jar name with location in the classpath. In Windows, the jar file is not getting resolved from the classpath because of the scheme.

Windows : file:///C:/Users/<user>/.ivy2/jars/<jar-name>.jar
Unix : file:///home/<user>/.ivy2/jars/<jar-name>.jar

With this PR, we are avoiding the 'file://' scheme to get added to the packages jar files.

I have verified manually in Windows and Unix environments, with the change it adds the jar to classpath like below,

Windows : C:\Users\<user>\.ivy2\jars\<jar-name>.jar
Unix : /home/<user>/.ivy2/jars/<jar-name>.jar

Author: Devaraj K <devaraj@apache.org>

Closes #18708 from devaraj-kavali/SPARK-21339.

(cherry picked from commit 58da1a2)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
@asfgit asfgit closed this in 58da1a2 Aug 1, 2017
@quantkeyvis
Copy link

quantkeyvis commented Dec 8, 2017

Hi, has this fix been implemented? (In any of the currently released versions)

MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…o classpath on windows

The --packages option jars are getting added to the classpath with the scheme as "file:///", in Unix it doesn't have problem with this since the scheme contains the Unix Path separator which separates the jar name with location in the classpath. In Windows, the jar file is not getting resolved from the classpath because of the scheme.

Windows : file:///C:/Users/<user>/.ivy2/jars/<jar-name>.jar
Unix : file:///home/<user>/.ivy2/jars/<jar-name>.jar

With this PR, we are avoiding the 'file://' scheme to get added to the packages jar files.

I have verified manually in Windows and Unix environments, with the change it adds the jar to classpath like below,

Windows : C:\Users\<user>\.ivy2\jars\<jar-name>.jar
Unix : /home/<user>/.ivy2/jars/<jar-name>.jar

Author: Devaraj K <devaraj@apache.org>

Closes apache#18708 from devaraj-kavali/SPARK-21339.

(cherry picked from commit 58da1a2)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants