Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6144]When in cluster mode using ADD JAR with a hdfs:// sourced ja... #4880

Closed
wants to merge 0 commits into from

Conversation

trystanleftwich
Copy link

...r will fail
While in cluster mode if you use ADD JAR with a HDFS sourced jar it will fail trying to source that jar on the worker nodes with the following error:

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@marmbrus
Copy link
Contributor

marmbrus commented Mar 3, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Mar 3, 2015

Test build #28246 has started for PR 4880 at commit 5931cc9.

  • This patch merges cleanly.

fileOverwrite: Boolean): Unit = {
if (!targetDir.mkdir()) {
fileOverwrite: Boolean,
filename: String = ""): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use Option[String] = None. The in L648 you can do val targetFile = new File(targetDir, filename.getOrElse(innerPath.getName).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to adding some kind of type safety here

@vanzin
Copy link
Contributor

vanzin commented Mar 3, 2015

LGTM aside from minor style issue. I also think this should really go into 1.3...

@vanzin
Copy link
Contributor

vanzin commented Mar 3, 2015

@pwendell adding to your radar.

new File(targetDir, filename)
} else {
new File(targetDir, innerPath.getName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? If path refers to a directory with multiple files in it, then this will fetch all of those files using the same name, overwriting all but the last one fetched. IIUC we need to differentiate between path being a directory from it being a file in the beginning of this method:

// L641, before the listStatus logic
if (fs.isFile(path)) {
  val targetFile = new File(targetDir, filename.getOrElse(path.getName))
  val in = fs.open(path)
  downloadFile(path.toString, in, targetFile, fileOverwrite)
} else {
  ... // do the listStatus thing we've been doing before
}

where filename should be set if and only if the path refers to a file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a little weird but it works. Only the very first call to fetchHcfsFile defines filename. If the path passed to it is a directory, it will recursively call itself without setting filename. If it's a file, it will write the file using the given filename. So even if it could be clearer, the code as is should work.

But I'm ok with making it clearer, exactly to avoid this kind of discussion. :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path passed to it is a directory, it will recursively call itself without setting filename

That's not actually true. If you call listStatus on a directory it will list the directory's contents but not include the directory itself (I just verified this). So if the directory contains multiple files they will all go into the else case in L646 and be renamed to the same thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the problem here is that now targetDir is the parent directory of where path should be, and the children are being written directly to that parent path. It needs some code to create this directory corresponding to path before downloading the children.

@andrewor14
Copy link
Contributor

@trystanleftwich thanks for fixing this. I believe given the current way we call fetchHcfsFile your existing patch is sufficient in fixing the problem. However, at the risk of being pedantic, I believe it is technically not correct if the path refers to a directory for the reason I described above. This is only a concern with fetching directories from fetchHcfsFile though, which is something I don't think we support anyway.

In other words I think this patch in its current state is probably fine to merge, but I'd be interested to hear what others think.

@vanzin
Copy link
Contributor

vanzin commented Mar 4, 2015

I tried this patch locally and while it works for addFile(String), it seems to not work for addFile(String, boolean) (i.e. the directory version). Here's the error I got:

Exception in thread "Driver" org.apache.spark.SparkException: File /dataroot/local/yarn/nm/usercache/systest/appcache/application_1425400850634_0015/userFiles-ed754418-8d53-4e86-a324-738b70fab5cd/spark-files.23921
exists and does not match  contents of hdfs://vanzin-st1-1.vpc.cloudera.com:8020/tmp/spark-files.23921/core-site.xml
        at org.apache.spark.util.Utils$.copyFile(Utils.scala:519)
        at org.apache.spark.util.Utils$.org$apache$spark$util$Utils$$downloadFile(Utils.scala:471)
        at org.apache.spark.util.Utils$$anonfun$fetchHcfsFile$1.apply(Utils.scala:654)
        at org.apache.spark.util.Utils$$anonfun$fetchHcfsFile$1.apply(Utils.scala:641)

Let me take a look to see if I figure out what's missing.

@andrewor14
Copy link
Contributor

Ah, I didn't realize addFile also supports directories for Hadoop file systems. Then this does seem to a correctness problem.

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28246 has finished for PR 4880 at commit 5931cc9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28246/
Test PASSed.

@trystanleftwich
Copy link
Author

I fat fingered and accidentally closed this ticket, And for some reason its not picking up that the branch has changes in it. I reopened here:
#4881

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants