-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1876] Windows fixes to deal with latest distribution layout changes #819
Conversation
mateiz
commented
May 19, 2014
- Look for JARs in the right place
- Launch examples the same way as on Unix
- Load datanucleus JARs if they exist
- Don't attempt to parse local paths as URIs in SparkSubmit, since paths with C:\ are not valid URIs
- Also fixed POM exclusion rules for datanucleus (it wasn't properly excluding it, whereas SBT was)
Also fixed an issues where SparkSubmit was trying to parse local files as URLs, which fails on Windows because they contain backslashes. We didn't need to treat those as URLs to check if a file exists.
They are excluded in SBT, but the rule added in Maven didn't actually remove the files from the JAR. The JARs built still worked despite this, but it's better to remove them than have 2 copies on the classpath.
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
shift | ||
else | ||
echo "Usage: ./bin/run-example <example-class> [example-args]" | ||
echo " - set MASTER=XX to use a specific master" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this deprecated? I thought in general we want people to use --master
, since this goes through Spark submit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess that was already there before
Did you mean to also remove the hive check here? https://github.com/apache/spark/blob/master/bin/compute-classpath.sh#L93 |
Yes, I don't want to rely on "jar" being installed. It's not installed by default when you grab a JRE (as far as I can tell). I'd like to eventually do that on Unix too but it's okay to do it in a later release.� |
My worry with Windows is people downloading pre-built Spark and getting a bizarre behavior. I'm assuming most people will work with pre-built Spark (since you'd mostly use Windows for local development) so those who build it by hand can handle a bit more complexity. |
) | ||
set "datanucleus_jars=" | ||
for %%d in ("%datanucleus_dir%\datanucleus-*.jar") do ( | ||
set datanucleus_jars=!datanucleus_jars!;%%d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @mateiz I just tried this on windows 7 and my classpath includes the string !datanucleus_jars!
. It should probably be %datanucleus_jars%
instead?
(or did you mean to also setlocal enabledelayedexpansion
here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, how did you try it, you just ran compute-classpath? I set this in spark-class.cmd but not compute-classpath but I guess both need to work.
Merged build triggered. |
Merged build started. |
Thanks for the review, @andrewor14! I think I've dealt with all the comments (modulo a few I replied to above). The enabledelayedexpansion thing was very weird; if you call compute-classpath from spark-shell2, you should not set it again in compute-classpath, otherwise its changes to variables will not propagate out. But if you run compute-classpath by itself in a new process (as we do to launch executors), you should set it. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
I tested this again on Windows building with and without hive and I can verify that it works as we expect. I think this is ready to go. |
@mateiz Is this good for merging? |
@tdas sure, go for it. |
Great! Thanks! Merging it. |
…anges - Look for JARs in the right place - Launch examples the same way as on Unix - Load datanucleus JARs if they exist - Don't attempt to parse local paths as URIs in SparkSubmit, since paths with C:\ are not valid URIs - Also fixed POM exclusion rules for datanucleus (it wasn't properly excluding it, whereas SBT was) Author: Matei Zaharia <matei@databricks.com> Closes #819 from mateiz/win-fixes and squashes the following commits: d558f96 [Matei Zaharia] Fix comment 228577b [Matei Zaharia] Review comments d3b71c7 [Matei Zaharia] Properly exclude datanucleus files in Maven assembly 144af84 [Matei Zaharia] Update Windows scripts to match latest binary package layout (cherry picked from commit 7b70a70) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
…anges - Look for JARs in the right place - Launch examples the same way as on Unix - Load datanucleus JARs if they exist - Don't attempt to parse local paths as URIs in SparkSubmit, since paths with C:\ are not valid URIs - Also fixed POM exclusion rules for datanucleus (it wasn't properly excluding it, whereas SBT was) Author: Matei Zaharia <matei@databricks.com> Closes apache#819 from mateiz/win-fixes and squashes the following commits: d558f96 [Matei Zaharia] Fix comment 228577b [Matei Zaharia] Review comments d3b71c7 [Matei Zaharia] Properly exclude datanucleus files in Maven assembly 144af84 [Matei Zaharia] Update Windows scripts to match latest binary package layout
…0 private package branch (apache#819) Co-authored-by: Egor Krivokon <>
…0 private package branch (apache#819) Co-authored-by: Egor Krivokon <>