-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1681] Include datanucleus jars in Spark Hive distribution #610
Conversation
Merged build triggered. |
Merged build started. |
|
||
if [ -n "$datanucleus_jars" ]; then | ||
an_assembly_jar=${ASSEMBLY_JAR:-$DEPS_ASSEMBLY_JAR} | ||
hive_files=$(jar tvf "$an_assembly_jar" org/apache/hadoop/hive/ql/exec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably still pipe 2>/dev/null here as before, I believe this throws an error if it doesn't exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot, thanks
Merged build triggered. |
Merged build started. |
@@ -29,6 +29,7 @@ FWDIR="$(cd `dirname $0`/..; pwd)" | |||
|
|||
# Build up classpath | |||
CLASSPATH="$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:$FWDIR/conf" | |||
CLASSPATH=$(echo "$CLASSPATH" | sed s/::/:/g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just for aesthetics or is there a correctness issue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aesthetics... @aarondav also pointed out that this is a little confusing. Maybe I should remove it.
@andrewor14 looks good! Just a few small questions. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Sorry for a dumb question but is this the place to configure what goes into artifacts? the Maven build is where the rest of that action happens right? or is this a necessary special case? |
There is another solution #598 |
Right now we actually build distributions with I think for 1.1 the goal should be to both consolidate the builds and to potentially migrate this entire thing to maven using release profiles. But for now I want a surgical fix that we can pull into the 1.0 branch. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
@andrewor14 @aarondav I tried this locally and ran into a bug. The use of
|
Regarding question 1, it was just to avoid having to nest another check for the existence of the dependency jar. More of a hack than anything. |
As a more general solution for the problem I encountered I've created: With that in mind I think it's fine to just address (2) here, since we can assume this error message will be found elsewhere. |
Both cases being building the uber assembly jar and building the deps assembly jar.
Merged build triggered. |
Merged build started. |
fi | ||
|
||
# Verify that versions of java used to build the jars and run Spark are compatible | ||
jar_error_check=$("$JAR_CMD" -tf "$ASSEMBLY_JAR" scala/AnyVal 2>&1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our offline discussion, let's just change this to unused/class/path
or something.
Merged build finished. All automated tests passed. |
All automated tests passed. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
This copies the datanucleus jars over from `lib_managed` into `dist/lib`, if any. The `CLASSPATH` must also be updated to reflect this change. Author: Andrew Or <andrewor14@gmail.com> Closes #610 from andrewor14/hive-distribution and squashes the following commits: a4bc96f [Andrew Or] Rename search path in jar error check fa205e1 [Andrew Or] Merge branch 'master' of github.com:apache/spark into hive-distribution 7855f58 [Andrew Or] Have jar command respect JAVA_HOME + check for jar errors both cases c16bbfd [Andrew Or] Merge branch 'master' of github.com:apache/spark into hive-distribution 32f6826 [Andrew Or] Leave the double colons 940a1bb [Andrew Or] Add back 2>/dev/null 58357cc [Andrew Or] Include datanucleus jars in Spark distribution built with Hive support (cherry picked from commit cf0a8f0) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
This copies the datanucleus jars over from `lib_managed` into `dist/lib`, if any. The `CLASSPATH` must also be updated to reflect this change. Author: Andrew Or <andrewor14@gmail.com> Closes apache#610 from andrewor14/hive-distribution and squashes the following commits: a4bc96f [Andrew Or] Rename search path in jar error check fa205e1 [Andrew Or] Merge branch 'master' of github.com:apache/spark into hive-distribution 7855f58 [Andrew Or] Have jar command respect JAVA_HOME + check for jar errors both cases c16bbfd [Andrew Or] Merge branch 'master' of github.com:apache/spark into hive-distribution 32f6826 [Andrew Or] Leave the double colons 940a1bb [Andrew Or] Add back 2>/dev/null 58357cc [Andrew Or] Include datanucleus jars in Spark distribution built with Hive support
Now many project use scala, it's good to add a common role. Since scala depends on java, add the install step inner openjdk role. Users can install scala after openjdk with the var `with_scala`
This copies the datanucleus jars over from
lib_managed
intodist/lib
, if any. TheCLASSPATH
must also be updated to reflect this change.