Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13579][build][test-maven] Stop building the main Spark assembly. #11796

Closed
wants to merge 20 commits into from

Conversation

vanzin
Copy link
Contributor

@vanzin vanzin commented Mar 17, 2016

This change modifies the "assembly/" module to just copy needed
dependencies to its build directory, and modifies the packaging
script to pick those up (and remove duplicate jars packages in the
examples module).

I also made some minor adjustments to dependencies to remove some
test jars from the final packaging, and remove jars that conflict with each
other when packaged separately (e.g. servlet api).

Also note that this change restores guava in applications' classpaths, even
though it's still shaded inside Spark. This is now needed for the Hadoop
libraries that are packaged with Spark, which now are not processed by
the shade plugin.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 17, 2016

This is probably not ready yet, although it works for me locally. I want to run the patch through jenkins (sbt and maven) to see how well it works out.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53471 has finished for PR 11796 at commit f679e14.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -238,36 +238,13 @@
<configuration>
<sources>
<source>v${hive.version.short}/src/main/scala</source>
<source>${project.build.directory/generated-sources/antlr</source>
<source>${project.build.directory}/generated-sources/antlr</source>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't seem to be affecting the build, though. I'm actually wondering if this line is needed - the parser seems to be part of the catalyst module, not this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right about that. So this could actually be removed. Lemme know, I can also submit a PR for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it, but since this patch may take some more iterations before it's ready, feel free to do it as part of another patch too.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53473 has finished for PR 11796 at commit 5d1e259.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 18, 2016

Test build #53472 has finished for PR 11796 at commit 54336b6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 18, 2016

Test build #53549 has finished for PR 11796 at commit 5ec6722.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 19, 2016

Test build #53576 has finished for PR 11796 at commit a83afeb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 21, 2016

Hmm, log files are gone and tests pass locally... sigh. retest this please

@SparkQA
Copy link

SparkQA commented Mar 21, 2016

Test build #53683 has finished for PR 11796 at commit a83afeb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 22, 2016

Test build #53735 has finished for PR 11796 at commit 5d517f0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 22, 2016

These tests fail sporadically; I think it's some weird sbt dependency resolution issue that's causing different hadoop versions to get mixed up. e.g.

16/03/21 19:43:11.830 redirect stderr for command ./bin/spark-submit INFO Utils: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.addDeprecations([Lorg/apache/hadoop/conf/Configuration$DeprecationDelta;)V
16/03/21 19:43:11.830 redirect stderr for command ./bin/spark-submit INFO Utils:    at org.apache.hadoop.mapreduce.util.ConfigUtil.addDeprecatedKeys(ConfigUtil.java:54)
16/03/21 19:43:11.830 redirect stderr for command ./bin/spark-submit INFO Utils:    at org.apache.hadoop.mapreduce.util.ConfigUtil.loadResources(ConfigUtil.java:42)
16/03/21 19:43:11.830 redirect stderr for command ./bin/spark-submit INFO Utils:    at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:118)

That's not the whole stack trace, but there's only hadoop stuff in there, so it's not Spark calling some method that was removed.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 22, 2016

The offending method (void addDeprecations(DeprecationDelta[])) seems to have been added in Hadoop 2.3 and seems to exist in all versions since. Somehow we're getting hadoop 2.2 jars in the classpath?

@SparkQA
Copy link

SparkQA commented Mar 22, 2016

Test build #53800 has finished for PR 11796 at commit 83fa3a4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 22, 2016

Weird jline error, trying to see if I can figure it out... but the tests I want to fail passed. retest this please.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 22, 2016

jline error looks weirdly similar. Old version of jline (0.9) has class jline.Terminal, while newer versions have an interface.

2016-03-22 13:00:57.304 - stderr> java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
2016-03-22 13:00:57.306 - stderr>   at jline.TerminalFactory.create(TerminalFactory.java:101)
2016-03-22 13:00:57.306 - stderr>   at jline.TerminalFactory.get(TerminalFactory.java:158)
2016-03-22 13:00:57.306 - stderr>   at jline.console.ConsoleReader.<init>(ConsoleReader.java:229)
2016-03-22 13:00:57.306 - stderr>   at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53818 has finished for PR 11796 at commit 83fa3a4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 23, 2016

I'm a little at a loss at what's wrong. Copy & pasting the commands from the logs and running them on the same jenkins machine works (I don't get that exception). Running javap for the Configuration class with the command's classpath shows the new class, with the method that the logs complain about...

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53874 has finished for PR 11796 at commit b5b8dd1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53871 has finished for PR 11796 at commit 0339263.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53875 has finished for PR 11796 at commit 6c4ed0a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53948 has finished for PR 11796 at commit e37886b.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2016

Test build #53949 has finished for PR 11796 at commit 55aec21.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 23, 2016

yay, but I'm still worried about the previous failures. will test on maven now.

@vanzin vanzin changed the title [SPARK-13579][build][wip] Stop building the main Spark assembly. [SPARK-13579][build][wip][test-maven] Stop building the main Spark assembly. Mar 23, 2016
@vanzin
Copy link
Contributor Author

vanzin commented Mar 23, 2016

retest this please

@vanzin
Copy link
Contributor Author

vanzin commented Mar 23, 2016

the maven build seems stuck... :-/ I'll revert to sbt and add some patches since I want to try to hit the previous failures.

@vanzin vanzin changed the title [SPARK-13579][build][wip][test-maven] Stop building the main Spark assembly. [SPARK-13579][build][wip] Stop building the main Spark assembly. Mar 23, 2016
@@ -201,24 +201,29 @@ class FileAppenderSuite extends SparkFunSuite with BeforeAndAfter with Logging {

// Make sure only logging errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the deal with these logging changes? Got a short summary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason these tests failed in one of my runs. And when they failed, they used to leave the error level at "ERROR", which basically means all tests that ran after these didn't write any logs.

@JoshRosen
Copy link
Contributor

Took a quick pass and have a few clarifying questions. This looks like it's in good shape overall, though.

@SparkQA
Copy link

SparkQA commented Mar 31, 2016

Test build #54652 has finished for PR 11796 at commit 9275ea6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Mar 31, 2016

@JoshRosen if you want to take a look at the above failure, it seems related to running the build with jdk8. retest this please

@JoshRosen
Copy link
Contributor

Hmm, that's unfortunate. Let me flip the switch to move the PR builder back to Java 7...

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 31, 2016

Test build #54656 has finished for PR 11796 at commit 9275ea6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 31, 2016

Test build #54657 has finished for PR 11796 at commit 9275ea6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Apr 4, 2016

Hi @JoshRosen , any more comments?

@SparkQA
Copy link

SparkQA commented Apr 4, 2016

Test build #54865 has finished for PR 11796 at commit b5d4292.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if [ -f "${SPARK_HOME}/RELEASE" ]; then
SPARK_JARS_DIR="${SPARK_HOME}/lib"
SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
SPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also have to append /jars here?

@JoshRosen
Copy link
Contributor

LGTM pending Jenkins. I'll merge this after tests pass and after I wrap up a final local test with the make-distribution.sh script.

I'd like to get these core changes in now in order to unblock other build patches and so that they get more widespread community testing. If necessary, we can address small issues in followup patches.

@SparkQA
Copy link

SparkQA commented Apr 4, 2016

Test build #54892 has finished for PR 11796 at commit a419707.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

I also manually ran DistributedSuite in SBT to make sure that the SBT build still works; everything looked fine, so I'm merging now. Thanks!

@asfgit asfgit closed this in 24d7d2e Apr 4, 2016
@zsxwing
Copy link
Member

zsxwing commented Apr 5, 2016

How to build Spark after this patch? Just run build/sbt clean assembly but bin/spark-shell doesn't work any more:

Failed to find Spark jars directory (/Users/zsx/workspace/spark/assembly/target/scala-2.10/jars).
You need to build Spark before running this program.

@JoshRosen
Copy link
Contributor

@zsxwing, try build/sbt clean package?

@vanzin
Copy link
Contributor Author

vanzin commented Apr 5, 2016

Yeah it should be "package" now. I'll send a message to the dev list.

@zsxwing
Copy link
Member

zsxwing commented Apr 5, 2016

yeah, package works. Thanks!

@vanzin vanzin deleted the SPARK-13579 branch April 5, 2016 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants