Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled classes #9512

Closed
wants to merge 8 commits into from

Conversation

JoshRosen
Copy link
Contributor

This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support cleaning of closures that were compiled by Java 8.

In order to avoid ASM dependency conflicts, Spark excludes ASM from all of its dependencies and uses a shaded version of ASM 4 that comes from reflectasm (see SPARK-782 and #232). This patch updates Spark to use a shaded version of ASM 5.0.4 that was published by the Apache XBean project; the POM used to create the shaded artifact can be found at https://github.com/apache/geronimo-xbean/blob/xbean-4.4/xbean-asm5-shaded/pom.xml.

http://movingfulcrum.tumblr.com/post/80826553604/asm-framework-50-the-missing-migration-guide was a useful resource while upgrading the code to use the new ASM5 opcodes.

I also added a new regression tests in the java8-tests subproject; the existing tests were insufficient to catch this bug, which only affected Scala 2.11 user code which was compiled targeting Java 8.

@JoshRosen
Copy link
Contributor Author

/cc @pwendell @srowen

@JoshRosen
Copy link
Contributor Author

Apparently ASM is a common source of Java 8 upgrade pain: http://product.hubspot.com/blog/upgrading-to-java-8-at-scale

@SparkQA
Copy link

SparkQA commented Nov 6, 2015

Test build #45165 has finished for PR 9512 at commit e5e27ce.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * final class DecisionTreeRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * final class GBTRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class IsotonicRegression @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)\n * class LinearRegression @Since(\"1.3.0\") (@Since(\"1.3.0\") override val uid: String)\n * final class RandomForestRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n

@SparkQA
Copy link

SparkQA commented Nov 6, 2015

Test build #45169 has finished for PR 9512 at commit c3e7f4a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * final class DecisionTreeRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * final class GBTRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class IsotonicRegression @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)\n * class LinearRegression @Since(\"1.3.0\") (@Since(\"1.3.0\") override val uid: String)\n * final class RandomForestRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n

@JoshRosen
Copy link
Contributor Author

Jenkins retest this please.

@SparkQA
Copy link

SparkQA commented Nov 6, 2015

Test build #45190 has finished for PR 9512 at commit c3e7f4a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

I suppose it would be nice to actually run the Java 8 tests to see whether they would have caught this.

@pwendell
Copy link
Contributor

This looks fine (i.e. LGTM). However, we could also look into actually shading asm ourselves in our published artifacts, similar to how we now shade jetty and other things. I'm fine to just use this already shaded one too though. Seems to do no harm and will allow us to work with Java 8.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor Author

I added a regression test to the java8-tests subproject and verified that it fails before this patch's changes.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45535 has finished for PR 9512 at commit c3e7f4a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

import BuildCommons._

lazy val settings = Seq(
compilerJVMVersion := "1.8"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtlety: this setting will break the java8 tests when running on Scala 2.10.x, which does not support the emission of Java 8 bytecode. For this reason, we need separate settings for configuring the scalc and javac target JVM versions and need to only set scalac to 1.8 when running Scala 2.11.4 or higher.

@JoshRosen
Copy link
Contributor Author

@marmbrus, it turns out that our java8-tests Maven profile disables the Scala compiler plugin (presumably because it was being used to compile Java sources and was hard to configure properly). I could spend some time to make it so that the regression test for this patch can run in Maven, but I'd rather work on other things given how much time I've already sunk into this. Therefore, I just amended the "Building Spark" guide to contain the one-liner for running this in SBT. I'll defer proper configuration / execution of these tests to a followup / someone else.

@marmbrus
Copy link
Contributor

I think the most important thing is that it will be run by the PR builder.

@JoshRosen
Copy link
Contributor Author

The Java 8 tests currently aren't run at all (in fact, some of the streaming ones appear to be broken). In order to test this in the PR builder, I'll have to configure it to use JDK 8 and will have to supply the right profiles. I could probably auto-set the right profile based on the detected JDK version.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45546 has finished for PR 9512 at commit 38191a1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45549 has finished for PR 9512 at commit fd96788.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45569 has finished for PR 9512 at commit 01f5cca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaLBFGSExample\n * class SlidingRDDPartition[T](val idx: Int, val prev: Partition, val tail: Seq[T], val offset: Int)\n * class SlidingRDD[T: ClassTag](@transient val parent: RDD[T], val windowSize: Int, val step: Int)\n

} catch {
case _: IllegalArgumentException => None
}
Some(new ClassReader(new ByteArrayInputStream(baos.toByteArray)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that this no longer needs to return an Option; will update now to keep the code clean.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45598 has finished for PR 9512 at commit 9833667.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45609 has finished for PR 9512 at commit 9833667.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaLBFGSExample\n

@marmbrus
Copy link
Contributor

Thanks, I'm going to merge this to master and 1.6. We should look at turning Java 8 test on in Jenkins as a follow up.

@asfgit asfgit closed this in 529a1d3 Nov 11, 2015
asfgit pushed a commit that referenced this pull request Nov 11, 2015
…mpiled classes

This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support cleaning of closures that were compiled by Java 8.

In order to avoid ASM dependency conflicts, Spark excludes ASM from all of its dependencies and uses a shaded version of ASM 4 that comes from `reflectasm` (see [SPARK-782](https://issues.apache.org/jira/browse/SPARK-782) and #232). This patch updates Spark to use a shaded version of ASM 5.0.4 that was published by the Apache XBean project; the POM used to create the shaded artifact can be found at https://github.com/apache/geronimo-xbean/blob/xbean-4.4/xbean-asm5-shaded/pom.xml.

http://movingfulcrum.tumblr.com/post/80826553604/asm-framework-50-the-missing-migration-guide was a useful resource while upgrading the code to use the new ASM5 opcodes.

I also added a new regression tests in the `java8-tests` subproject; the existing tests were insufficient to catch this bug, which only affected Scala 2.11 user code which was compiled targeting Java 8.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #9512 from JoshRosen/SPARK-6152.

(cherry picked from commit 529a1d3)
Signed-off-by: Michael Armbrust <michael@databricks.com>
dskrvk pushed a commit to dskrvk/spark that referenced this pull request Nov 13, 2015
…mpiled classes

This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support cleaning of closures that were compiled by Java 8.

In order to avoid ASM dependency conflicts, Spark excludes ASM from all of its dependencies and uses a shaded version of ASM 4 that comes from `reflectasm` (see [SPARK-782](https://issues.apache.org/jira/browse/SPARK-782) and apache#232). This patch updates Spark to use a shaded version of ASM 5.0.4 that was published by the Apache XBean project; the POM used to create the shaded artifact can be found at https://github.com/apache/geronimo-xbean/blob/xbean-4.4/xbean-asm5-shaded/pom.xml.

http://movingfulcrum.tumblr.com/post/80826553604/asm-framework-50-the-missing-migration-guide was a useful resource while upgrading the code to use the new ASM5 opcodes.

I also added a new regression tests in the `java8-tests` subproject; the existing tests were insufficient to catch this bug, which only affected Scala 2.11 user code which was compiled targeting Java 8.

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#9512 from JoshRosen/SPARK-6152.
johnynek pushed a commit to twitter/chill that referenced this pull request Nov 10, 2016
* WIP

* Use Kryo 3.1.0-SNAPSHOT; register ClosureSerializer

* Update ClosureCleaner to use ASM 5 APIs:

References:

    http://movingfulcrum.tumblr.com/post/80826553604/asm-framework-50-the-missing-migration-guide
    apache/spark#9512

* Use sbt-doge to exclude certain subprojects from the 2.12 build.

* Remove testOptions change.

* Only register ClosureSerializer when running on Java 8 JDK.

* Update to use sbt-doge's +++ "strict aggregation" operator.

* Workaround to fix PriorityQueueTest (there's probably a better way to do this).

* Replace Unit.box() call by scala.runtime.BoxedUnit.UNIT

See https://issues.scala-lang.org/browse/SI-6710 for context.
This is necessary for Scala 2.12.0-M5.

* Update to 2.12.0 final.

* Undo BoxedUnit workaround since it's no longer necessary.

* Create Java8ClosureRegistrar

* Fix Java8ClosureRegistrar; standardize formatting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants