Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-7811] Add support for Scala 2.12 #6784

Closed
wants to merge 25 commits into from

Conversation

aljoscha
Copy link
Contributor

@aljoscha aljoscha commented Oct 2, 2018

This is the first batch of changes on the way to Scala 2.12 support. All these changes are necessary to make the Scala 2.12 compiler happy and I also updated the ClosureCleaner to be able to work with Scala 2.12.

This does not yet change the Scala version to 2.12 but I have a work-in-progress branch that does that and these changes are the first batch of changes that are valid both for Scala 2.11 and necessary for Scala 2.12.

This adds a dependency:

<!-- This artifact is a shaded version of ASM 6.x. The POM that was used to produce this
is at https://github.com/apache/geronimo-xbean/tree/trunk/xbean-asm6-shaded
For context on why we shade ASM, see SPARK-782 and SPARK-6152. -->
<dependency>
    <groupId>org.apache.xbean</groupId>
    <artifactId>xbean-asm6-shaded</artifactId>
    <version>4.8</version>
</dependency>

It's required to make the ClosureCleaner work with Scala 2.12 lambdas. It's convenient because it's already there but we can also make our own flink-shaded release of ASM 6.

@zentol
Copy link
Contributor

zentol commented Oct 2, 2018

I'm currently preparing an asm6 version for flink-shaded since we also need it for java 9 compatibility.

@zentol
Copy link
Contributor

zentol commented Oct 2, 2018

Note that the GitHub link for the version you're using results in a 404.

@zentol
Copy link
Contributor

zentol commented Oct 2, 2018

For flink-shaded-asm it would be useful to know which components of asm are actually required.

@zentol
Copy link
Contributor

zentol commented Oct 2, 2018

API compatibility failed:

11:55:19.619 [ERROR] Failed to execute goal com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp (default) on project flink-streaming-scala_2.11: Breaking the build because there is at least one incompatibility: 
org.apache.flink.streaming.api.scala.DataStream$$anon$3.DataStream$$anon$3(org.apache.flink.streaming.api.scala.DataStream,org.apache.flink.api.common.typeinfo.TypeInformation,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$4.map(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$4.DataStream$$anon$4(org.apache.flink.streaming.api.scala.DataStream,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.api.common.functions.MapFunction[org.apache.flink.api.common.functions.MapFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$5.flatMap(java.lang.Object,org.apache.flink.util.Collector):METHOD_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$5.DataStream$$anon$5(org.apache.flink.streaming.api.scala.DataStream,scala.Function2):CONSTRUCTOR_REMOVED,
org.apache.flink.api.common.functions.FlatMapFunction[org.apache.flink.api.common.functions.FlatMapFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$6.DataStream$$anon$6(org.apache.flink.streaming.api.scala.DataStream,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$7.filter(java.lang.Object):METHOD_REMOVED,
org.apache.flink.api.common.functions.FilterFunction[org.apache.flink.api.common.functions.FilterFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$8.invoke(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.functions.sink.SinkFunction[org.apache.flink.streaming.api.functions.sink.SinkFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$9.select(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.collector.selector.OutputSelector[org.apache.flink.streaming.api.collector.selector.OutputSelector]:INTERFACE_REMOVED -> [Help 1]

@aljoscha
Copy link
Contributor Author

aljoscha commented Oct 3, 2018

API compatibility failed:

11:55:19.619 [ERROR] Failed to execute goal com.github.siom79.japicmp:japicmp-maven-plugin:0.11.0:cmp (default) on project flink-streaming-scala_2.11: Breaking the build because there is at least one incompatibility: 
org.apache.flink.streaming.api.scala.DataStream$$anon$3.DataStream$$anon$3(org.apache.flink.streaming.api.scala.DataStream,org.apache.flink.api.common.typeinfo.TypeInformation,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$4.map(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$4.DataStream$$anon$4(org.apache.flink.streaming.api.scala.DataStream,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.api.common.functions.MapFunction[org.apache.flink.api.common.functions.MapFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$5.flatMap(java.lang.Object,org.apache.flink.util.Collector):METHOD_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$5.DataStream$$anon$5(org.apache.flink.streaming.api.scala.DataStream,scala.Function2):CONSTRUCTOR_REMOVED,
org.apache.flink.api.common.functions.FlatMapFunction[org.apache.flink.api.common.functions.FlatMapFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$6.DataStream$$anon$6(org.apache.flink.streaming.api.scala.DataStream,scala.Function1):CONSTRUCTOR_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$7.filter(java.lang.Object):METHOD_REMOVED,
org.apache.flink.api.common.functions.FilterFunction[org.apache.flink.api.common.functions.FilterFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$8.invoke(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.functions.sink.SinkFunction[org.apache.flink.streaming.api.functions.sink.SinkFunction]:INTERFACE_REMOVED,
org.apache.flink.streaming.api.scala.DataStream$$anon$9.select(java.lang.Object):METHOD_REMOVED,
org.apache.flink.streaming.api.collector.selector.OutputSelector[org.apache.flink.streaming.api.collector.selector.OutputSelector]:INTERFACE_REMOVED -> [Help 1]

@zentol that was resolved by one of the last commits. The problem was that our exclusion pattern for anon$$... classes was not correct.

@aljoscha
Copy link
Contributor Author

aljoscha commented Oct 3, 2018

@zentol Also, the only ASM-related imports are

import org.apache.xbean.asm6.{ClassReader, ClassVisitor, MethodVisitor, Type}
import org.apache.xbean.asm6.Opcodes._

which seem to be exactly the same as for our earlier asm5 dependency.

@aljoscha aljoscha force-pushed the scala-2-12 branch 4 times, most recently from 5da50f2 to ba043d0 Compare October 12, 2018 09:10
@aljoscha aljoscha force-pushed the scala-2-12 branch 4 times, most recently from 5c6bc57 to a4ab90d Compare October 20, 2018 08:17
@aljoscha
Copy link
Contributor Author

@zentol The last 5 commits (minus the DO NOT MERGE commit) are touching the travis and release infrastructure, if you want to have a look.

@aljoscha
Copy link
Contributor Author

Also, we won't merge it like this but have to think about how to integrate the Scala 2.12 build. This PR only builds Scala 2.12, for testing, without building Scala 2.11.

@aljoscha aljoscha force-pushed the scala-2-12 branch 2 times, most recently from 8af1873 to e8d5110 Compare October 22, 2018 09:30
@zentol
Copy link
Contributor

zentol commented Oct 22, 2018

Can we split this PR so we don't change 30 different things at once? It's impossible to gauge how/whether these changes affect the current build with 2.11.

  • replace hard-coded scala versions
  • exclude kafka-clients dependency in kafka connector
  • remove scala 2.10 snapshots
  • bump breeze dependency version
  • changes to scala code
  • addition of scala 2.12 build

1 similar comment
@zentol
Copy link
Contributor

zentol commented Oct 22, 2018

Can we split this PR so we don't change 30 different things at once? It's impossible to gauge how/whether these changes affect the current build with 2.11.

  • replace hard-coded scala versions
  • exclude kafka-clients dependency in kafka connector
  • remove scala 2.10 snapshots
  • bump breeze dependency version
  • changes to scala code
  • addition of scala 2.12 build

*/
def where[K: TypeInformation](fun: KeySelector[L, K]) = {
val keyType = implicitly[TypeInformation[K]]
val keyExtractor = new KeySelector[L, K] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like redundant wrapping of KeySelector into KeySelector. I would suggest to remove that.

*/
def where[K: TypeInformation](fun: KeySelector[L, K]) = {
val keyType = implicitly[TypeInformation[K]]
val keyExtractor = new KeySelector[L, K] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like redundant wrapping of KeySelector into KeySelector. I would suggest to remove that.

*/
def where[K: TypeInformation](fun: KeySelector[L, K]) = {
val keyType = implicitly[TypeInformation[K]]
val keyExtractor = new KeySelector[L, K] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like redundant wrapping of KeySelector into KeySelector. I would suggest to remove that.

*/
def equalTo[K: TypeInformation](fun: KeySelector[R, K]): O = {
val keyType = implicitly[TypeInformation[K]]
val keyExtractor = new KeySelector[R, K] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant wrapping, see above.

val cleanFun = clean(fun)
val keyType: TypeInformation[K] = implicitly[TypeInformation[K]]

val keyExtractor = new KeySelector[T, K] with ResultTypeQueryable[K] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to wrap this in a ResultTypeQueryable if we explicitly pass the key type info below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I'm changing it.

@@ -227,32 +227,35 @@ object KNN {

// join input and training set
val crossed = crossTuned.mapPartition {
(iter, out: Collector[(FlinkVector, FlinkVector, Long, Double)]) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be fixed without changing the indentation level of the function body?

@@ -227,32 +227,35 @@ object KNN {

// join input and training set
val crossed = crossTuned.mapPartition {
(iter, out: Collector[(FlinkVector, FlinkVector, Long, Double)]) => {
for ((training, testing) <- iter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be changed without changing the indentation of the function body?

@@ -227,32 +227,35 @@ object KNN {

// join input and training set
val crossed = crossTuned.mapPartition {
(iter, out: Collector[(FlinkVector, FlinkVector, Long, Double)]) => {
for ((training, testing) <- iter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be changed without changing the indentation of the function body?

@@ -227,32 +227,35 @@ object KNN {

// join input and training set
val crossed = crossTuned.mapPartition {
(iter, out: Collector[(FlinkVector, FlinkVector, Long, Double)]) => {
for ((training, testing) <- iter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be changed without changing the indentation of the function body?

val rightKey = new Keys.SelectorFunctionKeys[R, K](
keyExtractor,
unfinished.leftInput.clean(fun),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be unfinished.rightInput.clean(fun)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter, I just need to somehow clean the function.

@@ -64,8 +64,7 @@ under the License.

<dependency>
<groupId>org.apache.flink</groupId>
<!-- use a dedicated Scala version to not depend on it -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is dangerous, because it means that the artifact we release have an undefined Scala version - effectively whatever version we uploaded first (or last?).

That means flink-avro 1.7 depends on Scala 2.11, flink-avro 1.8 might depend on Scala 2.12, flink-avro 1.9 might depend again on Scala 2.11.

@@ -53,8 +53,7 @@ under the License.

<dependency>
<groupId>org.apache.flink</groupId>
<!-- use a dedicated Scala version to not depend on it -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, see above.

…eCleaner

This updates the ClosureCleaner with recent changes from SPARK-14540.
Some places in the code didn't have override, this wasn't a problem for
Scala 2.11 but Scala 2.12 seems to be more strict.
This wasn't a problem for Scala 2.11 but Scala 2.12 seems to be more
strict.
…pport

Scala 2.12 doesn't have ForkJoinPool there anymore.
It seems this wasn't a problem with Scala 2.11 but with 2.12 we go into
infinite recursion.

The fix should still be good for 2.11.
Our previous grizzled dependency is too old for Scala 2.12
The new Scala 2.12 typechecker didn't like this.
These changes are required for 2.12 support and also work on Scala 2.11.
It seems \$ doesn't work but $$ does. I noticed this when changing
DataStream.scala.
Previously we only allowed a lambda here, which was an omission.

This also adds support for KeySelector on other operations that need a
key.
… 0.7.6

We have to do this in order to be able to update our Chill dependency
without changing the the Kryo serializers that are registered by
default.

The problem is that snapshots of our Kryo serializer only contain the
user-registered serializers, not the serializers that are registered by
default by Chill. If we had that, we could probably get by without this
change.

The reason we have to update is that there is no Chill 0.7.4 dependency
for Scala 2.12.
…rk on Scala 2.12

They didn't have a SerialVersionUID before and it seems Scala 2.12
assigns different UIDs from Scala 2.11 so we have to fix them to those
that Scala 2.11 assigned automatically.
…sionUID

It seems that the SerialVersionUID of this changed between 2.11 and
2.12. We need to ignore it so that we can restore savepoints taken on a
2.11 build with a 2.12 build.
…hinBound()

With Scala 2.12, the ClosureCleaner will complain that the filter
function that uses withinBound() is not serializable. The reason is that
when "Bound" is not final it will serialize it with the closure, wich
includes IterateExample.
This changes some modules that had a _2.11 dependency but didn't expose
it in their module name to instead depend on the ${scala.binary.version}
dependency.

The main reason for this is to make the build self contained, before,
with the hard-dependency on 2.11, when buildig for 2.12 it would not be
clear where the dependency would come from because it is not created as
part of the build. This could lead to inconsistencies. For example, when
adding a new class in flink-runtime but not recompiling on 2.11 but only
on 2.12, the 2.12 tests would fail when using that new class because
they would use 2.11 dependencies that weren't rebuilt with the new
class.

We also don't build flink-scala-shell and flink-connector-kafka-0.8
because they don't work with Scala 2.12.

This also includes $PROFILE in dependency convergence check script. This
is in preparation for building with "-Dscala-212", where we hace to
exclude certain things.

We also exclude Kafka 0.8 from stages when building for Scala 2.12

And add Scala 2.12 to the release scripts
@aljoscha
Copy link
Contributor Author

Merged

@aljoscha aljoscha closed this Oct 31, 2018
@aljoscha aljoscha deleted the scala-2-12 branch October 31, 2018 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants