-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-3810] Support Spark 2.4 #3206
Conversation
This is a WIP. We should wait for Spark 2.4.0. cc @zjffdu and @felixcheung |
Thanks @HyukjinKwon Have you checked this PR (#3034) for supporting scala 2.12 |
oops. I haven't. Will check that too while I am here. BTW, my understanding is that we need this one as well since Spark still can be compiled against Scala 2.11.x, am I in the right way? |
Yes, we need to support scala 2.11 for spark 2.4 first. |
@@ -192,6 +192,15 @@ | |||
|
|||
<profiles> | |||
|
|||
<profile> | |||
<id>spark-2.4</id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I thought this profile is not meant to be used for building. I referred #2880.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I am verifying this in my personal Travis CI as well (https://travis-ci.org/HyukjinKwon/zeppelin/builds/442994923)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessary for building, but it is necessary for running unit test against spark 2.4 in travis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zjffdu, do you mind what travis update I should do please? Should I build this against spark-2.2, make it download Spark 2.4.0 and test it after setting SPARK_HOME
like I did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon Could you update .travis.yml
add test matrix for spark 2.4 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but looks I don't know how to .. 2.3 looks not being tested as well..
e2d224a
to
4b9ca42
Compare
is this going to break, say spark 2.3 with scala 2.11.8? |
Nope, it will work for both 2.11.8 and 2.11.12. I manually checked. This change only uses the methods existing in both 2.11.8 and 2.11.12 at Scala. |
Hello, I was trying to run zeppelin with spark 2.4 and I have pulled your code.
Can you please help. |
Does that happen only with this code changes? The change here does not touch signature at |
The error message:
complains there's no |
Yes. With the original 0.8.0 source code I'm able to build. |
Thanks for the reply. |
It should be usable if the changes is cherry-picked properly. This PR basically just replace one line: to a private function. |
4b9ca42
to
9ac1797
Compare
BTW, I tested this via my Travis CI - https://travis-ci.org/HyukjinKwon/zeppelin/builds/448215776. Tests seems got passed. |
I also locally tested this patch against Spark RC5. |
Awesome @HyukjinKwon Let's wait for spark 2.4 release |
I was able to get this working by doing cherry-picking. We needed some changes in our environment not related to zeppelin, but for the usage of spark-images. |
Thanks for confirmation. |
Hey all ~ could this get in by any chance maybe? |
Hi, All. It's announced finally. |
Thanks for everyone, the only remaining thing is to update |
@HyukjinKwon I created one PR for you to add test for spark 2.4, would mind to merge that ? HyukjinKwon#1 |
CI is passed, will merge it if no more comments |
Looking forward to seeing it merged and Zeppelin 0.9.0 released, Spark 2.4 fixes some very nasty bugs |
Thank you, @HyukjinKwon, @zjffdu and ALL! |
Thank you all!! |
Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418). There are two problems for this upgrade at Zeppelin side: 1.. Some methods that are used in private by reflection, for instance, `loopPostInit` became inaccessible. See: - https://github.com/scala/scala/blob/v2.11.8/src/repl/scala/tools/nsc/interpreter/ILoop.scala - https://github.com/scala/scala/blob/v2.11.12/src/repl/scala/tools/nsc/interpreter/ILoop.scala To work around this, I manually ported `loopPostInit` at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the new `loopPostInit` by reflection. 2.. Upgrade from 2.11.8 to 2.11.12 requires `jline.version` upgrade. Otherwise, we will hit: ``` Caused by: java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) ``` To work around this, I tweaked this by upgrading jline from `2.12.1` to `2.14.3`. [Improvement] * [x] - Wait until Spark 2.4.0 is officially released. * https://issues.apache.org/jira/browse/ZEPPELIN-3810 Verified manually against Spark 2.4.0 RC3 * Does the licenses files need update? Yes * Is there breaking changes for older versions? No * Does this needs documentation? No Author: hyukjinkwon <gurwls223@apache.org> Author: Hyukjin Kwon <gurwls223@apache.org> Author: Jeff Zhang <zjffdu@gmail.com> Closes apache#3206 from HyukjinKwon/ZEPPELIN-3810 and squashes the following commits: c2456c9 [Hyukjin Kwon] Py4J 0.10.6 to 0.10.7 573f07d [Jeff Zhang] add test for spark 2.4 (#1) 9ac1797 [hyukjinkwon] Support Spark 2.4 (cherry picked from commit 4f73272)
Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418). There are two problems for this upgrade at Zeppelin side: 1.. Some methods that are used in private by reflection, for instance, `loopPostInit` became inaccessible. See: - https://github.com/scala/scala/blob/v2.11.8/src/repl/scala/tools/nsc/interpreter/ILoop.scala - https://github.com/scala/scala/blob/v2.11.12/src/repl/scala/tools/nsc/interpreter/ILoop.scala To work around this, I manually ported `loopPostInit` at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the new `loopPostInit` by reflection. 2.. Upgrade from 2.11.8 to 2.11.12 requires `jline.version` upgrade. Otherwise, we will hit: ``` Caused by: java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) ``` To work around this, I tweaked this by upgrading jline from `2.12.1` to `2.14.3`. [Improvement] * [x] - Wait until Spark 2.4.0 is officially released. * https://issues.apache.org/jira/browse/ZEPPELIN-3810 Verified manually against Spark 2.4.0 RC3 * Does the licenses files need update? Yes * Is there breaking changes for older versions? No * Does this needs documentation? No Author: hyukjinkwon <gurwls223@apache.org> Author: Hyukjin Kwon <gurwls223@apache.org> Author: Jeff Zhang <zjffdu@gmail.com> Closes apache#3206 from HyukjinKwon/ZEPPELIN-3810 and squashes the following commits: c2456c9 [Hyukjin Kwon] Py4J 0.10.6 to 0.10.7 573f07d [Jeff Zhang] add test for spark 2.4 (#1) 9ac1797 [hyukjinkwon] Support Spark 2.4 (cherry picked from commit 4f73272)
Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418). There are two problems for this upgrade at Zeppelin side: 1.. Some methods that are used in private by reflection, for instance, `loopPostInit` became inaccessible. See: - https://github.com/scala/scala/blob/v2.11.8/src/repl/scala/tools/nsc/interpreter/ILoop.scala - https://github.com/scala/scala/blob/v2.11.12/src/repl/scala/tools/nsc/interpreter/ILoop.scala To work around this, I manually ported `loopPostInit` at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the new `loopPostInit` by reflection. 2.. Upgrade from 2.11.8 to 2.11.12 requires `jline.version` upgrade. Otherwise, we will hit: ``` Caused by: java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) ``` To work around this, I tweaked this by upgrading jline from `2.12.1` to `2.14.3`. [Improvement] * [x] - Wait until Spark 2.4.0 is officially released. * https://issues.apache.org/jira/browse/ZEPPELIN-3810 Verified manually against Spark 2.4.0 RC3 * Does the licenses files need update? Yes * Is there breaking changes for older versions? No * Does this needs documentation? No Author: hyukjinkwon <gurwls223@apache.org> Author: Hyukjin Kwon <gurwls223@apache.org> Author: Jeff Zhang <zjffdu@gmail.com> Closes apache#3206 from HyukjinKwon/ZEPPELIN-3810 and squashes the following commits: c2456c9 [Hyukjin Kwon] Py4J 0.10.6 to 0.10.7 573f07d [Jeff Zhang] add test for spark 2.4 (#1) 9ac1797 [hyukjinkwon] Support Spark 2.4 (cherry picked from commit 4f73272)
Hello, |
It is merged to branch-0.8 and master branch. So yes it will be released in zeppelin 0.9.0, but I believe 0.8.1 will be released before zeppelin 0.9.0 |
thank you for the quick response!
does 0.8.1 have a release date?
thanks again
בתאריך יום א׳, 18 בנוב׳ 2018, 15:10, מאת Jeff Zhang <
notifications@github.com>:
… It is merged to branch-0.8 and master branch. So yes it will be released
in zeppelin 0.9.0, but I believe 0.8.1 will be released before zeppelin
0.9.0
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3206 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AP-5fNF0kNk3qGODsqniD9sa1TDGaFPvks5uwVxGgaJpZM4XkINL>
.
|
I will try to do it by the end of 2018 |
Thank you very much!
בתאריך יום א׳, 18 בנוב׳ 2018, 15:17, מאת Jeff Zhang <
notifications@github.com>:
… I will try to do it by the end of 2018
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3206 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AP-5fC-vkwaAJVD2G81Bq8JBKLaS1PQdks5uwV4GgaJpZM4XkINL>
.
|
### What is this PR for? Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418). There are two problems for this upgrade at Zeppelin side: 1.. Some methods that are used in private by reflection, for instance, `loopPostInit` became inaccessible. See: - https://github.com/scala/scala/blob/v2.11.8/src/repl/scala/tools/nsc/interpreter/ILoop.scala - https://github.com/scala/scala/blob/v2.11.12/src/repl/scala/tools/nsc/interpreter/ILoop.scala To work around this, I manually ported `loopPostInit` at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the new `loopPostInit` by reflection. 2.. Upgrade from 2.11.8 to 2.11.12 requires `jline.version` upgrade. Otherwise, we will hit: ``` Caused by: java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) ``` To work around this, I tweaked this by upgrading jline from `2.12.1` to `2.14.3`. ### What type of PR is it? [Improvement] ### Todos * [x] - Wait until Spark 2.4.0 is officially released. ### What is the Jira issue? * https://issues.apache.org/jira/browse/ZEPPELIN-3810 ### How should this be tested? Verified manually against Spark 2.4.0 RC3 ### Questions: * Does the licenses files need update? Yes * Is there breaking changes for older versions? No * Does this needs documentation? No Author: hyukjinkwon <gurwls223@apache.org> Author: Hyukjin Kwon <gurwls223@apache.org> Author: Jeff Zhang <zjffdu@gmail.com> Closes apache#3206 from HyukjinKwon/ZEPPELIN-3810 and squashes the following commits: c2456c9 [Hyukjin Kwon] Py4J 0.10.6 to 0.10.7 573f07d [Jeff Zhang] add test for spark 2.4 (#1) 9ac1797 [hyukjinkwon] Support Spark 2.4
Hi all, I'm using Zeppelin with the official Docker image and I'm still getting |
This fix is not released yet. This PR exactly fixes the problem you faced. This fix will be available in the next release of Zeppelin. |
Happy new year guys! I've waited for the release of 0.8.1 but I made a mistake and upgraded my Cloudera to 6.1 which it comes with Hadoop 3.0 and Spark 2.4! I am trying to compile it myself, but I have a question. I have cloned the repo and checkout to branch-0.8 which I can see it has this already merged into. I have built it as follow but still tells me Spark 2.4 is not supported:
Shall I use |
To answer my own basic question! Yes, it has to be -Pspark-2.4! But it didn't work on my macOS:
I did it again on my Ubuntu which is part of Cloudera cluster and it went perfectly. (obviously, Hadoop 3.0 still not available in the profiles, but it didn't change anything)
Can't wait for this release and 0.9.0! Great work and thank you. |
I have some bad news related to Spark 2.4 and https://issues.apache.org/jira/browse/ZEPPELIN-3939 Unfortunately, it is not possible to test everything in a new Spark/Hadoop/CDH at once. I successfully ran an ML pipeline, read from parquet, but after a day I realized I can't read JSON and CSV files. |
@HyukjinKwon @zjffdu @Leemoonsoo is it possible to shade Spark and Hadoop related dependencies when there is already Spark and Hadoop exist? I was thinking to avoid situations like this, to have a type of build where it shades all the deps for users with Spark and Hadoop. |
Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418). There are two problems for this upgrade at Zeppelin side: 1.. Some methods that are used in private by reflection, for instance, `loopPostInit` became inaccessible. See: - https://github.com/scala/scala/blob/v2.11.8/src/repl/scala/tools/nsc/interpreter/ILoop.scala - https://github.com/scala/scala/blob/v2.11.12/src/repl/scala/tools/nsc/interpreter/ILoop.scala To work around this, I manually ported `loopPostInit` at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the new `loopPostInit` by reflection. 2.. Upgrade from 2.11.8 to 2.11.12 requires `jline.version` upgrade. Otherwise, we will hit: ``` Caused by: java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:139) ``` To work around this, I tweaked this by upgrading jline from `2.12.1` to `2.14.3`. [Improvement] * [x] - Wait until Spark 2.4.0 is officially released. * https://issues.apache.org/jira/browse/ZEPPELIN-3810 Verified manually against Spark 2.4.0 RC3 * Does the licenses files need update? Yes * Is there breaking changes for older versions? No * Does this needs documentation? No Author: hyukjinkwon <gurwls223@apache.org> Author: Hyukjin Kwon <gurwls223@apache.org> Author: Jeff Zhang <zjffdu@gmail.com> Closes apache#3206 from HyukjinKwon/ZEPPELIN-3810 and squashes the following commits: c2456c9 [Hyukjin Kwon] Py4J 0.10.6 to 0.10.7 573f07d [Jeff Zhang] add test for spark 2.4 (#1) 9ac1797 [hyukjinkwon] Support Spark 2.4 (cherry picked from commit 4f73272) Change-Id: I05583aff76758936ccd84fa3820fa1e733d4416f
Hello everyone and in log zeppelin-interpreter-spark-hduser-ubuntu.log INFO [2019-06-01 14:17:09,785] ({pool-2-thread-2} NewSparkInterpreter.java[open]:83) - Using Scala Version: 2.11 version scala 2.11.12 can someone help me plz ? |
What is this PR for?
Spark 2.4 changed it's Scala version from 2.11.8 to 2.11.12 (see SPARK-24418).
There are two problems for this upgrade at Zeppelin side:
1.. Some methods that are used in private by reflection, for instance,
loopPostInit
became inaccessible.See:
To work around this, I manually ported
loopPostInit
at 2.11.8 to retain the behaviour. Some functions that are commonly existing at both Scala 2.11.8 and Scala 2.11.12 are used inside of the newloopPostInit
by reflection.2.. Upgrade from 2.11.8 to 2.11.12 requires
jline.version
upgrade. Otherwise, we will hit:To work around this, I tweaked this by upgrading jline from
2.12.1
to2.14.3
.What type of PR is it?
[Improvement]
Todos
What is the Jira issue?
How should this be tested?
Verified manually against Spark 2.4.0 RC3
Questions: