Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7944][SPARK-8013] Remove most of the Spark REPL fork for Scala 2.11 #6903

Closed
wants to merge 11 commits into from

Conversation

dragos
Copy link
Contributor

@dragos dragos commented Jun 19, 2015

This PR removes most of the code in the Spark REPL for Scala 2.11 and leaves just a couple of overridden methods in SparkILoop in order to:

  • change welcome message
  • restrict available commands (like :power)
  • initialize Spark context

The two codebases have diverged and it's extremely hard to backport fixes from the upstream REPL. This somewhat radical step is absolutely necessary in order to fix other REPL tickets (like SPARK-8013 - Hive Thrift server for 2.11). BTW, the Scala REPL has fixed the serialization-unfriendly wrappers thanks to @ScrapCodes's work in #4522

All tests pass and I tried the spark-shell on our Mesos cluster with some simple jobs (including with additional jars), everything looked good.

As soon as Scala 2.11.7 is out we need to upgrade and get a shaded jline dependency, clearing the way for SPARK-8013.

/cc @pwendell

Make sure the class path is correctly passed to the underlying REPL.
@srowen
Copy link
Member

srowen commented Jun 19, 2015

Sounds like a good step to avoid future headaches, and would surely be great to avoid continuing to fork. CC @mateiz as well to recall any particular things that were important to do differently in the fork, to help make sure nothing is being overlooked? Of course this is just for 2.11.

@dragos
Copy link
Contributor Author

dragos commented Jun 19, 2015

See scala/scala#4563 for shaded jline.

@SparkQA
Copy link

SparkQA commented Jun 19, 2015

Test build #35268 has finished for PR 6903 at commit 9d46d85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

else
intp.lastWarnings foreach { case (pos, msg) => intp.reporter.warning(pos, msg) }
}
private val blockedCommands = Set("implicits", "javap", "power", "type", "kind")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you block these commands, may be to retain the original behavior or something ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just followed the original behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for blocking them earlier was that they were non functional or it needed a more complex change to support them. May be that is not the case now, or even if we want to retain the original behavior we can hide them behind a flag like -Dspark.repl.experiemental.commands or something. If you think these commands work as expected with spark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason they wouldn't (I'll make sure), but I would add them in a separate pull request, so we can point to a JIRA in release notes. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@ScrapCodes
Copy link
Member

Thanks for putting this together, I am going to take a more closer look. Finally we will be relieved of burden of maintaining and porting repl per release.

@dragos
Copy link
Contributor Author

dragos commented Jun 21, 2015

@ScrapCodes thanks for having a look!

@mateiz
Copy link
Contributor

mateiz commented Jun 21, 2015

Hey Dragos, so what Scala 2.11 versions will this work with? It won't work with the very first one, right, only with versions after some of your other fixes? We need to document that so that people running older Scala 2.11s won't be confused.

@dragos
Copy link
Contributor Author

dragos commented Jun 22, 2015

@mateiz Indeed. I tested this with Scala 2.11.6, but I would prefer to move to 2.11.7 (should be out this week) ASAP. 2.11.7 has a shaded version of jline, which means we can build hive-thriftserver and spark-sql would work too.

Currently, Spark depends on Scala 2.11.6, and that's the version that gets bundled in the assembly jar (and the distribution). I didn't find any mention in the docs on how to build with a different Scala version, so my assumption was that Spark fixes the Scala version you're going to use in the spark-shell.

(I'm pretty sure a determined dev would find a way to pass scala.version from the outside, but it doesn't seem to be a supported scenario).

@mateiz
Copy link
Contributor

mateiz commented Jun 22, 2015

Alright, please also send a PR to update the version to 2.11.7 when that comes out the. Both these things should be able to make it into Spark 1.5.

@ScrapCodes
Copy link
Member

We can change the License file too, and add this too RAT checks.

@@ -274,7 +268,7 @@ class ReplSuite extends SparkFunSuite {

test("SPARK-2632 importing a method from non serializable class and not using it.") {
val output = runInterpreter("local",
"""
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a stray change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, arguments should be indented 2 spaces, right?

@ScrapCodes
Copy link
Member

IMHO this can not be merged before 2.11.7 is released.

@dragos
Copy link
Contributor Author

dragos commented Jun 23, 2015

@ScrapCodes, @mateiz Let me know if there's anything else.

@SparkQA
Copy link

SparkQA commented Jun 23, 2015

Test build #35557 has finished for PR 6903 at commit 66eec11.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dragos
Copy link
Contributor Author

dragos commented Jun 23, 2015

I think this is a spurious failure:

[error] Failed: Total 1409, Failed 1, Errors 0, Passed 1408, Ignored 5
[error] Failed tests:
[error]     org.apache.spark.network.nio.ConnectionManagerSuite

@SparkQA
Copy link

SparkQA commented Jun 24, 2015

Test build #957 has finished for PR 6903 at commit 66eec11.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dragos
Copy link
Contributor Author

dragos commented Jun 24, 2015

Still spurious? This time it's a PySpark test.

@SparkQA
Copy link

SparkQA commented Jun 24, 2015

Test build #958 has finished for PR 6903 at commit 66eec11.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ScrapCodes
Copy link
Member

Can you update to scala 2.11.7 as part of this ? It is just a flip in version numbers in pom.xml

@ScrapCodes
Copy link
Member

@dragos
Copy link
Contributor Author

dragos commented Jun 25, 2015

Yes, but it's a bit more than that. I'd like to make sure thirftserver also builds, and that requires some jline version tinkering as well (it needs to pull in the one from the Hive dependency now).

rat-excludes and removed `-Yrepl-sync` per reviewer’s request.
@dragos
Copy link
Contributor Author

dragos commented Jun 25, 2015

We are in fork-hell. Who can build and publish (or at least locate this)?

org.spark-project#genjavadoc-plugin_2.11.7;0.9-spark0: not found

@ScrapCodes
Copy link
Member

I am sure @pwendell or @andrewor14 can help.

@mengxr
Copy link
Contributor

mengxr commented Jul 7, 2015

@dragos @srowen Sorry I missed this thread! The official genjavadoc doesn't support 2.11.7 yet (https://github.com/typesafehub/genjavadoc/blob/master/project/Build.scala#L20). I can try to publish one for Scala 2.11.7 (if it compiles under Scala 2.11.7).

@dragos
Copy link
Contributor Author

dragos commented Jul 7, 2015

@mengxr I think (0 to 7) is inclusive, so probably it would work. Thanks! Looking forward to closing this one :)

@brkyvz
Copy link
Contributor

brkyvz commented Jul 7, 2015

Hey, sorry for missing this thread. I'll publish the 2.11.7 version today and ping you all.

@mengxr
Copy link
Contributor

mengxr commented Jul 7, 2015

@dragos Sorry I was looking at my local master. They added 2.11.7 12 days ago:)

@brkyvz
Copy link
Contributor

brkyvz commented Jul 7, 2015

@dragos Just published the genjavadoc 2.11.7!


import org.apache.commons.lang3.StringEscapeUtils
import org.apache.spark.{SparkContext, SparkFunSuite}
import org.apache.spark.{ SparkContext, SparkFunSuite }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you remove spaces here.

@dragos
Copy link
Contributor Author

dragos commented Jul 8, 2015

Thanks @brkyvz. In principle there's nothing I need to do for this PR (except the whitespace changes you pointed out), but I'll run tests locally again.

Could you point to the source of your fork? This would make things easier in the future.

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #1006 has finished for PR 6903 at commit 0ce67a6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dragos
Copy link
Contributor Author

dragos commented Jul 8, 2015

I removed spaces around imports (can't this be checked by ScalaStyle?).

I noticed a few tests are looping on 2.11 (both on 2.11.6 and 2.11.7). I'll investigate them, but I don't think they have anything to do with these changes.

@ScrapCodes
Copy link
Member

Yes, it can be. But enforcing it codebase wide is huge(+ unjustified, in the sense it affect code history) effort.
See : https://github.com/scalastyle/scalastyle/blob/master/src/main/scala/org/scalastyle/scalariform/SpaceAroundTokenChecker.scala

I think, may be we should not block on those tests, but would be good to open individual issues for everything we sort of postponed for a time later ?

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #1008 has finished for PR 6903 at commit 0ce67a6.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 8, 2015

I think this is a legit failure?

Scalastyle checks failed at following occurrences:
[error] sbt.IncompatiblePluginsException: Binary incompatibility in plugins detected.
[error] Note that conflicts were resolved for some dependencies:
[error]     org.codehaus.plexus:plexus-utils
[error]     org.apache.maven.wagon:wagon-provider-api
[error]     org.sonatype.sisu:sisu-inject-plexus
[error]     org.sonatype.sisu:sisu-inject-bean
[error]     org.sonatype.sisu:sisu-guice
[error]     cglib:cglib
[error]     asm:asm
[error]     com.typesafe:config
[error]     commons-lang:commons-lang
[error]     com.thoughtworks.paranamer:paranamer
[error] Use 'last' for the full log.
[error] running /home/jenkins/workspace/NewSparkPullRequestBuilder/dev/lint-scala ; received return code 1

@dragos
Copy link
Contributor Author

dragos commented Jul 8, 2015

Looks like that... I have no idea what it means though. I'll have a look

@dragos
Copy link
Contributor Author

dragos commented Jul 8, 2015

I can't reproduce it. How did you run it?

 $ dev/lint-scala 
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Scalastyle checks passed.

(Apache RAT takes an hour on my system, no idea what file trips it so badly, so I just ran lint tests).

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36793 has finished for PR 6903 at commit 2b1a305.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 8, 2015

Have a look at http://stackoverflow.com/questions/22628547/binary-incompatibility-in-plugins-detected-in-play-2-1-3-project-after-upgrading which makes me think this is somehow related to jline. Why you don't reproduce it, I don't know; does it depend on the profile that's set? I have no idea about jline but I imagine you do; does that ring any bells?

@dragos
Copy link
Contributor Author

dragos commented Jul 8, 2015

But the latest Jenkins build succeeded (at least, style checks did). Now it fails a test in catalyst. So this is a spurious error, it seems?

Note that there is no jline in the Jenkins log. That happened to be the case in the stack overflow question, but I doubt it's the same root cause.

Since I can't reproduce it, I also can't run last to see what was the full error message....

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #1012 has finished for PR 6903 at commit 2b1a305.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 8, 2015

Yeah these are different failures each time. I'm not sure what's going on -- still am not sure why these changes would make things merely flakier. Let me spin the wheel again.

@squito
Copy link
Contributor

squito commented Jul 8, 2015

I've seen that failure randomly as well from changes that didn't change anything remotely related. It doesn't mean that the error is always spurious, but I wouldn't worry too much if it passes later on ...

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #1014 has finished for PR 6903 at commit 2b1a305.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dragos
Copy link
Contributor Author

dragos commented Jul 9, 2015

Spinning the wheel "fixed" it :)

@srowen
Copy link
Member

srowen commented Jul 9, 2015

I think this is good to go. Any last comments anyone?

@srowen
Copy link
Member

srowen commented Jul 10, 2015

@dragos ah, sorry we seem to have merge conflicts now. I think we can merge after that's resolved and re-tested.

@dragos
Copy link
Contributor Author

dragos commented Jul 10, 2015

Oh, that's great...

@SparkQA
Copy link

SparkQA commented Jul 10, 2015

Test build #37034 has finished for PR 6903 at commit c596c6f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants