Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.12] Improve -Yrepl-class-based #8712

Merged
merged 15 commits into from
Feb 20, 2020
Merged

Conversation

dwijnand
Copy link
Member

@dwijnand dwijnand commented Feb 12, 2020

These changes improve the REPL under -Yrepl-class-based, so that it eventually it will become the new default (in 2.13.2).

For background: -Yrepl-class-based is an experimental (-Y) compiler option that switches the REPL to use "class-based" (instead of "object-based") wrappers, as its implementation detail. This mode originated in Spark's fork of the Scala REPL (and was later contributed upstream to scala/scala) as the object-based wrappers were problematic for the code serialisation that the Spark REPL does.

Additionally, the change in encoding to Scala's lambdas that happened in Scala 2.12 (i.e. since targeting JDK 8+) made it much, much easier to cause a deadlock in the REPL, aka scala/bug#9076. Using class-based wrappers resolves this too.

However, there were some limitations to the class-based wrappers' implementation, that have been addressed with this PR.

  1. The first is the most involved part of this change. The mode contained a heuristic that an import from previous lines would be ignored if the current line contained a class definition C and the import didn't contribute a name that was among the identifiers on this line. The goal of this heuristic was to avoid the outer pointer of class C transitively capturing the iw$ wrappers with state from previous lines that wasn't actually referred to. However, these imports could contribute implicits which are needed during typechecking of the class, so the heuristic changed semantics w.r.t -Yrepl-class-based:false (reported as SPARK-5150). This PR always considers the imports, but adds a post-processing step to prune out unused references to the state of prior lines.

  2. The mode can now handle value classes, by switching the evaluation to use the old, object-based wrappers for the snippet that defines a value class.

  3. The mode now mitigates any attempts to define a macro in the REPL, by giving the user some more information in the error message.

Refs scala/bug#9076
Refs SPARK-5150

TODOs:

  • review the changes bearing in mind Spark uses -Yrepl-class-based already

@scala-jenkins scala-jenkins added this to the 2.12.12 milestone Feb 12, 2020
@dwijnand dwijnand force-pushed the 2.12/favour-class-based branch 2 times, most recently from ee22df6 to ef71699 Compare February 12, 2020 19:11
Copy link
Contributor

@som-snytt som-snytt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a glance.

src/repl/scala/tools/nsc/interpreter/ReplGlobal.scala Outdated Show resolved Hide resolved
src/repl/scala/tools/nsc/interpreter/ReplGlobal.scala Outdated Show resolved Hide resolved
test/files/run/macro-bundle-repl.scala Show resolved Hide resolved
test/files/run/repl-colon-type.check Show resolved Hide resolved
test/files/run/repl-trim-stack-trace.check Outdated Show resolved Hide resolved
@retronym
Copy link
Member

In retronym/spark#1, I've setup the Spark Maven build to be able to test against the Scala release from this (or any) PR, figured out how to run the Spark REPL test suite, and found that we need to add -Yrepl-use-magic-imports to the Spark REPL setup to keep the test green.

@retronym
Copy link
Member

retronym commented Feb 13, 2020

I'd prefer if we didn't require Spark to pass -Yrepl-use-magic-imports. (That would require a new Spark release.)

We could enable it by default if -Yrepl-class-based is set, expanding the help of that option to say we'll do this. We do this sort of thing elsewhere using Setting#withPostSetHook.

@retronym
Copy link
Member

/cc @srowen (from the Spark team)

@lrytz
Copy link
Member

lrytz commented Feb 13, 2020

we need to add -Yrepl-use-magic-imports to the Spark REPL setup to keep the test green.

Do you know why, what change "broke" plain -Yrepl-class-based for spark?

@dwijnand dwijnand modified the milestones: 2.12.12, 2.12.11 Feb 13, 2020
@dwijnand
Copy link
Member Author

dwijnand modified the milestones: 2.12.12, 2.12.11 now

... optimistically.

retronym and others added 5 commits February 13, 2020 12:18
…ores some imports

Co-authored-by: Dale Wijnand <dale.wijnand@gmail.com>
Always generate a `val INSTANCE` field, whether it's for class-based or
object-based, so that the access code branches needn't branch on
isClassBased.

This greatly simplifies the next commit, for handling value classes.
... by making any code snippet that defines (or seems to define) a value
class use the original "object-based" wrappers instead.
Co-authored-by: Dale Wijnand <dale.wijnand@gmail.com>
This avoids problems like

    private value read escapes its defining scope as part of type u.Type

when running `test/files/run/class-symbol-contravariant.scala` (and a
few others) with -Yrepl-class-based.

Also add a test specific for type aliases, extracted from
`test/files/jvm/interpreter.scala`, to run in class-based mode.
@dwijnand dwijnand force-pushed the 2.12/favour-class-based branch 2 times, most recently from e77dbd9 to 5835734 Compare February 13, 2020 12:20
The Spark REPL uses -Yrepl-class-based and the change around
ImportHandler in interpreter/Imports (and that introduced the custom
wrapperCleanup phase for the REPL) requires -Yrepl-use-magic-imports be
set in order for the Spark REPL to continue to work correctly.  Rather
than affect all existing Spark releases that choose to upgrade Scala, we
preserve the functioning of the Spark REPL by enabling "magic imports"
when using the class-based wrappers.
... and thus -Yrepl-use-magic-imports too.
The intent is for these to be true, but we'll do that for 2.13.x,
not 2.12.x.
@retronym
Copy link
Member

Do you know why, what change "broke" plain -Yrepl-class-based for spark?

It will be the removal of the heuristic that ignores imports introduced by previous lines for when the current line contains a class definitions that doesn't include an identifier contributed by that import.

That was supposed to be okay for us to remove because the new dead back-reference elimination (DBRE) should detect that its unused and kill it.

The use case is covered in run/repl-serialization.scala. I see now that I added -Yrepl-use-magic-imports to that test, so there must have been an issue with making the DBRE work with classic wrapper nesting.

Indeed, that test fails with -Yrepl-class-based -Yrepl-use-magic-imports:false:

-	at scala.tools.partest.nest.StreamCapture$.$anonfun$capturingOutErr$2(StreamCapture.scala:32)
diff --git a/test/files/run/repl-serialization.scala b/test/files/run/repl-serialization.scala
index a0fed0b6f4..2f201146c2 100644
--- a/test/files/run/repl-serialization.scala
+++ b/test/files/run/repl-serialization.scala
@@ -14,7 +14,7 @@ object Test {
   def run(): Unit = {
     val settings = new Settings()
     settings.Yreplclassbased.value = true
-    settings.YreplMagicImport.value = true
+    settings.YreplMagicImport.value = false
     //settings.Xprint.value = List("refchecks","wrapper-cleanup")
     settings.usejavacp.value = true
# partest /Users/jz/code/scala/test/files/run/repl-serialization.scala
% scalac repl-serialization.scala
% <in process execution of run/repl-serialization.scala> > repl-serialization-run.log
% diff /Users/jz/code/scala/test/files/run/repl-serialization-run.log /Users/jz/code/scala/test/files/run/repl-serialization.check
@@ -23,75 +23,5 @@ u: U = U
   evaluating O
   constructing A
 == reconstituting into a fresh classloader
-java.io.NotSerializableException: $line9.$read$$iw$$iw$TestClass
-	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)

I'll take a quick look and see if the if its an easy or hard problem to solve.

@retronym
Copy link
Member

retronym commented Feb 14, 2020

-Yrepl-use-magic-imports:true

package $line17 {
  sealed class $read extends _root_.java.io.Serializable {
    val $line2$read: $line2.$read.INSTANCE.type = $line2.$read.INSTANCE();
    val $line3$read: $line3.$read.INSTANCE.type = $line3.$read.INSTANCE();
    val $line4$read: $line4.$read.INSTANCE.type = $line4.$read.INSTANCE();
    val $line6$read: $line6.$read.INSTANCE.type = $line6.$read.INSTANCE();
    val $line7$read: $line7.$read.INSTANCE.type = $line7.$read.INSTANCE();
    val $line8$read: $line8.$read.INSTANCE.type = $line8.$read.INSTANCE();
    val $line9$read: $line9.$read.INSTANCE.type = $line9.$read.INSTANCE();
    val $line15$read: $line15.$read.INSTANCE.type = $line15.$read.INSTANCE();
    val $line16$read: $line16.$read.INSTANCE.type = $line16.$read.INSTANCE();
    sealed class $iw extends _root_.java.io.Serializable {
      val res0 = $read.this.$line2$read().$iw().extract().apply({
        def $anonfun$res0(): A = new A($read.this.$line3$read().$iw().x().+($read.this.$line4$read().$iw().getX()).+($read.this.$line6$read().$iw().y()).+($read.this.$line7$read().$iw().z()).+($read.this.$line7$read().$iw().zz()).+($read.this.$line8$read().$iw().O().apply()).+($read.this.$line16$read().$iw().u().x()));
        (() => $anonfun$res0())
      })
    };
    val $iw = new $iw()
  }

  object $read extends Serializable {
    val INSTANCE = new $line17.$read()
  }
}
unused = [$line15$read,$line15$read ,$line9$read,$line9$read]

-Yrepl-use-magic-imports:false

package $line17 {
  sealed class $read extends _root_.java.io.Serializable {
    sealed class $iw extends _root_.java.io.Serializable {
      val $line2$read: $line2.$read.INSTANCE.type = $line2.$read.INSTANCE();
      val $line3$read: $line3.$read.INSTANCE.type = $line3.$read.INSTANCE();
      val $line4$read: $line4.$read.INSTANCE.type = $line4.$read.INSTANCE();
      val $line6$read: $line6.$read.INSTANCE.type = $line6.$read.INSTANCE();
      val $line7$read: $line7.$read.INSTANCE.type = $line7.$read.INSTANCE();
      val $line8$read: $line8.$read.INSTANCE.type = $line8.$read.INSTANCE();
      val $line9$read: $line9.$read.INSTANCE.type = $line9.$read.INSTANCE();
      sealed class $iw extends _root_.java.io.Serializable {
        sealed class $iw extends _root_.java.io.Serializable {
          val $line15$read: $line15.$read.INSTANCE.type = $line15.$read.INSTANCE();
          val $line16$read: $line16.$read.INSTANCE.type = $line16.$read.INSTANCE();
          sealed class $iw extends _root_.java.io.Serializable {
            val res0 = $iw.this.$line2$read().$iw().$iw().extract().apply({
              def $anonfun$res0(): A = new A($iw.this.$line3$read().$iw().$iw().x().+($iw.this.$line4$read().$iw().$iw().getX()).+($iw.this.$line6$read().$iw().$iw().y()).+($iw.this.$line7$read().$iw().$iw().z()).+($iw.this.$line7$read().$iw().$iw().zz()).+($iw.this.$line8$read().$iw().$iw().O().apply()).+($iw.this.$line16$read().$iw().$iw().$iw().$iw().u().x()));
              (() => $anonfun$res0())
            })
          };
          val $iw = new $iw()
        };
        val $iw = new $iw()
      };
      val $iw = new $iw()
    };
    val $iw = new $iw()
  }

  object $read extends Serializable {
    val INSTANCE = new $line17.$read()
  }
}
unused = []
java.io.NotSerializableException: $line9.$read$$iw$$iw$TestClass

@retronym
Copy link
Member

retronym commented Feb 14, 2020

I pushed a fix. The main problem was that the wrapper-cleanup phase was not recursing beyond the outermost ClassDef.

We no longer need to auto-enable -Yrepl-use-magic-imports for Spark compatibility. Maybe its best to uncouple them and we can decide in a separate PR whether we attempt to change the default for -Yrepl-use-magic-imports in 2.13 only or in 2.12 as well.

@dwijnand

This comment has been minimized.

@retronym
Copy link
Member

I've pushed a small refinement -- we should not run the unused analysis on the each ClassDef ModuleDef defined in a line separately, we need to first run it on the innermost class $iw.

@retronym
Copy link
Member

I also pushed a commit to limit the unsed private removal to -Yrepl-class-based. I have some further WIP that I think we also need to add (I'll get it working and submit by Monday): https://github.com/retronym/scala/tree/repl/tighten-unused-wip

That last part aims to centralize the logic for creating the REPL temp val names and checking if symbol has such a name. It then aims to limit the unused private removal to such vals, as we don't have any business removing unsed privates from user code.

  - Only run the transform under `-Yrepl-class-based`
  - Only eliminate the `$lineXY$read` fields and accessors to be sure
    we don't accidentally eliminate statically unused user code,
    which could be intended for reflective use or the like.
  - Colocate the logic that creates and tests for `$lineXY$read`
    val names.
@retronym
Copy link
Member

retronym commented Feb 16, 2020

Spark REPL test suite is happy with this PR now. I dropped the my commit that explicitly enabled -Yrepl-use-magic-imports, it now works with the postSetHook in here.

./build/mvn -P scala-pr-validation-snapshots -Dscala.version=2.12.11-bin-1dc6d84-SNAPSHOT -pl :spark-repl_2.12 -Dtest=none -DwildcardSuites='org.apache.spark.repl.ReplSuite,org.apache.spark.repl.SingletonReplSuite,org.apache.spark.repl.ExecutorClassLoaderSuite' test
...
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  03:09 min
[INFO] Finished at: 2020-02-17T07:28:33+10:00
[INFO] ------------------------------------------------------------------------

@retronym
Copy link
Member

retronym commented Feb 16, 2020

Also passes with:

diff --git a/repl/src/main/scala/org/apache/spark/repl/Main.scala b/repl/src/main/scala/org/apache/spark/repl/Main.scala
index a68b112ed2..8f90d2b07e 100644
--- a/repl/src/main/scala/org/apache/spark/repl/Main.scala
+++ b/repl/src/main/scala/org/apache/spark/repl/Main.scala
@@ -67,20 +67,22 @@ object Main extends Logging {
       .mkString(File.pathSeparator)
     val interpArguments = List(
       "-Yrepl-class-based",
       "-Yrepl-outdir", s"${outputDir.getAbsolutePath}",
       "-classpath", jars
     ) ++ args.toList

     val settings = new GenericRunnerSettings(scalaOptionError)
     settings.processArguments(interpArguments, true)

+    settings.processArguments("-Yrepl-use-magic-imports:false" :: Nil, processAll = false)
+

I'lm pretty sure we no longer need to customised `ignoreNames` now
that we're customizing `isEffectivelyPrivate` and we're restricting
the unused decl dropping to trees of the form `val $lineXY$read = ...`
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Should we disable the postSetHook that enables magic imports?

@lrytz lrytz self-assigned this Feb 19, 2020
Copy link
Member

@retronym retronym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

This should help ensure the order is as deterministic as possible.
@lrytz lrytz merged commit 4c726db into scala:2.12.x Feb 20, 2020
@dwijnand dwijnand deleted the 2.12/favour-class-based branch February 20, 2020 12:23
@SethTisue SethTisue added the release-notes worth highlighting in next release notes label Feb 20, 2020
@SethTisue
Copy link
Member

SethTisue commented Feb 20, 2020

@dwijnand tentatively labeled with "release-notes", but actually I'm unsure what the takeaway here is for users. I'd suggest either removing the label, or improving the PR description to begin with a user-focused description that puts this change in context, with the contributor-focused info afterwards.

@srowen
Copy link

srowen commented Feb 20, 2020

For Spark, is the takeaway that spark-shell needs to set -Yrepl-use-magic-imports if Spark updates the Scala 2.12 version?

@dwijnand
Copy link
Member Author

@srowen No, that's no longer true, thankfully (it's why we were able to drop the postSetHook).

The Spark-specific takeaway is we tweaked how to minimise references, so that Spark's serialisation continues to work (aka not regressing on https://issues.apache.org/jira/browse/SPARK-2576 and https://issues.apache.org/jira/browse/SPARK-2632), while fixing https://issues.apache.org/jira/browse/SPARK-5150. Tested with our test/files/run/repl-serialization.scala, as well as running org.apache.spark.repl.ReplSuite, org.apache.spark.repl.SingletonReplSuite, and org.apache.spark.repl.ExecutorClassLoaderSuite in the Spark build (though, admittedly, not on the latest sha).

If you could double-check that for us, using the https://scala-ci.typesafe.com/artifactory/scala-integration/org/scala-lang/scala-compiler/2.12.11-bin-4c726db/ that would be great!

@dwijnand
Copy link
Member Author

@SethTisue I'll update the description. It won't be interesting for non-Spark users until 2.13.2, but we can still describe it here and reference to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes worth highlighting in next release notes tool:REPL Changes in the Scala REPL Interpreter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants