-
Notifications
You must be signed in to change notification settings - Fork 3.1k
new bytecode emitter, GenBCode (11th attempt) #2620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
GenBCode is a drop-in replacement for GenASM with several advantages:
- faster: ICode isn't necessary anymore.
Instead, the ASTs delivered by CleanUp (an expression language)
are translated directly into a stack-language (ASM Tree nodes)
- future-proofing for Java 8 (MethodHandles, invokedynamic).
- documentation included, shared mutable state kept to a minimum,
all contributing to making GenBCode more maintainable
than its counterpart (its counterpart being GenICode + GenASM).
A few tests are modified in this commit, for reasons given below.
(1) files/neg/case-collision
Just like GenASM, GenBCode also detects output classfiles
differing only in case. However the error message differs
from that of GenASM (collisions may be show in different order).
Thus the original test now has a flags file containing -neo:GenASM
and a new test (files/neg/case-collision2) has been added
for GenBCode. The .check files in each case show expected output.
(2) files/pos/t5031_3
Currently the test above doesn't work with GenBCode
(try with -neo:GenBCode in the flags file)
The root cause lies in the fix to
https://issues.scala-lang.org/browse/SI-5031
which weakened an assertion in GenASM
(GenBCode keeps the original assertion).
Actually that ticket mentions the fix is a "workaround"
(3) files/run/t7008-scala-defined
This test also passes only under GenASM and not GenBCode,
thus the flags file. GenASM turns a bling eye to:
An AbstractTypeSymbol (SI-7122) has reached the bytecode emitter,
for which no JVM-level internal name can be found:
ScalaClassWithCheckedExceptions_1.E1
The error message above (shown by GenBCode) highlights
there's no ScalaClassWithCheckedExceptions_1.E1 class,
thus shouldn't show up in the emitted bytecode
(GenASM emits bytecode that mentions the inexistent class).
|
Good try, |
|
@cunei: could we have a dbuild job that not only builds akka but also executes its tests? |
|
I actually created that job, so we should be able to do so. Trying to build all of akka. |
|
Technically, dbuild will test all the sbt-based project, unless an explicit "run-tests: false" is added. In this case, it will try to run "test" of the only Akka subproject that was specified ("akka-actor"), which may or may not do anything at this time. Compiling (and testing) the entire Akka is another matter, however: Akka depends on a number of other libraries, including stuff like "zeromq-scala-binding" and "scalabuff". I am currently trying to track down and hammer into shape all of the necessary pieces; the full dbuild task is at https://jenkins-dbuild.typesafe.com:8499/job/Community1. I am still working on it, though. |
|
For testing in real-world conditions, the real-world GenBCode should be used instead (ie this PR + dead-code elimination + jump chains collapsing + removal of dangling exception handlers, as described at magarciaEPFL@b0b012e ). That functionality was originally left out to simplify reviewing. |
|
@magarciaEPFL: the functionality you mentioned are all optimizations so whether tests pass shouldn't be affected by lack of those features, right? |
|
Dead-code elimination is something ICode + GenASM do nowadays by default, same goes for jump-chains collapsing (it's called "removal of JUMP-only blocks" there). If I remember correctly @JamesIry added dce by default because otherwise a Before writing classfiles with "optimization level zero" That artifact results from the requirement by the Java 6 split verifier |
|
One of the tickets showing why DCE is necessary: https://issues.scala-lang.org/browse/SI-7006 |
|
Thanks for the pointers. I wasn't aware of that issue. |
|
SI-7006 does not affect correctness, right? "Just" worse performance. |
|
Without DCE by GenBCode then we must trust the code-patching approach of http://asm.ow2.org/doc/developer-guide.html#deadcode to emit verifiable code. By doing that, we may well end up debugging that code-patching approach rather than GenBCode. For running tests, we can't compare apples (GenASM with built in DCE + removal of jump-only basic blocks) and oranges (GenBCode lobotomized not to perform DCE, nor jump-chains collapsing). |
|
Is there a concrete reason for not trusting ASM? (E.g., evidence of bugs in this area?) We're not going to reject the PR that introduces the non-optimizing core of the new back-end because it generates slower code than the status quo. The simpler the PR, the likelier its success. Also, correctness and (evidence of) testing. Follow-up PRs can introduce optimizations. |
The code-patching approach in question is ASM's last-ditch effort to deal with unfinished job by the compiler. I can't recall an example where that code-patching approach fails. I haven't tested it either. The testing I've done is of GenBCode with DCE in place. I can't predict how GenBCode will work (say, compiling akka) without those features. |
|
Just to clarify the status of this PR: Done: Pending: Any items you might want to add / change? |
|
As far as I can see, the net impact of this PR on our test suite is 4 tests that now only run under "-neo:GenASM", so I guess they don't work under the new back-end. No new regressions tests specific to the new back-end are introduced. I agree being able to bootstrap and compile akka-actors inspires confidence. As discusses, I'd like to see some evidence of investigation regarding the differences between the new back-end and the old, preferably as automated tests. One test I can imagine is to compile the same code under the old and the new back-end and compare their output. We have infrastructure for that: https://github.com/scala/scala/blob/master/src/partest/scala/tools/partest/BytecodeTest.scala And people are using it: 386a5bd, 69109c0, b50a0d8, 3f0224c, c8fbba0, 5f3cd86, e9f6511, d8ba6af, 13caa49, b2117cf, b47bb0fe1a, 494ba94518, 71ea3e8278 |
|
@magarciaEPFL One area of the current backend that isn't particularly well tested is positions. Here's an example of one I fixed: #1754; it included a bug in the Do you consider this sort of bug to be high/medium/low risk under GenBCode? Are there any aspects of the new design that lend themselves to getting positions right? Should we invest in old-vs-new backend testing of this? |
|
The fix to SI-6288 mentions the root cause of the problem:
With GenBCode, whatever line number information appears in bytecode is the result of explicit invocations to Once we're dealing with ICode, all we can diagnose (as in the example) is that some JUMPs and their targets don't have the right positions. At the level of GenBCode we still have at our disposal the original Trees, whose structure still preserves lexical scoping, thus allowing more precise line numbers to be emitted.
That would have been definitely less time consuming with GenBCode! |
You can also see that (quoting from the commit message):
@adriaanm , as explained above, GenBCode is doing the right thing in all cases. |
|
@adriaanm , one of the tests you mentioned that you'd like GenBCode to pass is the following (quoting from the commit message). Are you saying the current behavior by GenASM is right? Could you elaborate, please? |
|
@magarciaEPFL: the |
|
My point was that you're not adding new tests (while we'd agreed there would be do some kind of testing), but rather changing/disabling old ones. That tends to be an opportunity to learn something about the new or the old code. Maybe the old was to blame, maybe the new implementation. Maybe both. Regarding pos/t5031_3, it seems this should compile. What's the difference under GenBCode? |
|
Oh well, it's all my fault, I was using a better optimizer too soon, before it's been merged into trunk. This is the source program, aka class Foo_1 {
def foo(x: AnyRef): Int = {
val bool = x == null
if (x != null)
1
else
0
}
}
With GenASM and BTW, the textual representation above was obtained with With the new optimizer (just intra-method optimizations suffice) two useful things happen:
In other words, there's one less conditional jump than what BTW, the idea about "control flow that mirrors lexical structure" is not mine, I just read a paper about it and thought it was cool: The Program Structure Tree: Computing Control Regions in Linear Time (1994) |
|
@magarciaEPFL: Yeah, I was guessing that you were running DCE and eliminating that one unnecessary null check. The test is just an example and it looks like I wasn't paying full attention when I was writing it. We probably should refer to Also, is "control flow that mirrors lexical structure" relevant in this example? Do we get more concise bytecode as a result? I can attribute shorter bytecode you posted above only to DCE. |
Not directly because of it, but that structure improves readability; without hurting any optimizations nor increasing code size. Compare with the CFG that GenICode produces, which appears to make DCE too difficult for (useless because, whether the conditional jump is taken or not, |
|
Ok. Thanks for confirming that the second optimization changes only the order of blocks without collapsing anything. I understand everything now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need a lineNumber call here?
In case any questions remains, please let me know, so as to answer them. |
|
@magarciaEPFL: I finally had enough time to go through your code in more detail. I really like it overall. Could you please address questions/suggestions I made? I think at this point it will be better if you address those points by adding new commits (not amending/rebasing) so I can verify that problems got resolved. If you rebase/amend/squash commits then I have to review the whole code base from scratch again. That would be pleasant given the amount of code you are submitting :) Once my questions/suggestions are addressed we should merge this in my opinion. |
|
@gkossakowski , thanks for the review. Below comments can be found addressing points raised by the review, which comprises three areas:
To simplify the discussion, I've included candidate commits for the above areas for now in a separate branch. The candidate commits in question are listed again below for summary: The single commit for item 1 ("clarifications") just adds comments to the source code. The commits for item 3 are also well-focused, and correspond to a GenASM fix ( magarciaEPFL@334bc76 ) and a performance improvement ( magarciaEPFL@af6d1ec ). |
|
Item 2 ("phase assembly", ie what a. GenJVM only Now comes the question how to assemble phases in the age of "co-existence GenASM + GenBCode". There are two approaches: (i) and (ii) below. i. The approach I've followed so far, which scales all the way to the new optimizer with all its optimization levels; keeps phase assembly as it is now. Please notice that this "status quo" also includes phases that see no actual use for some runs (the optimizer phases). In this sense, the approach of having the GenICode phase (doing nothing) when using GenBCode as bytecode emitter is no different from the current situation. ii. A "more clean" approach consists in assembling only those phases that are going to be actually used. In case of using GenBCode, that means leaving out GenICode, the old optimizer, and GenASM. Other scenarios include using (GenICode + GenASM only), or using (GenICode + some-subset-of-optimization-phases + GenASM). Approach (ii) has a fair amount of ripple effects and additional complexity, as detailed below. In my view, that additional complexity (useful only during the age of "co-existince of GenASM + GenICode") isn't justified. Instead, if all goes well, once we get rid of GenICode + old optimizer + GenASM, then we can also simplify phase assembly. What is the additional complexity of approach (ii) ?
It's only five tests for the compiler, but it's safe to assume the IDE, sbt, other frameworks, are all showing phases in some of their tests. All things considered, my suggestion is to live during the transition period with what has worked so well under GenJVM, the transition to GenASM, and GenASM; and change phase assembly only after GenASM is gone (by which time there will be again no need for separate test sets depending on bytecode emitter). |
This could be handled by adding support for it as partest filters ... whether is scales, is a different question, though. |
|
Regarding SI-5604, the defensive code that used to rescue compilation has been made "unreachable" after the fix for the root cause (fix in magarciaEPFL@e6f10b0 ) as pointed out by @gkossakowski . In order to test the waters without that defensive code, commit magarciaEPFL@2cba077 turns that defensive code into an Running the nightly (for each of GenASM, GenBCode) doesn't trigger the It appears more clear to me to merge this PR as is (defensive code for SI-5604 included) and only afterwards rephrase that defensive code (as in magarciaEPFL@2cba077). @gkossakowski , if you agree then I'll open a ticket (and assign to me) to keep track of this. |
|
Hi Miguel,
That age will span over at least 2 years because even if we switch to new backend in 2.12 we won't be able to remove the old one immediately but only deprecate it. Therefore the effort to cleanup the phase assembly is worth it. The tests dumping all phases are brittle by definition and should be fixed. I don't think either Sbt or IDE will break because of the change phases. Also, by default the assembled phases will stay intact for now. I understand that it might feel unfortunate that you have to clean up code that is not really related to new backend but that's reality: whenever you do something significant you discover some unexpected problems that you have to fix. Also, please push magarciaEPFL/scala@e8d1612, magarciaEPFL/scala@af6d1ec and magarciaEPFL/scala@334bc76 to this branch. I think all of them look fine and should be included as they address some of my review questions. To sum it up, I think the phase situation is the last remaining todo item on your list. Once this is addressed I think we should merge this PR. |
Sounds good to me. |
|
The ticket for the fallout from SI-5604 is https://issues.scala-lang.org/browse/SI-7621 . I'll fix that after this PR is merged. |
SI-7151 Emit final in bytecode for final inner classes was fixed in scala@b49b6cf
|
Actually, I'll fix the phase business. I created the ticket for that and I assigned it to myself: https://issues.scala-lang.org/browse/SI-7622 Please push magarciaEPFL/scala@e8d1612, magarciaEPFL/scala@af6d1ec and magarciaEPFL/scala@334bc76 to this PR and we are ready to go! :-) |
This commit brings GenBCode in line with "Eliminate needless Options" as promoted by scala@45d6177
|
As requested, I've added commits. Please notice magarciaEPFL@ed7f488 will compile only against |
|
Let's! |
|
+1 |
|
Yay! 🍰 |
new bytecode emitter, GenBCode (11th attempt)
|
Good work! 👏 |
|
Yes, thank you @magarciaEPFL and all reviewers. Good stuff! |
|
Here's a 🍰 from user-land. Great work all! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@magarciaEPFL Could you please elaborate on the interaction between BCode and the name table? I find it somewhat disturbing the chrs is not a private member of Names and that we allow outsiders like this code to peer into it directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@retronym In the new backend (ie bytecode emitter + optimizer, GenBCode + BCodeOpt for short) all uses of Namer's chrs are read-only.
Besides that use, the new backend enters via newTypeName the JVM-level internal name of a bytecode-level type (eg java/lang/Object or [Ljava/lang/String; ) this in turn is done for interning: the resulting index to chrs follows a one-to-one correspondence with the "bytecode-level type" in question (a BType instance represents one such "bytecode-level type").
In turn, having one-to-one indices allows BType to become a value class (that's part of upcoming commits, not yet merged from http://magarciaepfl.github.io/scala/ )
If not Namers, any other table providing interning would do. The current write-use of Namers (via newTypeName) by the new backend is single-threaded (there's only one typer-dependent thread in all of GenBCode + BCodeOpt, newTypeName is only be invoked from there).
Both GenBCode and BCodeOpt perform concurrent read-only accesses to Namer's chrs.
This PR re-submission supersedes #2603 and takes into account review comments so far.
Compared to the previous PR, pipelining is left for a future PR (same goes for "built-in" dead-code elimination and jumps-chain collapsing). This PR includes just the bytecode-emitter.
Let's see what dbuild thinks about GenBCode (after changing in
ScalaSettingsthe default for theneooption from GenASM as in this PR to GenBCode).review by @gkossakowski or @JamesIry or @retronym or @paulp