-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend Refactoring #6012
Backend Refactoring #6012
Conversation
Removes the three work queues in the backend. Splits up the backend in two main components - CodeGen, which has a Global - PostProcessor, which has a BTypes (but no Global) CodeGen generates asm.ClassNodes and stores them in postProcessor.generatedClasses. The code generator is invoketd through BCodePhase.apply. The postProcessor then runs the optimizer, computes the InnerClass table and adds the lambdaDeserialize method if necessary. It finally serializes the classes into a byte array and writes them to disk. The implementation of classfile writing still depends on Global. It is passed in as an argument to the postProcessor. A later commit will move it to a context without Global and make it thread-safe.
36a5cbd
to
cd861fd
Compare
BTypes is the component that's shared between CodeGen and PostProcessor.
Remove implicit conversion from LazyVar[T] to T
d468f87
to
9dd1933
Compare
This is ready for review, I'm not planning to add more right now. There's more that could be done, but that doesn't interfere with @mkeskells's plans started in #5815. Review by @retronym. Start looking at The The I started a benchmark for the last commit, should be coming in here: https://scala-ci.typesafe.com/grafana/dashboard/db/pr-validation?var-scalaSha=6ac6da8b61%7C58cfe9e76c&var-source=All&var-bench=HotScalacBenchmark.compile&var-host=scalabench@scalabench@&from=1500736130730&to=1502458163200 Some notes for the TODOs on making local optimizations / classfile writing work in parallel:
Data structures to make concurrent
|
Performance is flat, build is green :) |
The jardiff is non-empty (https://gist.github.com/lrytz/2079b2f9f933da06a3a55beb48c7b5fa): classes with a By the way, @mkeskells, if you don't want to wait until this PR is merged you can also start working on top of my branch (https://github.com/lrytz/scala/tree/backendRefactor). |
import bTypes._ | ||
import callGraph.ClosureInstantiation | ||
import coreBTypes._ | ||
import frontendAccess.compilerSettings | ||
|
||
// unused objects created by these constructors are eliminated by pushPop | ||
private[this] lazy val sideEffectFreeConstructors: LazyVar[Set[(String, String)]] = perRunLazy { | ||
private[this] lazy val sideEffectFreeConstructors: LazyVar[Set[(String, String)]] = perRunLazy(this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a good idea for this to be both a lazy val
and a LazyVar
? The implementation of PerRunInit
is not currently threasafe (no sync around mutation/access of inits
), so I guess the invariant is that we expect that sideEffectFreeConstructors
will have been called prior to initialize
and any multi-threading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, the blend of locks based on lazy vals and frontEndLock
needs to be considered carefully for potential deadlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably too many implicit invariants, but here they are:
initialize
is called in each compiler run at the beginning when the GenBCode phase starts running. There's no multithreading at this point.- I kept the values here as lazy vals so that only those actually accessed are ever added to the
PerRunInit.this.inits
buffer (and therefore re-initializes in the next run). - I think there can't be a deadlock here. Initializing a lazy val here will create the
LazyVar
and add it to thethis.inits
buffer, and this happens synchronized onthis
. - Completing a
LazyVar
synchronizes on thefrontendLock
.
trait PerRunInit { | ||
private val inits = ListBuffer.empty[() => Unit] | ||
|
||
def perRunInit(init: => Unit): Unit = inits += (() => init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an error to call this after initialize
has been called? Can we add an assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that it's OK to call this after initialize
, like the lazy vals in CoreBTypes
.
|
||
def perRunInit(init: => Unit): Unit = inits += (() => init) | ||
|
||
def initialize(): Unit = inits.foreach(_.apply()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we discard the closures in init
after forcing them, perhaps to allow some references to be GC-ed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to keep them around to re-initialize the LazyVar
s in the beginning of the next run.
(More) data structures to make concurrent:
|
plain: SubItem3, | ||
bean: SubItem3, | ||
outFolder: scala.tools.nsc.io.AbstractFile) { | ||
postProcessor.postProcessAndSendToDisk() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The architecture of generating all the ClassNodes
in memory is well suited to compiling with global optimization enabled, but for non-optimized compile runs imposes an unnecessary memory burden. In that mode, I'd like to see code gen and classfile writing happen within the loop over compilation units, and make sure that we release the ClassNode
instances in between.
I've sketched out what I'd like in a commit in https://github.com/scala/scala/compare/2.12.x...retronym:review/6012?expand=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good suggestion. We can also run local optimizations in this mode. Can we do that in a follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, so long as we don't forget before 2.12.4.
I'd prefer to have explicit types for the component wiring, as in https://github.com/scala/scala/compare/2.12.x...retronym:review/6012?expand=1 |
Could you please update the PR description with the important parts of your comment above and the motivation for the change? |
It's read-only after code gen, but maybe that's not safe for a ListBuffer? Probably better be safe.
I think in a first iteration the inliner will only run single-threaded, so these two should be OK. Inlining in parallel requires synchronization. |
Will take in your explicit type annotations and update the PR descriptions, probably also improve the commit messages. |
Depending how we implement interspersed code gen and postprocessing we might need it. Probably best to encapsulate access to it rather than exposing the buffer and synchronize defensively. |
/synch |
In 2.12.3, if `$deserializeLambda$` has the only reference to `MethodHandles$Lookup`, the corresponding entry in the InnerClass table is missing due to an ordering issue (the method is added only after the inner classes are visited). This was fixed in the recent refactoring.
Added a test: 5f83d78 In 2.12.3, we do it in the wrong order (https://github.com/scala/scala/blob/v2.12.3/src/compiler/scala/tools/nsc/backend/jvm/GenBCode.scala#L268-L271) |
cf6adac
to
a4842aa
Compare
This PR ports JVM backend refactor from Scala 2 as part of the #14912 thread. It squashes changes based on the PRs: - scala/scala#6012 - scala/scala#6057 The last refactor introducing backend parallelism scala/scala#6124 is left for later.
Isolate backend components to run parallel from the compiler frontend
Refactor the compiler backend to isolate components that are planned to
run in parallel (local optimizations, classfile writing) from the
compiler frontend (i.e., no access to the Global instance).
The long inheritance stack (
BCodeIdiomatic
-BCodeHelpers
- ... -BCodeSyncAndTry
) is now isolated and simply a component of the newCodeGen
class.CodeGen
itself is one component ofGenBCode
, thebackend class.
GenBCode
has some other components-
BTypes
, used inCodeGen
andPostProcessor
. Many of the backendcomponents that used to be in
BTypes
(inliner
,inlinerHeuristics
,localOpt
) now moved to thePostProcessor
.PostProcessor
implements global optimization (inlining), localoptimizations and classfile writing. It has itself quite a number
of components (
inliner
,callGraph
,classfileWriter
, ...)PostProcessorFrontendAccess
is an interface of functionalityrequired in the
PostProcessor
that is implemented by accessingthe frontend (
Global
). It synchronizes on the singlefrontendLock
object.
The
PostProcessor
doesn't have aGlobal
. It can access certain operations that depend on the frontend throughPostProcessorFrontendAccess
, which synchronizes on thefrontendLock
. Also the types inCoreBTypes
are computed lazily by looking at symbols, so access is synchronized on the same lock.The
BTypes
component is passed intoCodeGen
and thePostProcessor
. Note thatGodeGen
has aglobal
, so it can (and does) access various parts of thePostProcessor
throughglobal.genBCode.postProcessor
, for example it updatesgenBCode.postProcessor.callGraph.inlineAnnotatedCallsites
during code gen.Quite a few cleanups all around.