Expose hooks to instrument compiler phases #340

jvican · 2017-03-21T20:47:27Z

The problem

Scalac compiler phases carry out several type and tree transformations at once.
These transformations are aggressive and can be performed in parts of the
pipeline that are not related to its original purpose.

An example of a Scalac phase that produces lots of transformations is typer,
which aside from typing trees it performs an extensive array of transformations
such as desugarings, constant folding, and default/named arguments handling.

Projects that wish to control these transformations, like Scala Meta and Zinc,
have to find tricks to work around them. Scala Meta fallbacks to the only solution at hand: it hijacks analyzer using reflection, which means
reimplementing several typer methods by copy-pasting code from scalac, and then
enriching it with the desired operations. Here's the code.

There are two problems with this approach:

It's painful for the maintainers to keep up with new changes in Typers.
Maintenance is hard.
It's not composable -- Zinc would not be able to instrument typer in projects
that are using the Scala Meta compiler plugin and viceversa.

Why the instrumentation?

It happens that Scala Meta and Zinc need concrete information happened during
Typer, but we may want to envision this need along the whole compiler
pipeline. On the one hand, Zinc wants to know when constant folding has occured
-- currently we miss changes to final vals because they are constant folded
by Typer. This means that Zinc cannot analyze the use sites of these
constants. On the other hand, Scala Meta performs tasks along the same lines,
it does "remember" constant folding and several other tree transformations.

We may think these use cases could be facilitated on a one-by-one basis. For
some of these cases, this strategy may be a good idea but it has the following
disadvantages:

It requires a lot of time to implement and merge it in Scalac -- as expected
in a production-ready compiler.
The implemented fix may be rejected due to, for instance, performance
regressions.
The changes are only available to concrete versions of the compiler. The tool
won't work for versions prior to that change.

Searching for a solution

The rationale for the status quo is performance. The less passes the compiler
does, the better -- memory locality is better exploited. Dotty proves that
there can a more principled division of tree transformation along phases while
still exploiting memory locality (via miniphases), but this requires
fundamental changes to the compiler and it's another story.

I predict that the accumulation of type transformations in Scalac phases is
not likely to change any time soon -- it would require major changes to the
compiler and countless man-hours.

Therefore, I'd like to explore potential solutions to this problem that do not
focus on fixing the original problem. The solution I'm aiming for should:

Expose generic hooks to any tool that wishes to instrument the compiler.
Be conceptually simpler -- requiring no reflection at all.
Be easy to maintain -- no copy-pasting nor necessity of keeping up with main
changes to the compiler.

Below I explained one solution with the previous properties. I'm open to
reconsider other options that the Scalac team considers better.

Hook: allow tools to set their own tree copier

The tree copier is the class responsible of copying trees if they have changed.
It's a nice abstraction that represents tree transformations. For instance,
Zinc needs to instrument adaptConstant to handle constant folding. If we
allow tools to wrap the existing tree copy class and override solely the
methods the use case requires, Zinc would do something like:

class WrappedTreeCopier(treeCopier: TreeCopier) extends TreeCopier {
  // REDACTED: Forward all the operations to `treeCopier`
  ...

  // Override the `Literal` method that `adaptConstant` invokes
  def Literal(tree0: Tree, value: Literal): Literal = {
    val tree = treeCopier.Literal(tree0, value)
    // Add attachment to tree so that it can be queried later down the road
    val enrichedTree = ???
    enrichedTree
  }
}

If Scalac provides the WrappedTreeCopier, even better: tools won't need to
forward the entirety of the TreeCopier operations and would just subclass it
and override the bits they are interested in.

Overriding the tree copier limits the amount of things that the hooks can do:
instrumented code cannot access the context of these transformations, only the
final product: the old and new tree. I think this situation strikes a good
balance between customization and reasonable behaviour since they are not
intrusive by design.

I think that overriding treeCopy in Traversers and global could be
beneficial and cover at least all the use cases of Zinc (at the moment, we only
know constant folding, but there may be more use cases in the future). I'll let
@xeno-by clarify whether Scala Meta could jump directly to this solution or if
there's another scenario that would not be covered by this proposal.

Note that existing solutions like creating a phase before typer or setting the
tree copier via reflection are far from ideal. The first one is not possible if
you wish to keep track of changes between analyzer phases (namer,
packageobjects and typer). The second one may only be possible with var
handles, that are only available in JDK 9.

The text was updated successfully, but these errors were encountered:

jvican · 2017-03-22T12:11:41Z

Just for clarity, this proposal does not try to make Scala Meta not to hijack the typer. Certainly all the macro infrastructure needs it. It only targets use cases for Zinc and the Scalahost analyzer, which are only interested in keeping track of the tree transformations.

My point with this proposal is that exposing such hooks will allow Zinc to use them instead of hijacking the typer and conflicting with Scala Meta's. I propose the hook because this is a pretty common case for tools levitating around the compiler and I think other tools could benefit from it down the road.

xeno-by · 2017-03-22T14:08:07Z

@jvican Thanks for starting the discussion! Indeed, for the scala.meta project it would be nice if there were public facilities to instrument parts of the typechecker.

Speaking of the tree copier solution, the question is how the interested parties are going to register their interest in particular parts of the typechecker. E.g. how would the scala.meta semantic engine let the typer know that it'd like to instrument adaptConstant?

jvican · 2017-03-22T14:30:04Z

This is the instrumentation process I envision:

Subclass the WrappedTreeCopier that I mentioned in the description and override the tree node transformations we're interested in.
Set the customized tree copier to the Global treeCopy. This functionality would be exposed by scalac in both Global and Traverser via a setTreeCopy method or a setter. This method would be the only required change to make this work.

If we wish to access the tree copy of a concrete traverser, we can get a reference to the traverser through reflection and set the tree copier.

Note that this solution is composable; you can wrap the underlying tree copier with WrappedTreeCopier as many times as desired -- WrappedTreeCopier always forwards to the methods of the underlying copier.

That being said, though, typer will not know it's been instrumented. This is the neat thing about my proposed solution -- Typer doesn't have to be changed. Are you interested in adaptConstant? Just override it in the wrapped tree copier and set it.

Does this sound reasonable?

DarkDimius · 2017-03-22T14:36:26Z

I know this is specific for scalac, but I'll give my feedback on the proposal from Dotty perspective. I assume most of it applies to both compilers.

You don't have guarantees that tree copier will be used to reconstruct all parts of the tree. Tree copier is an optimization that allows to reuse the tree if it's not changed and no copying is needed. If you know for sure that tree is changed, nothing stops you from not using tree copier.
Tree copier is one of biggest hot-spots in the compiler. I'd be very careful when exposing it as it's a very easy way to slow-down the entire compiler.
Some parts of compiler rely on reference equality on trees. Exposing tree-copier to outside may break assumptions about parts of compiler.

DarkDimius · 2017-03-22T14:40:29Z

For Dotty, a more reliable and already existing place to intercept for you would be Type Assigner.
You already can provide a custom one. You may consider an alternative to split Typer in scalac into Typer & TypeAssigner, as already done in Dotty.

jvican · 2017-03-22T14:45:59Z

You don't have guarantees that tree copier will be used to reconstruct all parts of the tree. Tree copier is an optimization that allows to reuse the tree if it's not changed and no copying is needed. If you know for sure that tree is changed, nothing stops you from not using tree copier.

Fair point. However, I think it's a convention to use tree copier. It would be interesting to see if Typer is circumventing the use of treeCopy at some point. In any case, I think it would be interesting to force every desugaring to go through treeCopy. It does not sound like a harsh requirement.

Tree copier is one of biggest hot-spots in the compiler. I'd be very careful when exposing it as it's a very easy way to slow-down the entire compiler.

This is true. I'd like to see some benchmarks to see how it performs. It's not clear that an extra method invocation can slow it down, though it's theoretically possible.

Some parts of compiler rely on reference equality on trees. Exposing tree-copier to outside may break assumptions about parts of compiler.

IMO, wrapped tree copiers should:

Be idempotent.
Not touch the tree structure of the transformed tree. The only thing that tree copy methods can do is to attach information to trees, but not change referential equality.

We can enforce these properties. Those who don't abide by it are the responsible of the misbehaviour of the compiler, in the same way that macro authors and compiler plugin authors are responsible of the misuse of compiler APIs.

For Dotty, an easier and already existing place to intercept for you would be Type Assigner.

I would welcome this change. But this would take a lot of time to implement. The strength of my proposal is that it can be implemented in an "afternoon" and shipped in the next Scalac release.

adriaanm · 2017-03-25T16:54:43Z

For performance and maintainability, I'd much rather see these extension points designed bottom-up: what's the precise use case, what's need in addition to the current plugin mechanism, how do we ensure this doesn't slow down the compiler when not used (JITs are prickly beasts, it's hard to predict how they will react to an additional method call), how do we make sure it's used correctly,...

The bigger the API surface, the more we'll have to (at least) explain to users how to use it correctly. It's easy to say it's their responsibility, but that's not how it works in reality.

I doubt any extension mechanism can be implemented in an afternoon and be maintainable (code to keep correct/bug reports to answer/...), acceptable for performance,...

@lrytz

Adds an original tree attachment that allows external tools (compiler plugins like scalameta & zinc) to keep track of the previous, unadapted tree. The reason why this is required is explained in SD-340: scala/scala-dev#340. This change enables the incremental compiler to detect changes in `final val`s: sbt/zinc#227. It also enables a fix for scala/bug#10426 by allowing the scaladoc compiler to let the compiler adapt literals without losing the tree that will be shown in the Scaladoc UI. To maintainers: I was thinking of the best way to test this, but couldn't come up with an elegant one. Do you suggest a way I could write a test for this? Is there a precedent in testing information carried in the trees? I think @lrytz is the right person to review this, since he suggested this fix in the first place.

@lrytz

Adds an original tree attachment that allows external tools (compiler plugins like scalameta & zinc) to keep track of the previous, unadapted tree. The reason why this is required is explained in SD-340: scala/scala-dev#340. This change enables the incremental compiler to detect changes in `final val`s: sbt/zinc#227. It also enables a fix for scala/bug#10426 by allowing the scaladoc compiler to let the compiler adapt literals without losing the tree that will be shown in the Scaladoc UI. To maintainers: I was thinking of the best way to test this, but couldn't come up with an elegant one. Do you suggest a way I could write a test for this? Is there a precedent in testing information carried in the trees? I think @lrytz is the right person to review this, since he suggested this fix in the first place.

@lrytz

Adds an original tree attachment that allows external tools (compiler plugins like scalameta & zinc) to keep track of the previous, unadapted tree. The reason why this is required is explained in SD-340: scala/scala-dev#340. This change enables the incremental compiler to detect changes in `final val`s: sbt/zinc#227. It also enables a fix for scala/bug#10426 by allowing the scaladoc compiler to let the compiler adapt literals without losing the tree that will be shown in the Scaladoc UI. To maintainers: I was thinking of the best way to test this, but couldn't come up with an elegant one. Do you suggest a way I could write a test for this? Is there a precedent in testing information carried in the trees? I think @lrytz is the right person to review this, since he suggested this fix in the first place. Fixes scala/bug#7173.

@lrytz

Adds an original tree attachment that allows external tools (compiler plugins like scalameta & zinc) to keep track of the previous, unadapted tree. The reason why this is required is explained in SD-340: scala/scala-dev#340. This change enables the incremental compiler to detect changes in `final val`s: sbt/zinc#227. It also enables a fix for scala/bug#10426 by allowing the scaladoc compiler to let the compiler adapt literals without losing the tree that will be shown in the Scaladoc UI. To maintainers: I was thinking of the best way to test this, but couldn't come up with an elegant one. Do you suggest a way I could write a test for this? Is there a precedent in testing information carried in the trees? I think @lrytz is the right person to review this, since he suggested this fix in the first place. Fixes scala/bug#7173.

SethTisue · 2019-01-09T03:14:35Z

@jvican should this remain open?

SethTisue · 2019-01-09T03:14:50Z

I'm guessing not, reopen if you disagree

jvican mentioned this issue Mar 30, 2017

Zinc doesn't handle constants (final val) correctly sbt/zinc#227

Closed

jvican mentioned this issue Jul 29, 2017

annotation argument needs to be a constant; found: Bar.constant scala/bug#10426

Closed

jvican mentioned this issue Jul 29, 2017

Add original tree attachment scala/scala#6013

Merged

SethTisue closed this as completed Jan 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose hooks to instrument compiler phases #340

Expose hooks to instrument compiler phases #340

jvican commented Mar 21, 2017 •

edited

jvican commented Mar 22, 2017

xeno-by commented Mar 22, 2017

jvican commented Mar 22, 2017 •

edited

DarkDimius commented Mar 22, 2017

DarkDimius commented Mar 22, 2017 •

edited

jvican commented Mar 22, 2017

adriaanm commented Mar 25, 2017

SethTisue commented Jan 9, 2019

SethTisue commented Jan 9, 2019

Expose hooks to instrument compiler phases #340

Expose hooks to instrument compiler phases #340

Comments

jvican commented Mar 21, 2017 • edited

The problem

Why the instrumentation?

Searching for a solution

Hook: allow tools to set their own tree copier

jvican commented Mar 22, 2017

xeno-by commented Mar 22, 2017

jvican commented Mar 22, 2017 • edited

DarkDimius commented Mar 22, 2017

DarkDimius commented Mar 22, 2017 • edited

jvican commented Mar 22, 2017

adriaanm commented Mar 25, 2017

SethTisue commented Jan 9, 2019

SethTisue commented Jan 9, 2019

jvican commented Mar 21, 2017 •

edited

jvican commented Mar 22, 2017 •

edited

DarkDimius commented Mar 22, 2017 •

edited