Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make incremental compilation artifacts reusable (aka cached compilation) #2

Merged
merged 30 commits into from
Mar 17, 2017

Conversation

romanowski
Copy link
Owner

Goal is achieved by providing custom mappers to TextAnalysisFormat.

Based on client can create CacheProviders that using given mappers can migrate incremental compilation artifacts between workspaces.

This is can be done in two steps:

  • export cache to intermediate format (e.g. map path as a relative to abstract roots)
  • import cache from intermediate format (e.g. replace abstract roots with concrete values).

Zinc is not responsible for maintaining caches (e.g. there is no verification if correct cache provider is used) or searching caches for build, this is client responsibility (zinc only provides utilities).

This PR also include two basic cache implementation based on making all paths relative to project root. They are used to test whole mechanism internally and are base for any custom caches.

  • project rebased cache (copy artifacts and classes from different project)
  • exportable cache (first export cache as pair or zipped classes and incremental compilation metadata) that can be later imported to another workspace

Internally, we are just about to beta tests this with huge scala codebase (way over 100K lines of scala code). Results so far are promising:

  • currently we can observe ~ 4-5 times improvement on full build (when relatively close to cache)
  • most of time is spent on extracting zips with classfiles and with few fixes we should get speedups around x10 times.
    Of course workspace after cache import behaves as it was compiled locally.

Our integration is not open source but in future (hopefully close) we will release at least part of it to public domain.

I am working on open source plugins for sbt (both 0.13.x and 1.0.x versions) based on that PR (deep WIP so far): https://github.com/romanowski/hoarder

For certain reasons this time I have to paste the below disclaimer:
THIS PROGRAM IS SUBJECT TO THE TERMS OF THE BSD 3-CLAUSE LICENSE.

THE FOLLOWING DISCLAIMER APPLIES TO ALL SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.
ONLY THE SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE, IF ANY, THAT ARE ATTACHED TO (OR OTHERWISE ACCOMPANY) THIS SUBMISSION (AND ORDINARY
COURSE CONTRIBUTIONS OF FUTURES PATCHES THERETO) ARE TO BE CONSIDERED A CONTRIBUTION. NO OTHER SOFTWARE CODE OR MATERIALS ARE A CONTRIBUTION.

They are not relevant and their effect on performance is minimal, but
it's good to have them for consistency. We do check for `NoSymbol` as
well as `null` when it comes to symbols. It feels weird if we don't for
types.
Add more tests aside from local-class-inheritance to check that
declaring anonymous clases and local classes in Scala that depend on
Java artifacts works (even though the anonymous class is desugared into
a local class by Scala).

This just gives us a wider safety net, the only test that was checking
this key functionality was 'local-class-inheritance'.
Java local, inner and anonymous classes are not handled correctly by the
current algorithm. When they extend Scala classes and those are changed,
Zinc does not recompile the dependent Java code. See
sbt#192.

These changes introduce the first step to support recompilation of Scala
sources when Java classes depend on them via inheritance. Before, even if we
had member references to Scala class files, the invalidation logic would
not recompile Java sources. There needs to be an inheritance relation
for that to happen.

The approach taken in this commit fixes this issue, but in a non-ideal
way. Currently, the dependency is tracked in the enclosing class of the
local class as an inheritance dependency. This overcompiles sources in
several scenarios -- if the Java enclosing class is extended by other
Java classes, those will be recompiled as well because the invalidation
logic assigns this contract to inheritance dependencies.

What we want to express -- a local inheritance dependency -- was added in
gkossakowski/sbt@47416b3.
Local dependencies are special cased by the invalidation logic so that
subclasses of the enclosing class are not recompiled.

For now, we stay with this solution. In the upcoming commit, this normal
inheritance dependency will be turned into local and some of the new
logic in `Analyze` will be removed, `productsToSource` already maps
local class files to the sources of their enclosing classes.
The previous change added a global inheritance relationship between the
enclosing class of the local Java classes and the Scala sources. As we
discussed in scalacenter@f022f91,
this is far from ideal since Zinc overcompiles subclasses of the
enclosing class.

This commit turns the previous inheritance dependencies into local
dependencies, and therefore compile only the enclosing class but not
their subclasses.

As you see, classes that cannot be mapped to a source are supposed to be
either local classes (that are therefore mapped to the source of their
enclosing class) or stale classes that are present in the output
directory. In the latter case, we ignore such classes and not process
them.
This commit rewrites a part of `processDependency` to remove the
`return` clause and use become clearer to read. It also addresses
@romanowski's suggestions.
This commit is done separately in case maintainers prefer to remove it.
It renames a variables that was poorly named but also changes the git
history. I think this change is justified but I would understand
arguments against it, since it's only an aesthetic change.
eed3si9n and others added 23 commits February 3, 2017 23:43
Fix sbt#192: Handle local dependencies for Java classes
Compactify was never marked as a passing test after we got a regression
in sbt/sbt#1553.

For some reason, the scripted setup was not recognizing tasks defined
locally in the `build.sbt`, and so we add this functionality to
`IncHandler` which is the responsible of mapping task keys to their
actual implementation in sbt-scripted.

Aside from moving this functionality around and marking the test as
passing, I took the libery to modify the name of `output-empty` for
something clearer in spirit: `checkNoClassFiles`.
Mark compactify as a passing test
These helpers are only used in `Dependency` for now, but I feel that
they could become handy in other Zinc scalac phases and remove the
overplaty way of guarding against inexistent or useless symbols and
types.
Instead of gathering all symbols in a set and using `foreach`, reuse a
foreach symbol traverser that will go through all the symbols of a type
and execute an operation on them.

This finishes up similar work done by @romanowski. Apparently, this
instance of `symbolsInType` was forgotten. As this function is not
useful anymore, it is removed for the posterity.
The name `foreachSymbolInType` is misleading in the sense that gives the
impression to the reader that a certain operation will be applied to
every type symbol in a type, whereas it only does so for symbol that
don't correspond to packages.

This commit renames this method to a longer but more concise version of
this function. I haven't skimped on the length of the name because, IMO,
helpers should prefer correctness over shortness.
This commit is an aesthetic change to the code of `ExtractUsedNames`
in the following ways:

* It clarifies the purpose of the `termSymbol` attached to `Import`
  nodes and removes (now) unnecessary comments from the source.
* It adds curly braces around a long method for readability purposes.
* It removes `tpnme` which was introduced for 2.8 Scala support. Zinc
  does not anymore offer support for 2.8.x and 2.9.x, hence the removal.
* It moves `emptyName` to `GlobalHelpers` under `isEmptyName` and instead
  of guarding against `NoSymbol` it reuses the helper `ignoredSymbol`.
`SafeLazy` has been traditionally implemented in `zincApiInfo` because
it is part of the sbt API and is accessible to all the subprojects that
depend on it.

Before this commit, `SafeLazy` was a runtime dependency (using
reflection) of the compiler bridge. In this regard, Zinc was assuming
that the sbt API was accessible at runtime and therefore invoked it to
use an implementation of lazy that would remove references to the thunks
once they've been forced. This was done to free memory as soon as
possible since those thunks usually depend on classes of compiler
internals and would not be GC'ed otherwise.

However, the compiler bridge is not supposed to depend on sbt APIs since
its code is compiled by the Scala compiler that the user picks in SBT.
Its only dependency is the compiler interface, which is implemented in
Java and compiled beforehand with javac.

This commit removes the runtime dependency of the compiler bridge to the
sbt API and avoids the method invocations using reflection. This was
done for the following reasons:

* Simplicity. It is not obvious why `SafeLazy` is invoked reflectively.
  See sbt#113.
* Performance. Even though the JVM should make this simple use of
  reflection fast, there's a very small overhead of using reflection in
  the compiler bridge because `lzy` is (most likely) hot.

The fix consists of a Java implementation of `SafeLazy` that uses the
non-thread-safe lazy val implementation described [here](http://docs.scala-lang.org/sips/pending/improved-lazy-val-initialization.html).
It is complemented with a proxy written in Scala that will create an
indirection layer for things like by-name and strict evaluation.
This implementation of lazy val assumes that `SafeLazy` will never be
called asynchronously. If this is the case, it's up to the Zinc
maintainer to make sure that safe publishing is implemented at the
call-site or to change the implementation to avoid races and
uninitialized fields.
Future error reporting would be more sophisticated because ideally we
want to report concise error messages with good contextual information.

This commit takes the first step by putting common error messages in
an object `Feedback` stored into `GlobalHelpers`. Temporary error
messages have not been added since they will be removed in future
commits (things like `super` not being handled in `ExtractAPI`, for instance).
Let's make sure this class is not extended in future versions by
inheritance.
Fix sbt#113 and apply several other fixes
Fix scala checks for zinc-persist and add test for mapped Analysis.
Create cache aware store that can load Analysis from given cache provider.

Implement simplest (based on project 'rebase') cache.
Create intergration test for mechanism above.
Exportable cache can be exported as pair: zipped classfiles, zipped analysis.
Exportable cache exports paths as relative paths to project location.
Add cache verifier. Add classpath mappers. Add mapper for whole MiniSetup after setup os loaded.
Fixes small problems with dependencies phase (e.g. reduce numbers of NoSymbol checked)  and do treat refinement class as top-level class (since it does not have runtime representation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants