Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target-level incremental re-building when build.sc changes #2348

Closed
lihaoyi opened this issue Mar 8, 2023 · 15 comments
Closed

Target-level incremental re-building when build.sc changes #2348

lihaoyi opened this issue Mar 8, 2023 · 15 comments
Milestone

Comments

@lihaoyi
Copy link
Member

lihaoyi commented Mar 8, 2023

The goal of this ticket would be to make it so that changing build.sc code does not aggressively invalidate all target caches, and instead can invalidate only the ones downstream of the code change. This would make Mill more suitable for larger projects and monorepos, where invalidating the whole world becomes an increasingly expensive operation as the size of the codebase grows.

This global invalidation is somewhat mitigated by Zinc's incremental compilation, but that only applies to Scala targets and not any of the other things Mill builds. Duplicating it for all target types, even ad-hoc user defined targets, is probably unfeasible

This is probably not as difficult as it seems:

  1. SBT already has this functionality for re-running tests on changes, making use of the analysis files generated by the Zinc incremental compiler.

  2. The com.lihaoyi:::acyclic plugin takes another approach, but also involves dumping the fine-graph code-level dependency graph during compilation

It is unlikely we will be able to catch all dependencies, e.g. if someone is mucking around with shared mutable state we can't see that, but that's already a limitation in the current Mill approach. Assuming the code lacks shared mutable state, we should be able to invalidate things as precisely in response to code changes as we currently do in response to T.input changes

Roughly, that would involve the following steps:

  1. Replace Ammonite script running with a MillBuildModule, to compile the build.sc using Mill itself, to unlock access to Zinc and incremental compilation. This would also put everything into one compilation unit, simplifying things if we choose to go with the compiler plugin approach, although not strictly necessary

  2. Using either approach above (SBT or Acyclic), make the classpath hash that we feed into the target cache keys only dependent on the external classpath, and use the internal graph/hashes computer from Zinc or our own compiler plugin to provide fine-grained invalidation for changes

  3. (Optional) do bytecode analysis to allow fine grained invalidation of external classpath changes too (probably not worth the hassle, but it's an option if we choose to do so)

CC @lefou @lolgab, WDYT?

@lefou
Copy link
Member

lefou commented Mar 8, 2023

Does that mean, a changed build.sc wide variable will only affect those targets who use it (as well as their downstream users), all other targets do not need to recompute? This would be great!

@lihaoyi
Copy link
Member Author

lihaoyi commented Mar 8, 2023

that's right. Even if the change was in a helper function or something, we could crawl the call graph and figure it out. After all, that's what SBT does for testquick, and what acyclic does to find inter-file circular dependencies

@lolgab
Copy link
Member

lolgab commented Mar 8, 2023

We already have caching which works at build file level.
What we do is to get the class where a target is defined and we get an identifier for the build file from that class name.
Then we use the file imports tree from ammonite to build a hash for the script and its dependencies.

Since we don't know dependencies between classes from Ammonite (haven't investigated if it's already possible to do that), we remove any inner classes from the identifier so we can treat a script as an atomic thing.

Having said this, I hope that we can build on top of the existing functionality, maybe by increasing the granularity of the current working solution from file-level to class-level.
To do that we need to stop removing inner classes from identifiers and have a way to construct the importTree that keeps dependencies between classes instead of just dependencies between files.

@lefou
Copy link
Member

lefou commented Mar 8, 2023

@lolgab It's not clear to me, whether you argue against this proposal or just try to provide an implementation idea based on Ammonite.

I'd also like to use existing mechanisms to deduce code dependencies.

I haven't inspected async in this context, so I can't tell much about it, but I assume Haoyi is knowing it very well. The idea to use zinc analysis sounds even nicer to me, as there is already some API and tooling to produce it. But I have zero experience using said API to get the information we might want.

The use of zinc analysis requires a different build process though, hence @lihaoyi suggested the MillBuildModule approach, which isn't ready to use yet. But a switch to build Mill with Mill is kind of eating the own dog food, we better control what we need without much additional cost, as we maintain a ScalaModule anyways. We would immediately profit from incremental compilation. And when included files are disentangled from each other (in contrast to current Ammonite script compiler approach), the whole foreign module story would profit, as we even might be able to use the cache locations of the foreign modules directly, which would reduce the amount of redundant build work and cache storage even further.

@lihaoyi
Copy link
Member Author

lihaoyi commented Mar 8, 2023

I think long term Ammonite scripts are a dead end: they support a lot of dynamism nobody actually wants, and in exchange cannot support incremental-compilation/pre-compilation/packaging/etc. that people actually need. They also duplicate a lot of what Mill does: dependency analysis, file hashing, invalidation, etc. but in an ad-hoc manner.

Long term I believe we should replace them with Mill's own bootstrap scriptrunner or with ScalaCLI. From a dependency hygiene perspective, I'd try the Mill approach first. Not having to debug two different build servers, deal with two separate configuration frameworks, two sets of business logic libraries, etc. would be a huge advantage over both the status quo or going with ScalaCLI

@lolgab
Copy link
Member

lolgab commented Mar 8, 2023

@lefou I'm all in for this proposal! I should have said clearly!
What I wanted to say is that we don't need to start from scratch but provide a class-level dependency tree (which might come from the MillBuildModule and zinc, but I don't know how it would work) instead of the file-level dependency tree taken from Ammonite today.
Once we do that, most of the existing implementation can probably be reused.

@lefou
Copy link
Member

lefou commented Mar 9, 2023

I had trouble in the past with undercompilation in zinc. This is especially relevant when macros are used, which is the case in Mill. Beside using zinc in MillBuildModule, I think if we can get some kind of source dependency tracking in pure Scala code (like it is done in async) I would have more confidence in it's accuracy. This also means we are less coupled to zinc internals.

When building projects, it's easy to just drop incremental compilation results, but it may be a bit harder to even detect when Mill uses cached values based on incorrect code path dependencies.

@lihaoyi
Copy link
Member Author

lihaoyi commented Mar 9, 2023

Doing this in userland Scala code is unlikely to work. Scala code is opaque unless lifted into some M[T] the way we do Task[T], and lifting every single method call into a Task is unfeasible. It would also defeat the purpose of letting people write build logic in plain Scala.

We'd need to go one level deeper and do it at the language/runtime level: either a Scala compiler plugin (which is what Zinc does) or a JVM bytecode analyser. Not sure which one is easier to implement, but both should be doable

@lefou
Copy link
Member

lefou commented Mar 9, 2023

Thanks for clarifying. I guess I triggered you by saying "pure Scala", while I had some macro or compiler code in mind, sorry. You are the expert here. IMHO, we should use whatever is the easiest path to success. I have no clear picture of how complex it is, but I hope I can understand it in principle once it arrives.

@lihaoyi
Copy link
Member Author

lihaoyi commented Mar 9, 2023

@lefou you mentioned MillBuildModule isnt ready to use. What's the status of that? Is there a PR or something I can commandeer if I wish to push it forward?

@lefou
Copy link
Member

lefou commented Mar 9, 2023

@lefou you mentioned MillBuildModule isnt ready to use. What's the status of that? Is there a PR or something I can commandeer if I wish to push it forward?

The current version of MillBuildModule in tree does know all included files and it's dependencies. It's used in the BSP server to represent the synthetic mill-build module.

https://github.com/com-lihaoyi/mill/blob/c9220b402f22ab0d6d35ee55d0f4e3349314688c/scalalib/src/mill/scalalib/bsp/MillBuildModule.scala

What is missing is the Ammonite-specific wrapping of the .sc files.

I had some experiments in PR #1972, but never finished it up. The wrapping was rather lame and probably incorrect. It could compile a build script, but I think the packages (sub-package of ammonite) and other stuff wasn't correct. I also think that I messed something up in the latest rebase of that PR.

@lihaoyi
Copy link
Member Author

lihaoyi commented Mar 9, 2023

Thanks, I think I understand.

We should be able to drop the Ammonite-specific idiosyncracies right? Since we're aiming to drop Ammonite, as long as we can compile and run the build file, it shouldn't matter if we don't e.g. follow the Ammonite package prefix convention

@lefou
Copy link
Member

lefou commented Mar 9, 2023

Thanks, I think I understand.

We should be able to drop the Ammonite-specific idiosyncracies right? Since we're aiming to drop Ammonite, as long as we can compile and run the build file, it shouldn't matter if we don't e.g. follow the Ammonite package prefix convention

I think working with default package (no package at all) has issues (you simply can't) once you try to import from another script which belongs to a package (because it is in a subdir). But choosing the same top-level package as the tool is probably not a good idea, too. The build script has not the same origin, and mechanisms like access restrictions with private[package] may not be effective.

@lefou
Copy link
Member

lefou commented Mar 9, 2023

Best is to come with some cryptic-generated package name, e.g. mill$$yolo.

lihaoyi added a commit that referenced this issue Jul 30, 2023
…2417)

This PR replaces the coarse-grained script-import based
target-invalidation-on-code-change (introduced in
#1663) with fine-grained
invalidation based on the JVM-level method callgraph ([previous
discussion](#2348)). This
callgraph is constructed by using [ASM](https://asm.ow2.io/) to parse
the bytecode generated by compiling `build.sc`: We compute a code-hash
for every method based on its bytecode, and propagate that hash
throughout the call graph Merkle-Tree-style, resulting in a
`methodCodeHashSignatures: Map[String, Int]` dictionary that associates
each method with a hash signature representing both its own code as well
as the code of all methods that it calls (transitively). A more limited
analysis is applied to upstream library code. This hash signature is
then used as part of the `inputsHash` for each target, causing a Target
to invalidate after a code change only if the code change affected the
Target's implementation method or some other method that it calls
(transitively)

Of the 7,000+ lines in this PR, only about ~1,000 are implementation
code, the other ~6,000 are test cases. Most of the 100+ files added in
this PR are also just tiny standalone Java/Scala files serving as test
cases, with the main logic in `main/codesig/` being about a half-dozen
files and a handful more for the test utilities

## Algorithm

At a high level, the analysis logic living in the `main/codesig/` folder
does the following:

1. `LocalSummary` parses the compile output of `build.sc`. This extracts
the class hierarchy, computes method code hashes, and identifies method
callsites within method bodies.
* Notably, JVM lambdas (InvokeDynamic bytecodes) are treated as if they
are called at-definition-site. Even though that is not strictly precise,
it is conservatively correct, and is the same thing the Scala.js
optimizer does according to @sjrd.

3. `ExternalSummary` parses any super-classes of locally-defined classes
that are in the upstream classpath - Java/Scala stdlib, Mill code, or
things that are `import $ivy`ed - and extracts the class hierarchy and
method signatures without analyzing method bodies.
* This is necessary so that calls against externally-defined interfaces,
e.g. `InputStream#read`, can be properly traced to local implementations
even if they do not extend the interface directly (e.g. a `MyInputStream
extends ByteArrayInputStream` class)

5. `ResolvedCalls` uses the class hierarchy to identify what potential
implementations each callsite will resolve to.
* For calls to virtual methods, we treat them as potentially calling any
implementation of any subclasses across the whole program. This is done
without regard to whether the sub-class would be instantiated or not
* For calls to external methods for which we did not analyze the
bytecode, we treat them conservatively and assume that they could
potentially call any other external methods defined in the receiver
class, the function parameter types, or any of their superclasses, and
thus any locally-defined overrides for those external methods. This is
similar to the approach taken in [AVERROES: Whole-Program Analysis
without the Whole
Program](https://plg.uwaterloo.ca/~olhotak/pubs/ecoop13.pdf)

6. `ReachabilityAnalysis` converts the method resolution results into a
heterogenous call graph (comprising `LocalDef`s, `Call`s, and
`ExternalClsCall`s) that we feed into Tarjan's to generate topo-sorted
strongly-connected-components. We then walk the SCCs in topological
order to do the Merkel-Tree hash propagation in one pass, with every
method within a SCC having the same hash (because they all call each
other circularly, and so they must have the same transitive call graph)

* Here, we do some special casing for Single-Abstract-Method types and
implementations, to make them work like InvokeDynamic lambdas (i.e.
count reachability via instantiation, rather than via interface method
invocation). This is the right thing to do in principle since SAM
methods and Indy lambdas have similar use cases, and is necessary in
practice because the compiler often chooses between the two styles of
generating code somewhat arbitratily
(scala/scala#3616)

7. For every `Target`, we look up the hash of it's `def` method, as well
as the `<init>` constructor methods for each of its enclosing `Module`
classes, and combine them to form the final hash of the `Target`

* Mill `Target`s must live on `Module`s which form a tree of static
`object`s. We walk the object tree from the `Target`'s directly
enclosing module to the root module using a new `enclosingModule`
property on the `mill.define.Ctx` value, and look up all their `<init>`
methods. They should all only have one `<init>` method since they're
static objects

From the Call Graph Analysis literature, this is a relatively
straightforward Class Hierarchy Analysis (CHA), with some tweaks called
out above. Notably we aren't able to generate callgraphs via more
precise algorithms such as Rapid Type Analysis (RTA), which would
generalize our the lambda special-casing for all user-implemented
traits.

## Limitations

1. We do not do any kind of dataflow analysis. 
* That means we do not know when a Target calls a method whether it is
passing values in, receiving return values out, causing side effects, or
some combination of these.
* We just assume that if a Target end up transitively calling a method,
it is likely to affect it *somehow*, and we thus invalidate it to be
re-run
* This also means that `def`s and `val`s in the body of a class are
handled differently, despite looking superficially similar. `def`s are
full methods and participate in the call graph, so changes would only
invalidate downstream Targets that transitively end up calling that
method. `val`s are simply fields that are initialized in the class'
`<init>` constructor method or `<clinit>` class initialization, so
changing in `val` ends up invalidating any downstream Targets that
create a `new` instance of that class: probably a much broader set of
Targets

2. Both the handling of JVM lambdas and the handling of virtual methods
are approximations
* For JVM lambdas, we assume that they are "live" just because they are
instantiated, without checking if they get called
* This means that a lambda that is created but doesn't end up getting
called gets imprecisely marked as a callgraph dependency
* For virtual methods, we assume they are "live" just because they get
called, without checking if the relevant class gets instantiated
* That means that if a Target makes a virtual method call to e.g.
`InputStream#read`, we will assume that Target depends on all
`InputStream#read`s across the codebase, even if for subclasses of
`InputStream` that never get instantiated
* Both are conservative approximations that would cause some unnecessary
callgraph edges, but both approximations are well fit to actual usage
patterns (JVM lambdas have one method and tend to have tons of different
implementations, whereas virtual methods tend to have multiple methods
and a small number of implementations)
* Doing the "proper" thing and checking _both_ for calling and
instantiation is done by Rapid Type Analysis (and other more advanced
callgraph analyzers). These typically require a separate whole-program
analysis for every Target we may want to invalidate, which would be
unfeasible in a large build with 100s of Modules and 10,000s of Targets.
There are some approaches that may mitigate that (e.g. [Overlap-aware
rapid type analysis for constructing one-to-one matched call graphs in
regression test
selection](https://www.semanticscholar.org/paper/Overlap-aware-rapid-type-analysis-for-constructing-Kim-Jeong/68b3dfadb2a67a13a1ec78f90f8d8192f71be0e6?citedSort=relevance),
[Incrementalization of Analyses for Next Generation
IDEs](https://www.semanticscholar.org/paper/Incrementalization-of-analyses-for-next-generation-Kloppenburg/414fa60c65fc1aed05e4f522386de50066641321))
but for now I'm just leaving it for future work.


3. We avoid detailed callgraph analysis of external/upstream libraries
because the entire transitive codebase would be very large: e.g. all of
Mill, all of `com.lihaoyi`, all of Scala std lib, all of Java std lib.
Limiting our detailed analysis to the relatively-small `build.sc`
compile output simplifies things and greatly speeds things up, at the
expense of some precision in the callgraph


4. Apart from the question of CHA v.s. RTA, it is probably also possible
to get more precise callgraphs by performing the analysis at the Scala
level rather than at the JVM bytecode level ([Constructing Call Graphs
of Scala Programs](https://plg.uwaterloo.ca/~olhotak/pubs/ecoop14.pdf)).
I chose to work at the JVM bytecode level because it's
simple/stable/familiar to me, but trying to do the analysis at the
Scala-level is potential future work.

* We are able to ignore the Scala-level program information because at
the end of the day what runs on the JVM is bytecode. Thus the bytecode
fully defines the runtime semantics of the program, and changes in
bytecode are sufficient to tell us if a target needs to be invalidated,
even if it's not as precise as what the Scala type information can give
us

## Notes

1. I special-case handling of `new sourcecode.Line(_)` and explicitly
ignore the argument (~always a literal number). This is because we
explicitly do *not* want line number changes to force things to
re-compile: even if they can in theory affect the output of a target,
the assumption is that it's just there for debugging/inspect purposes
and won't. The way I do this in a bit hacky and relies on the bytecode
pattern generated by the Scala compiler, but it seems to work.

2. This analysis in ignores fields entirely. In general, any change to
instance or static fields that matters would be accompanied by
corresponding changes to method bodies, even if just changes to the
`<init>` constructor method.

3. The approximations we use to handle InvokeDynamic lambdas and virtual
method calls is necessary for this analysis to be precise enough to be
useful.
* If we treat lambdas like any other set of trait/implementations, the
callgraph would end up including an edge from every `def foo = T{}`
Target to every other Target in the `build.sc`.
* Conversely, if we treated virtual methods the same way we treat
lambdas, then every `mill.Module` someone instantiates (e.g. by
referencing the module name, which invokes the class static initializer,
which instantiates the class itself) would have all its methods and
Targets depended-on by default by the instantiating method.
* Both of these would cause enough additional callgraph edges to make
everything depend on everything else, rendering callgraph analysis
useless

4. We ignore calls to `Target`-returning methods as part of the static
callgraph analysis: `Target`s, `Input`s, `Worker`s, etc. Those
`NamedTask`s will themselves get invalidated if necessary if the
non-`NamedTask` code _they_ depend on changes, and invalidation will
already get propagated through the runtime `Task` graph without needing
to also model it in the static callgraph. This would improve precision
while also greatly shrinking the static callgraphs

* This relies on an additional assumption over the current analysis: it
assumed that methods returning `NamedTask`s are pure and do not generate
side effects. This is probably an OK assumption to make, because proper
operation of the runtime `Task` graph also relies on that assumption

* This greatly improves precision for our use case. Virtual methods
implemented in many different places is a big weakness of CHA since it
has to assume any implementation could be called, especially combined
with calls to methods on external library traits that are also treated
conservatively and have to assume that any virtual method defined in
that library (and thus any local implementation) can be called. However,
in a Mill build, the vast majority of "virtual methods implementations"
and "external library method calls" are the
`mill.define.Target[A]`/`T[A]` methods we override and call. Skipping
static callgraph analysis for these and relying on the exact runtime
build graph is thus a huge improvement to the precision of target
invalidation.

5. I wire up JSON logging (to disk in the `T.dest` folder) to the
`--debug` flag. It's too expensive to generate all the time (~doubles
the time taken for `methodCodeHashSignatures`), but we definitely want
it to be available when things are misbehaving and we need to debug
things

## Automated Testing

1. `main/codesig/test/cases/callgraph/` contains unit tests that assert
the (simplified) call graphs are the right shape for a variety of
minimal scenarios. This also includes a few less trivial scenarios in
the `realistic/` folder: some of my old [Java
games](https://github.com/lihaoyi/Java-Games), a [parallel merge-sort
algorithm](https://github.com/handsonscala/handsonscala/blob/ebc0367144513fc181281a024f8071a6153be424/examples/13.7%20-%20ParallelMergeSort/MergeSort.sc)
and some [Castor actor
code](https://github.com/handsonscala/handsonscala/blob/ebc0367144513fc181281a024f8071a6153be424/examples/16.8%20-%20LoggingRearrangedPipeline2/LoggingPipeline.sc)
from my book Hands-on Scala, a Fastparse parser. Both Java and Scala
code work because we perform the analysis at the bytecode level.

2. We also exercise the transitive hash-propagation logic in some of the
`callgraph/` tests by replacing the "hash sum" logic with "Set union",
using it to compute transitive closures of the graph, and asserting that
the transitive closures are what we expect. This adds test coverage for
most of the logic around the hash propagation, without the expense of
performing runtime code-changes and re-compilation that we need to
perform in the integration tests.

3. `main/codesig/test/cases/methodhash/` contains tests that assert that
the method code hash signature has the correct properties: that it
changes when it should change and is un-changed when it should be
un-changed (e.g. when there are only formatting/line-number/comment
changes)

4. `integration/feature/codesig{simple,trait,scalamodule}` contain some
integration tests in that test the workflow end-to-end including the
integration into Mill's evaluator. These ensure that when you edit a
file and re-run Mill, the correct targets are re-evaluated. Test cases
include a single Target, Targets inside module `object`s and `trait`s, a
single `ScalaModule`, and a bunch of dependent/unrelated `ScalaModule`s

I had to update a bunch of existing tests - e.g. those in
`integration/feature/invalidation/` - because now random code changes no
longer invalidate targets within that file unless you actually change
the target or method bodies that the target calls.

This is a binary compatible change. I had to add some MIMA exclusions
due to MIMA not correctly considering things nested within a `private
object` as `private` (lightbend-labs/mima#771).

I added an opt-out flag `--disable-callgraph-invalidation` to swap back
to the previous script-import-graph-based invalidation system, as a
change-management measure to mitigate the risk of the novel invalidation
algorithm

## Manual Testing

I did some end-to-end tests on this PR, running `__.compile` on a few
different projects:

1. On the com-lihaoyi/fansi build, the incrementality seems to work as
expected: e.g. changing the `fansi.JsFansiModule#scalaJSVersion` from
`1.10.1` to `1.10.0` invalidates 87/796 targets which all seem Scala.js
related, while changing the `FansiTestModule#ivyDeps` uTest version from
0.8.1 to 0.8.0 invalidates 67/875 targets that are all test suite
related. This is working as expected

2. On the `com-lihaoyi/upickle` build, changing `bench.scalaJSVersion`
from `1.13.0` to `1.13.1` invalidates 28/3695 targets, while changing
`CommonJsModule#scalaJSVersion` invalidates 412/3695 targets

3. On the `com-lihaoyi/mill` codebase:

1. Adding a whitespace at the top of `build.sc`, to offset everyone's
line numbers without changing their logic, invalidates 16/8154 targets,
all downstream of `millVersion` which is invalidated by the dirty hash
of the repo checkout changing

2. Changing `scalajslib.worker[1].ivyDeps` by re-ordering them,
invalidates 17/8154 targets (just the ones above, + the one edited)

3. Changing `MillScalaModule#scalacOptions` by removing `-feature`,
invalidates 836/8154 targets

4. Changing `contrib.playlib.WorkerModule#sources` generating a
temporary file, invalidates 37/8154 targets, mostly stuff in
`contrib.playlib.worker[_]`

5. Changing a constant `Deps.scalaVersion` ends up invalidating
~4343/8154 targets. I skimmed through the results and they seem
reasonable: all `.scalaVersion` and `.compile` targets end up
invalidated, while many external targets aren't invalidated because they
don't have any local code to be affected by the callgraph analysis and
their upstream build graph is not affected by `Deps.scalaVersion` (e.g.
`integration.test.javacOptions`,
`contrib.codeartifact.mandatoryScalacOptions`)

6. Adding a new `val abc = 123` to the root module invalidates
everything, which is expected since it changes the constructor of
`build` which could potentially affect any module nested within it.
1. One solution for this is to turn them into `lazy val`s. Maybe we can
do it automatically via a compiler plugin.
2. Dataflow/Purity analysis is another alternative that would let us
keep them as `val`s while still distinguishing which `val` is used in
what method, but that is much more complicated analysis than the
callgraph analysis this PR implements

The Fansi and uPickle results shows that this approach does work for
non-trivial builds, and the Mill results show that it works even on
pretty large builds like Mill's own `build.sc` codebase.

## Debugging

The easiest way to understand what is going on with the callgraph
analysis is to run Mill with the `--debug` flag. This generates extra
output JSON files in `out/mill-build/methodCodeHashSignatures.dest/`
which give you an insight into what is going on inside the callgraph
analyzer: the `localSummary.json`, `externalSummary.json`,
`resolvedMethodCalls.json`, etc. These are opt-in with `--debug` due to
the added overhead it takes to generate them (0.3-0.5s for the Mill
codebase, would be larger for larger builds)

On item of interest is the `spanningInvalidationForest.json` that is
generated the second time you run Mill with `--debug` enabled. This file
contains a tree of nested JSON dictionaries containing every method
definition or call that was invalidated, with the roots of the tree
being the invalidation "roots" (i.e. methods which were invalidated
without any parents being invalidated), and the tree structure
indicating an arbitrary path from an invalidation root to each specific
invalidated method. This is very useful for answering the question "why
was this method/target/etc. invalidated in response to code changes"

## Performance

Performance-wise, ad-hoc benchmarks on `com-lihaoyi/mill`'s own build
show a ~5% increase in `build.sc` compilation times due to this. Not
nothing, but probably acceptable: the cost is only paid when the
`build.sc` is re-compiled, and it will likely end up saving much more
time in tasks that we can avoid running (e.g. a single no-op Zinc
incremental compile may be 100s of milliseconds)

| | methodCodeHashSignatures | compile |
|-|---|---|
| Cold |  685ms | 12,148ms |
| Hot | 253ms | 4,143ms | 

It's taken some amount of optimizations to reach this point. There are
definitely further optimizations that can be done, e.g. replacing the
various `Map`s we pass around with `Array` or parallelizing parts of the
analysis (It's mostly pure functional code and should be easy to
parallelize)

## What Users can do to improve incrementality

1. `Module` `val`s should generally be avoided in favor of `def`s or
`lazy val`s. `val`s are all bundled together in the `Module`
constructor, so any change to any `val` invalidates any downstream code
who depends on anything in that `Module`

2. Abstract members of `Module`s should be `Target`s - `T[V]`s -
whenever possible, rather than plain `V`s. We make a stronger assumption
that `T[V]` methods are pure that we cannot assume in general for any
method returning an un-wrapped `V`, and can rely fully on runtime
build-graph analysis which is a lot more precise than the static
class-hierarchy-analysis we do no non-`Target` abstract methods.
* e.g. a change to the code in `BuildInfo#buildInfoMembers` invalidates
the code in only that target, because we assume that `Target` method
dependencies and invalidation will be handled by the runtime callgraph
* But a change to `BuildInfo#buildInfoPackageName` invalidates the code
in almost everything on every `JavaModule extends BuildInfo` because we
cannot guarantee that the `<init>` method of those `Module`'s does not
depend on `buildInfoMembers`.

These restrictions are fundamental given the simple
reachability/class-hierarchy analysis that we do in this PR, and lifting
them would involve a more complicated dataflow analysis or purity
analysis that would be much less simple to implement

---------

Co-authored-by: Tobias Roeser <le.petit.fou@web.de>
@lihaoyi lihaoyi closed this as completed Jul 30, 2023
@lefou lefou added this to the 0.11.2 milestone Jul 30, 2023
@lefou
Copy link
Member

lefou commented Jul 30, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants