RFC: Support Incremental Compilation #594

michaelwoerister · 2015-01-18T18:52:33Z

This RFC proposes an incremental compilation strategy for rustc that allows for translation, codegen, and parts of static analysis to be done in an incremental fashion, without precluding the option of later expanding incrementality to parsing, macro expansion, and resolution.

This RFC is purely about the architecture and implementation of the Rust compiler. It does not propose any changes to the language. I also don't expect it to be acted on any time before 1.0 is out of the door, but I wanted to get this out into the open, so that it can discussed as part of the RustC Architecture Improvement Initiative (that's right, RAII) that I invented just now and that will begin to discuss how the Rust compiler can get as good as possible once the language has become a more stable target.

Rendered

ghost · 2015-01-18T21:06:12Z

text/0000-incremental-compilation.md

+It would be possible to make dependency tracking aware of the kind of reference one item makes to another. If an item `A` mentions another item `B` only via some reference type (e.g. `&T`), then item `A` only needs to be updated if `B` is removed or `B` changes its 'sized-ness'. This is comparable to how forward declarations in C are handled. In the dependency graph this would mean that there are different kinds of edges that trigger for different kinds of changes to items.
+
+### Global Switches Influencing Codegen
+There are many compiler flags that change the way the generated code looks like, e.g. optimization and debuginfo levels. A simple strategy to deal with this would be to store the set of compiler flags used for building the cache and clearing the cache completely if another set of flags is used. Another option is to keep multiple caches, each for a different set of compiler flags (e.g. keeping both on disk, a 'debug build cache' and a 'release build cache').


Hash the relevant flags for the subdir name? I'd expect a lot of -C options affect the cache, and only storing one set wouldn't help at all for some usage patterns.

Yeah, something like that. I'd like to see how big such a cache gets.

nrc · 2015-01-18T22:29:44Z

cc @epdtry who implemented most of a very similar scheme last summer.

We talked about this as incremental codegen (as opposed to proper incremental compilation). He only kept around object files, not llvm ir too.

It would be great if @epdtry could link to his WIP branch and explain the concepts, etc. here.

nrc · 2015-01-18T22:30:38Z

text/0000-incremental-compilation.md

+It should not be too hard to let the compiler keep track of which parts of the program change infrequently and then let it speculatively build object files with more than one function in them. For these aggregate object files inter-function LLVM optimizations could then be enabled, yielding faster object code at little additional cost. Other strategies for controlling cache granularity can be implemented in a similar fashion.
+
+### Parallelization
+If some care is taken in implementing the above concepts it should be rather easy to do translation and codegen in parallel for all items, since by design we already have (or can deterministically compute) all the information we need.


We already can do codegen in parallel, although there is a bug preventing most use atm.

dhardy · 2015-01-19T10:03:57Z

Please don't make the acronym RAII more confusing by adding a competing definition. Arguably Resource Allocation Is Initialisation is the wrong name anyway, but that's no reason to make the term more difficult to explain to new-comers.

bstrie · 2015-01-19T14:27:56Z

I agree with @dhardy. As an alternative, may I propose "So Far, Incrementalism Necessitates An Exegesis"?

michaelwoerister · 2015-01-19T17:51:06Z

@dhardy Sorry for the confusion :) That suggestion was not meant entirely serious.

@bstrie I am in awe. Now do TANSTAAFL !

Diggsey · 2015-01-20T02:07:23Z

@dhardy
You're right that is the wrong name :P, it's "Resource Acquisition Is Initialisation"
(also we have the same first name...)

spernsteiner · 2015-01-26T15:49:48Z

text/0000-incremental-compilation.md

+}
+```
+
+The dependency tracking system as described above contains `node templates` for `program item` definitions on a syntactic level, that is, for each `struct`, `enum`, `type`, `trait`, there is one `node template`, for each `fn`, `static`, and `const` there are two (one for the interface, one for the body). However, as seen in the section on generics, the codebase can refer to monomorphized instances of program items that cannot be identified by a single identifier as described above. A reference like `Option<String>` is a composite of multiple `program item` IDs, a tree of program item IDs in the general case:


On the subject of monomorphized identifiers: you'll probably need to do something about symbol naming for monomorphizations of functions. Right now the name includes the hash of the pointers to the Tys representing the type arguments (which is random, thanks to ASLR). This does fine at preventing collisions, but it means you'll need to either record the mapping of (polymorphic function, type arguments) -> (symbol name) for use in later incremental builds, or fix symbol naming to produce something consistent. I tried to do the latter, but it wound up being a little more complicated than I expected (ADT Tys reference the struct/enum definition by its DefId, which is not stable) and I don't remember if I ever got it working.

Yes, that's a problem. I'd probably try to find a more stable symbol naming scheme.

nikomatsakis · 2015-08-16T12:20:52Z

I am expanding and adapting this RFC. After some discussion with @michaelwoerister we decided to close this existing PR for the time being.

Add incremental compilation RFC

3f55ba6

ghost reviewed Jan 18, 2015
View reviewed changes

nrc reviewed Jan 18, 2015
View reviewed changes

nrc self-assigned this Jan 22, 2015

spernsteiner reviewed Jan 26, 2015
View reviewed changes

sanxiyn mentioned this pull request Feb 11, 2015

Implement code completion rust-lang/rust#21323

Closed

nrc added the T-compiler Relevant to the compiler team, which will review and decide on the RFC. label May 15, 2015

nikomatsakis closed this Aug 16, 2015

ticki mentioned this pull request Jan 24, 2016

Long build times for exa crate rust-lang/rust#31164

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Support Incremental Compilation #594

RFC: Support Incremental Compilation #594

michaelwoerister commented Jan 18, 2015

ghost Jan 18, 2015

michaelwoerister Jan 19, 2015

nrc commented Jan 18, 2015

nrc Jan 18, 2015

dhardy commented Jan 19, 2015

bstrie commented Jan 19, 2015

michaelwoerister commented Jan 19, 2015

Diggsey commented Jan 20, 2015

spernsteiner Jan 26, 2015

michaelwoerister Jan 29, 2015

nikomatsakis commented Aug 16, 2015

RFC: Support Incremental Compilation #594

RFC: Support Incremental Compilation #594

Conversation

michaelwoerister commented Jan 18, 2015

ghost Jan 18, 2015

Choose a reason for hiding this comment

michaelwoerister Jan 19, 2015

Choose a reason for hiding this comment

nrc commented Jan 18, 2015

nrc Jan 18, 2015

Choose a reason for hiding this comment

dhardy commented Jan 19, 2015

bstrie commented Jan 19, 2015

michaelwoerister commented Jan 19, 2015

Diggsey commented Jan 20, 2015

spernsteiner Jan 26, 2015

Choose a reason for hiding this comment

michaelwoerister Jan 29, 2015

Choose a reason for hiding this comment

nikomatsakis commented Aug 16, 2015