Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upIncremental recompilation #2369
Comments
ghost
assigned
catamorphism
May 8, 2012
This comment has been minimized.
This comment has been minimized.
|
there is the code in trans that tries to determine dependencies for the purposes of metadata export. this seems like a reasonable starting point for such a graph. |
This comment has been minimized.
This comment has been minimized.
|
I think it would be nice if this was done in such a way that incremental compilation appears to the programmer no differently than full-compilation. For example I mean that:
|
This comment has been minimized.
This comment has been minimized.
|
@Dretch -- I totally agree; the only thing the programmer should notice is that recompilation will usually be much faster :-) (1) and (2) are great goals to have written down; if we don't achieve those (especially (2)), I won't consider this issue to be completed. |
This comment has been minimized.
This comment has been minimized.
|
If we do this at the level of "caching bitcode for distinguished subtreees of a crate", it's probably not too hard (though I'm not sure it'll make things a lot faster). If we do this at the level of "trying to cache target-machine object files", things get a fair bit weirder. We (or, well, I) chose the existing compilation model for crates based on the need to "eventually" do cross-module optimization (in particular, single-instantiation of monomorphic code and inlining). Crates were the optimization boundary, the compilation-unit boundary. We have subsequently grown cross-crate inlining. So this distinction is a bit less meaningful, at least in terms of optimization. There is still a meaningful (in the sense of "not easy to eliminate or blur") linkage boundary at work in two terms: monomorphization instantiations and version-insensitivity (the "anything called 1.0 with the same types is the same crate" rule discussed last week). Overall, this is one of a few bugs pointing in a similar direction: #2176, #1980, #2238, #558, #2166, #456 and even #552 to some extent. I am not saying these are all wrong. They are all pointing to similar sets of semantic weaknesses .. or "surprises" .. in the existing compilation model. I would like to have a conversation at some point (probably in public, or videoconf, or both) where we approach this problem as a design problem, and try to work out a new set of agreeable principles and plan-of-action for future work that spans the whole set of related bugs. I do want to fix them, but do not want to go much further down this road without having a map of where we're going. As an example: it could be that we wind up treating all inter-module functionality uniformly via a multiple dimensions of a single kind of |
This comment has been minimized.
This comment has been minimized.
|
Yes, don't worry, I won't jjump into this without some serious design discussions. |
catamorphism
referenced this issue
Jul 16, 2012
Closed
Tool to detect/display function dependencies in Rust crates #558
catamorphism
referenced this issue
May 16, 2013
Closed
Don't rebuild everything on test-or-comment-only changes #6522
cmr
referenced this issue
Jul 7, 2013
Closed
Implement dependency information output like GCC to rustc #7633
This comment has been minimized.
This comment has been minimized.
|
issue appears to be properly classified |
This comment has been minimized.
This comment has been minimized.
|
High, not 1.0 |
This comment has been minimized.
This comment has been minimized.
|
Do we still want to attempt this, or is splitting into crates viewed as enough now that we have static linking? I guess the missing feature would be combining multiple static libraries into a dynamic library. |
catamorphism
removed their assignment
Jun 16, 2014
This comment has been minimized.
This comment has been minimized.
|
Right... Ideally this should take into account whether compiler flags and/or environment variables have changed. Perhaps compiler version would be another thing to take care of. So a simple solution is to store output in directories which contain checksum of all the parameters we care about. |
This comment has been minimized.
This comment has been minimized.
|
Personally, I do still believe that compilers shouldn't be to claver about how to build projects and attempting to replace tools like make wouldn't likely end quite well. I know that clang has those JSON files, which I haven't quite looked into... Also wanted to point out that Qt's QBS certainly looks very appealing. |
This comment has been minimized.
This comment has been minimized.
|
Distributed build systems for Rust: http://discuss.rust-lang.org/t/distributed-build-systems-for-rust/400
Or maybe Rust should use Ninja… |
This comment has been minimized.
This comment has been minimized.
|
Shake is another interesting build system: https://github.com/ndmitchell/shake A comparaison with Ninja: http://neilmitchell.blogspot.fr/2014/05/build-system-performance-shake-vs-ninja.html |
l0kod
referenced this issue
Sep 6, 2014
Merged
parallelize LLVM optimization and codegen passes #16367
This comment has been minimized.
This comment has been minimized.
suhr
commented
Feb 7, 2015
This comment has been minimized.
This comment has been minimized.
nh2
commented
Aug 7, 2015
|
Hi, let me give some pointers to the Haskell world. GHC has this solved - and probably the best working incremental recompilation engine on the planet. It can do:
GHC has documented its approach in high detail, and highlighted what the problems are. See here: https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/RecompilationAvoidance I believe that Rust is in perfect shape to reach the same level of incremental compilation, and can likely apply the same techniques. It's all there, we just have to copy it ;)
I once also thought so, but this is not the case. External build tools like |
This comment has been minimized.
This comment has been minimized.
ebassi
commented
Aug 11, 2015
|
An important side effect of incremental recompilation that was not mentioned, and I think it's more important than building environment enablement, is enabling tooling (like IDEs) to perform operations like on-the-fly code analysis and prompt the user for warnings, code completion, etc. |
This comment has been minimized.
This comment has been minimized.
|
The problem with 99% of build tools out there is that they duplicate the dependency management by either letting the user specify dependencies prior to target execution and/or by scanning them heuristically with prior knowledge over the type of executed target, and often doing so wrongly, failing to duplicate intrinsic logic, resulting in broken or failed builds. The builds are sometimes slower than they could be because the developer forsakes on specifying dependencies in favor of safe but slow re-execution, to avoid the aforementioned broken builds. I suggest that you look into a newer approach, where dependencies are detected reliably for any kind of intermediate. The developer only needs to worry about target invocation itself. So, if you neatly split the build process to a DAG of separate process invocations, even with extremely complicated dependencies, this tool can track them easily. |
This comment has been minimized.
This comment has been minimized.
nikomatsakis
self-assigned this
Jan 6, 2016
This comment has been minimized.
This comment has been minimized.
nh2
commented
Jan 23, 2016
|
@huonw Is this issue the right place to discuss the RFC? |
This comment has been minimized.
This comment has been minimized.
nh2
commented
Jan 23, 2016
|
Even if not, I'll ask some questions here:
Why was this chosen? Wouldn't it make sense to include source files in the dependency graph as well, so that you can skip parsing and even reading the file contents if the file modification time suggests that the file has not changed?
I'm not familiar with codegen units, and where their boundaries would be set, but if it's typically on the library/crate level, that could make some inlining problematic. Take for example Haskell's |
This comment has been minimized.
This comment has been minimized.
Eventually perhaps yes. But for the initial versions, we're targeting the things in compilation that are most expensive: LLVM and type-checking. Hashing the HIR also means that we can avoid doing recompilation for smaller, trivial changes, like tweaking a comment -- at least in some cases (it turns out that because that affects the line/col number of all statements, we would need to change at least debuginfo, but we can hopefully isolate the effects of that in the future.) There are also just practical concerns. It's much easier to reduce the amount of the compiler we have to instrument.
Users can always add |
bluss
added
the
A-incr-comp
label
Apr 30, 2016
This comment has been minimized.
This comment has been minimized.
|
#34956 |
This comment has been minimized.
This comment has been minimized.
nh2
commented
Aug 19, 2016
|
@nikomatsakis Thanks for your explanation. |
This comment has been minimized.
This comment has been minimized.
|
@michaelwoerister is this still the best tracking issue for incremental? What's the current status? |
This comment has been minimized.
This comment has been minimized.
|
I forgot this issue existed. The preferred tracker is rust-lang/rust-roadmap-2017#4. In fact, I'm just going to close this issue. |
nikomatsakis
closed this
Jun 5, 2017
This comment has been minimized.
This comment has been minimized.
|
@nh2: It's frustrating (and hardly uncommon) for people to show up and say we "just need to copy Haskell" while ignoring the very real differences between the languages. In this case, the most important difference is that a whole Rust crate is semantically a single compilation unit, in fact a single syntax tree. Any
The only way around this is the tedious and error-prone approach of writing a hs-boot file, equivalent to writing header files in C. In practice almost nobody does this; they simply structure their programs so the module dependency graph is acyclic. In terms of the build system (not in terms of packaging / versioning), this is like placing every So, most of the features you highlight in |
catamorphism commentedMay 8, 2012
I want to be able to recompile only dependencies (rather than everything in a crate), like SML's compilation manager or GHC's
--makemode. I don't really care about the approach so long as it works and the outcome is that adding one#debugcall in one file doesn't trigger recompilation of 100 other files in the same crate :-) I am volunteering to work on this after 0.3, but suggestions are welcome. Patrick suggested a good place to start would be to generate a (visualizable) graph of item dependencies, which makes sense to me.