New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable reuse of `.o` files if nothing has changed #34956

Merged
merged 9 commits into from Jul 29, 2016

Conversation

Projects
None yet
6 participants
@nikomatsakis
Contributor

nikomatsakis commented Jul 21, 2016

This PR completes a first "spike" for incremental compilation by enabling us to reuse .o files when nothing has changed. When in incr. mode, we will save .o files into the temporary directory, then copy them back out again if they are still valid. The code is still a bit rough but it does seem to work. =)

r? @michaelwoerister

Fixes #34036
Fixes #34037
Fixes #34038

@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented Jul 21, 2016

Excellent :) I'll take a look at it first thing tomorrow.

@jonas-schievink

This comment has been minimized.

Contributor

jonas-schievink commented Jul 21, 2016

I just tried to build glium with this and I'm hitting multiple ICEs about local/foreign items:

https://gist.github.com/jonas-schievink/536a1ad7aa065193e4954242c9885cbd

(I've set RUSTFLAGS=-Zincremental=target/INCREMENTAL before running cargo)

/// them even in the absence of a tcx.)
#[derive(Clone, Debug, PartialEq, Eq, Hash, RustcEncodable, RustcDecodable)]
pub enum WorkProductId {
PartitionObjectFile(String), // see (*TransPartition) below

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Nit: Our nomenclature around "partitions", "codegen units", "translation units", etc is a bit inconsistent. We are using these names more or less interchangeably and it would probably be good to decide on the canonical term.

/// this map. We can later look for and extract that data.
previous_work_products: RefCell<FnvHashMap<Arc<WorkProductId>, WorkProduct>>,
/// work-products that we generate in this run

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Nit: These doc-comments should start with an uppercase letter.

/// Each work product is associated with a dep-node, representing the
/// process that produced the work-product. If that dep-node is found
/// to be dirty when we load up, then we will delete the work-product
/// at load time. If the work-product is found to be clean, the we

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

s/the/then

@@ -88,7 +88,7 @@ pub fn compile_input(sess: &Session,
// We need nested scopes here, because the intermediate results can keep
// large chunks of memory alive and we want to free them as soon as
// possible to keep the peak memory usage low
let (outputs, trans) = {
let (outputs, trans, id) = {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

crate_name would be a better variable name than id

match decode_dep_graph(tcx, &dep_graph_data, &work_products_data) {
Ok(()) => return,
Err(err) => bug!("decoding error in dep-graph from `{}` and `{}`: {}",

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Is this necessarily a bug? At least it seems like something the compiler can recover from by just ignoring it.

}
}
pub fn load_dep_graph_if_exists<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, path: &Path) {
fn load_data(sess: &Session, path: &Path) -> Option<Vec<u8>> {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

At some point we'll want to use memory maps for things like these. (just an observation, not related to this PR specifically)

/// Decode the dep graph and load the edges/nodes that are still clean
/// into `tcx.dep_graph`. On success, returns a hashset containing all
/// the paths of work-products from clean nodes (any work-products not
/// in this set can be deleted).

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

It looks like there is no hashset returned and the function takes care of deleting invalidated work-products itself.

@@ -913,13 +929,46 @@ fn build_work_item(sess: &Session,
}
}
fn link_or_copy<P: AsRef<Path>, Q: AsRef<Path>>(p: P, q: Q) -> io::Result<()> {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

This looks like something that could go into some utils module.

@@ -2279,7 +2283,14 @@ fn internalize_symbols(cx: &CrateContextList, reachable: &HashSet<&str>) {
// Examine each external definition. If the definition is not used in
// any other compilation unit, and is not reachable from other crates,
// then give it internal linkage.
for ccx in cx.iter() {
for ccx in cx.iter_need_trans() {
if ccx.sess().opts.debugging_opts.incremental.is_some() {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Why not just exit at the beginning of the function?

items: FnvHashMap(),
});
.or_insert_with(|| CodegenUnit::new(codegen_unit_name.clone(),
FnvHashMap()));

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Could use CodegenUnit::empty(codegen_unit_name.clone()) here too.

@@ -88,6 +99,13 @@ impl<'a, 'tcx> TransItem<'tcx> {
}
}
TransItem::Fn(instance) => {
let _task;
if instance.def.is_local() {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Why only for local items?

This comment has been minimized.

@nikomatsakis

nikomatsakis Jul 22, 2016

Contributor

Why only for local items?

Yeah, hmm, a good question. :) That might be wrong.

//!
//! ```
//! #![rustc_partition_reused(module="spike", cfg="rpass2")]
//! #![rustc_partition_translated(module="spike-x", cfg="rpass2")]

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

I personally would find rustc_partition_reused vs rustc_partition_not_reused be clearer than rustc_partition_reused vs rustc_partition_translated.

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

I just figured that there is the third case of a partition being neither re-used nor re-translated but removed entirely. If this third deleted case was added to the list then reused vs translated makes more sense.

This comment has been minimized.

@nikomatsakis

nikomatsakis Jul 22, 2016

Contributor

I just figured that there is the third case of a partition being neither re-used nor re-translated but removed entirely.

I was anticipating a third-case when I did the names, but I'm not sure now if that makes sense. The third case I had in mind was one where we had saved the LLVM IR pre-optimization -- but in that case, I realize now you might still want Translated. Or at least the backend, optimizing code doesn't care where the LLVM IR came from (though we might use another variant just for unit testing).

I hadn't thought about the case of deleting a partition -- I don't know if we would need that for any reason? Just to cleanup the saved file, are you thinking? I had imagined we would do some sort of "GC" where, on startup--or maybe termination--we just crawl the incr-comp directory and delete random files that don't correspond to known work-products. In fact, I had such code, but I remember now I wound up deleting it because in the newer scheme I didn't need it to get things working.

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

I don't know if we would need that for any reason? Just to cleanup the saved file, are you thinking?

Yes that, but also logically it's another case, right? There'd be something wrong if I removed a module from the source code but there'd still be a WorkProduct listed for it.

I had imagined we would do some sort of "GC"

Isn't this covered by reconcile_work_products() anyway?

This comment has been minimized.

@nikomatsakis

nikomatsakis Jul 22, 2016

Contributor

Yes that, but also logically it's another case, right?

Hmm, well, logically yes, though (at least now) we don't create a ModuleTranslation for modules that used to exist in some previous compilation but no longer do, since there is no work to do in such a case -- but I guess we could just assert that (indeed) no module exists with that name.

Isn't this covered by reconcile_work_products() anyway?

Yeah, I forgot about that. Indeed, for the most part it is, which is why I delete the other code I had that just scraped up files with "suspicious looking" names.

//!
//! The reason that we use `cfg=...` and not `#[cfg_attr]` is so that
//! the HIR doesn't change as a result of the annotations, which might
//! perturb the reuse results.

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Good thinking :)

}
#[rustc_dirty(label="TypeckItemBody", cfg="rpass2")]
#[rustc_clean(label="ItemSignature", cfg="rpass2")]

This comment has been minimized.

@michaelwoerister

michaelwoerister Jul 22, 2016

Contributor

Does it make sense to have these tests here? If they failed for some reason, they might hide that the rustc_partition_reused framework is broken.

This comment has been minimized.

@nikomatsakis

nikomatsakis Jul 22, 2016

Contributor

Does it make sense to have these tests here? If they failed for some reason, they might hide that the rustc_partition_reused framework is broken.

True. Removed.

@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented Jul 22, 2016

OK, I've done a first pass over this. It's so nice to see all the work from the last 10 months finally coming together! (I mean, not that we are anywhere near being done yet, of course :))

The PR looks very good. The only thing I'm not so sure about is the factoring of the WorkProduct functionality. It seems a bit strange to have the partitioning-hashes in there, something which will only be used for object files (and maybe .bc files), but not for something like MIR. At the same time the WorkProductId does not scale to things with a DefId. It seems to me that there is room for improvement. However, I won't block the PR on this.

Another thing that isn't quite clear to me is the relationship between the "temp" directory and the "incr. comp.". Are they the same when incr. comp. is enabled? Or do we copy back and forth between them? I guess, we'll clarify these questions when we tackle #34957, #34955, and #32754.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 22, 2016

The only thing I'm not so sure about is the factoring of the WorkProduct functionality. It seems a bit strange to have the partitioning-hashes in there, something which will only be used for object files (and maybe .bc files), but not for something like MIR. At the same time the WorkProductId does not scale to things with a DefId. It seems to me that there is room for improvement.

As I said on IRC, I also felt like the WorkProduct thing wasn't as general as I originally thought it would be. I suspect it might be better to just rip it out and change all that stuff to be specific to object files -- though I guess what is there could work just as well for bc files as it does for o files, so it's at least general enough for that.

(OTOH, it won't work for MIR, since we'd want something with a def-id as key for that; they also probably don't need the auxiliary hash info.)

I'm inclined to leave it now but just expect it to evolve, but that's mostly because I'm lazy. I'd also be happy to rip it out.

@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented Jul 22, 2016

I'm inclined to leave it now but just expect it to evolve, but that's mostly because I'm lazy. I'd also be happy to rip it out.

I'm fine with leaving it as it is in the PR for now. If we had a clear vision of what it ought to look like, that would be different.

@bors

This comment has been minimized.

Contributor

bors commented Jul 22, 2016

☔️ The latest upstream changes (presumably #34917) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis nikomatsakis force-pushed the nikomatsakis:incr-comp-o-files branch from 3293bac to 5c56fca Jul 22, 2016

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 22, 2016

@bors r=mw

@bors

This comment has been minimized.

Contributor

bors commented Jul 22, 2016

📌 Commit 5c56fca has been approved by mw

@bors

This comment has been minimized.

Contributor

bors commented Jul 23, 2016

⌛️ Testing commit 5c56fca with merge f62c12b...

bors added a commit that referenced this pull request Jul 23, 2016

Auto merge of #34956 - nikomatsakis:incr-comp-o-files, r=mw
Enable reuse of `.o` files if nothing has changed

This PR completes a first "spike" for incremental compilation by enabling us to reuse `.o` files when nothing has changed. When in incr. mode, we will save `.o` files into the temporary directory, then copy them back out again if they are still valid. The code is still a bit rough but it does seem to work. =)

r? @michaelwoerister

Fixes #34036
Fixes #34037
@bors

This comment has been minimized.

Contributor

bors commented Jul 24, 2016

💔 Test failed - auto-linux-musl-64-opt

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 25, 2016

@michaelwoerister care to review the final commit?

bors added a commit that referenced this pull request Jul 26, 2016

Auto merge of #34956 - nikomatsakis:incr-comp-o-files, r=michaelwoeri…
…ster

Enable reuse of `.o` files if nothing has changed

This PR completes a first "spike" for incremental compilation by enabling us to reuse `.o` files when nothing has changed. When in incr. mode, we will save `.o` files into the temporary directory, then copy them back out again if they are still valid. The code is still a bit rough but it does seem to work. =)

r? @michaelwoerister

Fixes #34036
Fixes #34037
@bors

This comment has been minimized.

Contributor

bors commented Jul 26, 2016

💔 Test failed - auto-win-msvc-64-opt-rustbuild

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jul 26, 2016

@bors: retry

On Tue, Jul 26, 2016 at 5:23 AM, bors notifications@github.com wrote:

💔 Test failed - auto-win-msvc-64-opt-rustbuild
https://buildbot.rust-lang.org/builders/auto-win-msvc-64-opt-rustbuild/builds/1979


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#34956 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAD95JZkQsxuMpO50qTI2ocLXsY7R55Nks5qZfw2gaJpZM4JSAbm
.

@bors

This comment has been minimized.

Contributor

bors commented Jul 28, 2016

☔️ The latest upstream changes (presumably #34907) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis added some commits Jul 21, 2016

Extend DepGraph so it can track "work-products"
A work product right now is just a `.o` file. In the future it probably
includes other kinds of files, such as `.bc` files saving the
unoptimized LLVM IR.

However, because WorkProductIds must be independent of DefIds, so that
they don't need translation, this system may not be suitable *as is* for
storing fine-grained information (such as the MIR for individual defs),
as it was originally intended. We will want to refactor some for that.
Store `crate_disambiguator` as an `InternedString`
We used to use `Name`, but the session outlives the tokenizer, which
means that attempts to read this field after trans has complete
otherwise panic. All reads want an `InternedString` anyhow.
Code to save/load the work-products map from disk
Work products are deleted if any of their inputs are dirty.
Modify trans to skip generating `.o` files
This checks the `previous_work_products` data from the dep-graph and
tries to simply copy a `.o` file if possible.  We also add new
work-products into the dep-graph, and create edges to/from the dep-node
for a work-product.
Keep multiple files per work-product
In the older version, a `.o` and ` .bc` file were separate
work-products.  This newer version keeps, for each codegen-unit, a set
of files of different kinds. We assume that if any kinds are available
then all the kinds we need are available, since the precise set of
switches will depend on attributes and command-line switches.

Should probably test this: the effect of changing attributes in
particular might not be successfully tracked?

@nikomatsakis nikomatsakis force-pushed the nikomatsakis:incr-comp-o-files branch from ea13edb to 2f9fff2 Jul 28, 2016

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 28, 2016

@bors r=mw

@bors

This comment has been minimized.

Contributor

bors commented Jul 28, 2016

📌 Commit 2f9fff2 has been approved by mw

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 28, 2016

@bors r-

seeing make check errors locally :(

nikomatsakis added some commits Jul 28, 2016

hash def-path's better
actually we shouldn't even hash nested items at all, but that is
addressed in a followup PR
@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Jul 28, 2016

@bors r=mw

@bors

This comment has been minimized.

Contributor

bors commented Jul 28, 2016

📌 Commit 42cd5d4 has been approved by mw

@bors

This comment has been minimized.

Contributor

bors commented Jul 28, 2016

⌛️ Testing commit 42cd5d4 with merge 54c0dcf...

bors added a commit that referenced this pull request Jul 28, 2016

Auto merge of #34956 - nikomatsakis:incr-comp-o-files, r=mw
Enable reuse of `.o` files if nothing has changed

This PR completes a first "spike" for incremental compilation by enabling us to reuse `.o` files when nothing has changed. When in incr. mode, we will save `.o` files into the temporary directory, then copy them back out again if they are still valid. The code is still a bit rough but it does seem to work. =)

r? @michaelwoerister

Fixes #34036
Fixes #34037
Fixes #34038

@bors bors merged commit 42cd5d4 into rust-lang:master Jul 29, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@WiSaGaN

This comment has been minimized.

Contributor

WiSaGaN commented Aug 5, 2016

Will this reduce the time of cargo check if only one of several files changes?

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Aug 10, 2016

@WiSaGaN it should, yes, but we are still fixing some known shortcomings there.

@nikomatsakis nikomatsakis deleted the nikomatsakis:incr-comp-o-files branch Oct 3, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment