Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load only the crate header for locator::crate_matches #111329

Merged
merged 2 commits into from
May 29, 2023

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented May 8, 2023

Previously, we used the following info to determine whether to load the crate:

  1. The METADATA_HEADER, which includes a METADATA_VERSION constant
  2. The embedded rustc version
  3. Various metadata in the CrateRoot, including the SVH

This worked ok most of the time. Unfortunately, when building locally the rustc version is always
the same because omit-git-hash is on by default. That meant that we depended only on 1 and 3, and
we are not very good about bumping METADATA_VERSION (it's currently at 7) so in practice we were
only depending on 3. CrateRoot is a very large struct and changes somewhat regularly, so this led
to a steady stream of crashes from trying to load it.

Change the logic to add an intermediate step between 2 and 3: introduce a new CrateHeader struct
that contains only the minimum info needed to decide whether the crate should be loaded or not. That
avoids having to load all of CrateRoot, which in practice means we should crash much less often.

Note that this works because the SVH should be different between any two dependencies, even if the
compiler has changed, because we use -Zbinary-dep-depinfo in bootstrap. See
#111329 (comment) for more details about how the
original crash happened.

Helps with #76720.

@rustbot
Copy link
Collaborator

rustbot commented May 8, 2023

r? @ozkanonur

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels May 8, 2023
@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member

Depdep is binary dep info yeah, the flag is -Zbinary-dep-depinfo IIRC.

My sense is that this has a chance of causing extra invalidation where it's not technically needed... maybe due to e.g. doc clearing out the full build directory or something? I don't have a great test case in mind though, I'm not sure why I(?) limited that originally.

Looking at the past PRs in this area, I had cited this case:

also avoid rebuilding the whole compiler on changes to libtest (with keep-stage to limit to a single stage build) since there isn't a dependency edge between libtest and most compiler crates. That seems like a good feature to keep.

Can you check if that breaks here? My guess is yes.

In general since I wrote that on #86641 (comment), I feel like this case is less important, but would be good to have a clear sense of tradeoffs. Working builds are probably more important than slightly faster builds...

That said, my sense is that something about the depinfo changing is still a better fix here, at least in the long run. Maybe now that we have a test case that's worth taking another look at.

@jyn514
Copy link
Member Author

jyn514 commented May 8, 2023

I'm not sure why I(?) limited that originally.

looks like the original PR adding this was #67760, and you actually had the same code from this PR until you found in #67760 (comment) that it broke the tests (same as in this PR 😄). I think that test failure is fixable while still keeping this change (maybe we're deleting the sysroot too aggressively?) - I'll look into it.

also avoid rebuilding the whole compiler on changes to libtest (with keep-stage to limit to a single stage build) since there isn't a dependency edge between libtest and most compiler crates. That seems like a good feature to keep.

Can you check if that breaks here? My guess is yes.

It does not :)

; x b rustc_span --stage 0
Building stage0 library artifacts (x86_64-unknown-linux-gnu)
    Finished release [optimized] target(s) in 0.07s
Building compiler artifacts {rustc_span} (stage0 -> stage1, x86_64-unknown-linux-gnu)
   Compiling rustc_span v0.0.0 (/home/jyn/src/rust/compiler/rustc_span)
    Finished release [optimized + debuginfo] target(s) in 0.87s
; touch library/test/src/lib.rs 
; x b rustc_span --stage 0
Building stage0 library artifacts (x86_64-unknown-linux-gnu)
   Compiling test v0.0.0 (/home/jyn/src/rust/library/test)
   Compiling sysroot v0.0.0 (/home/jyn/src/rust/library/sysroot)
    Finished release [optimized] target(s) in 0.30s
Building compiler artifacts {rustc_span} (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished release [optimized + debuginfo] target(s) in 0.09s

That said, my sense is that something about the depinfo changing is still a better fix here, at least in the long run. Maybe now that we have a test case that's worth taking another look at.

Hmm, ok - I thought bindep-depinfo was only for rustbuild, but looking at #63012 (comment) I see @jonhoo and others are using it for toolchain tracking too. I'll see if I can minimize the bug.

@jyn514
Copy link
Member Author

jyn514 commented May 8, 2023

Actually, it looks like binary-dep-depinfo doesn't even try to address this use case? Am I missing something? https://github.com/rust-lang/rust/blob/1fa1f5932b487a2ac4105deca26493bb8013a9a6/compiler/rustc_interface/src/passes.rs#L490-L511

@bjorn3
Copy link
Member

bjorn3 commented May 8, 2023

-Zbinary-dep-depinfo does include the standard library parts used as dependencies by the respective crate, just not rustc itself.

@jyn514
Copy link
Member Author

jyn514 commented May 8, 2023

I tried adding the toolchain to the dep-info file (i.e. build/x86_64-unknown-linux-gnu/stage1/bin/rustc: gets added to the end of every file) and it did not solve this problem. I think the issue is that -Zbinary-depinfo only tells cargo to rerun rustc, not that it needs to also clear the incremental and metadata caches.

diff --git a/compiler/rustc_interface/src/passes.rs b/compiler/rustc_interface/src/passes.rs
index 6a94d19001e..a5ca209b2bf 100644
--- a/compiler/rustc_interface/src/passes.rs
+++ b/compiler/rustc_interface/src/passes.rs
@@ -517,6 +517,14 @@ fn write_out_deps(
                     files.push(escape_dep_filename(&path.display().to_string()));
                 }
             }
+
+            match std::env::current_exe() {
+                Ok(driver_path) => files.push(escape_dep_filename(&driver_path.display().to_string())),
+                Err(e) => {
+                    #[allow(rustc::untranslatable_diagnostic, rustc::diagnostic_outside_of_impl)] // old enough this changes a lot in later commits
+                    sess.err(format!("failed to determine path to current process: {e}"));
+                }
+            }
         }
 
         let mut file = BufWriter::new(fs::File::create(&deps_filename)?);

@Mark-Simulacrum
Copy link
Member

I think that test is actually misleading? You're not rebuilding the compiler in the first step, so we're not running the code here at all. You need to build through to the bin/rustc binary (I forget what Step does that). After that I still expect it to cause more re-compilations than needed.

I thought bindep-depinfo was only for rustbuild [...]
Actually, it looks like binary-dep-depinfo doesn't even try to address this use case? Am I missing something?

I'm not sure I understand what this use case is. The expectation in my mind is that if you change rustc in a way that changes metadata format, you should get a new std. The binary depdep info will include the dependency on std, and so we will cause a rebuild of the compiler.

I think that isn't true with keep-stage1-std or whatever that flag is, but I'm not sure that the fully right fix is trying to fix that... I guess we can make this conditional on that flag? Essentially if we've artificially kept any dependency Step the next one needs this special cleaning logic. (For example, keeping rustc would mean needing this logic for rustdoc, if I'm connecting the dots right?)

Essentially keep-stage bypasses this logic by avoiding an update to the Std/Library binaries.

But with #76720 (comment) I guess we're seeing this without keep-stage? I'm not sure what is particular about that test case...

it needs to also clear the incremental and metadata caches.

I'm not sure about incremental, but why would metadata caches be a problem? rustc should never be loading metadata it's not compatible with because dependencies are built by Cargo before the next rustc runs.


Looking at the test case in #76720 (comment) we're failing to recompile libserde_derive in stage1-tools for some reason, that's the file we fail to load:

INFO rustc_metadata::locator dylib reading metadata from: /home/mark/Build/rust-3/build/x86_64-unknown-linux-gnu/stage1-tools/release/deps/libserde_derive-5970f6e0b3790395.so
DEBUG rustc_metadata::locator checking 8 bytes of metadata-version stamp
DEBUG rustc_metadata::locator inflating 25684 bytes of compressed metadata

I'm going to try to figure out why we're not recompiling libserde_derive -- the .d file rustc emits for it points to a non-existent libstd file: /home/mark/Build/rust-3/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-c38e6f867e4a21ae.rlib, so I would expect Cargo to do the right thing here without further changes to rustbuild, but that doesn't seem to be happening.

@Mark-Simulacrum
Copy link
Member

https://gist.githubusercontent.com/Mark-Simulacrum/3e5d4dc8d659da79376d1181617d121a/raw/3b5176553de7ec6d0b7649bc1e0552c84dfee0f2/log is the log file including Cargo's fingerprint and rustc's metadata loading. I'm still trying to dig through that, no clear leads just yet.

My latest suspicion is that Cargo actually does rebuild the serde_derive and serde crates (in an earlier invocation), but that when rustc goes to look for the libserde_derive crate it's failing to load that as it's loading an earlier version of that crate. (Cargo doesn't delete stale files in the target directory).

I'm not quite sure yet why we see this only with serde-derive and not other (non proc macro?) dependencies...

@Mark-Simulacrum
Copy link
Member

OK, so I think I have at least a theoretical understanding of what causes this. You need to go from version A to version B of rust-lang/rust where both of these are true (plus having built A):

  • Some dependency version changed (e.g., serde_derive 1.0.x to 1.0.y).
  • The compiler's metadata format changed, without bumping the metadata version constant.
    • And you're using ignore-git or similar, so the rustc version information isn't enough to reject the crate either, causing us to go load the full CrateRoot.

The first condition is needed so that when rustc is run, it outputs a distinct/separate file (e.g., in this case we have version B with libserde_derive-67ce481370f50db2.so and version A with libserde_derive-5970f6e0b3790395.so). The second condition is needed so that when we go to load the metadata from each of these to get the full SVH and confirm which one is correct, we fail to deserialize the metadata at that time (this is what causes the ICE).

I think a better fix than the one in this PR is to change our metadata format so that we can get at the SVH, crate name, etc. (i.e., everything accessed here:

let root = metadata.get_root();
if root.is_proc_macro_crate() != self.is_proc_macro {
info!(
"Rejecting via proc macro: expected {} got {}",
self.is_proc_macro,
root.is_proc_macro_crate(),
);
return None;
}
if self.exact_paths.is_empty() && self.crate_name != root.name() {
info!("Rejecting via crate name");
return None;
}
if root.triple() != &self.triple {
info!("Rejecting via crate triple: expected {} got {}", self.triple, root.triple());
self.crate_rejections.via_triple.push(CrateMismatch {
path: libpath.to_path_buf(),
got: root.triple().to_string(),
});
return None;
}
let hash = root.hash();
if let Some(expected_hash) = self.hash {
if hash != expected_hash {
info!("Rejecting via hash: expected {} got {}", expected_hash, hash);
self.crate_rejections
.via_hash
.push(CrateMismatch { path: libpath.to_path_buf(), got: hash.to_string() });
return None;
}
}
) without needing to deserialize a bunch of the other information in the root (even just at Lazy level). This should be pretty easy to do, and would mean that the metadata constant only needs to be bumped when a small set of information changes, rather than abstractly when "something" changes, which we're bad at getting right.

(I think it should even be possible to make the metadata version something we compute dynamically, e.g., by serializing a known set of information in this "CratePrefix" and then hashing that, which may be an alternative fix here).

Either of these approaches would mean that the ICE hit in #76720 (comment) isn't hit anymore, while preserving the cache behavior we want (around libtest -> rustc_span not being a valid edge, for example). This should also make it much more rare that this happens outside of rustc tree, which I think is sometimes the case today, particularly where others also rely on -Zbinary-dep-depinfo, which we're hoping to stabilize "soon".

@petrochenkov -- do you have thoughts on the metadata format change here? Preference for whether to change the way we compute the version constant or moving some information into a crateprefix struct or similar with more "stable" encoding (e.g., no optimization around laziness, etc.)?

@jyn514 jyn514 added the A-metadata Area: Crate metadata label May 10, 2023
@petrochenkov
Copy link
Contributor

@Mark-Simulacrum

do you have thoughts on the metadata format change here?

I barely remember what is this all about.
I also never changed the version constant and don't know when it's supposed to be changed.
@bjorn3 also tried to make some changes that may be related to this issue in #93945.

@petrochenkov
Copy link
Contributor

r? @bjorn3

@rustbot rustbot assigned bjorn3 and unassigned petrochenkov May 10, 2023
@bjorn3
Copy link
Member

bjorn3 commented May 12, 2023

I think changing the crate metadata format to include all information used by the CrateLocator::crate_matches function (with the exception of if it is a proc macro or not I think) in the metadata header next to the rustc version makes sense. It could be useful if I revive #93945 too.

That being said, even with that change we will still need to manually clear the incremental cache, but at least that shouldn't cause any unnecessary recompilations. Just slower recompilations when a recompilation is actually necessary.

@jyn514
Copy link
Member Author

jyn514 commented May 12, 2023

I think an easier solution for incremental is to just turn it off altogether for stage 1+ builds; it seems unlikely in practice that we'll ever actually be able to reuse the cache.

@bjorn3
Copy link
Member

bjorn3 commented May 12, 2023

With --keep-stage enabling incremental for stage > 0 should help. But --keep-stage can cause this kind of compiler crashes anyway, so maybe not worth optimizing for that.

@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 20, 2023
@jyn514
Copy link
Member Author

jyn514 commented May 20, 2023

OK, so I think I have at least a theoretical understanding of what causes this. You need to go from version A to version B of rust-lang/rust where both of these are true (plus having built A):

this was exactly it!!

here's a script to reproduce it (assuming you have a checkout at rust2 and rust3, and r2-stage1/r3-stage linked):

#!/bin/sh
unset CARGO_TARGET_DIR

cargo new repro-76720
cd repro-76720
cargo add serde_derive@1.0.162

build() {
    worktree=$1
    commit=$2
    serde_version=$3
    toolchain=$4

    # We can't use RTIM because this only repros with `omit-git-hash`.
    (cd ../$worktree && git checkout $commit && x build std proc_macro)
    cargo update -p serde_derive --precise $serde_version
    cargo clean -p repro-76720 -p serde_derive
    cargo +$toolchain-stage1 build
}

build rust2 6874f4e3fc2a16be7c78e702d068bbc1daa90e16 1.0.162 r2
build rust3 999ac5f7770bff68bd65f490990d32c3ec1faaa6 1.0.163 r3

going to work tomorrow on actually fixing the bug now that there's an MCVE :) ty again!

@jyn514
Copy link
Member Author

jyn514 commented May 26, 2023

I'm going to reuse METADATA_VERSION instead of adding a second version constant for now. I think it will be very unlikely in practice to need to change CrateHeader.

@rustbot
Copy link
Collaborator

rustbot commented May 26, 2023

Some changes occurred in src/tools/rust-analyzer

cc @rust-lang/rust-analyzer

@bjorn3
Copy link
Member

bjorn3 commented May 26, 2023

Do note that this will break rust-analyzer proc macro loading on nightly until this change is synced back and I believe there was some issue preventing syncs back right now. The rust-analyzer server attempts to read the rustc version from proc macros before spawning proc-macro-srv from the sysroot.

@Veykril
Copy link
Member

Veykril commented May 26, 2023

No that should be fine. Those checks are only done at the proc-macro server side now, the rust-analyzer server doesn't check the dylibs itself anymore.

@jyn514
Copy link
Member Author

jyn514 commented May 26, 2023

@Veykril who handles subtree syncs? Will you or another RA member make a sync after this lands or should I do the sync myself?

@Veykril
Copy link
Member

Veykril commented May 26, 2023

@lnicola usually does them, but we currently have a problem with syncing because it would break our CI atm. I forgot to look into that, I'll look into that again in a bit.

@bjorn3
Copy link
Member

bjorn3 commented May 26, 2023

MacroDylib::new calls version::read_dylib_info. MacroDylib::new is called at

let dylib = MacroDylib::new(path.to_path_buf())
which is part of the rust-analyzer binary, not the rust-analyzer-proc-macro-srv binary. Or am I missing something?

@jyn514
Copy link
Member Author

jyn514 commented May 26, 2023

I wonder if RA should get the rustc version from proc-macro-srv instead of trying to read it directly. Then it wouldn't be broken by these updates. The only time it needs the version is if it failed to load the proc-macro anyway, which should be very rare now that it's using the server in the sysroot.

@Veykril
Copy link
Member

Veykril commented May 26, 2023

MacroDylib::new calls version::read_dylib_info. MacroDylib::new is called at

let dylib = MacroDylib::new(path.to_path_buf())

which is part of the rust-analyzer binary, not the rust-analyzer-proc-macro-srv binary. Or am I missing something?

Well, it doesn't do that anymore on rust-lang/rust-analyzer, the subtree is just that outdated 😅 see rust-lang/rust-analyzer@3ae9bfe

@bjorn3
Copy link
Member

bjorn3 commented May 29, 2023

@bors r+

@bors
Copy link
Contributor

bors commented May 29, 2023

📌 Commit 60e95e7 has been approved by bjorn3

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 29, 2023
@bors
Copy link
Contributor

bors commented May 29, 2023

⌛ Testing commit 60e95e7 with merge 99ff5af...

@bors
Copy link
Contributor

bors commented May 29, 2023

☀️ Test successful - checks-actions
Approved by: bjorn3
Pushing 99ff5af to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 29, 2023
@bors bors merged commit 99ff5af into rust-lang:master May 29, 2023
11 checks passed
@rustbot rustbot added this to the 1.72.0 milestone May 29, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (99ff5af): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.2% [1.2%, 1.2%] 1
Regressions ❌
(secondary)
2.0% [1.1%, 4.7%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.2% [1.2%, 1.2%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 643.87s -> 644.761s (0.14%)

@jyn514 jyn514 deleted the metadata-ice branch May 29, 2023 15:25
saethlin pushed a commit to saethlin/miri that referenced this pull request May 30, 2023
Load only the crate header for `locator::crate_matches`

Previously, we used the following info to determine whether to load the crate:
1. The METADATA_HEADER, which includes a METADATA_VERSION constant
2. The embedded rustc version
3. Various metadata in the `CrateRoot`, including the SVH

This worked ok most of the time. Unfortunately, when building locally the rustc version is always
the same because `omit-git-hash` is on by default. That meant that we depended only on 1 and 3, and
we are not very good about bumping METADATA_VERSION (it's currently at 7) so in practice we were
only depending on 3. `CrateRoot` is a very large struct and changes somewhat regularly, so this led
to a steady stream of crashes from trying to load it.

Change the logic to add an intermediate step between 2 and 3: introduce a new `CrateHeader` struct
that contains only the minimum info needed to decide whether the crate should be loaded or not. That
avoids having to load all of `CrateRoot`, which in practice means we should crash much less often.

Note that this works because the SVH should be different between any two dependencies, even if the
compiler has changed, because we use `-Zbinary-dep-depinfo` in bootstrap. See
rust-lang/rust#111329 (comment) for more details about how the
original crash happened.
@jyn514 jyn514 added beta-nominated Nominated for backporting to the compiler in the beta channel. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 30, 2023
@apiraino
Copy link
Contributor

apiraino commented Jun 1, 2023

Beta backport declined as per compiler team on Zulip

@rustbot label -beta-nominated

@rustbot rustbot removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 1, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 4, 2023
…-Simulacrum

Revert "Enable incremental independent of stage"

This reverts commit 827f656.

Incremental is not sound to use across stages. Arbitrary changes to the compiler can invalidate the incremental cache - even changes to normal queries, not incremental itself! - and we do not currently enable `incremental-verify-ich` in bootstrap. Since 2018, we highly recommend and nudge users towards stage 1 builds instead of stage 2, and using `keep-stage` for anything other than libstd is very rare.

I don't think the risk of unsoundness is worth the very minor speedup when building libstd. Disable incremental to avoid spurious panics and miscompilations when building with the stage 1 and 2 sysroot.

Combined with rust-lang#111329, this should fix rust-lang#76720.

r? `@Mark-Simulacrum`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 20, 2024
Load only the crate header for `locator::crate_matches`

Previously, we used the following info to determine whether to load the crate:
1. The METADATA_HEADER, which includes a METADATA_VERSION constant
2. The embedded rustc version
3. Various metadata in the `CrateRoot`, including the SVH

This worked ok most of the time. Unfortunately, when building locally the rustc version is always
the same because `omit-git-hash` is on by default. That meant that we depended only on 1 and 3, and
we are not very good about bumping METADATA_VERSION (it's currently at 7) so in practice we were
only depending on 3. `CrateRoot` is a very large struct and changes somewhat regularly, so this led
to a steady stream of crashes from trying to load it.

Change the logic to add an intermediate step between 2 and 3: introduce a new `CrateHeader` struct
that contains only the minimum info needed to decide whether the crate should be loaded or not. That
avoids having to load all of `CrateRoot`, which in practice means we should crash much less often.

Note that this works because the SVH should be different between any two dependencies, even if the
compiler has changed, because we use `-Zbinary-dep-depinfo` in bootstrap. See
rust-lang/rust#111329 (comment) for more details about how the
original crash happened.
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
Load only the crate header for `locator::crate_matches`

Previously, we used the following info to determine whether to load the crate:
1. The METADATA_HEADER, which includes a METADATA_VERSION constant
2. The embedded rustc version
3. Various metadata in the `CrateRoot`, including the SVH

This worked ok most of the time. Unfortunately, when building locally the rustc version is always
the same because `omit-git-hash` is on by default. That meant that we depended only on 1 and 3, and
we are not very good about bumping METADATA_VERSION (it's currently at 7) so in practice we were
only depending on 3. `CrateRoot` is a very large struct and changes somewhat regularly, so this led
to a steady stream of crashes from trying to load it.

Change the logic to add an intermediate step between 2 and 3: introduce a new `CrateHeader` struct
that contains only the minimum info needed to decide whether the crate should be loaded or not. That
avoids having to load all of `CrateRoot`, which in practice means we should crash much less often.

Note that this works because the SVH should be different between any two dependencies, even if the
compiler has changed, because we use `-Zbinary-dep-depinfo` in bootstrap. See
rust-lang/rust#111329 (comment) for more details about how the
original crash happened.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metadata Area: Crate metadata merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet