Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework rustc output file tracking. #8210

Merged
merged 5 commits into from
May 7, 2020
Merged

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented May 5, 2020

This is some cleanup around how Cargo computes the outputs from rustc, and changes how cargo clean -p works. This also includes some minor observable differences detailed below.

clean -p changes

Previously cargo clean -p would build the unit graph and attempt to guess what all the filename hashes are. This has several drawbacks. It incorrectly guesses the hashes in many cases (such as different features). It also tends to cause bugs because it passes every permutation of every setting into Cargo's internals, which may not be prepared to accept unsupported combinations like "test a build-script".

This new implementation uses a strategy of querying rustc for the filename prefix/suffix, and then deletes the files using globs.

cargo clean -p should now be more complete in deleting a package's artifacts, for example:

  • Now deletes incremental files.
  • Deletes dep-info files (both rustc and cargo).
  • Handles changes in profiles, features, (anything in the hash).
  • Covers a few more files (for example, dSYM in some cases, etc.). Should delete almost all files for most targets.

Internal changes

There are a bunch of internal changes to make Cargo do a better job of tracking the outputs from rustc, and to make the code easier to understand:

  • The list of output files is now solely computed in TargetInfo. The files which are uplifted are solely computed in CompilationFiles::uplift_to. Previously both responsibilities were awkwardly spread in both locations.
  • Whether or not a file should have a hyphen or underscore is now determined in one place (FileType::should_replace_hyphens).
  • Added CrateType to replace LibKind, to avoid usage of strings, and to use the same structure for all of the target kinds.
  • Added FileFlavor::Rmeta to be more explicit as to which output files are ".rmeta". (Previously the Linkable{rmeta} flag was specific for pipelining, and rmeta was false for things like cargo check, which was a bit hard to deal with.)
  • Removed hyphen/underscore renaming in rustc. This was mostly unused, because it did not consider hashes in the filename, so it only applied to binaries without hashes, which is essentially just wasm32 and msvc. This hyphen/underscore translation still happens during "uplift".
  • Changed it so that Metadata is always computed for every unit. The logic of whether or not something should use it is moved to should_use_metadata. I didn't realize that multiple units were sharing the same fingerprint directory (when they don't have a hash), which caused some bugs (like bad output caching).

Behavioral changes

Cargo now does more complete tracking of the files generated by rustc (and the linker). This means that some files will now be uplifted that previously weren't. It also means they will show up in the artifact JSON notifications. The newly tracked files are:

  • .exp export files for Windows MSVC dynamic libraries. I don't know if these are actually useful (nobody has asked for them AFAIK). Presumably the linker creates them for some reason. Note that lld doesn't generate these files, this is only link.exe.
  • Proc-macros on Windows track import/export files.
  • All targets (like tests, etc.) that generate separate debug files (pdb/dSYM) are tracked.
  • Added .map files for wasm32-unknown-emscripten.

Some other subtle changes:

  • A binary example with a hyphen on Windows MSVC will now show up as examples/foo_bar.exe and examples/foo-bar.exe. Previously Cargo would just rename it to contain the hyphen. This is a consequence of simplifying the code, and I was reluctant to add a special case for this very narrow situation.
  • Example libs now follow the same rules for hyphen/underscore translation as normal libs (they will now use underscores).
  • Fingerprint changes:
    • Fingerprint files no longer have a hash in them. Their parent directory already contained the hash.
    • The fingerprint filename now uses hyphens instead of converting to underscores.
    • The fingerprint directory is now separated even if the unit doesn't use Metadata, to fix a caching bug.
  • macOS: dSYM is uplifted for all dynamic libraries (dylib/cdylib/proc-macro) and for build-script-build (in case someone wants to debug a build script?).

Notes

  • I suspect that the implementation of clean -p may change again in the future. If and when Cargo implements some kind of on-disk database that tracks artifacts (for the purpose of garbage collection), then cargo clean -p can be rewritten to use that mechanism if appropriate.
  • The build_script_build incremental directory isn't deleted because Cargo doesn't know which one belongs to which package. I'm uncertain if that's reasonably fixable. The only option I've thought of is to place each package's incremental output in a separate directory.
  • Should Cargo use globset to handle non-utf-8 filenames? I suspect that Cargo's support for non-utf-8 names is pretty poor, so I'm uncertain how important that is.

Closes #8149
Closes #6937
Closes #5788
Closes #5375
Closes #3530

@rust-highfive
Copy link

r? @Eh2406

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 5, 2020
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing on this @ehuss! I only sort of lightly skimmed the internal refactorings. It all sounds fine at a high-level and everything internally looked "correct enough" that it should either be covered by tests or we'll figure it out on nightly pretty soon anyway.

I'm left with really only one minor comment, but otherwise this I think is a great improvement to cargo clean -p, so r=me when ready

if matches.is_empty() {
anyhow::bail!("package ID specification `{}` matched no packages", spec);
}
packages.extend(pkg_set.get_many(matches)?);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the get_many here starts a new download session each time, so maybe these could all be collected from the opts.spec.iter() loop and one get_many could be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

When a unit does not have Metadata, the computation of fingerprints
depends on reusing the same fingerprint file to detect if the
output needs to be rebuilt. The previous change that put each unit's
fingerprint into a separate directory was wrong, and broke change
detection in this case (for example, executables on MSVC).

Instead, use a different approach to deal with compiler output caching
by using the same naming convention as the fingerprint file itself.
bors added a commit that referenced this pull request May 6, 2020
[beta] Fix `clean -p` with a build dependency.

This is a temporary and simple fix for #8149 where `cargo clean -p foo` would fail if `foo` has a build dependency.  The full fix is in #8210, but I think that PR is way too risky to backport.
@ehuss
Copy link
Contributor Author

ehuss commented May 6, 2020

I pushed an update that reverted the "always compute Metadata" change. It doesn't work for targets like MSVC in the following scenario:

  1. cargo build --bin foo --features f1 <- generates deps/foo.exe
  2. cargo build --bin foo --features f2 <- Overwrites deps/foo.exe
  3. cargo build --bin foo --features f1 <- If there is a separate fingerprint, this is "fresh", and doesn't regenerate the exe when it should.

In the case where there is no metadata hash in the filename, the units must reuse the same fingerprint file.

I changed the fix for the compiler output cache by changing the output filename to use the same format as fingerprints, like output-bin-foo instead of just output.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Collaborator

bors commented May 7, 2020

📌 Commit 7438770 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 7, 2020
@bors
Copy link
Collaborator

bors commented May 7, 2020

⌛ Testing commit 7438770 with merge c719a05...

@bors
Copy link
Collaborator

bors commented May 7, 2020

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing c719a05 to master...

@bors bors merged commit c719a05 into rust-lang:master May 7, 2020
bors added a commit that referenced this pull request May 27, 2020
Fix fingerprinting for lld on Windows with dylib.

This fixes an issue where if `lld` is used on Windows, dynamic libraries will never be treated as "fresh". This is a regression from #8210 where Cargo is expecting export files to be created, but lld does not create these.

The solution is to ignore "Auxiliary" files in fingerprinting, which AFAIK aren't really needed (only the primary output files really matter).

Fixes #8284
@not-matthias not-matthias mentioned this pull request Jun 2, 2020
ehuss pushed a commit to ehuss/cargo that referenced this pull request Jun 5, 2020
Fix fingerprinting for lld on Windows with dylib.

This fixes an issue where if `lld` is used on Windows, dynamic libraries will never be treated as "fresh". This is a regression from rust-lang#8210 where Cargo is expecting export files to be created, but lld does not create these.

The solution is to ignore "Auxiliary" files in fingerprinting, which AFAIK aren't really needed (only the primary output files really matter).

Fixes rust-lang#8284
bors added a commit that referenced this pull request Jun 22, 2020
Fix overzealous `clean -p` for reserved names.

#8210 changed the way `clean -p` worked, but in some ways it is a little too sloppy.  If a package has a test named `build`, then it would delete the `build` directory thinking an executable named "build" exists.  This changes it so that it does not attempt to delete tests/benches from the uplift directory.
@ehuss ehuss added this to the 1.45.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
5 participants