Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fingerprint of dependency in workspace changes when running cargo build and cargo build -p <crate> #12345

Closed
Kevin-Reactor opened this issue Jul 10, 2023 · 14 comments
Labels
C-bug Category: bug S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.

Comments

@Kevin-Reactor
Copy link

Problem

Context: I have a project that is a workspace that contains multiple library crates that are outputting cdylib types.

Bug/Expectation: When I build in the root of the workspace, I would expect running cargo build -p <crate> to be a noop as the workspace cargo build should have built all of it. If running cargo build -p <crate> outputs to a different cache, then I would expect that the cargo build and cargo build -p to not affect each other at all. Instead, whenever I build cargo build, it invalidates the individual cargo build of the crate's dependencies and forces a rebuild...

Steps

https://github.com/Kevin-Reactor/Fingerprint-Repro

Here is a public repo of a minimal reproduction of the issue.

Step 1: Download the repo
Step 2: run cargo build in the root directory
Step 3: run cargo build -p test_2
Step 4: Alternate between running cargo build and CARGO_LOG=cargo::core::compiler::fingerprint=info cargo build -p test_2 and you can see it complain that the fingerprint for serde_yaml has changed

Possible Solution(s)

No response

Notes

I have tested a few configurations, this only seems to happen when the libraries output cdylib, and only happens when there are two crates (i.e. a workspace with only one crate doesn't have this issue)

It also seems to happen with specific crates. quiche and serde_yaml specifically but I have no idea why.

I know that there's an issue with the features of the crates potentially being different between building in root and building with -p, but it makes no sense that this configuration doesn't require rebuilds with lib instead of cdylib and that they don't have any shared features...

Version

cargo 1.70.0 (ec8a8a0ca 2023-04-25)
release: 1.70.0
commit-hash: ec8a8a0cabb0e0cadef58902470f6c7ee7868bdc
commit-date: 2023-04-25
host: x86_64-unknown-linux-gnu
libgit2: 1.6.3 (sys:0.17.0 vendored)
libcurl: 8.0.1-DEV (sys:0.4.61+curl-8.0.1 vendored ssl:OpenSSL/1.1.1t)
os: Ubuntu 22.04 (jammy) [64-bit]
@Kevin-Reactor Kevin-Reactor added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels Jul 10, 2023
@epage
Copy link
Contributor

epage commented Jul 11, 2023

Can you check whether this is being affected by cargo activating different sets of features based on what packages are selected? See #5210 as an example of the different issues we have on the topic.

@weihanglo
Copy link
Member

In your case

  • test_1 -> quiche -> smallvec -> serde without default features
  • test_2 -> serde with derive

That led to a feature unification when running cargo build for the entire workspace. You can use cargo tree -e features -p <crate> to verify it.

This is at this moment a expected behavior. And feel like this may be a duplicate of #4463. Going to close this. If I am closing it wrong, please let us know.

@weihanglo weihanglo closed this as not planned Won't fix, can't repro, duplicate, stale Jul 11, 2023
@Kevin-Reactor
Copy link
Author

Kevin-Reactor commented Jul 11, 2023

Hi, appreciate the quick response! Just for my own understanding, why does the feature unification not happen when the libraries output lib instead of cdylib? It makes no sense that the output type of the library would affect the feature unification? and why does the fingerprint show that serde_yaml is the one that changed not serde itself?

@Kevin-Reactor
Copy link
Author

Kevin-Reactor commented Jul 11, 2023

I am also a bit confused as it seems like in the thread linked, cargo build --all should build each crate as if it was workspace aware but i dont see that fix the issue i am seeing. What's the recommended way to build the crate such that its built the same way as the workspace intended it? the thread linked doesn't seem to have a conclusion (unless i misread it haha)

@weihanglo
Copy link
Member

why does the feature unification not happen when the libraries output lib instead of cdylib? It makes no sense that the output type of the library would affect the feature unification?

Just verified it again. I believe it happens with and without cdylib specified.

The only difference is that, when your final artifact is cdylib you need to rebuild the final test_2 cdylib when switching back and forth between carg build and cargo build -p test_2, as cdylib doesn't contain a hash on the final artifact name.

why does the fingerprint show that serde_yaml is the one that changed not serde itself?

Not only serde_yaml but the entire subtree of the dependency tree down from that. If you use cargo build --verbose you will see a Dirty status like this

Dirty test_2 v0.1.0 (/projects/Fingerprint-Repro/test_2): dependency info changed

@weihanglo
Copy link
Member

I'd say the current fingerpint log is not really a thing a normal user ought to understand. There are rooms making the recompile reason more friendly 😆.

What's the recommended way to build the crate such that its built the same way as the workspace intended it?

I am not aware of any, and people usually want the opposite behaivor — not unify at all. I would suggest cargo-hack or something at this time to manipulate feature like testing feature combinations for member crates.

@Tomcc
Copy link

Tomcc commented Jul 11, 2023

Hi, @weihanglo , thanks for the quick reply! I work with @Kevin-Reactor and I wanted to chime in as well.

So, I understand that the feature unification leads to different sets of features depending on which leaf package is being built. I agree that is definitely working as intended.

To me, the problem here is that cargo's dependency cache seems to be (from an user's perspective) almost always append only except than in obscure cases such as this one.
Eg. adding and updating dependencies will create new built artifacts without affecting the old ones, so eg. checking out different versions of a cargo.toml with different versions of serde will not cause dependency rebuilds once a specific commit is built. Both of serde's built versions can exist next to each other.

In this case instead, serdes build output is replaced by the second invocation, which causes the rebuilds.
Conceptually, this is does seem like a bug to me: if cargo was able to include the feature set in the build artifact name, then "serde 1.0.100 with derive" and "serve 1.0.100 without derive" would be able to be built and cached individually without causing rebuilds.
I'm sure there's a lot of technical issues with getting there, but I do think it would make sense? Is there work being done on something like this?
Either way, I don't entirely agree with closing the issue due to this - from my perspective it does look like a bug that the build cache can only store things by version and not by version+features.

Plus, this behavior does slow down our iteration speed by quite a lot (we have a fairly big workspace in a monorepo) so whether it's technically correct or not, it would just be nice if it didn't work this way.

tl;dr: me and Kevin both though that cargo's dependency cache was immutable and append only but it's not when it comes to features, which is very confusing if you aren't familiar with cargo's resolver internals.

@weihanglo
Copy link
Member

Sure. I can reopen this if we want to figure out what's going on. I had a hard time reproducing what you've talked about — serde continues recompiling. To me, this went like this:

  1. Run cargo build. It compiles all stuff including test_1 and test_2
  2. Run cargo build -p test_2. Feature set has changed. It recompiles syn, serde_derive, serde, serde_yaml, and test_2 itself.
  3. Run cargo build again. Only test_2 recompiles.
  4. Run cargo build -p test_2 again. Only test_2 recompiles.

Is there something I missed or reproduced it wrong?

@weihanglo weihanglo reopened this Jul 11, 2023
@weihanglo weihanglo added S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request. and removed S-triage Status: This issue is waiting on initial triage. labels Jul 11, 2023
@weihanglo
Copy link
Member

"serde 1.0.100 with derive" and "serve 1.0.100 without derive" would be able to be built and cached individually without causing rebuilds.

They are supposed to be cached separately with different hash suffixes. See https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html#fingerprints-and-metadata.

If you run ls target/debug/deps/serde-*.d target/debug/deps/libserde-*.rlib, you should find two different copies of .d and .rlib file. If you want to check the on-disk fingerprint files, run ls target/debug/.fingerprint/serde-* and you shall find four directories. Each serde fingerprint contains one for serde itself and the other for its build script.

If you can't find something mentioned above, that is probably a real bug.

@Kevin-Reactor
Copy link
Author

if it is being separated on the cache, why is anything even being rebuilt? here is how tommaso and I would expect it to go:

  1. Run cargo build. It compiles all stuff including test_1 and test_2
  2. Run cargo build -p test_2. Feature set has changed. It recompiles syn, serde_derive, serde, serde_yaml, and test_2 itself.
  3. Run cargo build again. NO recompile because it should use the cache from step 1
  4. Run cargo build -p test_2 again. NO recompile because it uses the cache from step 2

we do correctly see the correct amount of .d and .rlib and the four serde directories, but if that was working as intended, why are there any recompiles at all?

@weihanglo
Copy link
Member

I see. Thanks for the prompt response!

So we are on the same page that there is no serde being rebuilt. Only the final artifact test_2 is rebuilt.

From #12345 (comment):

The only difference is that, when your final artifact is cdylib you need to rebuild the final test_2 cdylib when switching back and forth between carg build and cargo build -p test_2, as cdylib doesn't contain a hash on the final artifact name.

I guess I've already talked about it but in a less obvious way. That's my fault. It basically means that you have the only target/debug/deps/libtest_2.dylib without a suffix. Thus when switching between different feature sets, Cargo always produces a new libtest_2.dylib for each cargo build, overwriting the old one, since the file doesn't have a suffix to tell which version it is.

Sorry for my bad communication skill. cdylib is indeed the root cause of that as I stated earlier. See this piece of comments, and this duplicate artifact tracking issue #6313 as well.

@weihanglo
Copy link
Member

I don't really know about the workflow you have. If you are going to copy dylib to somewhere outside Cargo project, this tip might help if applicable: remove crate-type = ["cdylib"] from Cargo.toml. Use cargo rustc --crate-type cdylib when you really need a cdylib.

@Kevin-Reactor
Copy link
Author

ok, i guess that makes sense considering the cdylib bug. Is there work being done to output the dylib to a different directory? I assume that its not quite that simple though haha

Either way, thank you so much for your help, it was very much appreciated. We'll end up probably emitting the cdylib only when its needed, and remove it from the Cargo.toml for now.

@weihanglo
Copy link
Member

Is there work being done to output the dylib to a different directory?

There is an --out-dir unstable option you can use, but I doubt it will fix the build cache issue.

Personally I feel like the solution in the code comment might be doable but not easy:

Maybe use -install-name on macOS or -soname on other UNIX systems to specify the dylib name to be used by the linker instead of the filename.

See #5045 for more on -soname stuff.

I'ved linked to this issue from the meta tracking issue of duplicate artifact. Going to close this in favor of that. Thank you all for the discussion!

@weihanglo weihanglo closed this as not planned Won't fix, can't repro, duplicate, stale Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.
Projects
None yet
Development

No branches or pull requests

4 participants