Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new feature resolver. #7820

Merged
merged 22 commits into from Feb 20, 2020
Merged

Add new feature resolver. #7820

merged 22 commits into from Feb 20, 2020

Conversation

@ehuss
Copy link
Contributor

ehuss commented Jan 22, 2020

This adds a new resolver which handles feature unification independently of the main resolver. This can be enabled with the -Zfeatures flag which takes a comma-separated list of options to enable new behaviors. See unstable.md docs for details.

There are two significant behavior changes:

  1. Ignore targets that are not enabled.
  2. Do not unify features between build_deps, dev_deps, and normal deps.

The "forks" in the unit graph are handled by adding DepKind to UnitFor. The feature resolver tracks features independently for the different dependency kinds.

Unfortunately this currently does not support decoupling proc_macro dependencies. This is because at resolve time it does not know which dependencies are proc_macros. Moving feature resolution to after the packages are downloaded would require massive changes, and would make the unit computation much more complex. Nobody to my knowledge has requested this capability, presumably because proc_macros are relatively new, and they tend not to have very many dependencies, and those dependencies tend to be proc-macro specific (like syn/quote). I'd like to lean towards adding proc-macro to the index so that it can be known during resolve time, which would be much easier to implement, but with the downside of needing to add a new field to the index.

I did not update cargo metadata, yet. It's not really clear how it should behave. I think I'll need to investigate how people are currently using the feature information and figure out how it should work. Perhaps adding features to "dep_kinds" will be the solution, but I'm not sure.

The goal is to try to gather feedback about how well this new resolver works. There are two important things to check: whether it breaks a project, and how much does it increases compile time (since packages can be built multiple times with different features). I'd like to stabilize it one piece at a time assuming the disruption is not too great. If a project breaks or builds slower, the user can implement a backwards-compatible workaround of sprinkling additional features into Cargo.toml dependencies. I think itarget is a good candidate to try to stabilize first, since it is less likely to break things or change how things are built. If it does cause too much disruption, then I think we'll need to consider making it optional, enabled somehow.

There is an environment variable that can be set which forces Cargo to use the new feature resolver. This can be used in Cargo's own testsuite to explore which tests behave differently with the different features set.

@rust-highfive

This comment has been minimized.

Copy link

rust-highfive commented Jan 22, 2020

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 22, 2020

☔️ The latest upstream changes (presumably #7731) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss ehuss force-pushed the ehuss:features2-split branch from 3eeda09 to 68304c2 Jan 23, 2020
Copy link
Member

alexcrichton left a comment

Ok I've left a bunch of initial comments below, although I didn't get a chance to finish today unfortunately. Wanted to give you some time to respond though since I may be busy for a day or so. The major thing I didn't get finished reviewing is the internals of the new implementation of feature resolution, which I suspect would lift some confusion I have about DepKind, so feel free to defer to those internals if you'd like.

src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/profiles.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
src/cargo/ops/cargo_compile.rs Show resolved Hide resolved
src/cargo/ops/cargo_clean.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
@Eh2406 Eh2406 mentioned this pull request Jan 24, 2020
0 of 4 tasks complete
@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Jan 24, 2020

While looking through some of your questions, I found a bug with cargo test and dev-dependency handling. It might require some non-trivial changes to fix. The problem is that in some cases the only difference between two lib.rs units is what it links against, and the current unit-deduplication logic can't handle that.

I'll try to fix that, and then respond to your questions (and try to add some comments to clarify confusing sections).

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Jan 24, 2020

So I think there may need to be significant changes, and I wanted to confirm that my assumptions make sense. I'll illustrate with an example project with these dependencies:

[dependencies]
dep = {version="1.0", features=["f1"]}

[dev-dependencies]
dep = {version="1.0", features=["f2"]}

With:

  • src/lib.rs
  • src/main.rs
  • tests/test.rs
  • examples/example.rs

cargo build will create a unit graph like the following:

unit-graph-build

(Hopefully that's pretty straightforward.)

The tricky part comes with cargo test. I would assume with decoupled dev-dependencies, we would want a unit graph like the following:

unit-graph-test

This causes lib.rs to be built 3 times: once as a test, once as a library linked with dev-dependencies (for other tests), and once as a library for examples and binaries (without dev-dependencies). This is done with the assumption that you want main.rs to be built the same as it does with cargo build.

If you agree with that, then the tricky bit comes with how to build that graph. The only difference for the lib.rs unit is what it links against. However, the Unit structure as it exists doesn't differentiate that. I'm thinking of adding DepKind to Unit to handle this properly. I'll try to avoid the extra build of lib.rs if you don't have any dev-dependencies, which will make it a little tricky (same with shared build-dependencies where there are no differences). I'm also wondering if I can change (or remove) UnitFor, which is essentially doing the same thing (the reason UnitFor exists is to encode these differences while still handling de-duplication). I'll try to put a fix together that does this, and see where it goes.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jan 24, 2020

So that makes sense to me if we take into account DepKind. I'm not entirely convinced though (for now at least) that we should be doing that. Our current feature resolution is to unify based entirely on just the structure of the crate graph, but I'm not sure how much further we need to go other than unifying only based on target. For example I'm not sure if it's a bad thing where, in your example, cargo build works the same but cargo test only builds dep with both features enabled.

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Jan 27, 2020

Yea, I'm not really clear on what the best behavior here is. How about this: have some sort of internal boolean (CompileFilter::need_dev_deps) that indicates whether or not dev-dependency features should be included. It would be enabled if any test/bench/example is being built. That means that cargo test would behave the same as it does today. It would also mean something like cargo build --all-targets would also unify (the same as it does today). But if you do a plain cargo build, it will build without unifying dev-dependencies.

I'm not sure if that might be too confusing or surprising. Perhaps with documentation it can be made clear?

I'm inclined to go that direction because it will be much simpler to implement. If we decide that dependencies should be "forked" and built multiple times with different features, we can add that in the future.

How does that sound?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jan 28, 2020

That sounds relatively reasonable to me yeah, the difference in --all-targets behavior is a little unfortunate but I think that basically mirrors what happens today (albeit probably in a slightly different place)

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 31, 2020

☔️ The latest upstream changes (presumably #7851) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Feb 2, 2020

I did a fairly major rewrite to try to simplify things. Now that dev-dependency unification is a global setting, it all really boils down to whether or not something is a build dependency, so I just made that a bool. It's not as flexible, but I can always switch it back to something more complex if needed.

I also add -Zfeatures=all to try all options at once.

BTW, I'd like to squash this before merging, so don't merge as-is. This also now includes #7857 to fix some issues, but that PR should probably go first to simplify this.

@shamatar

This comment has been minimized.

Copy link

shamatar commented Feb 3, 2020

Does this PR also address #2589 and #2524 ? If not I'd try to look deeper into the 2589 myself as it would be quite convenient to have something like

[target.'cfg(not(target_os = "macos"))'.dependencies.a]
version = "0.5"

[target.'cfg(target_os = "macos")'.dependencies.a]
git = "https://github.com/foo/a_patched.git"

This example is actually much simpler than what you try to solve cause dependency graph just need a replacement of the node (actually just a path/version/git) based on target, not a feature, but may be it's also fixed as a by-product?

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Feb 3, 2020

@shamatar yes, it should close those issues (and many others). 2589 relates to features, not multiple sources. Multiple sources is #7753, and I believe that will be very difficult to fix.

@shamatar

This comment has been minimized.

Copy link

shamatar commented Feb 3, 2020

@ehuss

Ok, thanks for an answer. Another short question: if paths are not easy to fix, may be it would be able to introduce a syntax like

[target.'cfg(not(target_os = "macos"))'.features]
default = []

[target.'cfg(target_os = "macos")'.features]
default = ["macos_patch"]

or something similar. If it ever makes sense. At least it's a short-term fix for different paths problem.

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Feb 3, 2020

That is discussed in #1197.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 5, 2020

☔️ The latest upstream changes (presumably #7855) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss ehuss force-pushed the ehuss:features2-split branch from f07d254 to ae57a25 Feb 5, 2020
bors added a commit that referenced this pull request Feb 7, 2020
Fix BuildScriptOutput when a build script is run multiple times.

When I implemented profile build overrides, I created a scenario where a build script could run more than once for different scenarios.  See the test for an example.

However, the `BuildScriptOutputs` map did not take this into consideration.  This caused multiple build script runs to stomp on the output of previous runs.  This is further exacerbated by the new feature resolver in #7820 where build scripts can run with different features enabled.

The solution is to make the map key unique for the Unit it is running for.  Since this map must be updated on different threads, `Unit` cannot be used as a key, so I chose just using the metadata hash which is hopefully unique.  Most of this patch is involved with the fussiness of getting the correct metadata hash (we want the RunCustomBuild unit's hash).  I also added some checks to avoid collisions and assert assumptions.
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 7, 2020

Ok this is taking me longer to get to reviewing than I thought it might. I'll be gone most of next week as well. From my previous review and given the changes you've described implementing, I'd be fine landing this in the meantime and I can comment on this when I get a chance to. Or if you're ok waiting a week or so that's ok too!

@matklad

This comment has been minimized.

Copy link
Member

matklad commented Feb 12, 2020

I did not update cargo metadata, yet. It's not really clear how it should behave. I think I'll need to investigate how people are currently using the feature information and figure out how it should work.

I guess I can say a thing or two about IntelliJ and rust-analyzer, who are perhaps the two biggest users of cargo metadata at the moment. In short, the current Resolve in cargo metadata doesn't really make sense; it works good enough, but it is on the wrong level of abstraction.

Specifically, Resolve is on the level of packages, but what rust-analyzer & IntelliJ want is more close to units. For example, we would like to see "crate ./test/foo.rs depends on crate ./src/lib.rs with these cfg flags set".

I think we can make two steps to make sure that things "make sense".

  1. Recast existing Resolve as an exact reflection of Cargo.lock. Ie, make it fully independent of the set of feature flags and the target, such that it is basically the flat list of packages from a lockfile + deps which explain why those packages are in lockfile. I am not sure if this would be actually useful, but at least that would be a clear defined thing which makes sense :-)

  2. Add a separate thing (perhaps a new command even?) which outputs a format close to the UnitMap, which speaks not about Cargo packages, but rather crates in rust-reference sense, and dependencies between them. This would depend on the precise set of flags passed to Cargo (we'll get different outptus for cargo build vs cargo test) and would actually resemble a sort-of static build plan.

@ehuss

This comment has been minimized.

Copy link
Contributor Author

ehuss commented Feb 12, 2020

@matklad I've been thinking the exact same thing. I've also been wondering if we could just extend cargo metadata or need some other command or flag (like how --build-plan works).

I'd really like a tool to visualize the unit graph, but that is difficult for any non-trivial project.

@sunshowers

This comment has been minimized.

Copy link
Contributor

sunshowers commented Feb 12, 2020

FYI cargo-guppy relies on metadata as well: https://github.com/calibra/cargo-guppy

@ehuss ehuss force-pushed the ehuss:features2-split branch from ae57a25 to 57867e7 Feb 17, 2020
Copy link
Member

alexcrichton left a comment

Ok apologies again for the delay in getting to review this.

This time around though I was able to process it a whole lot faster! The current logic makes sense to me, especially how dev-dependencies and their features are handled. I think what's implemented probably won't last until the end of time but it's certainly a helluva lot better than what we have now and is worth getting stabilized (IMO)

src/cargo/core/compiler/standard_lib.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
src/cargo/core/resolver/features.rs Outdated Show resolved Hide resolved
for fv in fvs {
self.activate_fv(pkg_id, fv, for_build)?;
}
if !self.processed_deps.insert((pkg_id, for_build)) {

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 19, 2020

Member

This guard seems a bit suspicious to me. Just because we visited a package before here doesn't mean we didn't later add features which themselves add dependencies, right?

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 19, 2020

Member

I'm thinking of something like cargo build -p deep-dependency -p root-dependency where it activates deep-dependency quickly but by activating root-dependency we transitively enable more features of deep-dependency

This comment has been minimized.

Copy link
@ehuss

ehuss Feb 20, 2020

Author Contributor

I think it is fine. After resolving deep-dependency once, other packages that add features to that deep-dependency will do so via FeatureValue::CrateFeature in activate_fv. This in turn checks if it is activating an optional dependency, and will recurse into that if necessary.

This is kinda fundamental to the way this is organized. If you notice, a few lines below it always skips optional dependencies. Whenever a dependency is encountered that enables an optional feature, it will enable it and recurse right away.

I believe this is structured somewhat differently from how the dependency resolver works. Presumably because it is figuring out the features as it goes along, and doesn't have the full graph built, yet.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 20, 2020

Member

Ah ok the skipping of optional deps below actually does make this sense yeah, sorry I missed that! It still feels a bit funky to me but because it's doing the skip I'm fine deferring to tests to verify the correctness of this. Mind adding a comment here clarifying what sort of recursion this is guarding against, and how more recursion is actually allowed via feature activation below in a limited way?

/// features of the non-host package (whereas `host` is true because the
/// build script is being built for the host). `build_dep` becomes `true`
/// for build-dependencies, or any of their dependencies.
build_dep: bool,

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 19, 2020

Member

I'm thinking about this for a bit now. I understand (I think?) all the words in this comment, but I'm having trouble piecing together a situation about where this becomes relevant. Can you give a small example about where this subtle distinction is necessary?

Additionally, since host=false, build_dep=true probably doesn't make much sense, would it make more sense for this to be some sort of tri-value thing like:

enum Kind {
    Target,
    Host { build_dep: bool },
}

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 19, 2020

Member

(please don't use Kind as I'm sure you're already thinking when reading that)

This comment has been minimized.

Copy link
@ehuss

ehuss Feb 20, 2020

Author Contributor

I added an example.

I'm reluctant to add an enum to represent the 3 states. All the code is oriented around simple booleans, and I think it would make it harder to follow if they were unified into one field. I usually agree that it is good to prevent impossible scenarios, but in this case I think it might be harder to understand.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 20, 2020

Member

Ok the example looks good to me, thanks! I'm also fine deferring to your thoughts about booleans!

One final question here, though. Technically we have two sets of fetures to deal with, one when the build script is compiled (the --feature flags) and one when it's executed (CARGO_FEATURE_*). Given that is_custom_build is conflating both executing and compiling the build script, I think we may want to be a little more nuanced/precise? I would expect that the compilation of the build script would use the host: true and build_dep: false set of features, but the execution of the build script would use host: false and build_dep: false as well, right? (ok maybe this doesn't have to do with build_dep?)

Also to clarify, in the example you listed here, shared_dep build.rs is only compiled once regardless of feature settings, right? And then it's executed twice maybe depending on feature settings/targets?

This comment has been minimized.

Copy link
@ehuss

ehuss Feb 20, 2020

Author Contributor

Hm, so it is working correctly, but you do point out that host isn't accurate for the RunCustomBuild unit. host only matters for the profile, and the RunCustomBuild unit goes through a separate code path for computing the profile. That is here where it manually fetches the profile (ignoring unit_for). All other units go through this path where it inspects unit_for to determine the profile. However, it is necessary for host to be true for RunCustomBuild so that all dependencies below it are marked as "host:true". Added some comments explaining this.

shared_dep build.rs is only compiled once regardless of feature settings, right? And then it's executed twice maybe depending on feature settings/targets?

It depends. With -Zfeatures=build_dep, and the features listed are different between the normal and build dep, then it will get built twice. The reason for this is that many scripts out there use #[cfg] and cfg! to detect features, instead of inspecting "CARGO_FEATURE_*". Only building it once would break a large number of build scripts. I don't think it would be feasible to avoid that.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 20, 2020

Member

Ah ok that makes sense, I know I've "used and abused" that CARGO_FEATURE_* is the "same" as #[cfg] in the past myself, so having a special case for this makes sense.

///
/// This is part of the machinery responsible for handling feature
/// decoupling for build dependencies in the new feature resolver.
pub fn with_build_dep(mut self, build_dep: bool) -> UnitFor {

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 19, 2020

Member

As a minor nuance (and this was already present, just something I'm realizing now) the terminology of with_* doesn't quite fit here in my mind since I'd expect a function of that name to unconditionally configure build_dep with the argument provided, but here it's more of a unioning function. I'm not sure of a better name though!

This comment has been minimized.

Copy link
@ehuss

ehuss Feb 20, 2020

Author Contributor

Yea, I struggled a bit with the naming convention and wasn't happy with it. But I'm unable to think of anything particularly better. If you think of something, let me know!

I don't know if there is a common naming convention for this idiom (creating a copy and modifying a field in one step)? Maybe that's a sign it may be better to be explicit about creating a copy, and then call set_build_dep?

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Feb 20, 2020

Member

Eh it's a minor thing, I wouldn't stress out much over it. I definitely know of no naming convention for this, so it may be good to just highlight it in the documentation and move on. (the current call-sites all "look right" which is what matters)

@ehuss ehuss force-pushed the ehuss:features2-split branch from ff68a7c to 349d664 Feb 20, 2020
@ehuss ehuss force-pushed the ehuss:features2-split branch from 349d664 to 4d0fda7 Feb 20, 2020
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 20, 2020

Alright I'm definitely happy with everything here, so I'm going to r+

I think it's safe to say this has languished long enough (sorry!) and we can continue to iterate in-tree. Thanks again so much for pushing on this @ehuss!

@bors: r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 20, 2020

📌 Commit 4d0fda7 has been approved by alexcrichton

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 20, 2020

⌛️ Testing commit 4d0fda7 with merge d6fa260...

//!
//! This is a new feature resolver that runs independently of the main
//! dependency resolver. It is intended to make it easier to experiment with
//! new behaviors. When `-Zfeatures` is not used, it will fall back to using

This comment has been minimized.

Copy link
@Eh2406

Eh2406 Feb 20, 2020

Contributor

This would be a good place to explain that it is not going to replace the dependency resolver but that it acts as a second pass. By not considering changing which versions are selected it avoids solving the NP-Hard problem faced by the dependency resolver. And it is focused on what should be built for this system not all possible systems that the dependency resolver considers. It therefore has complexity budget to consider ... (the list of really cool things your system achieves.)

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 20, 2020

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing d6fa260 to master...

@bors bors merged commit d6fa260 into rust-lang:master Feb 20, 2020
11 checks passed
11 checks passed
homu Test successful
Details
rust-lang.cargo Build #20200220.8 succeeded
Details
rust-lang.cargo (Linux beta) Linux beta succeeded
Details
rust-lang.cargo (Linux nightly) Linux nightly succeeded
Details
rust-lang.cargo (Linux stable) Linux stable succeeded
Details
rust-lang.cargo (Windows x86_64-msvc) Windows x86_64-msvc succeeded
Details
rust-lang.cargo (build_std) build_std succeeded
Details
rust-lang.cargo (docs) docs succeeded
Details
rust-lang.cargo (macOS) macOS succeeded
Details
rust-lang.cargo (resolver) resolver succeeded
Details
rust-lang.cargo (rustfmt) rustfmt succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.