Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Improve Cargo target-specific dependencies #1361

Merged
merged 3 commits into from Jan 29, 2016

Conversation

Projects
None yet
@alexcrichton
Copy link
Member

alexcrichton commented Nov 10, 2015

Improve the target-specific dependency experience in Cargo by leveraging the
same #[cfg] syntax that Rust has.

Rendered

RFC: Improve Cargo target-specific dependencies
Improve the target-specific dependency experience in Cargo by leveraging the
same `#[cfg]` syntax that Rust has.

[Rendered](https://github.com/alexcrichton/rfcs/blob/cargo-cfg-dependencies/text/0000-cargo-cfg-dependencies.md)
@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 10, 2015

Tagging as T-compiler for the new flag and T-tools for cargo

  • (new rustc flag is --print cfg)
information Cargo will query the compiler via a new command line flag:

```
$ rustc --print cfg

This comment has been minimized.

@sfackler

sfackler Nov 10, 2015

Member

This can be done without any rustc changes by having Cargo and compile and run a little shim program e.g. fn main() { println!("{}", cfg!(target_os = "macos")) }

This comment has been minimized.

@alexcrichton

alexcrichton Nov 10, 2015

Author Member

Oh wow that's an excellent point! I hadn't even considered that part. This would have the nice benefits of:

  • Doesn't require changes to the compiler
  • Allows newer Cargo releases to work with older compilers

This does, however, have a drawback that Cargo has to have a whitelist of all possible #[cfg] annotations that can possibly be defined. This means that if the compiler adds new ones over time (e.g. all the simd-related ones) or simply adds new values (e.g. a new OS or a new target_pointer_width, god forbid), it wouldn't automatically be picked up by Cargo.

In terms of forward compatibility I think I'd prefer to stick with --print cfg to be sure that Cargo stays as close to the compiler for as long as possible (although it is likely to inevitably diverge). What do you think though?

This comment has been minimized.

@jethrogb

jethrogb Nov 10, 2015

Contributor

Why does there need to be a whitelist? Can't Cargo just try if it works?

This comment has been minimized.

@alexcrichton

alexcrichton Nov 10, 2015

Author Member

Ah of course, I shall clarify! So in theory given a set of packages you've got a list of #[cfg] annotations to match against. We can generate a program, figure out answers to all these, but the process is then iterative. Each time a dependency is resolved it may bring in more dependencies, each of which may have #[cfg] annotations. Although Cargo can build a cache of matched cfg directives, this cache will be iteratively built and and is very expensive to create as each iteration involves compiling a program and then running it.

Basically, we don't understand the full set of cfg annotations we need to match against at the start of dependency resolution, so we'd have to invoke the compiler multiple times. In the limit we have to invoke the compiler many times! Note that this also happens on every single invocation of cargo build.

To ensure that we don't regress too much in speed I wanted to opt for a solution where Cargo can quickly learn about the entire set of cfg and then do all the processing in-memory (much faster than spawning a process).

Does that make sense in terms of the constraints here? I can also be sure to add some of this to the RFC as well!

This comment has been minimized.

@jethrogb

jethrogb Nov 10, 2015

Contributor

An alternative would be to implement a cfg_str! macro or such that returns the value of such configuration options as target_vendor and others defined in rustc_back::target. This would alleviate the need to check each value seperately, and would also not require the addition of a command-line option to rustc.

This comment has been minimized.

@wthrowe

wthrowe Nov 10, 2015

The list of needed cfg key-value pairs could be cached between runs so the iterative generation would only be needed on a clean build or when something in the dependency graph changes what values it requests. In those cases you're almost certainly (re)compiling crates anyway, so a bit of extra time in dependency resolution shouldn't matter.

This comment has been minimized.

@alexcrichton

alexcrichton Nov 11, 2015

Author Member

@jethrogb

That's actually being proposed in #1258 as well, and although it doesn't suffer the forward compatibility problem quite as much it still does to some degree because Cargo may not know the keys to query for.

@wthrowe

That's a good point! I guess I'd have to time both strategies and see what came out on top, but I think my preference would still be to push on --print cfg if possible, but this sounds to me like at least a reasonable fallback!

This comment has been minimized.

@alexcrichton

alexcrichton Nov 11, 2015

Author Member

Ah actually @arcnmx pointed out below that this won't work for cross compiles as you can't tell the compiler "compile as if you were this target but actually generate a binary for the host" which unfortunately I think kills the idea of generating a binary with println! + cfg! and then running it.

This comment has been minimized.

@sfackler

sfackler Dec 16, 2015

Member

We could in theory have cargo try to parse the output of -Z pretty-expanded output, but that seems like a nightmare in comparison to just adding a new rustc flag :)

@briansmith

This comment has been minimized.

Copy link

briansmith commented Nov 10, 2015

This looks great. I would rather not have it bundled with a bunch of other unrelated changes just because they all happen to not be 100% backward compatible.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Nov 10, 2015

Looks good to me.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 10, 2015

@briansmith

Ah yeah I should clarify that the other changes I have in mind are all backwards compatible, just not necessarily forwards compatible. An example of this is that I have a change in mind to add more entries to Cargo.lock. This means that when you build with an older Cargo it will remove these entries, but when building with a newer Cargo it will re-add these entries. Not a huge problem, but something at least to consider!

Regardless though I do personally agree that this should happen independent of other changes. Always good to have a nice pipeline of features on the way!

`--target` (Cargo's current behavior).

> **Note**: There's an [issue open against TOML][toml-issue] to support
> single-quoted keys allowing more ergonomic syntax in some cases like:

This comment has been minimized.

@jethrogb

jethrogb Nov 10, 2015

Contributor

I have no feeling for how TOML would respond to changes like this. Would they implement such a change? If so, when will they implement such a change? The backslash syntax is pretty ugly for us in the mean time. How long would we have to live with it? An alternative is to support the inverted syntax from Rust: [target."cfg(target_os = 'macos')".dependencies] although that might be confusing.

This comment has been minimized.

@alexcrichton

alexcrichton Nov 10, 2015

Author Member

That's a good question! I'll cc @BurntSushi, our resident TOML expert.

In terms of implementation, as soon as this is merged into the TOML spec I can update toml-rs to implement the parsing and merge it into Cargo ASAP. If that happened before this RFC and/or implementation landed then it'd all happen at the same time :)

This comment has been minimized.

@sfackler

sfackler Dec 16, 2015

Member

Any update on this, by the way?

This comment has been minimized.

@BurntSushi

BurntSushi Jan 19, 2016

Member

Sorry, I completely missed this ping. AFAIK, keys are either bare or quoted. If they're quoted, then they follow the same rules as strings themselves:

Keys may be either bare or quoted. Bare keys may only contain letters, numbers, underscores, and dashes (A-Za-z0-9_-). Note that bare keys are allowed to be composed of only digits, e.g. 1234. Quoted keys follow the exact same rules as basic strings and allow you to use a much broader set of key names. Best practice is to use bare keys except when absolutely necessary.

This means that all of the string variants listed here are valid as keys (including multi-line keys, by my read of the spec): https://github.com/toml-lang/toml#user-content-string

This comment has been minimized.

@alexcrichton

alexcrichton Jan 19, 2016

Author Member

In reading the spec, however, there's a difference between a string literal and a "basic string", e.g. the term "basic string" is used to reference strings that are surrounded by double quotes, and the term for strings surrounded by single quotes is "literal string". This interpretation is also confirmed by the proposed ABNF where the key production only allows for quotation-mark-surrounded strings.

Now this may all just be a misunderstanding, though, as it may be intended that literal (single-quote) strings are allowed in keys as well

This comment has been minimized.

@jethrogb

jethrogb Jan 19, 2016

Contributor

What about literal strings? That looks like what we'd need here...

This comment has been minimized.

@BurntSushi

BurntSushi Jan 19, 2016

Member

@alexcrichton Interesting. You might be right about that!

This comment has been minimized.

@BurntSushi

BurntSushi Jan 19, 2016

Member

Here's the issue for those following along at home: toml-lang/toml#354

@arcnmx

This comment has been minimized.

Copy link

arcnmx commented Nov 11, 2015

Can we also discuss how this itegrates with build scripts?

  • Will cfg items exposed through build scripts like println!("cargo:rustc-cfg=etc") be available or will these conditions be evaluated beforehand?
  • It will be really nice if build scripts can take advantage of this functionality too! I like the --print option, or cargo could provide another environment variable with its results.

The proposed alternative around compiling and running a program won't work well in cross-compiling environments or those where a linker, libstd, or libcore aren't available even if you could execute binaries produced by rustc. And cross-compiling is where this RFC's functionality is very useful!

@Ericson2314 Ericson2314 referenced this pull request Nov 11, 2015

Closed

Tracking issue for libcore + no_std stabilization #27701

4 of 4 tasks complete
@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 11, 2015

Right now the resolution and compilation steps in Cargo are very separate, and this is done to ensure that resolution does not require taking time to compile dependencies just to discover it may not have needed to compile those dependencies, for example. As a result I don't belief that any cfg annotations exported by build scripts will be available for use here (they're just too dynamic), but it's a good point and should be documented!

Also note that running a binary should be fine for Cargo to do as it always compiles for the host architecture in that case and then informs the binary about the target (e.g. how build scripts work today).

@arcnmx

This comment has been minimized.

Copy link

arcnmx commented Nov 11, 2015

@alexcrichton

Also note that running a binary should be fine for Cargo to do as it always compiles for the host architecture in that case and then informs the binary about the target (e.g. how build scripts work today).

Sure, and then your resulting cfg!() values will be completely wrong if you were to omit --target when building it!

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 11, 2015

Oh right of course, I didn't connect those dots (thought you were talking about custom build scripts)

@arcnmx

This comment has been minimized.

Copy link

arcnmx commented Nov 11, 2015

Sorry, yeah, those were separate. In any case it would be very useful for build scripts to be able to access this information in some form: either by using rustc --print cfg or cargo sending along the info in an environment variable!

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 11, 2015

Oh I see what you meant now! Yeah I could see Cargo passing along something like TARGET_CFG_<val> down to build scripts for what was discovered from --print cfg, although I'd personally prefer to punt on that for now. Build scripts at least are much more amenable to pulling in a library which parses target strings and perhaps gleans information from them much more readily.

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Nov 13, 2015

Alternatively, allow users to specify custom targets and predefine a few useful targets.

Predefined targets:

[target.x86_64-pc-windows-gnu]
cfg = 'all(target_os = "windows", target_arch = "x86_64", target_env = "gnu")'

# for all triples...

[target.macos]
cfg = 'target_os = "macos"'

# for all OSs...

[target.unix]
cfg = 'unix'

Example usage:

[target.unix.dev-dependencies]
libc = "*"

On compile, cargo would include all targets with matching cfgs.

The primary benefit is that targets get descriptive names which can be used multiple times. That is,

[target.sane]
# no one wants to repeat this...
cfg = 'any(target_os = "macos", target_os = "linux", feature = "sane")'

[target.sane.dev-dependencies]
quickcheck = "*"

[target.sane.dependencies]
some_other_lib = "*"

The secondary benefit is that you don't have the weird [target.cfg(....).something] syntax.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Nov 13, 2015

@Stebalien What would windows be? cfg(windows) or cfg(target_os = "windows")?

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Nov 13, 2015

@retep998 Is there a difference between the two?

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Nov 13, 2015

@Stebalien I hope there isn't. It would have to be a really weird obscure platform for it to be both Windows and yet not Windows.

@jethrogb

This comment has been minimized.

Copy link
Contributor

jethrogb commented Nov 13, 2015

cfg(windows) means that Target.options.is_like_windows is true. cfg(target_os = "windows") means that Target.target_os is "windows". One could imagine target_os being something different like "wince" or "winphone".

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Nov 13, 2015

wine versus windows would probably be a better comparison. I guess we'd need os_windows, os_linux, os_macos, unix, and windows.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Nov 16, 2015

@Stebalien that's an interesting suggestion! Sounds kinda like #831 and it's a good point that the #[cfg] syntax today can get quite wordy, even if it's less wordy than explicitly listing each triple today

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Nov 22, 2015

@Daggerbot That example would actually be incorrect for cross compiling scenarios where cfg!(windows) for the host triple could be different than for the target triple.

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Nov 22, 2015

@Daggerbot this would be a usability/tooling nightmare.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Dec 10, 2015

@rust-lang/tools any chance we can start moving on this? Looks like conversation has died down.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Dec 11, 2015

@sfackler yes I hope to place this into FCP during our next triage meeting, although we meet every other week instead of every week.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Dec 16, 2015

🔔 This RFC is now entering its final comment period 🔔

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Dec 16, 2015

I still feel very strongly that my modified version (#1361 (comment)) is much more readable/maintainable.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Dec 17, 2015

I think @Stebalien 's idea looks nicer than what is originally proposed in this RFC.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Dec 18, 2015

@Stebalien

I definitely agree that some form of aliasing would be nice, but I'm hesitant to move on it just yet. I think that there's a pretty nice backwards-compatibility story where we can support the syntax in this RFC but then later add the ability to alias later on in the future. Some downsides of pushing harder on having these aliases today are:

  • It's less consistent with Rust itself, which this syntax is targeted at mirroring. Rust may one day gain the ability to alias a bunch of #[cfg] options, and we'd probably want to make sure the Cargo and Rust syntaxes are the same.
  • New platforms don't gain the benefit of target.platform.dependencies immediately, they'll have to wait for new versions of Cargo to be released down the road which includes these aliases. Whereas with scraping rustc --print cfg syntax means that Cargo will automatically support the new platform with normal syntax as soon as a compiler is found that supports it.

I think that in any world we're likely to want the more exhaustive syntax as an option at least, so for now I'd prefer to stick to the more conservative route that mirrors Rust closesly and isn't too complicated.

How does that sound?

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Dec 18, 2015

It's less consistent with Rust itself, which this syntax is targeted at mirroring. Rust may one day gain the ability to alias a bunch of #[cfg] options, and we'd probably want to make sure the Cargo and Rust syntaxes are the same.

Most cross platform libraries I write look like this:

// platform dependent code
mod impl {
    #[cfg(any(target_os = "macos", target_os = "linux", feature = "sane"))]
    mod sane;
    #[cfg(any(target_os = "macos", target_os = "linux", feature = "sane"))]
    use sane::*;
}

Which is exactly what I'm trying to achieve here. I actively avoid putting cfgs inline in the body of my code to avoid copying them around and ifdef hell.

New platforms don't gain the benefit of target.platform.dependencies immediately, they'll have to wait for new versions of Cargo to be released down the road which includes these aliases. Whereas with scraping rustc --print cfg syntax means that Cargo will automatically support the new platform with normal syntax as soon as a compiler is found that supports it.

I doubt the core set of platforms will change much over time. Regardless, we can simply ship a platforms.toml with rust.

I think that in any world we're likely to want the more exhaustive syntax as an option at least.

I disagree. Putting cfgs in toml keys looks really hacky to me and is no more flexible.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Dec 18, 2015

I would personally like to keep the scope of this RFC to be relatively small, so if we start diving into the realm of shipping TOML configuration with the compiler it's straying pretty far from the intentions. I also think it's important to consider what the typical usage of this will look like.

For example, I suspect most target specific dependencies fall under cfg(windows) or cfg(unix), in which case the difference between the two I think is subjective

[target."cfg(unix)".dependencies]

# ...

[target.unix.dependencies]

If you start going outside that realm, I suspect that most dependencies are sort of one-off configured. I would consider there to be a drastic difference between

[target."cfg(target_arch = \"x86_64\")".dependencies]

# ...

[target.x86_64]
cfg = 'cfg(target_arch = "x86_64")'

[target.x86_64.dependencies]

If you note that we shouldn't support the explicit syntax, then I think that if we have to provide a built-in configuration for all possible "common permutations" to make the usage ergonomic is a bit of a non-starter. There is also the question of when Cargo sees target.<string>.dependencies, how does it know whether string is an alias for more cfgs or whether it's an actual custom target spec? Today it just treats it entirely opaquely.

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Dec 18, 2015

For example, I suspect most target specific dependencies fall under cfg(windows) or cfg(unix), in which case the difference between the two I think is subjective

I completely agree. I just subjectively find [target."cfg(unix)".dependencies] to be ugly (specifically, the internal quotes). 😉

If you start going outside that realm, I suspect that most dependencies are sort of one-off configured ... If you note that we shouldn't support the explicit syntax, then I think that if we have to provide a built-in configuration for all possible "common permutations" to make the usage ergonomic is a bit of a non-starter.

You're right. This is a significant problem with my design.

There is also the question of when Cargo sees target..dependencies, how does it know whether string is an alias for more cfgs or whether it's an actual custom target spec? Today it just treats it entirely opaquely.

It's always an alias for a cfg:

let mut config = parse();
let targets = config.remote("target");
for (_, target) in targets {
    if eval_cfg(target["cfg"]) {
        config.merge(target);
    }
}

One off configs are a good point. I'm still not happy with the syntax but I'm also no longer very happy with mine so, no more objections.

However, it would be nice if we could get TOML to treat parentheses as a form of quotes (token tree quotes?). That is, allow target.cfg(not(target_os = "linux")).dependencies but not target.cfg()).dependencies.

# Detailed design
[design]: #detailed-design

The target-specific dependency syntax in Cargo will be expanded to to include

This comment has been minimized.

@nrc

nrc Dec 20, 2015

Member

typo: "to to"

[target."cfg(unix)".dependencies]
unix-socket = "0.4"
[target."cfg(target_os = \"macos\")".dependencies]

This comment has been minimized.

@nrc

nrc Dec 20, 2015

Member

this syntax is really ugly - is it a TOML restriction that we have to wrap everything in quotes? It'd be great if we could have [target.cfg(unix).dependencies]

This comment has been minimized.

@alexcrichton

alexcrichton Dec 21, 2015

Author Member

Unfortunately, yeah, this is a TOML restriction that keys with "odd characters" are required to be in quotes. It's also a restriction that the key needs to be in double quotes instead of single quotes currently.

# Unresolved questions
[unresolved]: #unresolved-questions

* This is no the only change that's known to Cargo which is known to not be

This comment has been minimized.

@nrc

nrc Dec 20, 2015

Member

typo: no -> not

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Dec 20, 2015

lgtm. I'm in favour of the version proposed in the RFC - the symmetry with rustc is nice - the less Cargo is a special world and more like Rust, the better, IMO. Sounds like target aliases are nice to have too, but (IIUC) implementing the RFC as proposed and later adding a generic target alias mechanism would subsume the alternative proposed above.

I would really like to drop the "" if possible.

@tbu-

This comment has been minimized.

Copy link
Contributor

tbu- commented Dec 23, 2015

The Rust syntax doesn't really look good in TOML in my opinion.

@michaelwoerister

This comment has been minimized.

Copy link

michaelwoerister commented Jan 12, 2016

The version proposed in the RFC looks good to me .

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jan 13, 2016

I'm torn but also kind of ambivalent and want this to get done. The aesthetics of @Stebalien's is a lot better to me. The indirection it adds with the target 'aliases' is good and bad: it could avoid lots of cfg duplication, but requires more code to set up than the OP RFC. I'm not sure how I feel about predefining cfg target aliases. @Stebalien's looks to me like sugar over the original proposal, but adding both is pretty bloaty.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 16, 2016

We discussed this in the tools triage meeting this past Wednesday, and the conclusion was that there wasn't enough consensus to move forward with the proposed syntax yet.

Syntax such as @Stebalien's proposal is easier to read, but unfortunately also has downsides. It was decided that it was probably too complicated to implement one with the intent that if it didn't work another would be implemented, so we wanted to narrow it down ahead of time (e.g. now).

One part of the discussion I believe we should assume is that TOML will not change. I suspect that it's pretty unlikely the spec will change much at this point, especially for features such as "this is ok so long as there are balanced delimeters" (although it would be quite nice...)

@jethrogb

This comment has been minimized.

Copy link
Contributor

jethrogb commented Jan 16, 2016

One part of the discussion I believe we should assume is that TOML will not change.

Not even the " vs. ' internal quotes?

cc @BurntSushi

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 19, 2016

That's something we might be able to modify TOML with, but even that may be a bit of a stretch and I wouldn't want to hinge acceptance of this RFC on a modification like that.

@BurntSushi

This comment has been minimized.

Copy link
Member

BurntSushi commented Jan 19, 2016

@jethrogb Sorry, I hadn't seen that question. I responded.

@jethrogb

This comment has been minimized.

Copy link
Contributor

jethrogb commented Jan 23, 2016

I read through this entire PR again, and aside from @Stebalien's proposal which he withdrew, I don't really see any alternative proposals. The remaining complaints are mostly "doesn't look good". With today's change in the TOML spec allowing us to get rid of the backslashes in all cases I think that means that the currently proposed RFC is the best and only proposal out there right now and should be our way forward.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 23, 2016

@jethrogb ah yes thanks for the reminder! To clarify the syntax for strings-in-table-keys in TOML has been tweaked to allow, for example:

[target.'cfg(foo = "bar")'.dependencies]
# ...

I've updated the RFC to reflect this.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 26, 2016

For those interested, I've implemented this in a branch, currently in this commit.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 29, 2016

The tools team discussed this in triage this past week, and the decision was to merge. The recent change in TOML syntax should allow for much more ergonomic usage of string-like #[cfg] annotations, and it doesn't seems that any one syntax is clearly better than another, so we're going to move forward with what's proposed here.

Thanks regardless again for the discussion everyone!

@alexcrichton alexcrichton merged commit 60e6d04 into rust-lang:master Jan 29, 2016

@ruuda

This comment has been minimized.

Copy link

ruuda commented Jan 31, 2017

For future reference: the implementation of this RFC became available with Rust and Cargo version 1.8.0, released 2016-04-14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.