Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "artifact dependencies" on bin, cdylib, and staticlib crates #3028

Merged
merged 34 commits into from Jan 23, 2021

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Nov 30, 2020

Rendered

This RFC was co-authored by @jyn514 and @joshtriplett.

See rust-lang/cargo#4316 and rust-lang/cargo#7804 for previous discussions of this. Also discussed on Zulip.

Some motivating use cases (taken from the RFC):

@joshtriplett joshtriplett added I-nominated T-cargo Relevant to the Cargo team, which will review and decide on the RFC. labels Dec 1, 2020
@programmerjake
Copy link
Member

Awesome! Additional usecase: building wasm32-wasi executable that is then loaded by an example binary that is a simple demo webserver, like is used in power-cpu-sim (hosted version, in case you like cpu design).

Copy link

@matu3ba matu3ba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions from me, although its unclear to me what a lib exactly can be. I guess it is either an rlib (static with compiler linking) or staticlib (static without compiler linking) or dylib (dynamic library).

I am not sure at all, if proc-macro has a special treatment within cargo or only from rustc.

text/0000-cargo-binary-dependencies.md Outdated Show resolved Hide resolved
text/0000-cargo-binary-dependencies.md Outdated Show resolved Hide resolved
text/0000-cargo-binary-dependencies.md Outdated Show resolved Hide resolved
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up! I feel like a feature along these lines is good to add to Cargo, but I do have some concerns about the motivation and some of the specifics below.

One concern I have is related to depending on binaries. I'm personally not convinced that what's proposed in this RFC is necessary or necessarily something that Cargo needs. One of the major drawback I see of "simply depend on a binary" is that running binaries is often quite involved and nontrivial. Fiddling with Command, having good error reporting, and dealing with subprocess I/O is not always trivial and if a widely-used package simply provides a binary it means that all consumers are duplicating all this logic for handling the subprocess management. The alternative, however, having a library, provides an author an opportunity to make a much easier to use Rust API, document it, provide examples, etc. An author providing the binary would still be able to build the binary itself in the build script and reference it from the crate that runs the binary.

An additional concern I would have about binaries is that it sort of feels like we're pushing Cargo towards an entire package manager. The example given in the RFC is related to cmake, but as the maintainer of the cmake crate I would not want to be responsible for shipping cmake's source code and building it if it's not already in the environment (or somehow handling "the environment is too old a version from what you requested). In general I can't ever really see myself managing build tools like that, especially for ubiquitous ones like make, cmake, clang, etc.

Overall I feel that the most useful part of this RFC is being able to build multiple artifacts for possibly different targets, but I'm not sure that it's best served with a depencency-level directive rather than a package-level directive (commented more below). Otherwise I can't personally think of a concrete use case where this would be an improvement to a build that's otherwise quite complicated to do today (e.g. would presumably be multiple cargo invocations driven by some sort of other build system)


There are many different possible use cases.

- [Testing the behavior of a binary](https://github.com/rust-lang/cargo/issues/4316#issuecomment-361641639). Currently, this requires invoking `cargo build` recursively, or running `cargo build` before running `cargo test`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this RFC is necessary in this case for binaries defined in the local package. All integration test targets implicitly depend on all binary targets in a package today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @Twey might have been talking about using binaries in unit tests. I'll let them describe their use case, I don't know anything other than the linked comment.

There are many different possible use cases.

- [Testing the behavior of a binary](https://github.com/rust-lang/cargo/issues/4316#issuecomment-361641639). Currently, this requires invoking `cargo build` recursively, or running `cargo build` before running `cargo test`.
- [Running a binary that depends on another](https://github.com/rust-lang/rustc-perf/tree/master/collector#how-to-benchmark-a-change-on-your-own-machine). Currently, this requires running `cargo build`, making it difficult to keep track of when the binary was rebuilt. The use case for `rustc-perf` is to have a main binary that acts as an 'executor', which executes `rustc` many times, and a smaller 'shim' which wraps `rustc` with additional environment variables and arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this perhaps be expanded a bit? I'm not really sure what this is referring to, because cargo build builds all the binaries today in a package. Does rustc-perf have different packages with binaries that depend on each other?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the binaries depend on each other, so you can't use cargo run because that requires a specific binary. cargo build does work, but means you have to run the binary as target/release/collector, so it's easy to forget to recompile after a change.

There are four `type`s available:
1. `"lib"`, the default
2. `"bin"`, a crate building one or more binaries
3. `"cdylib"`, a C-compatible dynamic library
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about cdylib/staticlib here in the sense that these often have canonical meanings which Cargo would otherwise not be interpreting. For example if you depend on a lib Cargo will pass all the right --extern flags and make sure that it makes its way to the final binary. With a cdylib and/or staticlib you may want that as well.

For example you might want to consume a crate's C API and want to have that automatically linked into the final format. By only supporting cdylib and staticlib here and nothing else about these crate types I'd fear that we'd get into a situation where depending on one of these crate types involves a lot of boilerplate that Cargo could otherwise handle if we took some time to game out what it meant.

Is it possible to game out here what the main use cases are here and see if we can improve the Cargo experience?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton This isn't intended for the use case where you just want to link your binary (or library) to the produced library. For that use case, I absolutely agree that cargo should provide more integration. This is intended for the use case where the library serves some other function, such as a runtime library for an interpreter (linked into interpreted programs rather than just the interpreter), or a VDSO for a kernel (linked into userspace programs run on that kernel), or a preloaded library for use with LD_PRELOAD, or a remote debugging stub injected into another program or device. In those cases, you really do just want the compiled artifact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the problem lies exactly within type="lib" providing some kind of integration (adding --extern to rustc invocations) and type="cdylib" or type="staticlib" not doing so. Or vice versa. This difference is not easily inferred by just reading a toml file that contains both or one of the settings.

And yet the options change enough about the build process that anybody who attempts to understand the dependency structure needs to be very aware of the difference in behaviour here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa I don't think it's reasonably possible to have "default" handling for cdylib, because linking a shared library doesn't guarantee it'll be available at runtime.

That said, if we believe there's some value in having non-artifact dependencies on cdylib or staticlib, with automatic handling, we could flag artifact dependencies in some way. For instance, type="artifact:cdylib". We could then reserve type="cdylib" for something else. I don't think that's necessary, but it's a reasonable possibility, and this feature would still be suitable for the intended purposes.

Copy link
Member

@nagisa nagisa Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option that comes to mind is to just remove type="lib" for now – making the default integrated beaviour only happen when the type is absent; or perhaps make it behave much like other types, in that it would only provide the rlib as an artifact, without any integrated behaviour.

In that case people who want both the "integrated" behaviour as well as an artifact, they could specify the dependency multiple times, once with a type and once without.

- `[dependencies]`
- `[dev-dependencies]`

By default, `build-dependencies` are built for the host, while `dependencies` and `dev-dependencies` are built for the target. You can specify the `target` attribute to build for a specific target, such as `target = "wasm32-wasi"`; a literal `target = "target"` will build for the target even if specifing a build dependency. (If the target is not available, this will result in an error at build time, just as if building the specified crate with a `--target` option for an unavailable target.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry I would have about supplying a target feature is that this sort of attribute is often better placed on the package itself. For example if you're depending on a wasm blob the package that compiles to wasm is unlikely to ever not want to get compiled to wasm, but this means that you'd have to specify the target every single time you depend on it.

I think that this sort of attribute on a dependency would work but it isn't clearly the best solution available to us I think. I think it would be worthwhile to explore the use cases of this and see if they'd be better suited with a dependency-level or package-level target declaration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting something like [package] default-target = "wasm32-wasi"? I'd be interested in that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry I would have about supplying a target feature is that this sort of attribute is often better placed on the package itself. For example if you're depending on a wasm blob the package that compiles to wasm is unlikely to ever not want to get compiled to wasm, but this means that you'd have to specify the target every single time you depend on it.

I disagree: for power-cpu-sim (linked above), it is specifically designed to be able to compile to a command-line program for linux/etc. and the exact same crate is also designed to be compiled for the web using target wasm32-wasi and there is an example web server which runs on linux and requires the produced wasm file to run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm not saying that a dependency-level directive isn't useful. Nor does one reason why it is useful mean it's more useful than a package-level directive. My point is that the package-level directive is probably useful in more cases.

@jyn514 yes that's what I'm thinking (subject to bikeshedding of course)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton A package-level directive for the default also makes sense, but that would be a separate proposal. If a package truly supports only one target, a package-level directive would suffice. The target attribute on a dependency is intended for cases where the dependency supports multiple targets.

For example, consider a crate that builds virtual machine firmware, and supports multiple architectures. If you're building a virtual machine for a specific architecture, you need firmware for that architecture.

Or, consider a kernel that supports both 64-bit and 32-bit userspace applications, and needs to build different versions of a VDSO for each case.

I also expect target = "target" on build dependencies to be quite common: "no, I don't want to run this as part of the build process, I want this to be runnable on the target system instead".

I would also expect to see cases like this:

[build-dependencies]
thing32 = { package = "thing", version = "...", target = "i686-unknown-linux-gnu" }
thing64 = { package = "thing", version = "...", target = "x86_64-unknown-linux-gnu" }

Copy link
Member Author

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't personally think of a concrete use case where this would be an improvement to a build that's otherwise quite complicated to do today (e.g. would presumably be multiple cargo invocations driven by some sort of other build system)

The goal here is to avoid having a build system on top of cargo - rustc-perf for instance just asks its users to deal with manually running cargo build && target/release/collector.

Fiddling with Command, having good error reporting, and dealing with subprocess I/O is not always trivial and if a widely-used package simply provides a binary it means that all consumers are duplicating all this logic for handling the subprocess management. The alternative, however, having a library, provides an author an opportunity to make a much easier to use Rust API, document it, provide examples, etc. An author providing the binary would still be able to build the binary itself in the build script and reference it from the crate that runs the binary.

This requires the author to first be interested in providing a library. My original use case was for running compiletest in integration tests, which is currently a binary and doesn't have a natural programmatic API. There is https://github.com/laumann/compiletest-rs, but that's a fork and the story around it is ... complicated. It would be nice to be able to use compiletest in-tree without having to come up with a whole API no one will use but rustdoc.

There are many different possible use cases.

- [Testing the behavior of a binary](https://github.com/rust-lang/cargo/issues/4316#issuecomment-361641639). Currently, this requires invoking `cargo build` recursively, or running `cargo build` before running `cargo test`.
- [Running a binary that depends on another](https://github.com/rust-lang/rustc-perf/tree/master/collector#how-to-benchmark-a-change-on-your-own-machine). Currently, this requires running `cargo build`, making it difficult to keep track of when the binary was rebuilt. The use case for `rustc-perf` is to have a main binary that acts as an 'executor', which executes `rustc` many times, and a smaller 'shim' which wraps `rustc` with additional environment variables and arguments.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the binaries depend on each other, so you can't use cargo run because that requires a specific binary. cargo build does work, but means you have to run the binary as target/release/collector, so it's easy to forget to recompile after a change.


There are many different possible use cases.

- [Testing the behavior of a binary](https://github.com/rust-lang/cargo/issues/4316#issuecomment-361641639). Currently, this requires invoking `cargo build` recursively, or running `cargo build` before running `cargo test`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @Twey might have been talking about using binaries in unit tests. I'll let them describe their use case, I don't know anything other than the linked comment.

- `[dependencies]`
- `[dev-dependencies]`

By default, `build-dependencies` are built for the host, while `dependencies` and `dev-dependencies` are built for the target. You can specify the `target` attribute to build for a specific target, such as `target = "wasm32-wasi"`; a literal `target = "target"` will build for the target even if specifing a build dependency. (If the target is not available, this will result in an error at build time, just as if building the specified crate with a `--target` option for an unavailable target.)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting something like [package] default-target = "wasm32-wasi"? I'd be interested in that.

@joshtriplett
Copy link
Member

@alexcrichton, I responded inline to several of your comments, but I also wanted to respond to your top-level comment:

Thanks for writing this up! I feel like a feature along these lines is good to add to Cargo, but I do have some concerns about the motivation and some of the specifics below.

One concern I have is related to depending on binaries. I'm personally not convinced that what's proposed in this RFC is necessary or necessarily something that Cargo needs. One of the major drawback I see of "simply depend on a binary" is that running binaries is often quite involved and nontrivial. Fiddling with Command, having good error reporting, and dealing with subprocess I/O is not always trivial and if a widely-used package simply provides a binary it means that all consumers are duplicating all this logic for handling the subprocess management. The alternative, however, having a library, provides an author an opportunity to make a much easier to use Rust API, document it, provide examples, etc. An author providing the binary would still be able to build the binary itself in the build script and reference it from the crate that runs the binary.

Some crates may want to wrap the usage of the binary in an API, but that isn't always the case. That assumes your primary use case is to invoke the binary in some standardized way from Rust code via Command or similar. You may, instead, want to:

  • Invoke a different binary (such as make or ./configure or ./do-build.sh) that expects to find the binary you depend on via the $PATH.
  • Embed the binary, rather than running it: firmware, plugins, etc.
  • Run the binary in some non-standard way, for which wrapping it in an API would be non-trivial.
  • Have multiple different crates providing different APIs on top of the same binary, but only one crate with the knowledge of how to build and supply that binary.

An additional concern I would have about binaries is that it sort of feels like we're pushing Cargo towards an entire package manager. The example given in the RFC is related to cmake, but as the maintainer of the cmake crate I would not want to be responsible for shipping cmake's source code and building it if it's not already in the environment (or somehow handling "the environment is too old a version from what you requested). In general I can't ever really see myself managing build tools like that, especially for ubiquitous ones like make, cmake, clang, etc.

We can certainly change the example to a more hypothetical build tool, if you'd prefer. I'd like to talk to you at more length about the idea of building cmake from source, though. (I'd expect to put that in something like a cmake-sys or cmake-bin crate, rather than directly in the cmake crate; cmake could just have a vendored feature or similar, to optionally make use of that.) This would help many Windows users, for which it'd be substantially easier than installing cmake on the host system.

For something like clang, I'd love to provide a sysroot crate that supplies clang and related binaries from precisely the version of LLVM used to build Rust, which would make things like cross-language LTO trivial.

This would be even more helpful for less ubiquitous build tools that not everyone already has, particularly in environments where installing tools is not trivial.

Overall I feel that the most useful part of this RFC is being able to build multiple artifacts for possibly different targets, but I'm not sure that it's best served with a depencency-level directive rather than a package-level directive (commented more below). Otherwise I can't personally think of a concrete use case where this would be an improvement to a build that's otherwise quite complicated to do today (e.g. would presumably be multiple cargo invocations driven by some sort of other build system)

Several of the use cases proposed in the RFC are specifically cases where today someone would have to drive Cargo with some other build system, and this RFC would allow just using cargo build instead.

@programmerjake
Copy link
Member

For something like clang, I'd love to provide a sysroot crate that supplies clang and related binaries from precisely the version of LLVM used to build Rust, which would make things like cross-language LTO trivial.

That's exactly what I need for power-cpu-sim, and I have to take up half the readme with instructions for how to install clang-11 using LLVM's APT repos. Replacing that all with just cargo build or maybe rustup component add would be awesome!

@joshtriplett
Copy link
Member

I submitted a PR at jyn514#3 to expand the alternatives section. (I don't think the alternative is something we should consider, but it's worth documenting.)

@alexcrichton
Copy link
Member

I think that's a really good point about a binary-calling-a-binary as well as something like *-bin crates (ala cmake-bin). That makes a lot of sense to me, thanks for clarifying! I agree that it'd make sense for the cmake crate (and other build-system-like crates) to have an optional off-by-default dependency on "vendor the build support".

One question I would have though is how is it expected to hook up a foreign binary into Cargo? For example it looks like the env vars specified to downstream crates are based on Cargo-based-[[bin]] targets, so how would binaries produced by the build script (e.g. cmake) get specified?

(also FWIW I don't mind cmake as an example, I just didn't know what was expected of the literal cmake crate if this RFC were to get merged)

I do feel like this is a bit pie-in-the-sky for tools like cmake/clang/gcc though given that their build times is so long and ubiquity is so high. For smaller binaries and/or less-widely-known ones, however, I could imagine this being quite useful.

Several of the use cases proposed in the RFC are specifically cases where today someone would have to drive Cargo with some other build system, and this RFC would allow just using cargo build instead.

I think this makes sense as motivation, but after reading the RFC I was left with the impression that this is just a small feature in Cargo which may not be able to supplant other wrapper build systems. That being said this seems like a necessary step in any case. I'm sure this will unlock some use cases but it seems like it will only make progress on others (which is of course fine!).

@alexcrichton
Copy link
Member

One concern about this RFC with targets I'm realizing now is that "it's just cargo build" isn't quite right. If a crate has a variety of targets specified it's actually "run this set of rustup target add commands" (which is project-specific and perhaps not easily discoverable) and then "run cargo build". In terms of ease-of-use that may be a relatively high-priority item to tackle around this RFC as well?

@jyn514
Copy link
Member Author

jyn514 commented Dec 10, 2020

If a crate has a variety of targets specified it's actually "run this set of rustup target add commands" (which is project-specific and perhaps not easily discoverable) and then "run cargo build".

Those are targets that would already be built anyway, right? Like if cargo is building from host to target, it needs both standard libraries anyway (one for build scripts and one for the library being cross-compiled). In that case I don't think that needs to be addressed in the RFC - the only time this would break is when cargo build would already break.

@alexcrichton
Copy link
Member

My point is that one of the major motivations it seems for this RFC is "let's get more projects simply doing cargo build". Having to rustup target add a bunch of targets makes me feel that it's not quite achieving that goal, especially if it's encouraged to add target-specific builds to dependencies.

@bjorn3
Copy link
Member

bjorn3 commented Dec 10, 2020

With the new rust-toolchain format it is possible to specify which targets should be installed automatically.

@andrewdavidmackenzie
Copy link

I would like to add my use-case that makes this desirable, in case it helps folks working in it:

Overview

I have a workspace project with multiple packages. The two important ones are:

  • flowc - which is a lib and a binary
  • flowstdlib - which is a library of routines, to be built using flowc

flowc is a kind of compiler that I build as part of the project.

flowstdlib has a build script, that uses the flowc built binary to build that package, so I need the flowc compiler binary to be built when flowstdlib is to be built.

Current Situation

In cargo.toml of flowstdlib I define flowc as a build dependency:

[build-dependencies]
flowc = {path = "../flowc", version = "0.31.0" }`

The build compiles multiple packages at the same time in parallel:

   Compiling flowstdlib v0.31.0 (/Users/andrew/workspace/flow/flowstdlib)
   Compiling flowsamples v0.31.1 (/Users/andrew/workspace/flow/samples)
warning: Could not find `flowc` in $PATH or `target/debug`, so cannot build flowstdlib
error: failed to run custom build command for `flowsamples v0.31.1 (/Users/andrew/workspace/flow/samples)`

and the flowstdlib build fails as the flowc binary is not built yet.

Since the build continues and eventually finishes building flowc, if I re-run the build, it will work the second time around (as flowc binary is now found).

Desired State

Have the flowstdlib package build depend on the build of the the flowc binary having completed.
(without forcing a non-parallel build)

@demurgos
Copy link

Hi,
I've read the RFC and am very interested in it. I have two use cases for this, both are variants of the use-case "Installing a local command-line tool that is scoped to the project.".

  1. When building WASM modules, I use the lib wasm-bindgen and the binary wasm-bindgen-cli. I want to ensure that both the library and corresponding CLI have the same version for a given project. Without project-local binaries, I need to reinstall my global wasm-bindgen-cli when switching between projects using a different lib version. With this proposal, I could define a dependency in each project and invoke it through build.rs.

  2. Similarly, I want to use the cargo sqlx command provided by sqlx-cli without having to install it globally but having it bound to a project. I hope that a future RFC would enable cargo to load subcommands from the local project, but until then I will be able to use it from the build.rs script.

@jyn514
Copy link
Member Author

jyn514 commented Dec 11, 2020

My point is that one of the major motivations it seems for this RFC is "let's get more projects simply doing cargo build". Having to rustup target add a bunch of targets makes me feel that it's not quite achieving that goal, especially if it's encouraged to add target-specific builds to dependencies.

FWIW I would be ok with removing the target-specific parts of the RFC, or limiting it only to the host or --target. @joshtriplett what do you think?

@programmerjake
Copy link
Member

My point is that one of the major motivations it seems for this RFC is "let's get more projects simply doing cargo build". Having to rustup target add a bunch of targets makes me feel that it's not quite achieving that goal, especially if it's encouraged to add target-specific builds to dependencies.

FWIW I would be ok with removing the target-specific parts of the RFC, or limiting it only to the host or --target.

I think keeping target-specific parts is a major benefit of this rfc.
I'd anticipate cargo and rustup growing to be able to have cargo automatically install missing targets, though, even if that doesn't happen right away, having cargo build work after initial installation of required targets is itself still a major benefit.

@alexcrichton
Copy link
Member

@demurgos that's actually an example I'm worried about with this RFC, actually. It seems like it should work for wasm-bindgen but this would not be usable for wasm-bindgen. This RFC isn't intended for postprocessing artifacts, but only processing preexisting artifacts with external tools into the rest of the build. Nothing here would help wasm-bindgen's experience, I believe.

@jyn514 I'm not saying that the target-specific bits should be removed, what I'm trying to do is point out that this is sort of a half-design in a lot of ways. I think it's solving real problems but it's leaving a lot of other closely related problems unsolved. That's possible and reasonable to do since you don't want to solve anything at once, but at the same time I think it's important to acknowledge what's explicitly not being solved and what pain points will continue to be.

@XAMPPRocky
Copy link
Member

Hey, just wanted to add a use case for artifact dependencies. In rust-gpu we want to be able to build and include SPIR-V shader crates in the main binary which is actually responsible for running them. Currently this implemented in a very hacky way with having a spirv-builder crate which runs cargo inside a build script to build the SPIR-V crate and expose the path to the binary in an environment variable.

This isn't really ideal since there have been a number of issues with calling cargo inside of cargo (from it easily creating directories that are beyond MAX_PATH in Windows and failing to build, to environment variables from the top level cargo build affecting the shader build). It would be so much nicer and simpler if you could say shader = { path = "...", target = "spirv-unknown-unknown" } in your Cargo.toml.

There are four `type`s available:
1. `"lib"`, the default
2. `"bin"`, a crate building one or more binaries
3. `"cdylib"`, a C-compatible dynamic library
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the problem lies exactly within type="lib" providing some kind of integration (adding --extern to rustc invocations) and type="cdylib" or type="staticlib" not doing so. Or vice versa. This difference is not easily inferred by just reading a toml file that contains both or one of the settings.

And yet the options change enough about the build process that anybody who attempts to understand the dependency structure needs to be very aware of the difference in behaviour here.

text/0000-cargo-binary-dependencies.md Outdated Show resolved Hide resolved
@joshtriplett
Copy link
Member

@alexcrichton wrote:

One question I would have though is how is it expected to hook up a foreign binary into Cargo? For example it looks like the env vars specified to downstream crates are based on Cargo-based-[[bin]] targets, so how would binaries produced by the build script (e.g. cmake) get specified?

With the RFC as specified, if you want to produce and depend on a bin artifact, it needs to be a Cargo [[bin]] target. I can think of many use cases for other possibilities, such as passing through a binary produced by one of your dependencies, or building a binary in your build script. If you're open to the possibility, I'd love to have a way to declare a [[bin]] target that's produced by build.rs.

Perhaps something like this?

[[bin]]
name = "foo"
manual = true

A [[bin]] target with manual=true will not have any logic emitted by Cargo; Cargo will just give build.rs an environment variable specifying a directory to put it in, and then error if it doesn't get built by build.rs.

Thoughts?

I do feel like this is a bit pie-in-the-sky for tools like cmake/clang/gcc though given that their build times is so long and ubiquity is so high. For smaller binaries and/or less-widely-known ones, however, I could imagine this being quite useful.

It's a tradeoff, just like any vendoring is a tradeoff. Many -sys crates offer the option of building a vendored copy, with the same tradeoff in build time.

Also, if we have the above mechanism for supplying binaries via build.rs, we could consider supplying a sysroot crate (installable via rustup) containing the LLVM we built rustc with.

One concern about this RFC with targets I'm realizing now is that "it's just cargo build" isn't quite right. If a crate has a variety of targets specified it's actually "run this set of rustup target add commands" (which is project-specific and perhaps not easily discoverable) and then "run cargo build". In terms of ease-of-use that may be a relatively high-priority item to tackle around this RFC as well?

As @bjorn3 observed, that's supported in rustup today, by specifying a rust-toolchain toml file that requires the targets you want. I don't think we need to do anything more than that in this RFC. Any further support for that in Cargo would be a different RFC.

@demurgos that's actually an example I'm worried about with this RFC, actually. It seems like it should work for wasm-bindgen but this would not be usable for wasm-bindgen. This RFC isn't intended for postprocessing artifacts, but only processing preexisting artifacts with external tools into the rest of the build. Nothing here would help wasm-bindgen's experience, I believe.

One day I hope we have a way to run Rust code post-build to generate or post-process artifacts. I want to avoid rolling too much into this RFC, though, and I'd expect post-build code to invite more controversy.

In the meantime, though, if we add a mechanism for supplying binaries, people could technically handle this by building a workspace with two crates, one building the un-preprocessed binary, and the other depending on that and wasm-bindgen and emitting a processed binary. That's not ideal, but it works.

@jyn514 wrote:

FWIW I would be ok with removing the target-specific parts of the RFC, or limiting it only to the host or --target. @joshtriplett what do you think?

Given the ability to enable those targets via a rust-toolchain file, and the value of this for cases like wasm, I'd like to keep this in the RFC.

@XAMPPRocky wrote:

Hey, just wanted to add a use case for artifact dependencies. In rust-gpu we want to be able to build and include SPIR-V shader crates in the main binary which is actually responsible for running them. Currently this implemented in a very hacky way with having a spirv-builder crate which runs cargo inside a build script to build the SPIR-V crate and expose the path to the binary in an environment variable.

This isn't really ideal since there have been a number of issues with calling cargo inside of cargo (from it easily creating directories that are beyond MAX_PATH in Windows and failing to build, to environment variables from the top level cargo build affecting the shader build). It would be so much nicer and simpler if you could say shader = { path = "...", target = "spirv-unknown-unknown" } in your Cargo.toml.

That's an awesome example! I've added it to the examples in the RFC.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 12, 2021

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Jan 12, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Jan 12, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Jan 12, 2021
@rfcbot rfcbot added the finished-final-comment-period The final comment period is finished for this RFC. label Jan 22, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Jan 22, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Jan 22, 2021
@ehuss ehuss merged commit 513198d into rust-lang:master Jan 23, 2021
@ehuss
Copy link
Contributor

ehuss commented Jan 23, 2021

Huzzah! The @rust-lang/cargo team has decided to accept this RFC.

To track further discussion, subscribe to the tracking issue here:
rust-lang/cargo#9096

illicitonion added a commit to illicitonion/cargo that referenced this pull request Oct 13, 2021
A number of projects, such as https://github.com/google/cargo-raze
consume the output of `cargo metadata` when generating input to other
build systems, such as Bazel and Buck. Typically, these projects use a
`Cargo.toml` file as the input to drive their work, and it can be useful
to express to these systems that a binary is required as part of the
build.

If such a dependency happens to have a lib target, sufficient dependency
information happens to be exposed via the resolve field to recreate the
dependency structure adequately.

However, for binary-only targets, that entire branch of the dependency
tree is cut off, making it hard to recreate this information.

This PR adds an option to `cargo metadata` to allow rendering these
subtrees, describing these deps as of kind "binary". This isn't really a
concept in cargo-proper, but may become one via
rust-lang/rfcs#3168 and/or
rust-lang/rfcs#3028 - this output format is
forwards-compatible with these proposals.

In an RFC 3028 world, binary deps would appear as `dep_kind` "binary",
and `cdylib` or `staticlib` deps would appear as new `DepKind` variants.
We'd probably add a new value of the flag, `declared`, and set that to
be the default. We may also want to deprecate the
`include-if-no-library-dep` value, and make these ignored binary deps a
hard error (replacing the existing warning log).

In an RFC 3168 world, there's an interesting metadata question to have -
there are (at least) three reasonable options in terms of handling
metadata:
1. Require a direct dependency on any package whose binary deps are
   being used from the crate which uses them - this gives a nicely
   restricted search space, and allows for nicely curated metadata
   output.
2. Allow any transitive package dependencies to be used as binary deps -
   this may significantly expand the output produced, but would model
   the implementation of the feature.
3. Require explicitly tagging binary deps as being used as binary deps -
   this looks a lot like RFC 3028, and would tend in that direction.
@jyn514 jyn514 deleted the cargo-binaries branch March 3, 2022 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. to-announce
Projects
No open projects
Status: Implemented
Development

Successfully merging this pull request may close these issues.

None yet