New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eRFC: Cargo build system integration #2136
Changes from 4 commits
6c1d9a2
a732f6a
e8cb125
3591258
e1fdd82
1f004ce
733f988
562d116
ff9bb8e
8b7ce19
9fe67c4
b5c2615
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,13 +31,13 @@ After extensive discussion with stakeholders, there appear to be two distinct | |
kinds of use-cases (or "customers") involved here: | ||
|
||
- **Mixed build systems**, where building already involves a variety of | ||
language- or proeject-specific build systems. For this use case, the desire is | ||
language- or project-specific build systems. For this use case, the desire is | ||
to use Cargo as-is, except for some specific concerns. Those concerns take a | ||
variety of shapes: customizing caching, having a local crate registry, custom | ||
handling for native dependencies, and so on. Addressing these concerns well | ||
means adding new points of extensibility or control to Cargo. | ||
|
||
- **Homogenous build systems** like [Bazel], where there is a single prevailing | ||
- **Homogeneous build systems** like [Bazel], where there is a single prevailing | ||
build system and methodology that works across languages and projects and is | ||
expected to drive all aspects of the build. In such cases the goal of Cargo | ||
integration is largely *interoperability*, including easy use of the crates.io | ||
|
@@ -164,22 +164,32 @@ functionality into numerous small pieces that can be re-used when integrating | |
into a larger build system. This finer division is left as a question for | ||
experimentation. | ||
|
||
## Specifics for the homogenous build system case | ||
|
||
For homogenous build systems, there's a key question: how is the Rust code | ||
itself managed, through a crate registry or though some external system? Any | ||
integration has to handle the first case (to have access to crates.io or a | ||
mirror thereof), but organizations can choose whether to manage their own crates | ||
through a custom registry (more on that below) or some other means. | ||
|
||
### Using crates managed by a crate registry | ||
|
||
Using a crate registry implies using Cargo's dependency resolution, and, in | ||
particular, `Cargo.toml`. In this case, the external build system should invoke | ||
Cargo for *at least* the dependency resolution and build configuration steps, | ||
and likely the build lowering step as well. In such a world, Cargo is | ||
responsible for *planning* the build (which involves largely Rust-specific | ||
concerns), but the external build system is responsible for *executing* it. | ||
## Specifics for the homogeneous build system case | ||
|
||
For homogeneous build systems, there are two kinds of code that must be dealt | ||
with: code originally written using vanilla Cargo and a crate registry, and code | ||
written "natively" in the context of the external build system. Any integration | ||
has to handle the first case (to have access to crates.io or a vendored mirror | ||
thereof). | ||
|
||
### Using crates vendored from or managed by a crate registry | ||
|
||
Whether using a registry server or a vendored copy, if you're building Rust code | ||
that is written using vanilla Cargo, you will at some level need to use Cargo's | ||
dependency resolution and `Cargo.toml` files. In this case, the external build | ||
system should invoke Cargo for *at least* the dependency resolution and build | ||
configuration steps, and likely the build lowering step as well. In such a | ||
world, Cargo is responsible for *planning* the build (which involves largely | ||
Rust-specific concerns), but the external build system is responsible for | ||
*executing* it. | ||
|
||
A typical pattern of usage is to have a whitelist of "root dependencies" from an | ||
external registry which will be permitted as dependencies within the | ||
organization, often pinning to a specific version and set of Cargo | ||
features. This whitelist can be described as a single `Cargo.toml` file, which | ||
can then drive Cargo's dependency resolution just once for the entire registry. | ||
The resulting lockfile can be used to guide vendoring and construction of a | ||
build plan for consumption by the external build system. | ||
|
||
One important concern is: how do you depend on code from other languages, which | ||
is being managed by the external build system? That's a narrow version of a more | ||
|
@@ -189,64 +199,54 @@ separately in a later section. | |
#### Workflow and interop story | ||
|
||
On the external build system side, a rule or plugin will need to be written that | ||
knows how to invoke Cargo to produce a build plan, then translate that build | ||
plan back into appropriate rules for the build system. Thus, when doing normal | ||
builds, the external build system drives the entire process, but invokes Cargo | ||
for guidance during the planning stage. | ||
knows how to invoke Cargo to produce a build plan corresponding to a whitelisted | ||
(and potentially vendored) registry, then translate that build plan back into | ||
appropriate rules for the build system. Thus, when doing normal builds, the | ||
external build system drives the entire process, but invokes Cargo for guidance | ||
during the planning stage. | ||
|
||
### Using crates managed by the build system | ||
|
||
Many organization want to employ a their own strategy for maintaining and | ||
versioning code, and for resolving dependencies. In this case, they may wish to | ||
entirely forgo producing a meaningful Cargo.toml for the code the write, instead | ||
having one that just forwards to a plugin. The description of dependencies is | ||
then written in the external build system's rule format. Here, Cargo acts | ||
primarily as a *workflow and tool orchestrator*, since it is not involved in | ||
either planning or executing the build. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. An organization that chooses to have their own versioning and dependency system (such as the facebooks and googles of the world) are most likely not going to use Cargo at all. Instead they will have their own tooling that calls rustc directly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A significant motivation for this RFC -- which was designed in coordination with FB and Google build engineers -- is precisely to allow them to manage the build process while still (1) getting access to crates.io and (2) integrating with Rust tooling. Indeed, the sentence you attached this to is specifically talking about how, in these cases, all that Cargo is doing is providing a common way for Rust tooling to get information about a Rust project, even if that information is just being provided by an external build system. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aturon while I know that that's what "workflow and tool orchestrator" means, I'm not sure a casual reader would. It would be useful to make this clearer -- specifically that Cargo in this situation will just be the API that RLS/rustfmt etc use. |
||
|
||
#### Workflow and interop story | ||
|
||
Even though the external build system is entirely handling both dependency | ||
resolution and build execution for the crates under its management, it may still | ||
use Cargo for *lowering*, i.e. to produce the actual `rustc` invocations from a | ||
higher-level configuration. Cargo will provide a way to do this. | ||
|
||
When *developing* a crate, it should be possible to invoke Cargo commands as | ||
usual. We do this via a plugin. When invoking, for example, `cargo build`, the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to invoke the native build system directly instead of using I'm not sure if this is too low-level for this RFC, but it might be worth talking a bit about where build artifacts go. With Buck, build artifacts always go into
Would tools like the RLS know to read build artifacts from the |
||
plugin will translate that to a request to the external build system, which will | ||
in turn re-invoke Cargo to request a build plan (the exact mechanics here are | ||
TBD), and then execute the build. For `cargo run`, the same steps are followed | ||
by putting the resulting build artifact in an appropriate location, and then | ||
following Cargo's usual logic. And so on. | ||
then execute the build (possibly re-invoking Cargo for lowering). For `cargo | ||
run`, the same steps are followed by putting the resulting build artifact in an | ||
appropriate location, and then following Cargo's usual logic. And so on. | ||
|
||
A similar story plays out when using, for example, the RLS or rustfmt. Ideally, | ||
these tools will have no idea that a Cargo plugin is in play; the information | ||
and artifacts they need can be obtained by using Cargo in the appropriate way, | ||
transparently. | ||
and artifacts they need can be obtained by using Cargo's in a standard way, | ||
transparently -- but the underlying information will be coming from the external | ||
build system, via the plugin. Thus the plugin for the external build system must | ||
be able to translate its dependencies back into something equivalent to a | ||
lockfile, at least. | ||
|
||
While the details here are quite hazy, the overall point is that control swaps | ||
back and forth between Cargo and the external build system, depending on the | ||
concerns at play. We set things up so that the Rust-specific pieces (including | ||
Cargo workflows) continue to be handled by Cargo whenever possible. | ||
### The complete picture | ||
|
||
### Using "unmanaged" crates | ||
In general, any integration with a homogeneous build system needs to be able to | ||
handle (vendored) crate registries, because access to crates.io is a hard constraint. | ||
|
||
In some cases, an organization may want to employ a their own strategy for | ||
maintaining and versioning code, and for resolving dependencies. In this case, | ||
they may wish to entirely forgo writing a meaningful Cargo.toml, instead having | ||
one that simply forwards to a plugin. The description of dependencies is then | ||
written in the external build system's rule format. Here, Cargo acts primarily | ||
as a *workflow and tool orchestrator*, since it is not involved in either | ||
planning or executing the build. | ||
|
||
#### Workflow and interop story | ||
|
||
As with the workflow for crate registries, a plugin is used to manage control | ||
passing back-and-forth between Cargo and the external build system. The main | ||
difference is that Cargo is used for fewer steps. So, for example, when running | ||
`cargo test`, the external build system is invoked directly (in an appropriate | ||
mode for building tests) and performs the build without consulting Cargo at all | ||
(or, perhaps it uses Cargo strictly for lowering, i.e. to determine `rustc` | ||
invocations from higher-level configuration). Once the build is complete, Cargo | ||
takes over, actually executing the resulting test binary. | ||
|
||
For the RLS or other tools that need to explore the dependency structure of the | ||
crate, again they should work with a clear Cargo interface that hides any use of | ||
plugins. The plugin for the external build system must be able to translate its | ||
dependencies back into something equivalent to a lockfile, at least. | ||
|
||
### A hybrid | ||
|
||
In general, any integration with a homogenous build system needs to be able to | ||
handle crate registries, because access to crates.io is a hard constraint. | ||
|
||
However, it's possible to *mix* this model for crates.io with the model for | ||
unmanaged crates. All that's needed is a distinction within the external build | ||
system between these two kinds of dependencies, which then drives the plugin | ||
interactions accordingly. | ||
Usually, you'll want to combine the handling of these external registries with | ||
crates managed purely by the external build system, meaning that there are | ||
effectively *two* modes of building crates at play overall. All that's needed to | ||
do this is a distinction within the external build system between these two | ||
kinds of dependencies, which then drives the plugin interactions accordingly. | ||
|
||
## Cross-cutting concern: native dependencies | ||
|
||
|
@@ -285,13 +285,13 @@ in the first place. | |
|
||
Reliably building native dependencies in a cross-platform way | ||
is... challenging. Today, Rust offers some help with this through crates like | ||
[`gcc`] and `[pkgconfig]`, which provide building blocks for writing build | ||
[`gcc`] and [`pkgconfig`], which provide building blocks for writing build | ||
scripts that discover or build native dependencies. But still, today, each build | ||
script is a bespoke affair, customizing the use of these crates in arbitrary | ||
ways. It's difficult, error-prone work. | ||
|
||
[`gcc`]: https://docs.rs/gcc | ||
`[pkgconfig`]: https://docs.rs/pkg-config | ||
[`pkgconfig`]: https://docs.rs/pkg-config | ||
|
||
This RFC proposes to start a *long term* effort to provide a more first-class | ||
way of specifying native dependencies. The hope is that we can get coverage of, | ||
|
@@ -311,6 +311,15 @@ Needless to say, this approach will need significant experimentation. But if | |
successful, it would have benefits not just for build system integration, but | ||
for using external dependencies *anywhere*. | ||
|
||
### The story for externally-managed native dependencies | ||
|
||
Finally, in the case where the external build system is the one specifying and | ||
providing a native dependency, all we need is for that to result in the | ||
appropriate flags to the lowered `rustc` invocations. If the external build | ||
system is producing those lowered calls itself, it can completely manage this | ||
concern. Otherwise, we will need for the plugin interface to provide a way to | ||
plumb this information through to Cargo. | ||
|
||
## Specifics for the mixed build system case | ||
|
||
Switching gears, let's look at mixed build systems. Here, we generally don't | ||
|
@@ -353,7 +362,7 @@ work. For example: | |
- **Profiles**. Putting the idea of the "build configuration" step on firmer | ||
footing will require clarifying the precise role of profiles, which today blur | ||
the line somewhat between *workflows* (e.g. `test` vs `bench`) and flags | ||
(e.g. `--release`). Moreover, integration with a homogenous build system | ||
(e.g. `--release`). Moreover, integration with a homogeneous build system | ||
effectively requires that we can translate profiles on the Cargo side back and | ||
forth to *something* meaningful to the external build system, so that for | ||
example we can make `cargo test` invoke the external build system in a | ||
|
@@ -362,7 +371,7 @@ work. For example: | |
possible to control enough about the `rustc` invocation for at least some | ||
integration cases, and the answer may in part lie in improvements to profiles. | ||
|
||
- **Build scripts**. Especially for homogenous build systems, build scripts can | ||
- **Build scripts**. Especially for homogeneous build systems, build scripts can | ||
pose some serious pain, because in general they may depend on numerous | ||
environmental factors invisibly. It may be useful to grow some ways of telling | ||
Cargo the precise inputs and outputs of the build script, declaratively. | ||
|
@@ -392,7 +401,7 @@ follow-up RFCs after experimentation has concluded. | |
It's somewhat difficult to state drawbacks for such a high-level plan; they're | ||
more likely to arise through the particulars. | ||
|
||
That said, it's unquestionable that following the plan in this RFC will result | ||
That said, it's plausible that following the plan in this RFC will result | ||
in greater overall complexity for Cargo. The key to managing this complexity | ||
will be ensuring that it's surfaced only on an as-needed basis. That is, uses of | ||
Cargo in the pure crates.io ecosystem should not become more complex -- if | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "employ a their own strategy"