Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Add workspaces to Cargo #1525
Conversation
alexcrichton
added
the
T-dev-tools
label
Mar 3, 2016
alexcrichton
self-assigned this
Mar 3, 2016
alexcrichton
referenced this pull request
Mar 3, 2016
Closed
Improve multi-crate project management #2122
This comment has been minimized.
This comment has been minimized.
|
If I have a manually configured workspace, how will cargo figure out that a crate that I'm compiling belongs to a workspace? Will I have had to have built the workspace root first? |
This comment has been minimized.
This comment has been minimized.
|
When Cargo constructs a workspace it's built upon the condition that all crates in a workspace can transitively reach all others. This means that for any workspace (manually configured or not) any crate can find its way to the root quickly. In that sense it won't matter if you've built the workspace root first or not, Cargo will be able to build any crate in the workspace at any point in time into the workspace root's output folder. Does that make sense? Not sure if that answers your question... |
This comment has been minimized.
This comment has been minimized.
gkoz
commented
Mar 3, 2016
|
Are multi-repo workspaces too niche to be considered? I've been abusing local path overrides for a while now. It would be nice if "local" workspaces could incorporate that. Imagine two repos
I'd be happy if the "local" workspace could transparently change that to |
This comment has been minimized.
This comment has been minimized.
|
Let's say that I have a project structure like this:
|
This comment has been minimized.
This comment has been minimized.
|
That's an interesting idea! This doesn't currently consider multiple disjoint repositories beyond submodules. Using Ah yes for the workspace to be constructed there you'd need two bits of configuration, one as |
This comment has been minimized.
This comment has been minimized.
|
Ok cool, that makes sense. |
This comment has been minimized.
This comment has been minimized.
What happens if people don't set this up correctly? It seems a bit disappointing for such a common setup to require explicit configuration; I'm not sure what the alternative would be, though, except for having cargo trawl the git repo to look for things that depend on the current directory. (which...might be ok) Still, I think i might prefer to have a |
This comment has been minimized.
This comment has been minimized.
|
I asked:
But I think I get it now. If you don't include all the links, things just aren't considered to be in 1 workspace. |
This comment has been minimized.
This comment has been minimized.
tomaka
commented
Mar 4, 2016
|
I don't like the convention that this RFC suggests:
|
This comment has been minimized.
This comment has been minimized.
netvl
commented
Mar 4, 2016
|
How does the proposed design work in a scenario when there are multiple equally important binaries and some common library? For example, client-server programs like my wcd may be structured like this. In wcd I have the following structure:
I think I would prefer both binaries to be built together (now it is not very convenient to test how client and server interact when something inside |
This comment has been minimized.
This comment has been minimized.
Man, I must have been tired when reading the RFC last night. I didn't really grok that it was suggesting that we should structure our repos in a nested fashion. So, e.g., for LALRPOP, where the main lalrpop crate (
and in this way, lalrpop would implicitly be the "workspace root"? This makes the name 'root' make more sense to me. :) I also never thought of structuring things in this way, but I can see that if I did so, then this RFC would allow for a lightweight (no annotation) setup. |
This comment has been minimized.
This comment has been minimized.
Ah yes, as you surmised the end result is that workspaces just continue to be disjoint (as they are today). If manual configuration is incorrect (e.g. a
One of the major motivational factors for this RFC is to have zero configuration in the conventional use case. A scheme such as you and @sfackler are proposing is certainly covered by this RFC (wish some configuration), but it's unclear to me how it could be the default. The requirement is to be able to go from any crate to all of its workspace crates quickly (eg can't walk the whole filesystem). Do you have something in mind, however, to alleviate this? To reiterate, this RFC doesn't preclude any source layout, it's just blessing one as the conventional way to organize crates where features work without extra configuration.
That's correct, but it's also why it's just a heuristic as part of this RFC. Any and all project layouts are supported, it's just a question of which require less configuration. Most projects usually have some sort of source repo at the top which is a good heuristic for probing (but that's it).
You'd basically choose one of your crates to be the root, and then you'd have to configure a few Do you think the configuration necessary to select one of the binaries as the root is too much, though? Is that a case in favor of some form of "virtual package"?
Yeah that sort of layout would require no extra configuration to benefit, but as @tomaka mentions it may not always make sense for projects. There may be another heuristic we could use to find a "root crate" without any extra configuration, but I'm drawing blanks... |
This comment has been minimized.
This comment has been minimized.
tomaka
commented
Mar 4, 2016
|
An alternative could be to separate workspaces from packages, and create another configuration file named for example If you want to have one crate per directory, you can just put a EDIT: Oops, that was already suggested :-/ |
This comment has been minimized.
This comment has been minimized.
netvl
commented
Mar 4, 2016
Well, I'm not sure whether it is too much or not, but the whole concept, when applied to "equally important" binaries, seems wrong to me. I just dislike the need to choose one of the crates as a root arbitrarily. Maybe if a client-server application is not a convincing example, then something like coreutils is? It is a collection of lots of different binaries using a single common library, and I doubt that it makes any sense to select one of them as a workspace root. Something like cargo-edit also qualifies, I believe. So yes, I certainly think that virtual packages or some equivalent solution is needed here. |
This comment has been minimized.
This comment has been minimized.
|
I'm a little unclear on the implications of the 'workspace inference' rules, which are described abstractly but don't have any examples of what organizations they can infer. I think the standard way to organize a multi-crate project is this:
My understanding is that this structure will be inferred under the proposal and require no annotation. That sounds great. But will other structures be inferred? Like @tomaka mentioned, I think the practice of putting multiple crates inside of a |
This comment has been minimized.
This comment has been minimized.
Yes! Has anyone looked at Haskell's Stack http://docs.haskellstack.org? IMO whenever I develop multiple crates at once, the idea is almost always that while the libraries are easiest to develop together, they are still full fledged independent libraries:
As such a view If the crates are released/uploaded to crates.io separately, they should NOT be nested in the filesystem, otherwise we need to prune sub trees when uploading so one library doesn't contain others' source. |
This comment has been minimized.
This comment has been minimized.
|
The main problem I have stack's |
This comment has been minimized.
This comment has been minimized.
|
I think @netvl concern's basically boil down to "dependencies form a DAG, but nested crates form a tree". IMO that right there is the simplest reason that nesting crates is going to suck---simpler and better than the problem I described 2 comments previous :). |
This comment has been minimized.
This comment has been minimized.
Are you saying that the directory structure itself is confusing, or that the annotations one would need (under this RFC) to support it are confusing? I definitely think the RFC would be improved by specifying the expected directory layout. It makes the rules make much more sense. Do you know if this "nested" structure is in common use today? |
This comment has been minimized.
This comment has been minimized.
tomaka
commented
Mar 7, 2016
I don't know for the nested structure, but the non-nested structure is not uncommon:
|
This comment has been minimized.
This comment has been minimized.
The former. I expect everything inside |
This comment has been minimized.
This comment has been minimized.
Yeah this is certainly always possible (and is one of the alternatives in the RFC), but one of the key points here is that an idiomatic project structure should require zero extra configuration (including this extra manifest). Thanks for the examples! I can certainly see the desire to have a manually selected root location for the workspace, but I'm pretty hesitant to so easily add the concept of virtual crates or packages. It seems like a pretty significant feature that may not quite pull its weight if it's just used for workspaces. For now this could always be worked around with a "dummy package" that ends up being the root, which although not quite as elegant should at least serve as a vector in the interim. @wycats may have more thoughts on the "virtual package" idea, though. I believe he's thought about it more than I have. Could you clarify what you're finding hard to understand about how workspaces are constructed? It's described concretely because that's what'll end up being implemented (I tend to prefer that over a somewhat hand-wavy description). I can certainly add some examples though! I've seen lots of examples with a nested layout and also lots with a nothing-at-the-root layout. I wouldn't necessarily say one is more common than the other. As I mentioned in my reply to @tomaka, however, there's a fundamental problem of child crates still somehow need to quickly discover their parent. That tips the scales in favor (to me) much more in towards a something-should-be-at-the-root structure. I've tried to add some examples though which may help. |
This comment has been minimized.
This comment has been minimized.
gkoz
commented
Mar 7, 2016
|
Perhaps alternatives should include using |
This comment has been minimized.
This comment has been minimized.
|
An empty |
This comment has been minimized.
This comment has been minimized.
It's necessary that a concrete implementation be provided in the RFC for implementation, but some examples of the kinds of directory trees these rules infer and the kinds these rules cannot infer would help to understand how it impacts me as a user. |
This comment has been minimized.
This comment has been minimized.
|
Ok, the tools team discussed this RFC during its triage meeting the other day, and the conclusion was somewhat mixed. We did not decide to merge just yet, but were somewhat hesitant to do so. There was some pushback about this being an "instantly stable" feature in Cargo because Cargo does not have unstable features like the compiler, and there was also discussion about perhaps blocking this until that existed or somehow ensuring it doesn't reach the stable channel just yet. The question of unstable features in Cargo, however, is pretty weighty, so I'd prefer to not tackle it at this time if possible. I fear that this RFC was perhaps not well written which led to confusion in how it was interpreted. I've rewritten and reorganized the RFC in a format which I hope is much more clear and much simpler than before. Note, however, that there is no semantic change from the previous draft, just hopefully more understandable! cc @rust-lang/tools
We discussed this in the tools meeting, but I think that the current draft is more clear on this as well. There's only one way to put a package in a workspace,
To clarify, no workspace will exist unless
As pointed out by @Ericson2314 it's not possible for a crate to be a member of more than one workspace (it can only point to one root) As discussed in the tools triage meeting, the "we don't walk past VCS roots" part has been removed. There is no other VCS related portion of this RFC (to clarify), which I believe addresses your concerns!
The RFC has been changed slightly to account for this (which I think is simpler as well) to say that
Hm I think that the current draft may be a bit more understandable, does it help clear things up? I think it covers many of your points, but I'm not 100% certain. |
This comment has been minimized.
This comment has been minimized.
jnicklas
commented
Apr 23, 2016
|
@alexcrichton the new revision of this RFC is awesome! Really clarifies things and the added explicitness will make things much easier to understand, imo. Definitely addresses my concerns at least! |
matthiasbeyer
reviewed
Apr 23, 2016
| of crates pointing back to the root may not. If, however, this restriction were | ||
| not in place then the set of crates in a workspace may differ depending on | ||
| which crate it was viewed from. For example if workspace root A includes B then | ||
| it will think B is in A's workspace. If, however, B does ont point back to A, |
This comment has been minimized.
This comment has been minimized.
matthiasbeyer
Apr 23, 2016
Typo: ont should be not.
(Please tell me if I'm not allowed to point out typos here)
This comment has been minimized.
This comment has been minimized.
dcuddeback
reviewed
Apr 23, 2016
|
|
||
| The root of a workspace, indicated by the presence of `[workspace]`, is | ||
| responsible for defining the entire workspace (listing all members). | ||
| This example here means that two extra crates will members of the workspace |
This comment has been minimized.
This comment has been minimized.
dcuddeback
reviewed
Apr 23, 2016
| These keys are mutually exclusive when applied in `Cargo.toml`. A crate may | ||
| *either* specify `package.workspace` or specify `[workspace]`. That is, a | ||
| crate cannot both be a root in a workspace (contain `[workspace]`) and also be | ||
| member of another workspace (contain `package.workspace`). |
This comment has been minimized.
This comment has been minimized.
dcuddeback
reviewed
Apr 23, 2016
| which crate it was viewed from. For example if workspace root A includes B then | ||
| it will think B is in A's workspace. If, however, B does ont point back to A, | ||
| then B would not think that A was in its workspace. This would in turn cause the | ||
| set of crates in each workspace to be different, futher causing `Cargo.lock` to |
This comment has been minimized.
This comment has been minimized.
dcuddeback
reviewed
Apr 23, 2016
| builds no matter where they're executed in the workspace. | ||
|
|
||
| To alleviate misconfiguration Cargo will emit an error if the two properties | ||
| above hold for any crate attempting to be part of a workspace. For example, if |
This comment has been minimized.
This comment has been minimized.
dcuddeback
Apr 23, 2016
Is this supposed to say that Cargo will emit an error if the two properties do not hold?
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
I think it says no crates other than "the" root should satisfy the conditions in a workspace.
This comment has been minimized.
This comment has been minimized.
dcuddeback
reviewed
Apr 23, 2016
|
|
||
| * Test all crates within a workspace (run all unit tests, doc tests, etc) | ||
| * Build all binaries for a set of crates within a workspace | ||
| * Publish all crates in a workspace if necessary to crates.io |
This comment has been minimized.
This comment has been minimized.
dcuddeback
Apr 23, 2016
Related to this use case, another possible extension is to share a version across all the crates in a workspace. Some projects act as a collection of crates that are published at the same time (e.g., foo and foo-sys) with the same version. It'd be a nice convenience to not have to update the version property in every Cargo.toml.
This comment has been minimized.
This comment has been minimized.
dcuddeback
Apr 23, 2016
Actually, this could apply to authors, homepage, repository, license, and keywords properties, too, as those are likely to be the same for crates in a single-repo workspace. version just changes more often than the other properties. It would be nice if member crates could inherit these properties from the workspace root.
This comment has been minimized.
This comment has been minimized.
nodakai
reviewed
Apr 24, 2016
| the output of a workspace to be configured regardless of where crates are | ||
| located, Cargo will now allow for "virtual manifest" files. These manifests will | ||
| currently **only** contains the `[workspace]` table and will notably be lacking | ||
| a `[project]` or `[package]` top level key. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 24, 2016
Author
Member
An alias for [package] with planned different semantics that never got off the ground.
nodakai
reviewed
Apr 24, 2016
| Cargo will for the time being disallow many commands against a virtual manifest, | ||
| for example `cargo build` will be rejected. Arguments that take a package, | ||
| however, such as `cargo test -p foo` will be allowed. Workspaces can eventually | ||
| get extended with `--all` flags so in a workspace root you could execute |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
•
What are we achieving by requiring --all for "virtual" projects? A user will have to first look inside the top Cargo.toml and see if it's "virtual" or not for a successful build of a given Cargo project.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 24, 2016
Author
Member
I don't think I understand the question here. The point of this is to say that cargo build doesn't work, not that --all is being added.
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
In other words, why should cargo build not work for a "virtual workspace" ? My point is, you are suggesting to have two types of Cargo.toml; one accepts cargo build and one rejects cargo build. This can be a source of confusion.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 25, 2016
Author
Member
Mostly erring on the side of being conservative, we can always alter the meaning of cargo build on a virtual crate to be an alias for cargo build --all later.
nodakai
reviewed
Apr 24, 2016
| ### Workspaces in practice | ||
|
|
||
| Many Rust projects today already have `Cargo.toml` at the root of a repository, | ||
| and with the small addition of `[workspace]` in the root a workspace will be |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
•
If I didn't misread the section "Validating a workspace," the workspace is not valid unless we add [package] workspace = ... to children crates. Update: This comment is invalid. I misread the "Implicit relations" section.
nodakai
reviewed
Apr 24, 2016
| ``` | ||
|
|
||
| The two workspaces here can be configured by placing the following in the | ||
| manifests: |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
Is it correct to understand that the two workspaces just happen to be in the same "tree" (undefined word) and shouldn't share the common .lock file? If that is the case, this example seems to me to serve more for confusion rather than explanation and I'd rather not include it in the RFC.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 24, 2016
Author
Member
Yes this is intended to showcase two workspaces as part of the same development tree, which is what the compiler will have, for example.
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
My point is such a situation can arise too often, for example, when I check out two unrelated (but workspace-aware) Cargo packages from GitHub under the common directory, say, ~/dev/. I'd naturally take their independence for granted.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 25, 2016
Author
Member
I'm not sure I understand what change you'd like as a result of these comments then? The intention is that of course two independent trees can have workspaces that don't mess with one another, and the point of this example is simply to show that it can happen in one repo.
nodakai
reviewed
Apr 24, 2016
|
|
||
| ```toml | ||
| # ws2/Cargo.toml | ||
| [workspace] |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
•
Shouldn't members be here? Update: again, the "Implicit relations" obviates them.
nodakai
reviewed
Apr 24, 2016
| * A workspace can contain multiple local crates. | ||
| * Each workspace will have a root. | ||
| * Whenever any crate in the workspace is compiled, output will be placed in the | ||
| `target` directory next to the root. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
nodakai
reviewed
Apr 24, 2016
| ``` | ||
|
|
||
| Projects like the compiler will likely need exhaustively explicit configuration. | ||
| The `rust` repo conceptually has two workspaces, the standard library and the |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
I understood a "workspace" as a unit for sharing the common lock file. Could you elaborate on why the compiler and the stdlib should not share the common lock file?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 24, 2016
Author
Member
It's somewhat unrelated to this RFC, unfortunately, bit the gist of it is that we want crates.io deps to be part of the compiler but they do not explicitly depend on the standard library, so they need to be built in two phases.
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
OK, then I'd suggest you to omit the sentence. You are simply saying "a complex project is likely to require a complex hand-written configuration"
This comment has been minimized.
This comment has been minimized.
nodakai
reviewed
Apr 24, 2016
| Cargo will grow the concept of a **workspace** for managing repositories of | ||
| multiple crates. Workspaces will then have the properties: | ||
|
|
||
| * A workspace can contain multiple local crates. |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
Could "multiple" perhaps better be replaced with "zero or more" (if a workspace with zero local crates is not forbidden)?
nodakai
reviewed
Apr 24, 2016
| ```toml | ||
| # ws1/Cargo.toml | ||
| [workspace] | ||
| members = ["crate1", "crate2"] |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 25, 2016
Author
Member
It is required here as the newly created manifest does not otherwise depend on the crates.
nodakai
reviewed
Apr 24, 2016
|
|
||
| 1. A workspace has only one root crate (that with `[workspace]` in | ||
| `Cargo.toml`). | ||
| 2. All workspace crates defined in `workspace.members` point back to the |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
I'd prefer to have "explicitly or implicitly" for workspace.members and package.workspace for the sake of clarification.
nodakai
reviewed
Apr 24, 2016
| ``` | ||
|
|
||
| The root of a workspace, indicated by the presence of `[workspace]`, is | ||
| responsible for defining the entire workspace (listing all members). |
This comment has been minimized.
This comment has been minimized.
nodakai
reviewed
Apr 24, 2016
| be implicitly defined in some situations. | ||
|
|
||
| The `package.workspace` can be omitted if it would only contain `../` (or some | ||
| repetition of it). That is, if the root of a workspace is hierarchically the |
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 24, 2016
Consider this case:
ws0
+- Cargo.toml // [workspace] members = ["ws1/util"]
+- src
+- ws1
+- Cargo.toml // [workspace] members = []
+- src
+- util
+- Cargo.toml // package.workspace is omitted
+- src
So, although util/../.. does point back to ws0 and the workspace ws0 contains a single root, it is invalid due to the "that is" sentence; is my understanding correct? I mean, the "that is" sentence is actually a stronger restriction than what it tries to rephrase.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Apr 25, 2016
Author
Member
If ws0/ws1/Cargo.toml depends on ws0/ws1/util/Cargo.toml, then yes this is an invalid workspace.
This comment has been minimized.
This comment has been minimized.
nodakai
Apr 27, 2016
Actually I found "hierarchically the first" confusing; did it mean the closest to the root directory of the system (/) or the furthest from it?
I interpreted it as "the furthest" (because it would be the "first" to be found by a repeated application of ../); then the root of the workspace ws0 is not "the first" Cargo.toml with [workspace] regardless of ws0/ws1's (lack of) dependency on ws0/ws1/util.
My question could be rephrased as: can an unrelated Cargo.toml (ws0/ws1/Cargo.toml in this case) break the validity of a workspace (ws0) that depends on implicit package.workspace ?
This comment has been minimized.
This comment has been minimized.
Ericson2314
referenced this pull request
Apr 25, 2016
Closed
Make Cargo aware of standard library dependencies #1133
This was referenced Apr 25, 2016
This comment has been minimized.
This comment has been minimized.
|
The tools team discussed this RFC during triage yesterday and the decision was to merge. Once this is implemented we're likely to reevaluate right before the next release to see if we want to let it ride the trains to stable, and we may, for one release, temporarily prevent that particular released version of Cargo from having workspaces. This'll all be based on our experience with workspaces up to that point, however! Thanks again for the discussion everyone! |
alexcrichton commentedMar 3, 2016
•
edited by mbrubeck
Improve Cargo's story around multi-crate single-repo project management by
introducing the concept of workspaces. All packages in a workspace will share
Cargo.lockand an output directory for artifacts.Cargo will infer workspaces where possible, but it will also have knobs for
explicitly controlling what crates belong to which workspace.
Rendered