Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow path = "${FOO}/bar" dependencies #9855

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sivadeilra
Copy link

This allows dependencies that uses path = ... to contain references to environment variables. Cargo will expand these references. For example:

[dependencies]
foo = { path = "${MY_PROJECT}/utils/foo" }

The syntax used for variable references is ${name} or ${name?default} if
a default value is provided. The curly braces are always required;
$FOO will not be interpreted as a variable reference (and will be copied
to the output).

If a variable is referenced, then it must have a value or the variable reference
must provide a default value (using the ...?default syntax). If a variable
does not have a value and the variable reference does not provide a default
value, then the expansion of the entire string will fail.

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 30, 2021
@Eh2406
Copy link
Contributor

Eh2406 commented Aug 30, 2021

Is there an RFC for this feature?

@sivadeilra
Copy link
Author

No, there isn't. I created this as a draft PR only, to serve for discussion. I didn't know that draft PRs would pull in official reviewers.

@alexcrichton
Copy link
Member

Thanks for the PR, but Cargo generally shies away from env-var interpolation historically because we don't really handle it anywhere. I would personally expect a feature like this to go through an RFC since it's not necessarily a minor extension to Cargo.

@sivadeilra
Copy link
Author

Thanks for the PR, but Cargo generally shies away from env-var interpolation historically because we don't really handle it anywhere. I would personally expect a feature like this to go through an RFC since it's not necessarily a minor extension to Cargo.

My goal is to enable Cargo to be used in the Windows code base. (As in, Windows itself, not just the Windows ecosystem.) As you might imagine, we have a large code base, with a large directory structure. Policy forbids us from using ../../../foo walk-up paths, because our experience has shown that they are very difficult to maintain. All of our paths to projects are required to have the form $(SOME_WELL_KNOWN_ROOT)/path/to/a/component.

This seems like a reasonable thing to do in Cargo, right? The argument that we don't do it because we historically haven't done it seems a bit circular.

I had previously discussed this with Nick Cameron, who suggested that this was a small enough change that it didn't require an RFC, but I'm fine with going through the RFC process, too.

@alexcrichton
Copy link
Member

Ah ok, interesting!

On the RFC bit other team members may have other thoughts, but I personally feel that env-vars-in-strings, if added, would want to get added in a relatively general way since if users see it in one place they might expect that it works in lots of other places too.

As an alternative, though, are you using workspaces? If a workspace can be a the root of all your Rust code then #9684 would allow you to define all dependencies in your [workspace] section without ../.. directives in them, and then all workspace members could reference those dependencies directly.

@sivadeilra
Copy link
Author

On the RFC bit other team members may have other thoughts, but I personally feel that env-vars-in-strings, if added, would want to get added in a relatively general way since if users see it in one place they might expect that it works in lots of other places too.

Agreed. My thinking here was that an unstable feature provides just the right way to experiment with that and learn from the experience. The code as written is quite general, and it would be easy to extend this to any other fields, or to apply it to all fields. I wanted to start with only a small, focused scenario (one that we really need), rather than risking an adverse reaction from a change that had much broader impact.

As an alternative, though, are you using workspaces?

Yes, many. Workspaces are great, but they're not going to scale to the size of the systems that we want to build. We have pretty solid evidence that build systems where any single process has to reason about the structure of the dependency graph of all of Windows are not scalable.

Workspaces also don't solve the problem, from what I understand. Even within a workspace, we would still need to use path-based dependencies, right? My understanding is that workspaces don't create an implicit namespace. (I could have missed something, of course.)

#9684 looks interesting, but I don't think it would meet our needs. First, we have many separate workspaces, some of which have mutually-incompatible settings in their [profile] sections. Second, our experience has shown that every time we create a namespace for components that is not simply based on the repo-relative path of the component, it creates serious maintenance problems. First, because namespace collisions become inevitable, since the path is no longer part of how you identify components, so you end up re-encoding some part of the path in the component name. Second, because our developers already have to deal with the directory paths, they don't want to deal with another, separate namespace. (Google has reached a similar conclusion, and uses purely path-based dependencies in their mono-repo.)

It's not practical for a single workspace to be the root of all of our Rust code. Our "success" scenario is that Rust code is eventually as wide-spread as C/C++ code. Even just loading a single dependency graph that covers the entire repo requires a significant amount of time and resources; it's not an option for us. We also want our Rust code to look reasonably similar to our C/C++ code, where it makes sense, and using ${FOO}/bar/baz for dependencies is one place where I think the consistency would be beneficial.

@alexcrichton
Copy link
Member

Sorry I don't mean to presume what's best for your codebase. I wanted to point out that if your only restriction is you can't have ../ paths then #9684, an implementation of #9684, may be able to resolve that. If all members of a workspace are beneath the workspace's root manifest then the workspace can list all its members as foo = { path = 'path/to/foo' } and everything else in the workspace simply does foo = { workspace = true }. If you want to depend on members outside of your workspace then no, this won't work, because then you'd still need paths with ../.

It's worth pointing out that by adding something like this it's the third string format syntax that Cargo would support, the others being:

Ideally we'd avoid a third format, but if it's necessary it's necessary.

I know some folks at Amazon though recently have done some integration work to get Cargo working with their build system and #9269 ended up being the feature for them. That allows the build system to write configuration files which work for [patch] to globally inform where packages are that the build system knows about that Cargo doesn't necessarily.

Would you be up for experimenting a bit with that feature or describing more the requirements of your use case? I do think it's useful to get Cargo working for your use case but if you're ok with it I wouldn't mind trying to poke around and see if there are other possible solutions or if this is the only thing that works for your case.

@sivadeilra
Copy link
Author

sivadeilra commented Sep 1, 2021

I honestly appreciate the willingness to work with me, on figuring out what the best thing is. I'm willing to experiment with some of these approaches, and to describe in more detail what our needs are. I'll need to omit some details, but that shouldn't be too much of an impediment.

Part of my motivation here is to avoid locking in patterns that work well for small codebases, but don't scale to large codebases. It's my responsibility either to find ways to meet our needs with existing Cargo features, or to articulate what those patterns are and why Cargo's existing features don't work (for us). It's a little difficult to make those arguments now, when we don't yet have a huge Rust codebase, but I'm trying to extrapolate from what we've learned from managing an extremely large C/C++ code base.

One lesson: Any time we have unwanted or accidental coupling of state, settings, etc. across different projects, we suffer. That's part of my reluctance to embrace root-level workspaces. If there's any state that's shared at all (dependency versions, profile settings, etc.), by using a root workspace, then it's going to cause serious problems for us. So that just leaves the mapping from crate name to crate/component path, which I view as an anti-pattern (component paths are sufficient for us, and are what we currently use). It's really common to have components that have the same name, but different paths. I don't think I can "sell" a system where all components have to have a unique crate name, i.e. a flat component namespace.

Another anti-pattern is any file that needs to be frequently edited by a large group of people. That causes us endless merge conflicts, so we work really hard to avoid that. If crate A depends on crate B, then I shouldn't need to edit a third file just to manage that dependency. Having a root workspace seems to invite merge conflicts, when you consider that our ecosystem is on the order of many 10,000s of libraries.

Would you be up for chatting about this in Zulip, or some other synchronous medium?

@jonhoo
Copy link
Contributor

jonhoo commented Sep 1, 2021

I can definitely see the appeal of something like this, and even though I don't personally have a use-case for this, it would also have worked as a solution for me instead of patch-in-config (even though patch-in-config is much nicer for my use-case). That said, I also agree that string interpolation is... a recipe for unexpected complexity. That's not to say it may not be worthwhile, but I'd like to propose another alternative that came up for me a while back — config-based base paths: rust-lang/rfcs#3074. The basic idea is that you define named paths in any Cargo configuration file that affects the current build (could be ~/.cargo/config.toml, but could also be /deep/in/source/tree/a/b/c/foo/.cargo/config.toml), and then you can specify that any path declarations should be relative to such a named path. In your case from the first post, this would look like:

# in crate-being-built/.cargo/config.toml
[base]
my_project = "/path/to/my/project/root/"
# in crate-being-built/Cargo.toml
[dependencies]
foo = { path = "utils/foo", base = "my_project" }

You would then have your build system generate .cargo/config.toml using values from environment variables, and then you're off to the races. It does mean that you can't directly use your environment variables, but the upshot is that Cargo can give better error messages since it knows more about the configuration structure. With string interpolation, Cargo can't really give any warnings — foo = { path = "${MY_PROJECT}/utils/foo" } when MY_PROJECT is empty, for example, shouldn't generate a warning from Cargo (maybe it's okay for it to be empty sometime), so users will just get the generic error "/utils/foo does not exist". Cargo can't even assume that env vars used this way should be valid paths, and warn based on that! With path bases as described above, however, Cargo can give very detailed errors. Something like:

crate-being-built requires that the named path my_project be defined in you Cargo configuration. my_project is defined in ~/.cargo/config.toml, but points to a path that does not exist. Please double-check the path in your configuration.

@alexcrichton
Copy link
Member

@sivadeilra yeah I'd be happy to chat on Zulip! Feel free to ping me

@sivadeilra
Copy link
Author

Ooooh, [base] paths are pretty much the right concept for, like, 98% of what I need. I like it that it's more explicit, too; I can imagine writing refactoring tools that can work with these dependencies much more easily than dealing with string interpolation. I'm going to go read that RFC now.

@bors
Copy link
Collaborator

bors commented Dec 6, 2021

☔ The latest upstream changes (presumably #10165) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton
Copy link
Member

@sivadeilra wanted to ping on this, I don't think I heard from you on Zulip? Are you still interested in pursuing this?

@sivadeilra
Copy link
Author

Actually, yes. Other priorities came up, but this has bubbled near the top of my priority queue.

@alexcrichton
Copy link
Member

Thanks for the PR, but I'm going to be stepping down from the Cargo team so I'm going to un-assign myself from this. The Cargo team will help review this when they get a chance.

@alexcrichton alexcrichton removed their assignment Mar 8, 2022
@weihanglo weihanglo added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants