Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow [patch] (and [replace]) sections in Cargo config. #5539

Closed
vext01 opened this issue May 15, 2018 · 12 comments · Fixed by #9204
Closed

Allow [patch] (and [replace]) sections in Cargo config. #5539

vext01 opened this issue May 15, 2018 · 12 comments · Fixed by #9204
Labels
A-patch Area: [patch] table override C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted`

Comments

@vext01
Copy link
Contributor

vext01 commented May 15, 2018

Scenario:

  • You have several smaller libraries you are developing together.
  • You don't want to, or can't, use a workspace.
  • You don't want to have to keep changing and git stashing [patch] sections in the scattered Cargo.toml files (you are bound to commit them accidentally at some point).

Wouldn't it be super useful if we could put [patch] (and maybe [replace] too) sections into our ~/.cargo/config? I'd certainly use that.

Thanks

@vext01
Copy link
Contributor Author

vext01 commented May 15, 2018

Just to add, I'm aware of paths = in the Cargo config. This would be perfect, but Cargo warns me that it's a buggy feature that I shouldn't be using (why don't you delete it or fix it?):

warning: path override for crate `blah` has altered the original list of
dependencies; the dependency on `mydep` was either added or
modified to not match the previously resolved version

This is currently allowed but is known to produce buggy behavior with spurious
recompiles and changes to the crate graph. Path overrides unfortunately were
never intended to support this feature, so for now this message is just a
warning. In the future, however, this message will become a hard error.

To change the dependency graph via an override it's recommended to use the
`[replace]` feature of Cargo instead of the path override feature. This is
documented online at the url below for more information.

http://doc.crates.io/specifying-dependencies.html#overriding-dependencies

@alexcrichton alexcrichton added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label May 15, 2018
@ehuss ehuss added the A-patch Area: [patch] table override label Dec 18, 2018
@sehz
Copy link

sehz commented Mar 11, 2020

Any progress in this issue? config path documentation are not clear. it would be great to support patch in the config

@jonhoo
Copy link
Contributor

jonhoo commented Feb 5, 2021

There's a decent more discussion over in #7199. It sounds like Alex is primarily worried about including this feature since it would mean that Cargo.lock may reflect decisions that aren't check into version control (.config/cargo), and thus aren't reproduceable on a different host.

To me, that argument seems a little weak, since users can already do the same thing by using source replacement with a local or directory registry. But that approach is fairly complex to get working compared to a simple [patch] in .cargo/config. Maybe we could just issue a big bold warning if a [patch] is present in .cargo/config saying that this will lead to a Cargo.lock that cannot be used on other hosts, along with some kind of option to hide the warning?

Nested workspaces (#5042) might help alleviate this problem somewhat by constructing a temporary workspace around all the crates you want to build and specifying a [patch] in the workspace config. However, I'm not sure that that presents a nicer solution? It requires some some pretty heavy-handed re-organization of the code you want to build, and you'd then end up ignoring the entire lockfile for the package you're building since the temporary workspace would have no lockfile initially.

@alexcrichton It would be nice to get some clarity on whether this feature might conceivably be supported given the closure of #7199, or whether it's dead-on-arrival (in which case I think this issue should be closed).

@alexcrichton
Copy link
Member

Personally I don't think that "simply add [patch] to .cargo/config" is viable. The patch table was designed to be in manifests, and having it be in global configuration radically changes the feature and breaks some of the design of the feature in the first place:

  • Configuration would now add patch.unused to all lock files with projects using the configuration.
  • Lock files in existing projects will no longer work because patch.unused needs to be inserted.
  • The same issue I mentioned before where when you check something in it may not work elsewhere. (Note that source replacement is not an exception to this rule, source replacement is heavily documented as "you must reflect the contents of the original source", so if you get a lock file you can't use elsewhere then you're mis-using the source replacement feature)

Resolving a dependency graph is intended to be a project-local decision, not a global system-configuration decision. That's sort of what Cargo has always trended towards.

All that being said there may still be room for a patch-like feature to get added to .cargo/config.toml. I think it would be good to enumerate requirements to see if patch is perhaps overpowered or if a subset of patch should be allowed in .cargo/config.toml

@jonhoo
Copy link
Contributor

jonhoo commented Feb 16, 2021

I have two proposed paths forward for this, and would be curious to hear your takes. But first, let me talk quickly through the use-case I have in mind.

I give a lot more detail in this i.r-l.o post, but basically I have an external build system that manages first-party dependencies. That build system has a lot of bells and whistles, and one of them is that it sort of has its own patching mechanism. Essentially, the build system has a way to shadow first-party dependencies with replacements. This feature is used for a number of things:

  • "Test releases" of internal packages where all consumers are re-built on top of an updated package before it is actually released.
  • Using a particular git branch/commit for a dependency instead of the internally published version.
  • Using a local checkout of a dependency for coordinated development if the build system detects that such a checkout exists.

The net result of this is that the build system wants to inject dependency patches for certain packages. It could inject them directly into Cargo.toml, but this is likely to make users check them in (even inadvertently), which would be unfortunate. The paths are unlikely to match across hosts, so it would just lead to a lot of unnecessary git diff noise.

In my case, the need for [patch] in .cargo/config is solely to have a way to keep auto-generated patch tables that should not generally be checked into version control. The "global" aspect of .cargo/config is not necessary in my case. So, to that end, here are two proposals:

  1. Add support for a Cargo.patch file that lives next to Cargo.toml. It may only contain [patch] sections, and is not merged from parent directories (as .cargo/config is). This would ensure that the configuration is truly project-local, and not a global system-configuration dependent decision. For your third point (ensuring that if you check something in in one place, it'll work in another place), this decision will now fall to the developers — they can choose to check in their patch file, or they can choose not to if that makes more sense for their use-case. I think this trade-off is fundamental for the use-case I'm considering.
  2. Generalize 1) to support including other toml files into Cargo.toml. Since TOML isn't extensible, I'm imagining a "magical" key like include = "file path" that can be placed anywhere, and that cargo will interpret to mean "read the named TOML file and replace the include with that TOML's contents. This would obviously enable the use-case above, but would be more explicit (it'll be clear from Cargo.toml that a Cargo.patch file is expected), would require that the given files exist by the time cargo runs, and would provide a more widely applicable mechanism than just for patching. The downside of course is that it complicates the configuration format and makes it so that more parts of the Cargo.toml can be excluded from the repository (whether intentionally or not).

I'd be curious to hear your thoughts on these two approaches @alexcrichton. This is a pretty big pain point for me at the moment, and one I'd happily put some time into solving.

@jonhoo
Copy link
Contributor

jonhoo commented Feb 16, 2021

Ah, and a third option inspired by UNIX configuration scripts:

  1. Add support for Config.toml.d, a directory that can contain any number of .toml files which will be combined to make up the Config.toml for the project. This would make it trivial to programmatically manage parts of the configuration, while still ensuring that the entire configuration is contained inside the project.

@alexcrichton
Copy link
Member

Ok thanks for the info, that all definitely makes sense. This definitely strikes a chord with me in the "build system integration" category of Cargo, which is always something that Cargo has basically languished at. Along these lines though at an abstract level this is a problem we've solved many times before in Cargo. A recent example is profile configuration which historically was only located in Cargo.toml but when integrating with other build systems you also want to change it on the fly. The solution for that was, surprise surprise, to add support to .cargo/config.toml which enables build-system-like configuration via generated files or environment variables.

My worries above where mainly about how a features like this could be misused. In my opinion this stance towards new features has hampered Cargo historically to the point that no progress is made out of the fear that something could be misused. Basically I know I don't always personally strike the right balance between adding things which can be misused and also designing things which are impossible or hard to misuse.

That's all basically a way of saying that given your problem statement I really think the best solution is .cargo/config.toml. That's just how we idiomatically do things in Cargo where an external build system wants to configure behavior of Cargo itself. That doesn't alleviate or solve my worries, of course, but just because something has downsides doesn't mean it shouldn't get implemented.

In thinking about this one thing I'm reminded of is the historical paths feature of Cargo. This is a feature of Cargo which was basically the first thing implemented to depend on crates but persists to this day. The downside of this feature is that it's only useful for system-level overrides if the dependency list is not altered. This ends up having pretty limited use cases, basically only around changing one or two lines of code. It sounds like you want to more generally try out release candidates and things which I would reasonably expect to be able to change the list of dependencies.

So basically I think that a feature like this is a balance between the motivation and the downsides. The motivation depends on who wants it and who's willing to implement/design/advocate for it. The downsides are dependent on the same design and ability to mitigate where possible (or acknowledging that the benefit outweighs the downside if mitigation only goes but so far).


Along the path of discussing possible mitigations for the downsides I mentioned above, could you describe a bit how things are set up for you? Many build system integration scenarios, for example, are large monorepos where first-party dependencies have one and will always only have one singular version to choose from. In these cases Cargo.lock is only useful for pulling in crates.io crates. Ideally Cargo.lock would completely eschew the dependencies that are first-party and known to only ever have one version.

Is a situation like this applicable to you? Or are your first-party dependencies, for example, coming from an internal registry where there's multiple versions and Cargo's actually doing version resolution between them?

@jonhoo
Copy link
Contributor

jonhoo commented Feb 19, 2021

Thank you for the context, that helps me understand where you're coming from a lot better. I agree this is ultimately a matter of balancing concerns, and think you're right that .cargo/config feels like the appropriate solution even if it also opens up some avenues for suboptimal behavior.


I think I've documented how things work for me in a couple of places, but it's scattered so let me try to bring it all together here. There are two designs at play — one that is currently in use, and one that is the "next generation" design I'm going for.

Current design: The build system generates a local registry (local-registry) that holds an index that's a merge between a) a stripped-down mirror of crates.io, and b) first-party dependencies built by the build system as part of the same build. The latter set is determined on the fly based on the build system dependency closure — there is no "internal registry". Multiple major versions may exist for a given first-party dependency, though that is rare in practice. Each major version has only a single minor+patch version. The lockfile here only provides value for the third-party dependencies pulled from the mirror of crates.io. For the first-party dependencies, the lockfile, and cargo's general assumption that the version number dictates the source creates a lot of pain; if a first-party dependency changes, cargo will not re-build that dependency since the version number in the registry is the same. And if the user wipes the cargo caches to force a re-build, cargo complains (rightfully) that the checksum has changed, and fails the build.

Lots of internal packages are already built using this pattern, so it's not feasible to change its design overnight, such as telling everyone to now use path dependencies. And even if I could, the paths are generated by the build system, and so we'd need something like named path bases to really make it work. If, on the other hand, the build system could inject a [patch] for each first-party dependency, it could patch them all to use path dependencies without user involvement, and everything would "just work".

Future design: Going forward, what I want to move to is a single "normal" registry (an http registry if possible) that serves both crates.io crates and internal crates. First-party packages can then be published to said registry, and thus become available to cargo without needing to cobble together a local registry for each build. First-party crates that are published to this internal registry would follow cargo's source == version assumption, and thus would "just work". However, that only holds for packages that consume the (internally) published versions. Some internal packages may wish to depend on the very latest version of a first-party dependency without waiting for the next release. The most obvious example of this is someone who owns, say, two internal packages A and B where A depends on B may want to have them both checked out together and be able to have modifications to B be immediately available in A. Since they are built together, the build system knows that the locally-built B should "shadow" the B in the internal registries — this is best achieved with a local build system generated [patch] that redirects B to a path dependency (*).

There is another class of dependencies in this design, which is if A depends on B and B is not published to the internal registry. There are a couple of reasons why this might be the case that we don't need to get into. If that's the case, A would list B directly in its dependency closure in the main build system, and it would take care of fetching A locally before trying to build B. The question then of course is how A lists its dependency on B in Cargo.toml. A path dependency might be the "right" choice (with named path bases again), but would also look a little out of place, I'm not sure. Another alternative is that the build system (which already proxies access to the internal registry) "injects" an index entry (by intercepting the relevant HTTP index GET) and [patch] for B in the registry — this way, developers could list B as any other internal dependency in their Cargo.toml, and if B were to be published later, the migration would be trivial. I haven't settled on which of these paths to go down yet.

(*) A question arises here about lockfiles. Say my lockfile for A is generated with a path patch because in my environment I pull in B as a direct dependency of A and thus re-build it locally. I check in that lockfile. Then someone who uses B straight from the registry tries to build A — will cargo complain? I don't think it should (unless the compile fails), but not sure.


Separately, I wonder if there might be room for a compromise where [patch] in .cargo/config is only accepted in the directory that holds the Cargo.toml, and nowhere else. So if we have a crate at /a/b and the user invokes cargo at /a/b/c, only the [patch] in /a/b/.config/cargo.toml is considered, and any patch statements in /a/.cargo/config.toml or /a/b/c/.cargo/config.toml are ignored (and produce warnings to that effect)?

@alexcrichton
Copy link
Member

Hm ok thanks for explaining things! I was hoping I'd be struck by some sort of inspiration that would lend itself to a solution to your issues while also not being a fully-generalized [patch]-in-config. Unfortunately though that didn't happen. I don't think I fully grok all the intricacies of your use case, but that's probably ok too. I'm also not sure learning about your build system in more precise detail will help me too too much with this.

Perhaps the next steps here are to implement this in Cargo? I don't think it would be too too large of an implementation. We'd want to gate it to nightly for now, of course, but that should hopefully at least let you explore internally whether it would work for your intended use case?

(*) A question arises here about lockfiles. Say my lockfile for A is generated with a path patch because in my environment I pull in B as a direct dependency of A and thus re-build it locally. I check in that lockfile. Then someone who uses B straight from the registry tries to build A — will cargo complain? I don't think it should (unless the compile fails), but not sure.

This sort of depends I think. I'm not entirely sure what you're thinking about how the lockfile is shared, but in general Cargo doesn't complain about an "invalid" lock file unless you pass something like --locked. Othewise Cargo will just silently update the lock file for you.

@jonhoo
Copy link
Contributor

jonhoo commented Feb 24, 2021

Hm ok thanks for explaining things! I was hoping I'd be struck by some sort of inspiration that would lend itself to a solution to your issues while also not being a fully-generalized [patch]-in-config. Unfortunately though that didn't happen. I don't think I fully grok all the intricacies of your use case, but that's probably ok too. I'm also not sure learning about your build system in more precise detail will help me too too much with this.

Yeah, I've also been going over this many times over, and I keep coming back to either needing to patch in .cargo/config or being able to auto-generate parts of Cargo.toml (hence the Cargo.patch/Cargo.toml.d proposals). Another option might be to allow patching through environment variables, but that would get unwieldy very quickly, and may be a box we don't want to open.

Perhaps the next steps here are to implement this in Cargo? I don't think it would be too too large of an implementation. We'd want to gate it to nightly for now, of course, but that should hopefully at least let you explore internally whether it would work for your intended use case?

I'll give that a stab then! I'm imagining that I'll be modifying the part of the code that reads out the [patch] from Cargo.toml and have it augment it with [patch] from Config if it exists. Does that sound about right?

We don't currently use nightly internally, but that's a "me problem" not a cargo problem :) Landing it under a nightly-only flag for now seems like the only responsible way to go about it.

(*) A question arises here about lockfiles. Say my lockfile for A is generated with a path patch because in my environment I pull in B as a direct dependency of A and thus re-build it locally. I check in that lockfile. Then someone who uses B straight from the registry tries to build A — will cargo complain? I don't think it should (unless the compile fails), but not sure.

This sort of depends I think. I'm not entirely sure what you're thinking about how the lockfile is shared, but in general Cargo doesn't complain about an "invalid" lock file unless you pass something like --locked. Othewise Cargo will just silently update the lock file for you.

Okay, yeah, I think that should be good enough. Thanks!

Separately, I wonder if there might be room for a compromise where [patch] in .cargo/config is only accepted in the directory that holds the Cargo.toml, and nowhere else. So if we have a crate at /a/b and the user invokes cargo at /a/b/c, only the [patch] in /a/b/.config/cargo.toml is considered, and any patch statements in /a/.cargo/config.toml or /a/b/c/.cargo/config.toml are ignored (and produce warnings to that effect)?

What do you think of this idea @alexcrichton? It'd limit the blast radius of this kind of config patching to be project-local, and is unlikely to have quite the same chance of hard-to-debug problems as the recursive-search approach.

@alexcrichton
Copy link
Member

I'll give that a stab then! I'm imagining that I'll be modifying the part of the code that reads out the [patch] from Cargo.toml and have it augment it with [patch] from Config if it exists. Does that sound about right?

I think so yeah, although I probably wouldn't try to agument the [patch] table stored in a Manifest but rather only take into account more entries from Config wherever we read the [patch] table (which I think is only one location in src/cargo/ops/resolve.rs.

What do you think of this idea @alexcrichton? It'd limit the blast radius of this kind of config patching to be project-local, and is unlikely to have quite the same chance of hard-to-debug problems as the recursive-search approach.

I probably wouldn't bother with extra restrictions for now. I think it makes more sense to implement something easy-ish and flexible in nightly. This is something we'd want to revisit before stabilization.

@jonhoo
Copy link
Contributor

jonhoo commented Feb 24, 2021

I submitted a preliminary nightly-only implementation in #9204.

bors added a commit that referenced this issue Mar 15, 2021
Support [patch] in .cargo/config files

This patch adds support for `[patch]` sections in `.cargo/config.toml`
files. Patches from config files defer to `[patch]` in `Cargo.toml` if
both provide a patch for the same crate.

The current implementation merge config patches into the workspace
manifest patches. It's unclear if that's the right long-term plan, or
whether these patches should be stored separately (though likely still
in the manifest). Regardless, they _should_ likely continue to be
parsed when the manifest is parsed so that errors and such occur in the
same place regardless of where a patch is specified.

Fixes #5539.
@bors bors closed this as completed in 8178f22 Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-patch Area: [patch] table override C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted`
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants