Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: standard-library vendoring in module mode #30241

Open
bcmills opened this issue Feb 14, 2019 · 53 comments
Open

cmd/go: standard-library vendoring in module mode #30241

bcmills opened this issue Feb 14, 2019 · 53 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules Proposal Proposal-Accepted
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Feb 14, 2019

This proposal is both a fix for #26924, and a means for users to explicitly upgrade the golang.org/x dependencies of the standard library (such as x/crypto and x/net).

Proposal

The standard library will ship with three go.mod files: src/go.mod (with module path std), src/cmd/go.mod (with module path cmd), and misc/go.mod (with module path misc).

The std and cmd modules will have vendor directories, containing modules.txt files in the usual format and managed using the standard Go 1.13 vendoring semantics (hopefully #30240).

The std and cmd modules differ somewhat from ordinary modules, in that they do not appear in the build list.

  • No module outside GOROOT/src may declare its own module path to begin with std or cmd.
  • No module may explicitly require, replace, or exclude any module whose path begins with std or cmd. (Their versions are fixed to the Go release in use.)

For files within GOROOT/src only, when we resolve an import of a package provided by an external module, we will first check the build list for that module.

  • If the build list does not contain that module, or if the version in the build list is strictly older than the one required by the standard-library module containing the file, then we will resolve the import to a directory in the standard library, and the effective import path will include the explicit prefix vendor or cmd/vendor.
    • The explicit prefix ensures that no two packages in the build will have the same import path but different source code.
    • It also ensures that any code loaded from a valid module path has a corresponding module in the build list.
  • If the version in the build list is equal to than the one required by std or cmd, that version is not replaced in the main module, and the package exists in the corresponding vendor directory, then we will load the package from vendor or cmd/vendor with its original import path.
    • This eliminates network lookups if the main module is std or cmd itself (that is, if the user is working within GOROOT/src).
    • We assume that the code in GOROOT/src/vendor and GOROOT/src/cmd/vendor will remain pristine and unmodified.
  • If the version in the build list is greater than the version required by std or cmd, or if the version is equal and the module is replaced, then we will load the package from the module cache in the usual way.
    • This may cause substantial portions of the library to be rebuilt, but will also reduce code duplication — which should result in smaller binaries.
@gopherbot gopherbot added this to the Proposal milestone Feb 14, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Feb 14, 2019

(CC @rsc @bradfitz @jayconrod)

@bcmills bcmills modified the milestones: Proposal, Go1.13 Feb 14, 2019
@bcmills bcmills added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Feb 14, 2019
@bcmills bcmills changed the title proposal: standard-library vendoring in module mode proposal: cmd/go: standard-library vendoring in module mode Feb 14, 2019
@FiloSottile
Copy link
Contributor

I am strongly in favor of using modules for vendoring in the standard library, instead of ad-hoc copies, but I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?

Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a replaced chacha20poly1305.

Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.

@jayconrod
Copy link
Contributor

cc @ianthehat

I share some of @FiloSottile's concerns. Allowing users to override std dependencies, especially crypto dependencies, is scary. But not allowing this would be against the spirit of modules.

There are some safeguards here, but I'd love to solve this without making the module resolution and package loading logic more complicated.

Possibly dumb, low-tech idea: what if we vendor these modules without using vendoring? We could run go mod vendor, then rename the vendor directory to something else (internal/vendor_) and fix up the imports? That would fix #26924 and move complexity out of cmd/go, into tools.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 15, 2019

what if we vendor these modules without using vendoring? We could run go mod vendor, then rename the vendor directory to something else (internal/vendor_) and fix up the imports?

We have that today, in internal/x. We could fix #26924 using the same strategy in cmd, if we're ok with code duplication for the x repos. (The goal of this proposal is to reduce that duplication.)

@bcmills
Copy link
Contributor Author

bcmills commented Feb 15, 2019

@FiloSottile

I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?

It has come up at Go team summits from time to time, and while we don't see many complaints about upgrading x repos explicitly, we do see fairly frequent complaints about binary sizes (#6853 is canonical).

The two are closely related, in that one way you can end up with an overly-large binary is by pulling in a redundant copy of an x repo from the standard library.

Per my favorite quote from C. A. R. Hoare's Hints on Programming Language Design:

[L]isten carefully to what language users say they want, until you
have an understanding of what they really want. Then find some way of
achieving the latter at a small fraction of the cost of the former. This is
the test of success in language design, and of progress in programming
methodology.

It's much easier for us to reduce code duplication in the standard library than to reduce the outputs generated from duplicated code.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 15, 2019

Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a replaced chacha20poly1305.

We probably should start requesting the output of go list -m all in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue against x/net, and that has ~nothing to do with the standard library.)

But consider the opposite direction, too: wouldn't it be nice for folks to be able to, say, test the interaction of TLS 1.3 with their dependencies written against 'net/http`, without having to also run a bleeding-edge version of the compiler and runtime (with any associated runtime bugs)?

@bcmills
Copy link
Contributor Author

bcmills commented Feb 15, 2019

Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.

That seems like a great argument for making many of the standard-library packages thin forwarding shims around x/ packages. I may file that proposal, in fact, but it's separate from this one. 🙂

@jayconrod
Copy link
Contributor

I'm definitely sympathetic to reducing code duplication and binary size, but I'm also concerned about complexity and special cases.

I think we should resolve #26924 using the internal/x approach in order to unblock other module work for 1.13, and we should talk further about modularizing the standard library after 1.13. Just to confirm my understanding, converting cmd/vendor to cmd/internal/x would not significantly increase binary or SDK download size, right?

@kardianos
Copy link
Contributor

@bcmills @FiloSottile

We probably should start requesting the output of go list -m all in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue against x/net, and that has ~nothing to do with the standard library.)

This is the same reaction I had when I saw the objection over updating the std lib versions. Make it easy and known how to list module versions and request it in issues. I also remember when Azure had to vendor a patched version of crypto/tls to access to hit their TLS endpoints in their SDK, then fork the world to make it reference the fork.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 15, 2019

@jayconrod, in looking at the details of this, it seems to be by far the simplest way to make module-mode vendoring work for vendoring in the standard library.

go mod vendor uses the import graph to figure out which packages to copy, so it's much simpler to have the module-provided packages in the import graph than to try to build some other analysis (and vendoring tool) specifically for the standard library.

@FiloSottile
Copy link
Contributor

I don't disagree with including go list -m all in issue reports, but it still does not make me feel good about letting users swap chacha20poly1305 backends to crypto/tls. TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

@jayconrod
Copy link
Contributor

@bcmills I agree that this is probably the simplest way to make module-mode vendoring work in the standard library in terms of new code that needs to be written. My concern is that this makes package loading harder to reason about. In particular, this proposal means that you may get either one or two copies of some packages, depending on what version the main module requires or whether it has a replacement. We have the same risk of conflict with exported types or global state as we do today, but now the risk is more subtle.

Here's a possible alternative (with some drawbacks at the end):

  • The Go SDK ships with go.mod files for std, cmd, and misc, as described above.
    • Imports starting with std/, cmd/, or misc/ are resolved to those modules. This will require new logic, but it would replace existing GOROOT / GOPATH package resolution.
    • These modules may have normal requirements on x/tools, x/net, etc.
  • The Go SDK ships with a pkg/mod directory, which would contain dependencies of the standard modules. It would be in the normal module cache format. It would act as a fallback read-only cache before going to the network.
  • For packages that should not be user-replaceable (i.e., chacha20poly1305), we would create private copies in internal/x. I'm hoping there wouldn't be many of these, so we could still mostly avoid code duplication in binaries.
    • I think writing a tool to do this would be fairly straightforward. Maybe this is something gorelease should do.

I'm realizing some drawbacks to this approach though, as I'm typing this out.

  • This would increase SDK size, since pkg/mod would include sources for packages we don't depend on. Not sure how much this would add.
  • Our dependencies on x/tools, x/net would effectively set global minimum versions for those modules. MVS would override user requirements on older versions.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 19, 2019

  • Our dependencies on x/tools, x/net would effectively set global minimum versions for those modules.

Not only that, but all of their transitive dependencies too: for example, various parts of x/net and std currently depend on x/text, and x/text has a dependency on x/tools via golang.org/x/text/cmd/gotext and golang.org/x/text/message/pipeline. x/tools has a fairly broad dependency graph (see #29981), so we would be setting minimums for a lot of arbitrary modules.

Note that everything in cmd should be a package main, so nothing can import it. That makes the cmd module a bit easier to handle than std, since it mitigates the code bloat from using a vendor directory: we know affirmatively that no binary that links against cmd will also link against a redundant x repo.

misc is similar to cmd: it does not have a globally-addressable import path, so the only time you can build anything in misc is if it's the main module.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 19, 2019

@FiloSottile

TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of crypto/tls to live in the standard library at all?

That is: if it's that tightly coupled to x/crypto, why not make it a forwarding façade over an implementation in x/crypto? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent to x/crypto today.

@gopherbot
Copy link

Change https://golang.org/cl/162989 mentions this issue: go/analysis: allow overriding V flag without code patches

gopherbot pushed a commit to golang/tools that referenced this issue Feb 20, 2019
In CL 149609, a file was added to
src/cmd/vendor/golang.org/x/tools/go/analysis/internal/analysisflags/patch.go
to override the behavior of the V flag for cmd/vet.

That modification causes the behavior of cmd/vet to change when a
pristine copy of x/tools is vendored in, and module-mode vendoring
will only support pristine copies (see golang/go#30240).

Instead, allow cmd/vet to override the V flag by defining its own V
flag before it invokes unitchecker.Main.

Tested manually (by patching into cmd/vendor).

Updates golang/go#30240
Updates golang/go#30241
Updates golang/go#26924
Updates golang/go#30228

Change-Id: I10e4523e1f4ede94fbfc745012dadeefef48e927
Reviewed-on: https://go-review.googlesource.com/c/162989
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@bcmills
Copy link
Contributor Author

bcmills commented Feb 20, 2019

To summarize so far: it seems that there are some concerns about the module-upgrade behavior, but no objections to the remainder of the proposal.

Since a lot of this work needs to happen early in the cycle, I'm going to go ahead and implement a more limited form: for now, we'll always resolve dependencies of the standard library to the packages in the vendor directories, and we'll only treat them as part of the external module if the working directory is within the std or cmd module proper.

@FiloSottile
Copy link
Contributor

TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of crypto/tls to live in the standard library at all?

That is: if it's that tightly coupled to x/crypto, why not make it a forwarding façade over an implementation in x/crypto? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent to x/crypto today.

Making the implementation of crypto/tls swappable is interesting, and I had explored a different approach in #21753. However, I'm not sure we want to move something so core to most applications to x/crypto. For example, the security release process of x/crypto is far, far weaker.

Maybe we could still make main security releases which just bump the x/crypto dependency, but there's a lot of moving parts to this all, and I'd like to see it as a separate proposal.

(Also, I'm not convinced by the argument that a x/crypto dependency that I don't want to be a joint makes crypto/tls substantially different from any other standard library package.)

@bcmills
Copy link
Contributor Author

bcmills commented Feb 21, 2019

I'm not convinced by the argument that a x/crypto dependency that I don't want to be a joint makes crypto/tls substantially different from any other standard library package.

Agreed, but I think the point generalizes: any package in std that is tightly coupled to one in x could be replaced with a forwarding shim that isolates the coupling within the module.

@dmitshur
Copy link
Contributor

dmitshur commented Feb 21, 2019

I have two questions about this proposal.

The standard library will ship with three go.mod files: src/go.mod (with module path std), src/cmd/go.mod (with module path cmd), and misc/go.mod (with module path misc).

The standard library has a policy on which packages are allowed to import which others. It's codified in:

// pkgDeps defines the expected dependencies between packages in
// the Go source tree. It is a statement of policy.

Have you considered following the L0/L1/L2/L3/(L4+everything else) tiers there to guide std lib module boundaries (e.g., having stdl0, stdl1, stdl2, stdl3, stdl4 modules rather than a single std one)? Would there be benefits in doing so?

The std and cmd modules differ somewhat from ordinary modules, in that they do not appear in the build list.

What is the motivation to make these modules differ from ordinary modules?

@gopherbot
Copy link

Change https://golang.org/cl/163207 mentions this issue: misc: add go.mod file

@hunjixin
Copy link

any progress for this proposal?

@andig
Copy link
Contributor

andig commented Oct 3, 2022

We've recently had a case that touches @FiloSottile's comment. Working with a broken device containing borked certificate we needed to modify crptoybtes (https://groups.google.com/u/1/g/golang-nuts/c/wlhj5RFXh9g). Unfortunately it was a huge endeavour to apply this patch consistently throughout local, ci and docker build. We are aware that this patch is not suitable for upstream or any other purpose but it would have been convenient to have a simpler way of applying it.

Is there anything that could move this proposal into the active column of the proposals project?

@rsc
Copy link
Contributor

rsc commented Oct 6, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Oct 12, 2022

/cc @matloob @bcmills

@rsc
Copy link
Contributor

rsc commented Oct 26, 2022

It sounds like people are generally OK with this. There is some question about whether there are problems with x/crypto being too tied to the main repo, but if so we should find and fix those.

@rsc
Copy link
Contributor

rsc commented Oct 26, 2022

Are there any objections to making this change?

@FiloSottile
Copy link
Contributor

FiloSottile commented Oct 26, 2022

Over the past few years we actually brought a lot from x/crypto into std, so these days only chacha20poly1305, hkdf, and cryptobyte are vendored into the standard library. This makes upgrading x/crypto both less troublesome and less useful.

I find it a bit weird to expose upgrading these specific packages but not anything else in the stdlib to applications. Why can they upgrade ChaCha20Poly1305 but not AES-GCM? Why can a X.509 parsing bug be fixed with a replace if it's in cryptobyte but not if it's in the crypto/x509 caller? There is no answer because we never designed that boundary to be meaningful: vendoring a package was an implementation detail, not a visible decision. This change would expose it.

Looking at the full list of vendored packages, this doesn't feel arbitrary only for x/crypto.

# golang.org/x/crypto v0.0.0-20220722155217-630584e8d5aa
## explicit; go 1.17
golang.org/x/crypto/chacha20
golang.org/x/crypto/chacha20poly1305
golang.org/x/crypto/cryptobyte
golang.org/x/crypto/cryptobyte/asn1
golang.org/x/crypto/hkdf
golang.org/x/crypto/internal/poly1305
golang.org/x/crypto/internal/subtle
# golang.org/x/net v0.0.0-20220920203100-d0c6ba3f52d9
## explicit; go 1.17
golang.org/x/net/dns/dnsmessage
golang.org/x/net/http/httpguts
golang.org/x/net/http/httpproxy
golang.org/x/net/http2/hpack
golang.org/x/net/idna
golang.org/x/net/lif
golang.org/x/net/nettest
golang.org/x/net/route
# golang.org/x/sys v0.0.0-20220804214406-8e32c043e418
## explicit; go 1.17
golang.org/x/sys/cpu
# golang.org/x/text v0.3.8-0.20220722155301-d03b41800055
## explicit; go 1.17
golang.org/x/text/secure/bidirule
golang.org/x/text/transform
golang.org/x/text/unicode/bidi
golang.org/x/text/unicode/norm

In other words, it's nice if by luck a bug is fixable by replacing cryptobyte, but I think it would be more useful to talk about this in the broader context of a plan that involves carving out the runtime and compiler independent pieces of std that folks might want to upgrade, like the whole crypto/tls tree, net/http, etc. That would consistently relieve pressure on backport choices and make it easier to adapt to unsupported cases without forking the whole toolchain, which feels like the actual need here.

@mvdan
Copy link
Member

mvdan commented Oct 26, 2022

  • If the version in the build list is equal to than the one required by std or cmd, that version is not replaced in the main module, and the package exists in the corresponding vendor directory, then we will load the package from vendor or cmd/vendor with its original import path.

Out of curiosity, shouldn't we be able to deduplicate code and get smaller binaries in this case as well? The benefits are only mentioned in the third case (build list has a newer version), but not in this second case (build list has the same version).

@bcmills
Copy link
Contributor Author

bcmills commented Nov 1, 2022

@FiloSottile, I agree that this proposal would be more useful if more of the standard library could be upgraded.

The process seems straightforward enough: for a given package in std, define a corresponding package in an x/ repo, and turn the std package into a thin forwarding shim around the x/ package. Then, upgrading the x/ package automatically upgrades the std package as well. We would just need a consensus as to which std libraries should become wrappers, and perhaps extend the main-repo API checks to also run for the relevant x/ packages.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2022

To be clear, this change is not really being made just for x/crypto. It's a step toward a more rational relationship between std and x. We'll see what the next step is after we take this one.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@martin-sucha
Copy link
Contributor

No module outside GOROOT/src may declare its own module path to begin with std or cmd

Searching for "module std" filename:go.mod on Github yields 339 results, for "module cmd" filename:go.mod there are 485 results.

If I understand correctly, those would break.

If we want to avoid breaking them, we could use for example golang.org/std and golang.org/cmd as the module names instead, as those do not seem to be used now.

@ianlancetaylor
Copy link
Contributor

std and cmd are already special names, as documented at https://pkg.go.dev/cmd/go#hdr-Package_lists_and_patterns. I'm a bit surprised that it works to use those as module names.

@seankhliao
Copy link
Member

Looking at search result counts is inaccurate. Those are all copies of the Go source tree.

@martin-sucha
Copy link
Contributor

std and cmd are already special names, as documented at https://pkg.go.dev/cmd/go#hdr-Package_lists_and_patterns

Ah, missed that, thanks for the link! I was checking https://go.dev/ref/mod#module-path before I posted my comment and that page does not mention std nor cmd as special name.

Looking at search result counts is inaccurate. Those are all copies of the Go source tree.

Fair point. I haven't counted how many are copies of Go source tree. Could be that most of them are.

There are some examples that aren't copies of Go source tree:


To be clear - I'm fine with disallowing module names beginning with src and cmd, just wanted to point out that there are some existing modules like this in the wild.

@rsc
Copy link
Contributor

rsc commented Nov 9, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: cmd/go: standard-library vendoring in module mode cmd/go: standard-library vendoring in module mode Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests