Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: automatically check and use vendored packages #33848

Open
bcmills opened this issue Aug 26, 2019 · 12 comments

Comments

@bcmills
Copy link
Member

commented Aug 26, 2019

Abstract

This is a proposal to enable the use of the main module's vendor directory, by default, in module mode. This proposal replaces #30240 and #29058, and subsumes #27227.

Background

The Go 1.5 release included support for a vendor subdirectory for each package subtree within GOPATH. A vendor directory, if present, implicitly changed the interpretation of import statements within that subtree to always refer to packages within the vendor directory if present.

In module mode, the meaning of each import is instead determined by the dependencies of the main module used for the build. However, from the first release of module mode (in Go 1.11), the go command has included a go mod vendor subcommand, which populates a vendor directory at the root of the main module with source code from those dependencies; and a -mod=vendor flag, which instructs the go command to use that source code for building.

In module mode, unlike GOPATH mode, the vendor subdirectories of other packages and modules are ignored — only the vendor directory at the root of the main module is used. This gives the author of the main module complete control over their dependencies via require and replace directives.

Go users have built a variety of workflows around vendored dependencies for a variety of use-cases — including self-contained builds, language-agnostic code review, and ephemeral patches — and since the launch of module mode in Go 1.11 they have made it clear that those workflows remain important in module mode.

However, vendoring in module mode today has a few rough edges:

  • A maintainer who introduces a new import statement may easily and accidentally introduce skew between the versions in the go.mod file and those in the vendor directory (#29058).

    • In CL 174528, we added a simple check for this skew in the std and cmd modules in the standard library, but it relies on some assumptions about the dependencies of std and cmd that do not hold in general.
  • The maintainers of projects that use vendoring expect the vendor directory to be used by default (#27227), but today in module mode that requires users to set -mod=vendor, either explicitly on the command line or in a GOFLAGS variable. For users working on multiple modules with different vendoring strategies, setting the variable may require repeated intervention over the course of the day.

This proposal attempts to address those rough edges.

Proposal

The concrete proposal is posted in a comment below. (Any updates will be linked from here.)

(CC @jayconrod @ianthehat)

@gopherbot gopherbot added this to the Proposal milestone Aug 26, 2019

@gopherbot gopherbot added the Proposal label Aug 26, 2019

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Aug 26, 2019

Proposal

I propose to enable -mod=vendor automatically, and to fail build operations when -mod=vendor is enabled if the vendor/ directory is out of sync with the go.mod file.

In order to detect when the vendor/ directory is out of sync with the go.mod file, we need to know the constraints imposed by the main module — that is, its require and replace directives. Given those, the remaining module requirements are computed deterministically, so we do not need to store or check the complete module graph, which may be much larger than the explicit requirements or even the final list of selected versions.

Concretely, the changes are:

  • In the vendor/modules.txt file, add an annotation indicating which modules were listed explicitly in a require or replace directive in the go.mod file as of the last invocation of go mod vendor. (This implies that every module path and version that appears in the main module's go.mod file must be listed in modules.txt, even if no packages are vendored from within that module or version.)

    • To maintain compatibility with the Go 1.11 modules.txt parser, this annotation must have the following properties:

      • Does not change the number of fields in the # <module-path> <version> lines, which must be equal to 3 in order for the existing parser to recognize the line.
      • Does not occur as a line with exactly one field (which would be interpreted as a package path within the preceding module).
      • Either does not have the prefix # , or does not occur in between the # <module-path> <version> lines and the associated <package-path> lines. (Otherwise, it would break the association between modules and packages.)
    • To comply with the above requirements, I propose that we use a line of the form:

      ## explicit
      

      or

      ## replaced by example.com/some/module vX.Y.Z
      

      or

      ## explicit; replaced by example.com/some/module vX.Y.Z
      
    • To allow for future expansion, the updated parser should check lines beginning with ## for any semicolon-delimited field matching the expected annotation, ignoring any whitespace surrounding the field. (All other tokens within such comments are reserved for future use.)

  • When a top-level vendor/modules.txt file exists and contains at least one ## annotation, then for all build commands except go get:

    • Change the default value of the -mod build flag to vendor.

    • Verify that the set of modules annotated as explicit in vendor/modules.txt is exactly equal to the set of modules (and versions) found in require directives in the main module's go.mod file.

    • Verify that the set of modules annotated as replaced by a module in vendor/modules.txt is exactly equal to the set of replace directives (including both path and version, if applicable) in the main module's go.mod file.

  • Add a new value for the -mod flag, -mod=mod, to explicitly request the non-vendor behavior: that is, to ignore the contents of the vendor directory and to automatically update the go.mod file as needed.

    • The go get command today treats an explicit -mod=vendor flag as an error. With this change, go get should instead always provide the behavior of -mod=mod, treating the -mod flag as unrecognized (see #26850 (comment)).

      In particular, go get should ignore any implicit -mod=vendor from the presence of a vendor/modules.txt file or from a -mod flag set via GOFLAGS (see also #32502), and treat an explicit -mod=vendor, -mod=readonly, or -mod=mod as an error (see #30345).

The proposed approach allows the following commands to work from a vendor directory with an empty module cache and no network access:

  • go list -deps ./...
  • go build ./...
  • go test ./…

Other commands may fail without an explicit -mod=mod override:

  • go list all
  • go test all
  • go list -test $(go list -deps ./…)
  • go list -m example.com/some/module

Note that the go mod subcommands do not have a -mod= flag and therefore never look at the vendor directory for information. They always require access to a module cache or module proxy. The check in this proposal ensures that the results produced by any go mod subcommand remain accurate even when the corresponding build commands use vendored sources.

Example workflow

# Upgrade the golang.org/x/sys module and fetch it from $GOPROXY.
# Also ensure that the rest of the module dependencies are consistent
# with the requirements of the upgraded module.
# This step updates the go.mod file, but not the vendor directory.
$ go get -d golang.org/x/sys

# A plain 'go test ./…' at this point may fail,
# because vendor/modules.txt may be out of sync with the go.mod file.
# However, 'go test -mod=mod ./…' should succeed,
# because it explicitly ignores the vendor directory.
$ go test -mod=mod ./...

# Download any newly-upgraded dependencies to the module cache,
# and copy them from the module cache to the vendor directory.
$ go mod vendor

# Test the packages within the module
# using the newly-updated dependencies in the vendor directory.
$ go test ./...

Caveats

Intentional modifications

The above checks do not detect intentional modifications to the vendored sources. Users who want to ensure that no intentional modifications are present will need to re-run go mod vendor and look for diffs, or (after #27348 is addressed) run go mod verify.

Local filesystem changes

The above checks also do not detect skew due to changes in local files within a directory specified in a replace directive. I do not recommend the use of filesystem replace directives in general, but the potential for skew makes them particularly important to avoid when using a vendor directory.

If we believe that changes in the local filesystem will be a significant source of skew in practice, we could expand the replaced by annotation to include a hash of the replacement directory. However, computing that hash could significantly increase the cost of the consistency check, which in the current proposal only requires us to read the go.mod and vendor/modules.txt files.

Dependency analysis

Because the approach proposed here does not retain go.mod files for modules that do not contribute packages to the build (including older-than-selected versions of packages that do contribute packages), the vendor directory will not support analysis of the module graph (go list -m all, go mod why -m, and go mod graph).

That seems like an acceptable tradeoff: adding those go.mod files would require some scheme for encoding their versions within the vendor directory, which we do not have today, and would introduce diffs that are largely irrelevant to the build during code review.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

One point I'm a bit uncertain on: above I proposed to trigger the automatic -mod=vendor behavior based on the presence of a ## annotation in the vendor/modules.txt file.

Would it be better to instead trigger it based on a go 1.14 directive in the go.mod file (plus the existence of vendor/modules.txt? (Why or why not?)

@thepudds

This comment has been minimized.

Copy link

commented Aug 27, 2019

That seems like it would be simpler to understand, including because I think it has been previously stated that the format of the modules.txt file is an internal and undocumented implementation detail?

Could you even take it a step further to have it trigger off of the presence of the vendor directory itself plus go 1.14 in the go.mod, rather than vendor/modules.txt plus go 1.14?

It would then be triggering off of things that are both more visible and actually documented, and hence would likely be easier to understand.

@ardan-bkennedy

This comment has been minimized.

Copy link

commented Sep 18, 2019

I propose to enable -mod=vendor automatically, and to fail build operations when -mod=vendor is enabled if the vendor/ directory is out of sync with the go.mod file.

When I build in a container, I will be copying the entire projects source tree of code into that container manually. This will cause things to be out of sync. So now, I will have to make sure that I don't copy the go.mod|sum files into the container.

This also means that anyone else on my team working with this project, will need to run throught the proxy server to validate their vendor folder is "correct". I am concerned that in the end, the vendor folder is not really acting as a vendor folder. If the module system doesn't like what it sees in vendor, the build breaks. The deps needs to be cached anyway. So might as well not even use vendor.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2019

When I build in a container, I will be copying the entire projects source tree of code into that container manually. This will cause things to be out of sync. So now, I will have to make sure that I don't copy the go.mod|sum files into the container.

I think I'm missing something. How would copying the entire source tree into the container cause the vendor directory to get out of sync with the go.mod file? Presumably you'd be copying in the same go.mod file that you used to generate the vendor directory in the first place.

@ardan-bkennedy

This comment has been minimized.

Copy link

commented Sep 18, 2019

How are you going to verify that the vendor folder and the go.mod file is in sync? This is going to require network calls against the proxy server I assume? I am wondering if the deps are going to be downloaded as well during this process?

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2019

@ardan-bkennedy, that's all covered in the text of the proposal above (#33848 (comment)).

@ardan-bkennedy

This comment has been minimized.

Copy link

commented Sep 18, 2019

I read that several more times and I think it’s starting to make sense. What the above proposal is saying is, given a vendor folder with a modules.txt file, just verify that this file is in sync with the go.mod file? If they are, build against the vendor folder. If they are not, issue a build error? Therefore no calls to the proxy server are required. The original workflow is maintained?

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2019

Yep. Being able to do that without hitting the network at all requires that we include some extra information in the vendor/modules.txt file compared to what we're writing today, but the amount of extra information seems reasonable to me, and the resulting file should be backward-compatible to earlier Go releases.

@ardan-bkennedy

This comment has been minimized.

Copy link

commented Sep 18, 2019

That's brilliant and allows vendoring to be an option again. Which is much appreciated. So the next question is, will this make a point release for 1.13 or do we need to wait for 1.14?

My concern is that projects may just give up on vendoring because of the current breakage in workflows. Which is a shame because I think vendoring has strong advantages when it's reasonable to use it.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2019

This would not be backported to 1.13. (Per the minor release policy, we only backport “security issues, serious problems with no workaround, and documentation fixes”.)

In the interim, users who rely on vendoring workflows can enable vendoring for all build commands by setting GOFLAGS=-mod=vendor, explicitly disable module mode in Go 1.13 by setting GO111MODULE=off, or implicitly disable module mode within GOPATH/src by staying on Go 1.12 until they're ready to try a 1.14 pre-release.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2019

It sounds like the general consensus here is that we should accept this proposal for Go 1.14.
This is probably important to do before we set GO111MODULE=on for people using the dev branch.

Am I reading this wrong? Does anyone object to accepting this proposal? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.