Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: add modvendor sub-command #27618

Open
myitcv opened this issue Sep 11, 2018 · 18 comments
Open

cmd/go: add modvendor sub-command #27618

myitcv opened this issue Sep 11, 2018 · 18 comments
Labels
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@myitcv
Copy link
Member

myitcv commented Sep 11, 2018

Creating this issue as a follow up to #26366 (and others).

go mod vendor is documented as follows:

Vendor resets the main module's vendor directory to include all packages
needed to build and test all the main module's packages.
It does not include test code for vendored packages.

Much of the surprise in #26366 comes about because people are expecting "other" files to also be included in vendor.

An alternative to the Go 1.5 vendor is to instead "vendor" the module download cache. A proof of concept of this approach is presented here:

https://github.com/myitcv/go-modules-by-example/blob/master/012_modvendor/README.md

Hence I propose go mod modvendor, which would be documented as follows:

Modvendor resets the main module's modvendor directory to include a 
copy of the module download cache required for the main module and its 
transitive dependencies.

Name and the documentation clearly not final.

Benefits (WIP)

  • Eliminates any potential confusion around what is in/not in vendor
  • Easier to contribute patches/fixes to upstream module authors (via something like gohack), because the entire module is available
  • The modules included in modvendor are an exact copy of the original modules. This makes it easier to check their fidelity at any point in time, with either the source or some other reference (e.g. Athens)
  • Makes clear the source of modules, via the use of GOPROXY=/path/to/modvendor. No potential for confusion like "will the modvendor of my dependencies be used?"
  • A single deliverable
  • Fully reproducible and high fidelity builds (modules in general gives us this, so just re-emphasising the point)
  • ...

Costs (WIP)

  • The above steps are currently manual; tooling (the go tool?) can fix this
  • Reviewing "vendored" dependencies is now more involved without further tooling. For example it's no longer possible to simply browse the source of a dependency via a GitHub PR when it is added. Again, tooling could help here. As could some central source of truth for trusted, reviewed modules (Athens? cc @bketelsen @arschles)
  • ...

Related discussion

Somewhat related to discussion in #27227 (cc @rasky) where it is suggested the existence of vendor should imply the -mod=vendor flag. The same argument could be applied here, namely the existence of modvendor implying the setting of GOPROXY=/path/to/modvendor. This presupposes, however, that the idea of modvendor makes sense in the first place.

Background discussion:

https://twitter.com/_myitcv/status/1038885458950934528

cc @StabbyCutyou @fatih

cc @bcmills

@bcmills
Copy link
Contributor

bcmills commented Sep 11, 2018

I don't think the proposed resets the main module's modvendor directory behavior is quite the right workflow.

One of the benefits of versioned modules over vendoring is that they can reduce redundancy globally: instead of N copies of the same code spread across N repos, we can have a single canonical copy shared by all builds of those repos. A per-module modvendor cache would revert that advantage.

@bcmills
Copy link
Contributor

bcmills commented Sep 11, 2018

Instead, perhaps we should make it easier to maintain per-user or per-organization module proxies.

For example, we could add an optional argument to go mod download to tell it where to save the downloaded modules.

go mod download $path could copy all active modules to $path, and go mod verify $path could verify that the modules already stored in $path match the go.sum of the current module. Then, the modvendor operation would essentially be:

go mod download $GOPROXY
go mod verify $GOPROXY

Then the user could commit the contents of $GOPROXY to a separate (personal or org-wide) repository.

@bcmills bcmills added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 11, 2018
@bcmills bcmills added this to the Go1.12 milestone Sep 11, 2018
@flibustenet
Copy link

We should also do the opposite, to fill the cache from a downloaded directory.

$ go mod download -export $path 

somewhere else, maybe an other machine
$ go mod download -import $path
that will fill the cache

@bcmills
Copy link
Contributor

bcmills commented Sep 11, 2018

@flibustenet GOPROXY already does the opposite: GOPROXY=$path go mod download populates the active modules into the user's module cache from an arbitrary directory.

We don't currently have a command that populates more than the active modules, but that seems like a job for rsync or git rather than go itself.

@myitcv
Copy link
Member Author

myitcv commented Sep 11, 2018

@bcmills

I don't think the proposed resets the main module's modvendor directory behavior is quite the right workflow.

I think there are actually two use cases here:

  1. "vendoring" all dependencies within the same repo as the module(s) that depend on them
  2. a per-user/organisation module proxy repo, separate from the repo(s) that use it

I should update the description to make clear that this issue is trying to address point 1. Hence why I think the logic to "reset the main module's modvendor directory" is correct; because I don't want this directory to grow like a cache.

Point 2 is the approach I've taken with https://github.com/myitcv/cachex, which is the "organisation repo" for https://github.com/myitcv/x, my mono repo. In this case, https://github.com/myitcv/cachex is an append-only repo that is a cache, and hence grows over time. It's separate from (and a subset of) $GOPATH/pkg/mod/cache/download because that can (and does) include downloads of private repos that I don't want made public. As you say, this approach reduces redundancy. Your proposal of go mod download $path is effectively what I do via bash with a GOPROXY+GOPATH+rsync dance; in this situation, I agree, I don't want the reset semantics.

But I can see use cases (i.e. deploying code or similar) where there is real benefit in point 1, for everything to be "bundled (in the same repo).

Assuming we want to address/support both use cases (and it seems sensible to my mind to do so), they could be solved by the same sub-command; I'm certainly not precious about that 😄. But I think there are separate use cases to cover here.

@bcmills
Copy link
Contributor

bcmills commented Sep 11, 2018

I can see use cases (i.e. deploying code or similar) where there is real benefit in point 1, for everything to be "bundled" (in the same repo).

I'm not certain about those cases one way or the other. Given versioning, it seems like you can address all of the same use-cases — and more! — using a separate repository. If folks are doing the cost/benefit analysis and coming to a different conclusion, I'd like to see more of the details of the costs and benefits involved (beyond just “that's the way we've done things without versioning”).

@myitcv
Copy link
Member Author

myitcv commented Sep 11, 2018

If folks are doing the cost/benefit analysis and coming to a different conclusion, I'd like to see more of the details of the costs and benefits involved (beyond just “that's the way we've done things without versioning”).

I'd second this request because, unless it wasn't clear already, I'm a fan of point 2.

I'm only putting up point 1 as a "better" alternative to go mod vendor (better in the sense that it doesn't suffer from the pitfalls associated with #26366 amongst other things). But, and I totally grant you this, I haven't articulated all (any?) of the costs associated with keeping workflows oriented around a single repo, a la vendor.

@bcmills
Copy link
Contributor

bcmills commented Sep 12, 2018

Hmm. With the go mod download $path approach, it's still possible to put $path in the same repository (cutting it off from the modules in that repo using an explicit go.mod file, or perhaps with a well-known subdirectory such as vendor/mod/ or vendormod/), and you can even unpack it easily with a single command (GOPROXY=$path go mod vendor).

@myitcv
Copy link
Member Author

myitcv commented Sep 12, 2018

Yes absolutely; I think the only difference between these two use cases is the use of "reset" semantics or not.

@sanguohot
Copy link

modules shared is very important, but there still would be some no share cases.
A litte like NPM without -g flag.

@rsc
Copy link
Contributor

rsc commented Oct 24, 2018

Replying to the original benefits:

  • Eliminates any potential confusion around what is in/not in vendor

Having two ways to populate vendor does not seem like it would eliminate confusion.

  • Easier to contribute patches/fixes to upstream module authors (via something like [gohack (https://github.com/rogpeppe/gohack)), because the entire module is available

We should address gohack, but modvendor does not seem like the right way to do it.

  • The modules included in modvendor are an exact copy of the original modules. This makes it easier to check their fidelity at any point in time, with either the source or some other reference (e.g. Athens)

It would be better to make go verify work with the pruned vendor directories, if that's a concern.

  • Makes clear the source of modules, via the use of GOPROXY=/path/to/modvendor. No potential for confusion like "will the modvendor of my dependencies be used?"

This is doubling down on vendor. We want to move in the opposite direction.

  • A single deliverable

I don't know what this means.

  • Fully reproducible and high fidelity builds (modules in general gives us this, so just re-emphasising the point)

No actual benefit here, right?

I don't see what the problem is here, really, and I think it's very important not to pull in the entire module just to get one package. Because you're not just pulling in that one module, you're pulling in (at least references to) its dependencies.

@myitcv
Copy link
Member Author

myitcv commented Oct 28, 2018

Thanks for the reply @rsc. Taking your responses slightly out of order:

Easier to contribute patches/fixes to upstream module authors (via something like [gohack (https://github.com/rogpeppe/gohack)), because the entire module is available

We should address gohack, but modvendor does not seem like the right way to do it.

Agreed, this doesn't make sense to solve with modvendor; not sure what I was thinking here. gohack get has a -vcs flag for just this purpose.

This is doubling down on vendor. We want to move in the opposite direction.

Just to be clear, I'm also trying to move away from vendor (the vendor directory as in the Go 1.5 definition) and the concept of "vendoring" more generally (and modvendor falls into this bucket), because there are better solutions to the problems that vendor/"vendoring" try to solve.

My thinking was that something like modvendor could be a useful stepping stone away from the vendor directory to proxies etc.

Eliminates any potential confusion around what is in/not in vendor

Having two ways to populate vendor does not seem like it would eliminate confusion.

modvendor uses a modvendor directory, not the vendor directory. The thinking being that a differently named directory forces the user to ask "what can I expect to be in modvendor" as opposed to being confused on "what is in vendor."

A single deliverable

I don't know what this means.

Poorly worded. One of the main reasons people like the vendor directory is that it removes service/network dependencies beyond the initial clone, there is nothing else to configure, no second repository to commit etc. modvendor achieves a similar effect - there is just one thing in play.

Fully reproducible and high fidelity builds (modules in general gives us this, so just re-emphasising the point)

No actual benefit here, right?

Agreed, if we can get go verify to work on the contents of the vendor directory. The only minor point I was making here was that it's very easy to modify the contents of your vendor directory and not run go verify either locally or enforce it as part of CI. It's harder to modify the contents of modvendor in the first instance. Case in point being https://github.com/goware/modvendor et al which exists to copy additional files to the vendor directory, files that are already in the module.

I don't see what the problem is here, really, and I think it's very important not to pull in the entire module just to get one package. Because you're not just pulling in that one module, you're pulling in (at least references to) its dependencies.

At least the way I intended to implement my trial of modvendor was to only pull in the modules that are required, so hopefully I only pull in references to their dependencies.

But I'm quite prepared to accept that modvendor might not be the right or even a necessary stepping stone.

@thanasik
Copy link

thanasik commented Apr 3, 2019

Have this issue as well, the c and h files from dependencies that import cgo are not vendored, builds fail because of this. We're using modvendor right after running vendor to solve it, though this would be convenient and should be standard since builds fail without it if they need to vendor non-go files

@dryaf
Copy link

dryaf commented Jun 13, 2019

c and h files are missing for example in gopkg.in/goracle.v2 when running go mod vendor and then go build -mod vendor . fails

go mod vendor should just copy the cache. in case somebody has a problem with that for some reason, some flag --ignore-tests could help. but some might also like to run the tests of the dependencies in the ci.

@nomad-software
Copy link

@dryaf @karysto Try https://github.com/nomad-software/vend which will vend everything.

@anjmao
Copy link

anjmao commented Sep 22, 2019

@myitcv I suggest to keep it simpler and do not introduce new subcommands. Maybe go mod vendor -a will do the job.

Talking about #26366 issue. I tried to migrate one of our go project to use go modules, but since it uses some C dependencies and gomobile is not working with go modules I thought well I will just use go mod vendor and it will do the job the same as dep is working, but... well... it's not. My initial thinking was that go mod vendor was introduced for easer migration to go modules, but it looks that we still need to write custom tools like vend again and again.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@gunsluo
Copy link

gunsluo commented Feb 28, 2020

@anjmao I agree with your suggestion and hope to solve the problem in a simple way. Personally speaking, go mod vendor -a is a good choice.
@myitcv In my go project and it use some C dependencies and template files in other packages in modules. I run go mod vendor and it ignored .c .h .tpl file, I had to manually copy the files into my project. if go mod vendor -a $package will copy all files of $pakcage to the vendor, It was the result I expected.

@rezaalavi
Copy link

We are facing the same problem. For legal reasons, we need to include the license file of some packages to the vendor directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests