New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: allow proxies to supply only some modules #26334

Open
arschles opened this Issue Jul 11, 2018 · 14 comments

Comments

Projects
None yet
6 participants
@arschles

arschles commented Jul 11, 2018

This is a proposal for extending the vgo download API to add a mechanism to allow proxies to redirect vgo to a VCS. This functionality would be useful if a proxy has only a subset of packages in a go.mod file.

Currently, if someone adds two modules "a" v1.0.0 and "b" v1.0.0 to their go.mod file and they then run GOPROXY=myprox.com vgo install, everything works as expected if the proxy has both modules at the given versions. If not, the command fails.

It would be helpful to allow the proxy to tell vgo to fetch one or both of the modules from the VCS if it doesn't have them in its cache. This would be useful for proxy implementations where the proxy will not/cannot cache the module in its own storage or can but doesn't have that module/version in its cache. The Athens project is a present day use case for the latter - it fills its caches asynchronously.

One implementation possibility is adding a $GOPROXY/a/@v/v1.0.0?go-get=1 network request that expects the same output as already-existing ?go-get=1 requests define. This request could be made before starting the download protocol as it currently exists. The new mechanism would allow the proxy to choose to do one of the following for the given module and version identifier:

  1. Send vgo to the standard download API via the meta tag
  2. Send vgo to the VCS via the meta tag
  3. Return a 404
@arschles

This comment has been minimized.

arschles commented Jul 11, 2018

Update after slack discussion with @myitcv and @zeebo:

  1. we could certainly do cache fills synchronously
  2. there could be some "gotchas" with hosting the proxy, like timeouts when doing a cache fill for a large module (i.e. github.com/kubernetes/kubernetes)
  3. assuming the proxy synchronously fills caches, the user experience would be about the same as if vgo get downloaded from VCS
  4. if the proxy has a partial failure and cannot serve code, GOPROXY=myprox.com vgo get will always fail with the current behavior

Another concern that was brought up is that with GOPROXY set, the current behavior of GOPROXY=myprox.com vgo get github.com/kubernetes/kubernetes doesn't let the proxy redirect vgo to another location (i.e. a CDN) it still has to serve metadata and the zipped source code itself.

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Jul 11, 2018

The proxy can redirect vgo to another URL as long as that URL implements the download protocol. That's because the net/http client handles redirect status codes internally (up to 10 times by default)

The big question here is: can the GOPROXY tell Vgo (during a build) to go use a VCS source instead of the proxy itself? Which is a little different than a simple redirect to another Download Protocol enabled URL.

@rsc rsc modified the milestones: vgo, Go1.11 Jul 12, 2018

@rsc rsc added the modules label Jul 12, 2018

@rsc rsc changed the title from x/vgo: Allow proxies to send vgo to the VCS for a package to cmd/go: Allow proxies to send vgo to the VCS for a package Jul 12, 2018

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Jul 12, 2018

@rsc How do you envision the flow of the Proxy telling Vgo to switch to VCS?

I'm trying to follow the vgo code and it seems that on an initial build, the initial contact with the proxy's download protocol can be any of these endpoints: /@latest or /list or /@v/{certainRevision}.info.

This means that the proxy would need to return a consistent code (maybe a 404) on each of these endpoints to signal vgo to reconstruct the modfetch.Repo interface from a vcs source instead of the *proxyRepo one. Vgo would then have to back up a few steps and retry again.

Vgo can potentially always hit /list as the first entry point to the download protocol, and if the list is empty then switch to vcs. Or it can always hit @latest first and if it returns 404 then switch to vcs.

Another solution, is that the Download Protocol could implement a @probe call with a couple of parameters (module path and revision) and then the Proxy can early on tell vgo to just go for the VCS source for this particular module.

For efficiency, vgo can potentially send the entire list of modules it wants to probe to the proxy.

I haven't fully understood the last 2 days worth of changes to vgo so apologies if I'm a bit off.

@rsc

This comment has been minimized.

Contributor

rsc commented Jul 17, 2018

I don't think it makes sense for a proxy to tell the go command "go to this VCS instead". We're trying to migrate to proxy by default and while VCS will probably always be with us, I'd rather not mix the two.

I do think it would probably be OK to let GOPROXY be preference list and to also allow some setting like GOPROXY=direct as an explicit name for what the default behavior is. So you could say GOPROXY=https://myproxy/,direct and just let myproxy return a 404 for the things it doesn't know about. Then the proxy isn't in charge of the actual redirect; it's only in charge of "it's not me".

@rsc rsc changed the title from cmd/go: Allow proxies to send vgo to the VCS for a package to cmd/go: allow proxies to supply only some modules Jul 17, 2018

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Jul 17, 2018

@rsc that's about what I had in mind. The proxy shouldn't say "go to this vcs", it can just say "I don't have this module"

Does that mean the /@probe endpoint makes sense? Since that means Go can just ask the proxy whether it can work with a specific module before it even asks for @latest or /list.

I've altered the Go code if you'd like to look at a reference of what I'm suggesting: marwan-at-work@3767be8

@rsc

This comment has been minimized.

Contributor

rsc commented Jul 18, 2018

@marwan-at-work I'd rather not have newProxyRepo make any network calls. It turns out to be important to delay those as long as possible. I'd rather have the existing GET paths return 404s and then have the methods be able to return some kind of recognizable "not found error" (maybe satisfying os.IsNotExist is enough) and then something at a higher level will try the next repo method down the list. I think we should wait until Go 1.12 regardless.

@rsc rsc modified the milestones: Go1.11, Go1.12 Jul 18, 2018

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Jul 18, 2018

@rsc That makes sense since cmd/go checks the cache before making network calls. I'm happy to take on this task if you'd like me to as I'll try to make it work for Athens in the near future.

Either way, I'm happy to know the Proxy can dynamically delegate modules fetching back to Go.

Thanks!

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Jul 24, 2018

@rsc I have another pass at making this work. This time, we won't hit the network until necessary, and we won't need a @probe endpoint either. I'm hoping to see if this change is not too invasive for 1.11

The idea is that a *proxyRepo can take an alternative Repo interface that it can switch to in case of 404 (or other future codes). Similar to how *cachingRepo works.

Feel free to take a look if you get the chance marwan-at-work@5117c8c

I see other ways of doing this, such as having a top level Repo interface that accepts a slice of Repos and just tries one at a time in order: (cache, proxy, vcs, etc)

So my solution above of course may still be not what how you'd like to solve this problem but would love to hear your thoughts

Thanks :)

@hyangah

This comment has been minimized.

Contributor

hyangah commented Nov 2, 2018

Since this issue discusses a change to GOPROXY to allow the list of proxies or direct method -

A slightly different case I am thinking of is when the proxy server I use by default is temporarily unreachable or unavailable (possibly in the middle of fetching all dependencies) and even we can't get
404s. Reruning go get again with GOPROXY=direct upon network failure is an option when noticing this failure, but I would be happier if I can specify a set of proxy servers or even 'direct' fetch option for fallback.

But @bcmills raised a concern about leaking private package paths in the event of the private proxy failure if we just fall back to the next (direct) blindly.

@marwan-at-work

This comment has been minimized.

Contributor

marwan-at-work commented Nov 2, 2018

@hyangah I'm concerned with blindly falling back to git for two reasons:

  1. The proxy won't be able to block certain modules from being downloaded (maybe a company doesn't want their employees to use a certain open source package)
  2. If a proxy is down, maybe a build shouldn't happen at all so that the user is aware of what's happening. And of course, the user can switch to turning off the GOPROXY environment explicitly.

marwan-at-work added a commit to marwan-at-work/go that referenced this issue Nov 2, 2018

cmd/go: fallback to VCS if GOPROXY 404s
This change makes it so that a Go Modules proxy can return a 404
http status code which would signal cmd/go to end up fetching the
module from VCS instead. Note, any other status besides 404 and 200
will still fail.

Fixes golang#26334

Change-Id: Idcb0409d7fa52c8dc2601aec7f8a4274550afe3a
@hyangah

This comment has been minimized.

Contributor

hyangah commented Nov 5, 2018

@marwan-at-work code freeze date is today. do you plan to mail in the change for review as described in the contribution guideline? https://golang.org/doc/contribute.html#sending_a_change_github

@gopherbot

This comment has been minimized.

gopherbot commented Nov 5, 2018

Change https://golang.org/cl/147177 mentions this issue: cmd/go: fallback to VCS if GOPROXY 404s

@bcmills

This comment has been minimized.

Member

bcmills commented Nov 5, 2018

https://golang.org/cl/147177 was mailed before the freeze, so I think this will make the cut.

@gopherbot

This comment has been minimized.

gopherbot commented Nov 8, 2018

Change https://golang.org/cl/148377 mentions this issue: cmd/go: allow comma separated GOPROXY URLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment