Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: support GOPROXY fallback on unexpected errors #37367

Open
heschik opened this issue Feb 21, 2020 · 17 comments
Open

cmd/go: support GOPROXY fallback on unexpected errors #37367

heschik opened this issue Feb 21, 2020 · 17 comments
Assignees
Milestone

Comments

@heschik
Copy link
Contributor

@heschik heschik commented Feb 21, 2020

In #26334, particularly #26334 (comment), we decided that the go command should only try the the next proxy in GOPROXY if it received a deliberate not found (404/410) response. This prevents unwanted leakage of private module paths in case a private proxy has an outage -- if the go command tried the next proxy on 500s, it would leak the request to proxy.golang.org or any other public proxy in the chain.

For public proxies like proxy.golang.org, this argument doesn't apply. Falling back from a public proxy, particularly to direct, should be much less risky. The only information leaked is that a particular IP address wants a public module, which must by definition have been public to be served by the public proxy in the first place. And that's not a new risk -- public proxies are free to serve a 404 whenever they want.

Therefore, allowing fallback on all errors would improve reliability of the ecosystem with only minimal costs. As a strawman, we could support | delimiters, which would be used like:

GOPROXY=goproxy.corp,proxy.golang.org|direct

meaning to require an affirmative response from goproxy.corp, then try proxy.golang.org and fall back to direct on any failure, expected or otherwise. Precisely, | after an entry means "if the prior entry fails in any way, continue to the next entry".

The default value of GOPROXY would presumably then change to GOPROXY=proxy.golang.org|direct.

@FiloSottile just in case there are security implications, but I'm pretty sure sum.golang.org covers this same as it does anything else.

cc @jayconrod @bcmills @katiehockman @hyangah

@heschik heschik added this to the Go1.15 milestone Feb 21, 2020
@FiloSottile

This comment has been minimized.

Copy link
Member

@FiloSottile FiloSottile commented Feb 21, 2020

This is not a security no-op, although it might be worth the tradeoff. Is it actually common for proxy.golang.org to fail while sum.golang.org is available?

The difference is that a network adversary becomes capable of causing a fallback to direct. While you're right that this does not impact the privacy of the module names themselves, nor the integrity guarantees of the checksum database, it makes it possible for the code host to observe who is using what. Moreover, list and latest results are not protected by the sumdb, and with this change an attacker would get to hide versions without causing errors by colluding with the host.

It also removes any fallback security provided by the TLS connection to the proxy should the checksum database fail to provide its intended security properties, or should it be disabled. For example, I think we disabled the sumdb in a fetch made by a test, because we are ok with trusting the proxy. If the fallback strategy changes, that would be a very different decision.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 24, 2020

Looking at my notes, I think @rsc and I discussed this sort of fallback back in January and settled on essentially the same design.

We observed that, semantically, , indicates a “definitive” proxy (that is, one that should control all module paths), while | indicates a "non-definitive" proxy (that is, one that does not control module paths per se).

From that perspective, the last entry in the list lacks a token because it is always definitive: if the last entry does not or cannot provide the requested path, then that path is de facto unavailable.

@jayconrod

This comment has been minimized.

Copy link
Contributor

@jayconrod jayconrod commented Feb 24, 2020

Should we fall back after a timeout if a proxy doesn't respond? That seems like another important failure mode.

Also, should we fall back after non-HTTP errors, like if we fail to connect or if the proxy returns an invalid response? Assuming yes.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 24, 2020

I would prefer not to hard-code any timeouts in the fallback algorithm proper. We can't in general assume that the user is on a fast, reliable network connection, and the decision about when a network connection has actually failed is best left to a lower level (such as the default net/http configuration, which may reasonably initialize itself from some system-specific configuration).

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 5, 2020

It seems clear we should do this, because we need to allow CI and other automated systems to fall back if the proxy or the network to the proxy is unavailable for some reason.

@rsc rsc added the NeedsFix label Mar 5, 2020
@gopherbot gopherbot removed the NeedsDecision label Mar 5, 2020
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 5, 2020

re @FiloSottile:

This is not a security no-op, although it might be worth the tradeoff. Is it actually common for proxy.golang.org to fail while sum.golang.org is available?

It is common to need to talk to proxy.golang.org (you just checked out a repo and are trying to build it, or you are a CI system) without needing to talk to sum.golang.org (the repo has an up-to-date go.sum file).

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 5, 2020

Also, re fallback strategy, it is important to note that there are two steps here:

  1. Have a syntax for "fallback", namely using a | after the name instead of a ,
  2. Change the default GOPROXY to do a fallback on proxy unavailability.

There are not changes in security posture from (1), just new possibilities (with security implications, but again, only if you use it).

The possible change in security possible is from (2). The argument is that reliability is more important than security here, because go.sum is providing security. (If it's not, we should fix that!)

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 6, 2020

re @jayconrod:

Should we fall back after a timeout if a proxy doesn't respond? That seems like another important failure mode.

http.DefaultClient has no timeout at all. That's clearly wrong - if the server is blackholed somehow and packets are being dropped, we don't want to sit. I think we probably should have some timeout, but it can be a long one, like 60 seconds.

Also, should we fall back after non-HTTP errors, like if we fail to connect or if the proxy returns an invalid response? Assuming yes.

Yes. Even some 500 errors probably should fall back. 502 Bad Gateway for example. We fall back today (for comma) on 404/410. It seems like probably any HTTP error at all should fall back for pipe.

@FiloSottile

This comment has been minimized.

Copy link
Member

@FiloSottile FiloSottile commented Mar 6, 2020

The possible change in security possible is from (2). The argument is that reliability is more important than security here, because go.sum is providing security. (If it's not, we should fix that!)

There are at least two parts to the security of fetching modules: integrity, which is indeed handled by the checksum database and go.sum, and privacy.

The tradeoff in this case is a privacy tradeoff: a code host that can cause a network failure (easy) can get information it would not have had otherwise on who is using what.

Also, list and latest are not protected by sumdb and go.sum, so this makes it possible to keep a client on an old version for an attacker that has control over network (easy) and code host (hard). Unlike publishing a fake version, this is undetectable externally.

I'm not saying the tradeoff is not worth it, but it's not within the scope of go.sum.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 6, 2020

The tradeoff in this case is a privacy tradeoff: a code host that can cause a network failure (easy) can get information it would not have had otherwise on who is using what.

That is true, but they could also get that information if the proxy decided to stop mirroring their module (and instead served a 404 or 410).

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Mar 10, 2020

Discussing with @FiloSottile, we realized that while step (2) would improve cmd/go's reliability posture it would significantly weaken the security posture, because now cmd/go could be easily forced to invoke git and other VCS binaries, and the never-ending stream of git/other-VCS remote code execution vulnerabilities would be back in scope. Today the proxy completely removes that security problem from users in the default configuration. I need to think some more about this, but this might be a good enough reason not to take step (2) by default.

@marwan-at-work

This comment has been minimized.

Copy link
Contributor

@marwan-at-work marwan-at-work commented Mar 12, 2020

Personally, I don't think the rate of proxy.golang.org failing (which I have yet to experience) is worth the complexity because adding the new | directive will be yet another detail the Go programmer has to be cognizant about.

Therefore, maybe having a proxy.golang.org/status should be enough for people to know that the public proxy is down?

Or maybe even making a special case for proxy.golang.org to always fallback since proxy.golang.org is special to begin with as it is the default proxy that Go ships with.

@jayconrod

This comment has been minimized.

Copy link
Contributor

@jayconrod jayconrod commented Mar 12, 2020

@marwan-at-work As I understand, the use case is more about CI systems that use the go command in production. Changing GOPROXY if there's an outage on proxy.golang.org may be difficult or risky to do at scale. While adding a new fallback method here does increase complexity, I expect most people won't have to know or think about it.

@FiloSottile

This comment has been minimized.

Copy link
Member

@FiloSottile FiloSottile commented Mar 12, 2020

Of course, special casing proxy.golang.org would have the same security cost as adding | and using it by default, without giving an option to opt-out, so it's even worse in terms of vulnerability exposure.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 12, 2020

Change https://golang.org/cl/223257 mentions this issue: cmd/go: add support for GOPROXY fallback on unexpected errors

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 27, 2020

I reverted this in CL 225757. We'll need to resubmit with a fix to the error-prioritization logic.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 30, 2020

Change https://golang.org/cl/226460 mentions this issue: cmd/go: add support for GOPROXY fallback on unexpected errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.