Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum.golang.org: responding with a 410 fifteen minutes after publishing a new module version #42809

Open
mvdan opened this issue Nov 24, 2020 · 10 comments

Comments

@mvdan
Copy link
Member

@mvdan mvdan commented Nov 24, 2020

$ go version
go version devel +f7a80af935 Mon Nov 23 05:47:51 2020 +0000 linux/amd64
$ go get -d github.com/multiformats/go-multicodec@v0.2.0
github.com/multiformats/go-multicodec@v0.2.0: verifying module: github.com/multiformats/go-multicodec@v0.2.0: reading https://sum.golang.org/lookup/github.com/multiformats/go-multicodec@v0.2.0: 410 Gone
	server response: not found: github.com/multiformats/go-multicodec@v0.2.0: invalid version: unknown revision v0.2.0
$ GONOSUMDB='*' go get -d github.com/multiformats/go-multicodec@v0.2.0
$ GOPROXY=direct GONOSUMDB='*' go get -v -d github.com/multiformats/go-multicodec@v0.2.0

At the time of running these commands, I pushed the new version via git tag -a v0.2.0 && git push --tags over fifteen minutes ago. Why am I still getting a 410?

Note how skipping sum.golang.org alone, or skipping both the sumdb and proxy, make the go get work. These commands were run outside of any module.

At first I assumed I was holding something wrong, but I've triple-checked everything. You can see the version at https://github.com/multiformats/go-multicodec/blob/v0.2.0/go.mod. I realise that the tag says "five hours ago", because that's when I tagged it - but I only pushed the tag fifteen minutes ago.

@mvdan
Copy link
Member Author

@mvdan mvdan commented Nov 24, 2020

As an extra data point, https://pkg.go.dev/github.com/multiformats/go-multicodec@master renders properly and I can see a new enough version of master. The license is getting recognised, too.

cc @heschik @katiehockman

@mvdan

This comment was marked as off-topic.

@mvdan
Copy link
Member Author

@mvdan mvdan commented Nov 24, 2020

Someone at work points out that this is because I did a go get -d github.com/multiformats/go-multicodec@v0.2.0 before the tag had been picked up by the proxy, so sum.golang.org must have cached the "version does not exist" proxy answer for 30m without trying again for a while. This seems to be confirmed by how, all of a sudden, go get works and it's roughly been 35 minutes since the tag was pushed.

Regardless of whether this is expected behavior, it seems fairly counter-intuitive and frustrating to the end user. Perhaps the proxy and sumdb should be taught to invalidate such a cached state when a new version is published.

@mvdan
Copy link
Member Author

@mvdan mvdan commented Nov 24, 2020

To clarify, the timeline was:

  1. Run go get module@version
  2. Push the tag - I had forgotten to do this at the start, so step 1 errored as expected
  3. Re-run go get module@version, which fails with the sumdb 410
  4. Keep getting failures for 30m
  5. go get module@version finally works a good while later

So it almost seems like I should git push --tags, wait a good 30-60s to ensure that the proxy will have picked up the tag, and only then start using go get. Because otherwise I might get sumdb stuck for thirty minutes.

@iand
Copy link
Contributor

@iand iand commented Nov 24, 2020

See also #41986

410 seems a strong response to give for a tag that has never been seen by the sum server since clients are free to interpret that as a permanent status. 404 would more appropriate for unknown tags, leaving 410 for retracted ones.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Nov 24, 2020

For the caching, this is working as intended. Though I do find it a bit odd that the request worked for proxy.golang.org but not for sum.golang.org. #34370 (comment) provides a bit of context about why we haven't lowered the negative cache time.

Perhaps the proxy and sumdb should be taught to invalidate such a cached state when a new version is published.

There is no way for the proxy or sumdb to know that a new version has been published without attempting to fetch it. Currently, the entire system works on a pull model, and there is no "push" model available for us to be informed of a new version.

from @iand :

404 would more appropriate for unknown tags, leaving 410 for retracted ones.

This is purely an implementation detail of proxy.golang.org and sum.golang.org, in order to work with out internal caching. 410 can be treated as a 404 in this context.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Nov 24, 2020

I'm going to close this as there is no further action to take at this time.

@mvdan
Copy link
Member Author

@mvdan mvdan commented Nov 24, 2020

Thanks for your quick response.

For the caching, this is working as intended. [...] I'm going to close this as there is no further action to take at this time.

I think closing this issue on the technicality that the cache isn't buggy is a mistake. The cache is an implementation detail that I shouldn't need to worry about, in general. The steps I shared above are pretty common, I think - that is, doing the first go get just a second too early, before the version is picked up by the proxy. And the experience is pretty frustrating - the services keep telling me that no such version exists, and all I can see on the website FAQ is:

Then explicitly request that version via go get module@version. After one minute for caches to expire, the go command will see that tagged version. If this doesn't work for you, please file an issue.

So, it's clearly been longer than a minute, hence I file the bug.

At the very least, if the confusing UX is here to stay, I certainly believe it should be clearly documented. Because, right now, the "after one minute" goes directly against my experience of seeing nothing for over half an hour.

@marten-seemann
Copy link
Contributor

@marten-seemann marten-seemann commented Nov 25, 2020

I've run into this problem so many times, so first of all, I'd like to thank @mvdan for raising this issue. It's been really annoying to have to wait for half an hour just to bubble up a version update, so my - not very sophisticated - workaround has been to wait for at least 5 minutes before go getting a new version I just released.

Keeping a stale cache entry for half an hour seems like a bad solution in general. Reducing that time to one minute wouldn't be a bad resolution of this issue. Even better would be reducing it to a few seconds with exponential backoff afterwards, but I'm not sure how much overhead that be in the cache implementation.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Nov 25, 2020

At the very least, if the confusing UX is here to stay, I certainly believe it should be clearly documented. Because, right now, the "after one minute" goes directly against my experience of seeing nothing for over half an hour.

That's fair, thanks for making this point about the docs. At a minimum, it may make sense for the documentation to be updated to more clearly state expectations. I'll re-open this so that we can determine the right course of action on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants