Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: proposal: extend download protocol to allow return of transitive requirements implied by go.mod #29832

Closed
myitcv opened this issue Jan 19, 2019 · 6 comments

Comments

Projects
None yet
4 participants
@myitcv
Copy link
Member

commented Jan 19, 2019

Somewhat motivated by the discussion in https://groups.google.com/forum/#!topic/golang-nuts/5otdvVra0xg

The download protocol ("Download Protocol" section of https://research.swtch.com/vgo-module) details the download protocol as follows:

As we saw earlier, custom domains can specify that a module is hosted at a particular base URL. As implemented in vgo today (but, like all of vgo, subject to change), that module-hosting server must serve four request forms:

  • GET baseURL/module/@v/list fetches a list of all known versions, one per line.
  • GET baseURL/module/@v/version.info fetches JSON-formatted metadata about that version.
  • GET baseURL/module/@v/version.mod fetches the go.mod file for that version.
  • GET baseURL/module/@v/version.zip fetches the zip file for that version.

The transitive set of module requirements implied by, say, example.com at version v1.0.0 can I think be derived and then cached by a proxy implementation.

Would it therefore make sense to support returning that transitive set of go.mod files in a request for GET baseURL/example.com/@v/v1.0.0.mod, as a stream of go.mod files?

Edit: this proposal is an enhancement/optimisation of the existing protocol.

cc @bcmills @mvdan @rogpeppe

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 22, 2019

It seems likely that the majority of the round-trip time today is spent fetching Git commits — that is, source code — rather than just go.mod files. If we assume a proxy in the middle, it can serve just the go.mod file instead, which is much smaller.

Given that — and given that we either can or will parallelize the requests to fetch go.mod files aggressively — how much impact would this additional optimization have?

@mvdan

This comment has been minimized.

Copy link
Member

commented Jan 22, 2019

I actually brought up the same point to Paul in private. I imagine that with http2, cmd/go could download many go.mod files concurrently over the same TCP connection. The only slight problem being that it may not initially know all the go.mod files it needs, so it might require multiple roundtrips to advance all the levels of the dependency tree.

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 22, 2019

Yep. There's still a potential for extra latency due to the round-trips, but the overall latency is proportional to the depth of the minimum spanning tree of the module graph — not the total number of modules.

We're obviously still learning what those depths are going to be in practice, but I would be surprised if it's a significant issue, especially if the module proxies have a reasonable geographic distribution.

@mvdan

This comment has been minimized.

Copy link
Member

commented Jan 22, 2019

I believe Paul's point with this issue is a potential opimisation of the download protocol. Like you, I think the current protocol will be fast enough for nearly all use cases once module proxies are widespread. I think Paul means to consider this optimisation only if the speed at that point is still not enough.

@bcmills bcmills added the ToolSpeed label Jan 22, 2019

@bcmills bcmills added this to the Unplanned milestone Jan 22, 2019

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2019

If the fetches are parallelized then the latency here is dominated by the depth of the dependency chain.
And if your local cache is mostly up-to-date, the server sending extras is useless.
It's hard to see what the benefit of all this complexity is.
It seems kind of like HTTP/2 PUSH, which servers don't know how to use well.

@myitcv

This comment has been minimized.

Copy link
Member Author

commented Jan 28, 2019

Thanks @bcmills @rsc

If the fetches are parallelized then the latency here is dominated by the depth of the dependency chain.

That describes the situation much more precisely, thanks.

And if your local cache is mostly up-to-date, the server sending extras is useless.

Yes, I have to admit this proposal was based entirely on a hunch rather than evidence.

I'll therefore close this for now and we can revisit if/as required.

@myitcv myitcv closed this Jan 28, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.