Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go list has too many (more than zero) side effects #29452

Open
DisposaBoy opened this Issue Dec 29, 2018 · 28 comments

Comments

Projects
None yet
@DisposaBoy
Copy link

DisposaBoy commented Dec 29, 2018

<rant>

This is quite honestly becoming (literally) rage-inducing so I'll keep it short...

Since the introduction of the package cache and modules, go list has gained some (IMO) nasty side-effects like downloading things from the internet and compiling (CGO) packages when all I did was ask it to print the import path of the current package.

Additionally, package querying shouldn't result in updating any files. This coupled with the fact that GOPATH/mod/... is readonly means that if you're in a package inside the mod cache and run go list it might fail because it can't write to go.mod (why is it updating the file?!?!?!).

This script that creates a go.mod file with an extra empty line at the end demonstrates the latter issue:

$ bash -c 'cd $(mktemp -d) && echo "package app" > app.go && echo -e "module app\n" > go.mod; go list -mod=readonly'
go: updates to go.mod needed, disabled by -mod=readonly

To make matters worse, go/build suddenly started calling go list so a simple operation like .Import(...FindOnly) that used to take no more than a couple milliseconds, now takes several seconds for no good reason... all because the go tool decided it was going download things from the internet, compile things and god knows what else... all manner of surprises I didn't ask for.

Usually I'd just code my way around it with the power of NIH, but the behavior of go/build and package lookups and querying in general is un(der)-documented and I don't want to have to keep track of whatever new magic it mightwill gain in the future.

I doubt any of this is ever going to be fixed, so it'd be nice if these things were documented so I could answer questions like "given an import path, how do I go about finding it in GOPATH, vendor, module cache, build cache, etc.?" without having to rely on some broken black box.

@ALTree

This comment has been minimized.

Copy link
Member

ALTree commented Dec 29, 2018

Thanks for the report.

At least some of the slowness you are observing may be imputable to outstanding go list bugs (see for example #29427 (comment)).

cc @heschik

@ALTree ALTree added this to the Go1.13 milestone Dec 29, 2018

@myitcv

This comment has been minimized.

Copy link
Member

myitcv commented Dec 29, 2018

@DisposaBoy (in a modules world) you need to move to use go/packages. go/packages replaces go/build.

cc @matloob / @ianthehat - perhaps we should add some (temporary) documentation to go/packages to explain how it is a replacement for go/build et al?

@myitcv

This comment has been minimized.

Copy link
Member

myitcv commented Dec 29, 2018

cc @bcmills - should we also add an advisory to go/build in Go 1.12 pointing (temporarily) to go/packages?

@DisposaBoy

This comment has been minimized.

Copy link
Author

DisposaBoy commented Dec 29, 2018

@myitcv I don't see how go/packages solve any of these issues. Last time I looked at it (a couple weeks ago) it still takes ~200ms to answer the same simple query in GOPATH mode and correct me if I'm wrong, but it also calls go list which means it doesn't solve any of the issues I mentioned.

@myitcv

This comment has been minimized.

Copy link
Member

myitcv commented Dec 29, 2018

@DisposaBoy previously, multiple calls to go/build were effectively zero-cost. In the new world, these are replaced by a single call to go/packages.Load.

If you continue to use go/build, in certain usage patterns you will end up making multiple calls to go list, which, even for relatively small projects, can become costly.

go/packages has come into existence to provide an abstraction layer atop various drivers. There is a driver for the go command, just as there is for build systems like Bazel, Blaze and others. All efforts for optimisation are therefore directed via go/packages.

So the first step is moving away from go/build to go/packages (which can do everything from simply resolving package patterns to loading fully type and syntax information (see https://godoc.org/golang.org/x/tools/go/packages#LoadMode).

If you are still seeing issues after moving to go/packages, then we can certainly help to diagnose further. There are a number of things you might be running into, but narrowing this down to a single go/packages.Load call will help.

Issues that spring to mind include the aforementioned #29427, #28739. The latter is hopefully going to be addressed by an upcoming CL that works by caching directory/file operations where possible.

@matloob

This comment has been minimized.

Copy link
Contributor

matloob commented Jan 8, 2019

cc @bcmills @ianthehat

@myitcv @DisposaBoy As much as I'd like to see everyone move to go/packages, I don't think doing so will help with these problems. The workings of modules, the build/list cache, etc., are complicated enough that we need go list as a black box to own the logic.

Even if we had documentation about all this stuff (which would certainly be a good thing), it's possible that changes across versions of Go might break you, and having the go list logic reimplemented I think would just lead to bugs and incompatibilities.

It doesn't seem likely to me that Go list will change to do less work, but I don't know enough about it and the workings of modules to say why. I'll leave that to @bcmills.

Of course we'll work to get go/packages to use go list as efficiently as it can, but we'll be limited by the behavior of go list (which is in turn limited by the requirements of go modules).

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 8, 2019

It doesn't seem likely to me that Go list will change to do less work,

To the contrary, I expect that it will change to do substantially less work, especially when the -e flag is present. (That's a discussion I need to have with @jayconrod and @rsc for Go 1.13.)

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 8, 2019

See also #28692.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 8, 2019

At any rate, it is probably true that go list has too many side effects today, but zero is probably too few.

For example, if go list attempts to write the go.mod file to fix formatting or to remove redundant declarations, and it fails to do so because the directory is read-only, perhaps that just shouldn't be an error. That doesn't mean that it shouldn't try, though.

@josharian

This comment has been minimized.

Copy link
Contributor

josharian commented Jan 8, 2019

Why should go list fix formatting or remove redundant declarations? Zero really does seem like the right number of side-effects for go list.

@bcmills bcmills added the modules label Jan 9, 2019

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 9, 2019

@josharian

Why should go list fix formatting or remove redundant declarations?

Perhaps it shouldn't, but that still doesn't lead to “zero side effects”.

For example, we would like go list all to be idempotent and fast: if you run it twice, to the extent possible you should get exactly the same results, and any expensive operations (such as network lookups) from the first run should not be repeated for the second run.

If go list is not allowed to modify the go.mod file at all, we either lose idempotence, or we lose the property that you can (in general) edit code in module mode in the steady state without needing to explicitly modify your module definition.


For example: suppose that you add an import of golang.org/x/oauth2 in your program. You run go list all, and it resolves some set of transitive dependencies via oauth2, including golang.org/x/net — but since oauth2 doesn't currently have a go.mod file, you get whatever version of golang.org/x/net happens to be latest at the moment, and go list all includes the packages contained in that version.

If go list doesn't update the go.mod file, then the next run will need to re-resolve the latest version (incurring another network fetch), and if any packages were added in the interim that will change the output of go list: we would lose both speed and fidelity.

In contrast, if go list does update the go.mod file, then the next run will not only produce the same output, but will also avoid the network operation (since the active version of golang.org/x/net is now cached locally).

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 9, 2019

So I could certainly buy the argument that go list shouldn't make cosmetic modifications, and perhaps it should not report an error if it failed to write updates (particularly to the go.sum file, since that doesn't affect reproducibility), but I don't at all buy the argument that it should not make any modifications at all.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

jimmyfrasche commented Jan 9, 2019

@bcmills that makes sense.

Is the fact that oauth2 does not have a go.mod relevant? If it did, wouldn't you still need to hit the network to find the latest minor/patch version since there isn't a specific version recorded in the local go.mod?

Could there be a fast mode that just prints a warning to stderr and skips unresolved modules for tools that need to run as fast as possible and might not necessarily care that everything is worked out?

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 9, 2019

Is the fact that oauth2 does not have a go.mod relevant?

Sort of. What matters is whether the requirements in the (transitive) go.mod files are sufficient to resolve all of the packages and/or modules needed to answer the go list query. (If you've run go mod tidy, then go list should not make any further edits to go.mod, although it may still add entries to go.sum, depending on the exact query.)

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 9, 2019

Could there be a fast mode that just prints a warning to stderr and skips unresolved modules for tools that need to run as fast as possible and might not necessarily care that everything is worked out?

go list -mod=readonly -e could perhaps do that.

@pwaller

This comment has been minimized.

Copy link
Contributor

pwaller commented Feb 6, 2019

Chiming in from #30090 - I found it counter-intuitive that gopls - the language server - did network activity due to the underlying use of go list. In this case, while I was editing code, it resulting in me being prompted many times to unlock my SSH keyring, due to this git configuration which enables access to private repositories:

[url "ssh://git@github.com/"]
	insteadOf = https://github.com/
	insteadOf = git://github.com/

Unfortunately, entering my SSH password into the prompt was not sufficient, it still resulted in the secure password entry overlay being spammed repeatedly. Choosing to not unlock the keyring was likely was the cause of other problems (#30090). So the user experience wasn't great there.

The circumstances under which it was doing network activity in #30090 came a bit of a surprise to me, because the code in question didn't have any dependencies which weren't already present in the module cache. It did eventually turn out that go mod tidy added to go.mod a dependency introduced through a test of a transitive dependency. Thereafter gopls doesn't appear to need to access the network.

@thockin

This comment has been minimized.

Copy link

thockin commented Mar 27, 2019

@myitcv

So the first step is moving away from go/build to go/packages

We are prepping to move Kubernetes to modules. I saw this thread and thought I would get a jump on one of the build tools, and follow your guidance.

I tried a very simple conversion of https://github.com/kubernetes/kubernetes/tree/master/hack/make-rules/helpers/go2make to use go/packages and it changed from taking 900ms with go/build to taking > 41s with go/packages for a single cmd/ dir. All of the time is under packages.Load().

We vendor EVERYTHING so there should be no need for any sort of network traffic at all.

Am I misunderstanding something? I thought the point of vendoring was to avoid any need for network callouts and to make builds totally hermetic, reproducible, and offline-safe.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 27, 2019

@thockin

I thought the point of vendoring was to avoid any need for network callouts and to make builds totally hermetic, reproducible, and offline-safe.

No, the main point of vendoring is to distribute proprietary code. (The root “vend” is right there in the name!) If you want to make builds hermetic, reproducible, and offline-safe, all you need in module mode is a go.mod file with a complete set of dependencies — that is, one for which go mod tidy is a no-op — and a module cache or proxy containing the relevant versions of those dependencies.

That said, #30240 would indeed avoid the need for most (but perhaps not all?) network access for fully-vendored code.

@heschik

This comment has been minimized.

Copy link
Contributor

heschik commented Mar 27, 2019

@thockin

There are a lot of reasons that go/packages could be slow, and network access is only one of them. Please file a new bug.

@thockin

This comment has been minimized.

Copy link

thockin commented Mar 27, 2019

Sorry, you've gone full pedantic and lost me.

From a practical POV, the vendor/ directory is my way of saying "use THIS code, not some random crap you find on the internet" and ensuring that everyone in my developer community gets the same result. If that requires me to set "replace" directives in go.mod (which seems to be the case), then OK (though, IMO, that is frankly a bit silly).

We have not enabled modules yet (working on it). Switching from go/build to go/packages caused a 50x slowdown. I'll open a new bug for that.

@DisposaBoy

This comment has been minimized.

Copy link
Author

DisposaBoy commented Mar 27, 2019

@thockin

I thought the point of vendoring was to avoid any need for network callouts and to make builds totally hermetic, reproducible, and offline-safe.

No, the main point of vendoring is to distribute proprietary code. (The root “vend” is right there in the name!) If you want to make builds hermetic, reproducible, and offline-safe, all you need in module mode is a go.mod file with a complete set of dependencies — that is, one for which go mod tidy is a no-op — and a module cache or proxy containing the relevant versions of those dependencies.

That said, #30240 would indeed avoid the need for most (but perhaps not all?) network access for fully-vendored code.

Speaking as someone who's subscribed to this repo and sees every issue and comment, I must say this response epitomizes so well the tragedy that has become Go...

@tapir

This comment has been minimized.

Copy link

tapir commented Apr 4, 2019

With the QT project I'm working on, due to VSCode Go extension using go list, when I fire up the editor lots of cgo/gcc processes are started in the background. QT being a massive project, the compilation takes a lot of time and the computer resources are depleted completely. It comes to point where I can't even move the mouse anymore. So just to let everyone know, go list became a DoS attack.

Edit: Here is what I'm talking about
Screenshot from 2019-04-08 16-51-43

@djgilcrease

This comment has been minimized.

Copy link

djgilcrease commented Apr 5, 2019

IMHO the only commands that should EVER attempt to download things from the internet, modify the go.mod or go.sum files are the go mod ... commands or go get ....

If the command fails because of a missing packages, great print out that you need to run go mod verify or some such, do not just blindly pull things from the network and naively assume (you know the joke here) that it is safe to change the version info specified in go.mod

Also ALL go ... commands except go mod ... or go get ... need to respect the -mod=... flags. Currently you cannot set GOFLAGS=-mod=vendor because many go commands (notably go list ... and go tool ...) do not understand that flag, then they go off and start downloading crap and mucking with my go.mod file.

please go back to the drawing board with go modules so we do not need to keep adding hacks (like https://github.com/kubernetes/kubernetes/blob/master/go.mod needing to replace every single dependency) and other workarounds for a broken by design system.

@balasanjay

This comment has been minimized.

Copy link
Contributor

balasanjay commented Apr 10, 2019

@djgilcrease In https://research.swtch.com/vgo-cmd, @rsc said that the following sequence of commands represents a suboptimal developer experience:

$ go build
go: rsc.io/sampler(v1.3.1) not installed
$ go get
go: installing rsc.io/sampler(v1.3.1)
$ go build
$ 

So it seems your proposal was considered, and explicitly rejected. (rsc's example also included a git clone, but the core of his argument doesn't seem specific to that)

@balasanjay

This comment has been minimized.

Copy link
Contributor

balasanjay commented Apr 10, 2019

@thockin Does k8s ever modify the dependencies in its vendor directory? Or is the code in its vendor directory exact copies of upstream?

@crvv

This comment has been minimized.

Copy link
Contributor

crvv commented Apr 10, 2019

@balasanjay

go build needs to download the code and it is OK.
But it is not OK that go tool, go fmt, go vet, go fix and go list all need to download something from the Internet.

Does k8s ever modify the dependencies in its vendor directory? Or is the code in its vendor directory exact copies of upstream?

If the codes in vendor is modified, go should either use the modified code or reject the modified codes.
Why should go download anything in this situation?

@balasanjay

This comment has been minimized.

Copy link
Contributor

balasanjay commented Apr 10, 2019

@crvv

Both of go vet and go list answer questions about your dependencies (in the same way that go build builds your dependencies), so rsc's logic clearly applies there as well. As for the other two, I have no idea, but this bug doesn't appear to be about those other two.

And my question to @thockin was not related to code downloading, I was trying to understand k8s' use of vendoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.