Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go list has too many (more than zero) side effects #29452

Closed
DisposaBoy opened this issue Dec 29, 2018 · 48 comments
Closed

cmd/go: go list has too many (more than zero) side effects #29452

DisposaBoy opened this issue Dec 29, 2018 · 48 comments

Comments

@DisposaBoy
Copy link

@DisposaBoy DisposaBoy commented Dec 29, 2018

<rant>

This is quite honestly becoming (literally) rage-inducing so I'll keep it short...

Since the introduction of the package cache and modules, go list has gained some (IMO) nasty side-effects like downloading things from the internet and compiling (CGO) packages when all I did was ask it to print the import path of the current package.

Additionally, package querying shouldn't result in updating any files. This coupled with the fact that GOPATH/mod/... is readonly means that if you're in a package inside the mod cache and run go list it might fail because it can't write to go.mod (why is it updating the file?!?!?!).

This script that creates a go.mod file with an extra empty line at the end demonstrates the latter issue:

$ bash -c 'cd $(mktemp -d) && echo "package app" > app.go && echo -e "module app\n" > go.mod; go list -mod=readonly'
go: updates to go.mod needed, disabled by -mod=readonly

To make matters worse, go/build suddenly started calling go list so a simple operation like .Import(...FindOnly) that used to take no more than a couple milliseconds, now takes several seconds for no good reason... all because the go tool decided it was going download things from the internet, compile things and god knows what else... all manner of surprises I didn't ask for.

Usually I'd just code my way around it with the power of NIH, but the behavior of go/build and package lookups and querying in general is un(der)-documented and I don't want to have to keep track of whatever new magic it mightwill gain in the future.

I doubt any of this is ever going to be fixed, so it'd be nice if these things were documented so I could answer questions like "given an import path, how do I go about finding it in GOPATH, vendor, module cache, build cache, etc.?" without having to rely on some broken black box.

@ALTree
Copy link
Member

@ALTree ALTree commented Dec 29, 2018

Thanks for the report.

At least some of the slowness you are observing may be imputable to outstanding go list bugs (see for example #29427 (comment)).

cc @heschik

@ALTree ALTree added this to the Go1.13 milestone Dec 29, 2018
@myitcv
Copy link
Member

@myitcv myitcv commented Dec 29, 2018

@DisposaBoy (in a modules world) you need to move to use go/packages. go/packages replaces go/build.

cc @matloob / @ianthehat - perhaps we should add some (temporary) documentation to go/packages to explain how it is a replacement for go/build et al?

@myitcv
Copy link
Member

@myitcv myitcv commented Dec 29, 2018

cc @bcmills - should we also add an advisory to go/build in Go 1.12 pointing (temporarily) to go/packages?

@DisposaBoy
Copy link
Author

@DisposaBoy DisposaBoy commented Dec 29, 2018

@myitcv I don't see how go/packages solve any of these issues. Last time I looked at it (a couple weeks ago) it still takes ~200ms to answer the same simple query in GOPATH mode and correct me if I'm wrong, but it also calls go list which means it doesn't solve any of the issues I mentioned.

@myitcv
Copy link
Member

@myitcv myitcv commented Dec 29, 2018

@DisposaBoy previously, multiple calls to go/build were effectively zero-cost. In the new world, these are replaced by a single call to go/packages.Load.

If you continue to use go/build, in certain usage patterns you will end up making multiple calls to go list, which, even for relatively small projects, can become costly.

go/packages has come into existence to provide an abstraction layer atop various drivers. There is a driver for the go command, just as there is for build systems like Bazel, Blaze and others. All efforts for optimisation are therefore directed via go/packages.

So the first step is moving away from go/build to go/packages (which can do everything from simply resolving package patterns to loading fully type and syntax information (see https://godoc.org/golang.org/x/tools/go/packages#LoadMode).

If you are still seeing issues after moving to go/packages, then we can certainly help to diagnose further. There are a number of things you might be running into, but narrowing this down to a single go/packages.Load call will help.

Issues that spring to mind include the aforementioned #29427, #28739. The latter is hopefully going to be addressed by an upcoming CL that works by caching directory/file operations where possible.

@matloob
Copy link
Contributor

@matloob matloob commented Jan 8, 2019

cc @bcmills @ianthehat

@myitcv @DisposaBoy As much as I'd like to see everyone move to go/packages, I don't think doing so will help with these problems. The workings of modules, the build/list cache, etc., are complicated enough that we need go list as a black box to own the logic.

Even if we had documentation about all this stuff (which would certainly be a good thing), it's possible that changes across versions of Go might break you, and having the go list logic reimplemented I think would just lead to bugs and incompatibilities.

It doesn't seem likely to me that Go list will change to do less work, but I don't know enough about it and the workings of modules to say why. I'll leave that to @bcmills.

Of course we'll work to get go/packages to use go list as efficiently as it can, but we'll be limited by the behavior of go list (which is in turn limited by the requirements of go modules).

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 8, 2019

It doesn't seem likely to me that Go list will change to do less work,

To the contrary, I expect that it will change to do substantially less work, especially when the -e flag is present. (That's a discussion I need to have with @jayconrod and @rsc for Go 1.13.)

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 8, 2019

See also #28692.

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 8, 2019

At any rate, it is probably true that go list has too many side effects today, but zero is probably too few.

For example, if go list attempts to write the go.mod file to fix formatting or to remove redundant declarations, and it fails to do so because the directory is read-only, perhaps that just shouldn't be an error. That doesn't mean that it shouldn't try, though.

@josharian
Copy link
Contributor

@josharian josharian commented Jan 8, 2019

Why should go list fix formatting or remove redundant declarations? Zero really does seem like the right number of side-effects for go list.

@bcmills bcmills added the modules label Jan 9, 2019
@bcmills
Copy link
Member

@bcmills bcmills commented Jan 9, 2019

@josharian

Why should go list fix formatting or remove redundant declarations?

Perhaps it shouldn't, but that still doesn't lead to “zero side effects”.

For example, we would like go list all to be idempotent and fast: if you run it twice, to the extent possible you should get exactly the same results, and any expensive operations (such as network lookups) from the first run should not be repeated for the second run.

If go list is not allowed to modify the go.mod file at all, we either lose idempotence, or we lose the property that you can (in general) edit code in module mode in the steady state without needing to explicitly modify your module definition.


For example: suppose that you add an import of golang.org/x/oauth2 in your program. You run go list all, and it resolves some set of transitive dependencies via oauth2, including golang.org/x/net — but since oauth2 doesn't currently have a go.mod file, you get whatever version of golang.org/x/net happens to be latest at the moment, and go list all includes the packages contained in that version.

If go list doesn't update the go.mod file, then the next run will need to re-resolve the latest version (incurring another network fetch), and if any packages were added in the interim that will change the output of go list: we would lose both speed and fidelity.

In contrast, if go list does update the go.mod file, then the next run will not only produce the same output, but will also avoid the network operation (since the active version of golang.org/x/net is now cached locally).

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 9, 2019

So I could certainly buy the argument that go list shouldn't make cosmetic modifications, and perhaps it should not report an error if it failed to write updates (particularly to the go.sum file, since that doesn't affect reproducibility), but I don't at all buy the argument that it should not make any modifications at all.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Jan 9, 2019

@bcmills that makes sense.

Is the fact that oauth2 does not have a go.mod relevant? If it did, wouldn't you still need to hit the network to find the latest minor/patch version since there isn't a specific version recorded in the local go.mod?

Could there be a fast mode that just prints a warning to stderr and skips unresolved modules for tools that need to run as fast as possible and might not necessarily care that everything is worked out?

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 9, 2019

Is the fact that oauth2 does not have a go.mod relevant?

Sort of. What matters is whether the requirements in the (transitive) go.mod files are sufficient to resolve all of the packages and/or modules needed to answer the go list query. (If you've run go mod tidy, then go list should not make any further edits to go.mod, although it may still add entries to go.sum, depending on the exact query.)

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 9, 2019

Could there be a fast mode that just prints a warning to stderr and skips unresolved modules for tools that need to run as fast as possible and might not necessarily care that everything is worked out?

go list -mod=readonly -e could perhaps do that.

@pwaller
Copy link
Contributor

@pwaller pwaller commented Feb 6, 2019

Chiming in from #30090 - I found it counter-intuitive that gopls - the language server - did network activity due to the underlying use of go list. In this case, while I was editing code, it resulting in me being prompted many times to unlock my SSH keyring, due to this git configuration which enables access to private repositories:

[url "ssh://git@github.com/"]
	insteadOf = https://github.com/
	insteadOf = git://github.com/

Unfortunately, entering my SSH password into the prompt was not sufficient, it still resulted in the secure password entry overlay being spammed repeatedly. Choosing to not unlock the keyring was likely was the cause of other problems (#30090). So the user experience wasn't great there.

The circumstances under which it was doing network activity in #30090 came a bit of a surprise to me, because the code in question didn't have any dependencies which weren't already present in the module cache. It did eventually turn out that go mod tidy added to go.mod a dependency introduced through a test of a transitive dependency. Thereafter gopls doesn't appear to need to access the network.

@djgilcrease
Copy link

@djgilcrease djgilcrease commented Jul 9, 2019

@balasanjay fine add go build to the list of commands that will pull dependencies down. It should however NEVER modify the version of dependencies listed in go.mod only the go get .. or go mod ... commands should do that.

I cannot count the number of times running go build and forgetting the -mod=vendor flag has broken my build because it decided it wanted to update to a broken version of some lib I use. The go module system was supposed to make it so I do not need to vendor and checkin my code yet I still have to do that and I still have to ensure I am not calling things like go list, go fmt, etc in my ci pipeline since that will suddenly pull in and upgrade version of dependencies even though that is clearly not what I intended by calling those commands.

@balasanjay
Copy link
Contributor

@balasanjay balasanjay commented Jul 9, 2019

Caveat: I have no experience with these issues personally (my personal projects are small enough to not have CI, and my professional projects use a different build system entirely).

That said, from reading rsc's articles, the impression I got is that go build will make any modifications necessary to ensure that the next invocation of go build will yield the same result.

In other words, it will not "upgrade" your dependencies willy nilly, but if your dependencies are insufficiently specified such that some change by someone else could change the meaning of your build, then it will change your go.mod file such that it completely describes your build. Idempotence of build commands seems like an important property.

I think if you'd rather never let it touch your go.mod, then the incantation is -mod=readonly, which btw can be put in the GOFLAGS environment variable if you'd like that to be the default behaviour.

@djgilcrease
Copy link

@djgilcrease djgilcrease commented Jul 9, 2019

@balasanjay the -mod flags cannot be set in GOFLAGS since most of the go commands do not actually work with it even though they still modify the go.mod file.

Currently go build is not idempotent in any way shape or form. It relies 100% on the libraries you are using to follow it's definition of semver and assumes that any potentially breaking change properly does a major version bump, which is complicated for libraries of libraries that your project also uses.

Say Your project used libA(1.2.3) & libB(1.10.2) and libA also uses libB(1.10.2).

Monday libB updated to version 2.0.0 from 1.10.2, you are all good since the module system will not auto update to a new major version.

Tuesday libA updates to use version 2.0.0 of libB and bumps their version to 1.2.4 since this is just a performance improvement from their perspective. None of their public interfaces or APIs changes in any way.

Well now your build is broken because the module system will auto update to libA(1.2.4) which will pull in libB(2.0.0) which is incompatible with your own use of libB.

This scenario gets worse when teams do not understand they released a breaking change or are still releasing a v0.x.x library that is very widely used or do not use semver at all. Implicit modification of dependencies is never the right choice, it should always be an explicit decision on the developers part to update or modify their dependencies.

@xStrom
Copy link

@xStrom xStrom commented Jul 9, 2019

Well now your build is broken because the module system will auto update to libA(1.2.4) which will pull in libB(2.0.0) which is incompatible with your own use of libB.

There is no "auto update" to libA(1.2.4). What are you talking about? If your go.mod contains libA(1.2.3) then it will stay at that version. You have to manually update via go get -u to get libA(1.2.4).

On top of that, libB(2.0.0) must be able to run in parallel with libB(1.10.2). Most libraries have this property by default. The exceptions are libraries that have some sort of global state, like listening to a specific port. This is a known pitfall and there are solutions against it. Basically the library authors have to respect the Go module system rules, among which is that major versions have to be able to run at the same time.

@nim-nim
Copy link

@nim-nim nim-nim commented Jul 9, 2019

@xStrom if you start using another bit of code that requires at least 1.2.4 of course there will be an auto update. go.mod versions are not a lockfile, they are the minimal state the module engine will try to achieve. That's a try not a hard promise.

@djgilcrease
Copy link

@djgilcrease djgilcrease commented Jul 9, 2019

On top of that, libB(2.0.0) must be able to run in parallel with libB(1.10.2). Most libraries have this property by default. The exceptions are libraries that have some sort of global state

like almost all the logger, config, web framework libraries in existence.

@xStrom
Copy link

@xStrom xStrom commented Jul 9, 2019

@nim-nim if you update some dependency manually, then I think it requires some serious mental gymnastics to call this an auto update.

When you do a manual update and your code stops working, then you can investigate and/or downgrade. Whether this issue is caused by the direct dependency you updated or some sub-dependency of that dependency doesn't really matter. The situation is the same, you manually updated a dependency and then your code broke.

@nim-nim
Copy link

@nim-nim nim-nim commented Jul 9, 2019

@xStrom sure, no argument about that. However, people need to stop thinking about go.mod as a way to lock versions. The module system assumes everything uses semver, and everything can be upgraded within semver rules.

Therefore, if you publish something, that states in its module file it uses otherdep at version 1.2.3, you can not assume that all your downstream users will use otherdep, version 1.2.3. Just composing your code, with another module, that requests at least 1.2.4, will cause the state to be bumped to 1.2.4.

And you can not declare in your module file it works with 1.2.3 and nothing else. You can declare it works starting with 1.2.3, and that 1.2.4 it incompatible, that will just make the module engine upgrade to 1.2.5 as soon as it is published if something else requires at least 1.2.4

So by publishing something, that uses otherdep 1.2.3, you make an implicit promise to your downstream users, that you will check your code works with all releases past 1.2.3, and blacklist them one by one if they don't (at which point it's probably less work to get into touch with otherdep's maintainer and convince him not to break compat)

@xStrom
Copy link

@xStrom xStrom commented Jul 9, 2019

I get what you're saying and also I don't think you can mark anything as incompatible as a library. Last I checked only the top-level go.mod has the ability to exclude stuff.

It's definitely true that the go modules system depends on people actually respecting semver.

Personally I don't think it's that big of a problem. I've thus far successfully lived with the old Go compatibility guarantee where I just vendor the master tip of every library and rarely have I had issues.

I think the new go modules system is a big upgrade over the old master tip system. Is it the be-all end-all version management system? Probably not, but it's a start.

As far as reproducible builds go, they are achievable with the go modules system. If you have a fully defined go.mod and things work - then things will continue working. No new dependency update is ever going to creep into your project at a surprising moment. You can run go build 5 years down the line and it will work the same way as it did yesterday. This is a very strong property of the go modules system. You get to control when you want to deal with the drama of updating dependencies and figuring out which library failed to follow semver.

@thepudds
Copy link

@thepudds thepudds commented Sep 24, 2019

Here is a half-baked thought.

First, there is a new -mod=mod argument being proposed for Go 1.14 in #33848 to support an opt-in for looking outside the vendor directory. (This is part of the proposed behavior changes for Go 1.14 to use any vendor directory by default if it exists, and hence no longer requiring -mod=vendor to opt-in to looking inside the vendor directory).

The question then is -- should there be some interplay between the questions here (about whether or not go list updates go.mod) and -mod=mod?

Perhaps -mod=mod could be used here to say: "allow go list to update go.mod", for example?

Maybe the default for go list could then be -mod=readonly?

If so, maybe -mod=mod is the name with slightly different semantics, or maybe it is renamed to something else? (-mod=update, or -mod=write, ...?).

@thepudds
Copy link

@thepudds thepudds commented Sep 24, 2019

The immediately prior comment is the main thing I wanted to raise. This comment is a follow-up to that, but this is even less than 50% baked...

Currently, there are some differences in how the Go 1.13 commands handle -mod=readonly.

For example, go get ignores -mod=readonly, including because the intent is to allow someone to set GOFLAGS=-mod=readonly and go on about their day, including issuing go get commands (whose purpose is to update go.mod).

Maybe that could be expanded to have different defaults for -mod for maybe two buckets of commands?

Using -mod=update for now as a name, perhaps something like:

default to -mod=update

  go get
  go mod tidy
  go build
  go test
  ...

default to -mod=readonly

  go list -m
  go mod why -m
  go mod graph
  go mod download
  ...

If that was the approach, and let's say you run go list -m all and it fails while stating things are out of sync, then a user would have at least two options: you cause things to become in sync by running something like go mod tidy, and then re-run the same go list -m all, or alternatively you could just run go list -mod=update -m all.

That glosses over what happens when a vendor directory is present. Perhaps that could still be the behavior described in #33848, and the table above describes the default behavior when no vendor directory is present?

Also, maybe those two lists are not a good split of what commands would be in which list (e.g., maybe someone might argue go build should default to -mod=readonly rather than -mod=update), but the main point is it might be reasonable to have different behavior for different commands for -mod behavior (including there is already some precedent for that).

Or maybe rather than the difference being the default -mod value, maybe instead some commands do not support a -mod argument at all (e.g., maybe go get does not support -mod at all)?

Or maybe when the current code is out of sync with the current go.mod file, perhaps the semantics for -mod=readonly for something like go list -m all could be to report from the go.mod perspective, rather than failing? (And if so, maybe with a warning message that things are out of sync? Or maybe no warning message? Don't know.)

Sorry, as I said, this is not fully baked.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 3, 2019

@thepudds, note that the go mod subcommands do not accept the -mod flag at all.

Nor should they: I would argue that only go mod tidy and perhaps go mod vendor should resolve missing imports: mod tidy because that is literally its only purpose, and perhaps mod vendor because it would be pretty frustrating to have go build -mod=vendor ./... fail right after you had just run go mod vendor.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2019

Why should go list fix formatting or remove redundant declarations?

I agree. Filed separately as #34822.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2019

Could there be a fast mode that just prints a warning to stderr and skips unresolved modules for tools that need to run as fast as possible and might not necessarily care that everything is worked out?

Filed as #34829.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2019

We vendor EVERYTHING so there should be no need for any sort of network traffic at all.

That will hopefully be addressed by #33848. If it is not, please file a separate issue.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2019

As much as I'd like to see everyone move to go/packages, I don't think doing so will help with these problems.

I would expect #34829 and #34506 (separately or in combination) to help substantially for go/packages. If that turns out not to be the case, we can investigate its use-cases separately.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2019

I believe that the aforementioned issues address all of the actionable problems reported in this issue thread, but it's been a long thread so it's possible that I've missed one.

We try to keep open issues in the Go issue tracker actionable and focused on a single problem or decision, and apart from the above issues this thread is not, so I'm going to close it out.

Please do continue to file new issues for specific use-cases where the behavior of go list poses a problem.

@bcmills bcmills closed this Oct 10, 2019
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 26, 2019

Change https://golang.org/cl/209037 mentions this issue: cmd/go: test that 'go list -e -mod=readonly' reports errors correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.