Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/cmd/stringer: stringer should not use go/types #25650

Open
robpike opened this issue May 31, 2018 · 26 comments
Open

x/tools/cmd/stringer: stringer should not use go/types #25650

robpike opened this issue May 31, 2018 · 26 comments
Assignees
Milestone

Comments

@robpike
Copy link
Contributor

@robpike robpike commented May 31, 2018

It takes much longer to run stringer on a program than to compile it. Stringer should be nearly instantaneous, but I suspect the new importer technology is killing performance. Something certainly is.

At tip, wiith a freshly written file, so the build cache has the dependencies but not the output of building list.go itself:

% go version
go version devel +47be3d49c7 Thu May 10 23:19:40 2018 +0000 darwin/amd64
% cat list.go
package main

import (
	"gopkg.in/src-d/go-license-detector.v2/licensedb/filer"
)

func main() {
}

type Type int

const (
	Other Type = iota
	A
	B
	C
	D
)

func getFiler(file string) filer.Filer {
	return nil
}
% time go build

real	0m1.617s
user	0m1.290s
sys	0m0.408s
% time stringer -type Type

real	0m4.197s
user	0m2.409s
sys	0m0.782s
%

Stringer is taking almost 3 times as long to run as it takes to compile and link the binary. That's crazy. If we remove the dependency on filer, which has a pretty big set of dependencies itself, stringer runs quickly again.

@gopherbot gopherbot added this to the Unreleased milestone May 31, 2018
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 31, 2018

My understanding is that the plan to make things better again is #14120 (comment) .

@griesemer
Copy link
Contributor

@griesemer griesemer commented May 31, 2018

I suspect the stringer is using the default setup of go/types which is using a source importer (rather than an importer based on gc-compiled export data). As a consequence, each time stringer is run, it type-checks not only the file containing the stringer directive, but also all imported packages, recursively, all the way down...

(That is, in fact it's surprisingly fast doing all this work... :-)

It may be sufficient to simply configure types.Config with the gc importer for now (if my hunch above is correct).

@griesemer
Copy link
Contributor

@griesemer griesemer commented May 31, 2018

@robpike Since you have a good test case, would you mind doing the following change and see what the difference is:

diff --git a/cmd/stringer/importer19.go b/cmd/stringer/importer19.go
index deddadb1..ed08eab4 100644
--- a/cmd/stringer/importer19.go
+++ b/cmd/stringer/importer19.go
@@ -12,5 +12,5 @@ import (
 )
 
 func defaultImporter() types.Importer {
-       return importer.For("source", nil)
+       return importer.For("gc", nil)
 }

I suspect this should fix it.

Maybe this should be settable with a flag?

(As an aside, we can probably get rid of the support for Go 1.8 now (importer18.go)).

@robpike
Copy link
Contributor Author

@robpike robpike commented May 31, 2018

That change gives me an error that it can't find the filer import.

I gave you a full reproducible example. It requires importing a big thing but it's not hard to reproduce.

@griesemer
Copy link
Contributor

@griesemer griesemer commented Jun 4, 2018

(@robpike Just changing the source string from "source" to "gc" (nothing else) should work w/o problem - except for a constant change there's not source change. )

Anyway, I've verified that switching from "source" to "gc" addresses the issue. On my machine, with the current version:

$ time stringer -type Type

real	0m5.002s
user	0m2.892s
sys	0m0.901s

After applying the above change suggested above and re-installing the stringer:

$ time stringer -type Type

real	0m0.037s
user	0m0.013s
sys	0m0.017s

which looks much more reasonable again.

This just needs a decision: Should stringer require that the package it operates on compiles (and all its imports are installed)? If so, we can just make the switch.

Alternatively, we could provide an additional stringer flag (say -sourceImports, or something like that) which would switch to the current behavior.

We should fix this for 1.11. It's embarasingly slow as is.

@robpike Comments?

@robpike
Copy link
Contributor Author

@robpike robpike commented Jun 4, 2018

I still get this bug when I use "gc":

stringer: checking package: x.go:4:2: could not import gopkg.in/src-d/go-license-detector.v2/licensedb/filer (can't find import: "gopkg.in/src-d/go-license-detector.v2/licensedb/filer")

but not when I use "source". What is my mistake? It's confusing.

Stringer should not care that the package compiles, only that the file parses, but changes people keep making to how things work are breaking all such assumptions.

@josharian
Copy link
Contributor

@josharian josharian commented Jun 5, 2018

@robpike I think you're effectively experiencing #10249. If you go install your packages first, then switching to the 'gc' importer should work. That problem--that go install is needed for the 'gc' importer to work--is a large part of why we switched to the 'src' importer in the first place.

Some other tools (e.g. vet) can handle this semi-magically, since they have direct support from cmd/go, and can do the install as needed, with appropriate caching. But stringer cannot, as it stands.

Presumably we don't want to maintain a 'src' importer cache, since we already have a cmd/go cache. Maybe handling all of this smoothly is something that the new (planned) 'go/packages' package could do? cc @alandonovan

Stringer should not care that the package compiles, only that the file parses

If that is the case, then stringer should not use go/types at all. But it has, at least as far back as #10249 (2015), so this is at least not new. It might be worth investigating to what extent we can use only go/ast.

@mvdan
Copy link
Member

@mvdan mvdan commented Jun 5, 2018

Also CC @myitcv, who has been thinking about stringer's use of go/types for a bit.

@myitcv
Copy link
Member

@myitcv myitcv commented Jun 5, 2018

Stringer should not care that the package compiles

I'd tend to agree with this. That's not to say therefore that stringer can't depend on go/types - I'll defer to others on whether stringer should depend on go/types and assume the status quo for now - rather that it should not fail if it doesn't type check.

Instead stringer could do as much as possible given the severity of error:

  • package doesn't parse - error
  • package parses, but type declaration not found - error
  • type declaration found, but no stringer-able const values found because of type-checking errors - no error, just output empty String() method
  • ...
  • package fully type checks, stringer behaves as it currently does

If we are comfortable that stringer should depend on go/types (to make its output more efficient, whatever the motivation was here...) then there is still the "speed" issue. Per @josharian, my understanding is that the speed issue will be solved by @alandonovan's go/packages work and reliance on the build cache (obviating the go install step).

@rsc
Copy link
Contributor

@rsc rsc commented Jun 11, 2018

Stringer is not in the standard repo. It could be changed for now to use go/importer.Default. The long-term (hopefully quite short-term) plan is to use golang.org/x/tools/go/packages. In fact maybe this should be one of the first programs converted.

/cc @alandonovan

@rsc
Copy link
Contributor

@rsc rsc commented Jun 25, 2018

Decision: @alandonovan's new go/packages will be used to fix this.

@rsc rsc added the NeedsFix label Jun 25, 2018
@rsc rsc added NeedsDecision and removed release-blocker labels Jun 25, 2018
@gopherbot gopherbot removed the NeedsFix label Jun 25, 2018
@rsc
Copy link
Contributor

@rsc rsc commented Jun 25, 2018

@robpike clarifies that the bigger problem is not how long this takes but that it assumes it can type-check the source code at all. His suggestion is to go back to stringer NOT type-checking the source code. (This happened in https://go-review.googlesource.com/c/tools/+/40403 for #10249.)

New decision: @robpike will update stringer NOT to use go/types.

@rsc rsc changed the title x/tools/cmd/stringer: stringer has become outrageously slow x/tools/cmd/stringer: stringer should not use go/types Jun 25, 2018
@rsc rsc assigned robpike and unassigned griesemer Jun 25, 2018
@rsc rsc modified the milestones: Go1.11, Unreleased Jun 25, 2018
@gopherbot
Copy link

@gopherbot gopherbot commented Jul 3, 2018

Change https://golang.org/cl/121884 mentions this issue: cmd/stringer: revert CL 40403

gopherbot pushed a commit to golang/tools that referenced this issue Jul 3, 2018
Revert "cmd/stringer: use source importer when available"

This reverts CL 40403.

The idea is to avoid type-checking and use just parsing, which should be
enough for stringer.

Separately reopening golang/go#10249 because the original change closed that issue,
but the change is itself causing other problems as described in the discussion
at golang/go#25650.

This reversion restores the old behavior of stringer and will be followed
with other fixes if they can be worked out.

Change-Id: I8404d78da08043ede1a36b0e135a3fc7fdf6728d
Reviewed-on: https://go-review.googlesource.com/121884
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@robpike
Copy link
Contributor Author

@robpike robpike commented Jul 3, 2018

Note that it must use go/types (I've paged the stack back into my small brain) so that it can evaluate constants, which requires type checking, but perhaps it can be done efficiently.

@mvdan
Copy link
Member

@mvdan mvdan commented Jul 3, 2018

@robpike what about using Alan's go/packages as Russ suggested above? The caching should help make it fast enough, I imagine.

@cznic
Copy link
Contributor

@cznic cznic commented Jul 3, 2018

Is the go/packages thing published somewhere to look at?

@mvdan
Copy link
Member

@mvdan mvdan commented Jul 3, 2018

@cznic https://go-review.googlesource.com/c/tools/+/116359, though there isn't any published code yet.

@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented Jul 3, 2018

Hoping to send out working code today. It's been a fractal of details.

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 6, 2018

Change https://golang.org/cl/122535 mentions this issue: cmd/stringer: revert back to source importer

gopherbot pushed a commit to golang/tools that referenced this issue Jul 7, 2018
Roll back my two recent changes. Stringer is now very slow again,
but works in most use cases.

My git foo is insufficient to do this as a revert, but it is a by-hand
reversion of CLs

	https://go-review.googlesource.com/121884
	https://go-review.googlesource.com/121995

See the issue for a long conversation about the general problem.

Update golang/go#10249
Update golang/go#25650

Change-Id: I7b6ce352a4c7ebf0977883509e9d7189aaac1251
Reviewed-on: https://go-review.googlesource.com/122535
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
SOF3 pushed a commit to SOF3/go-stringer-inverse that referenced this issue Aug 23, 2018
Revert "cmd/stringer: use source importer when available"

This reverts CL 40403.

The idea is to avoid type-checking and use just parsing, which should be
enough for stringer.

Separately reopening golang/go#10249 because the original change closed that issue,
but the change is itself causing other problems as described in the discussion
at golang/go#25650.

This reversion restores the old behavior of stringer and will be followed
with other fixes if they can be worked out.

Change-Id: I8404d78da08043ede1a36b0e135a3fc7fdf6728d
Reviewed-on: https://go-review.googlesource.com/121884
Reviewed-by: Ian Lance Taylor <iant@golang.org>
SOF3 pushed a commit to SOF3/go-stringer-inverse that referenced this issue Aug 23, 2018
Roll back my two recent changes. Stringer is now very slow again,
but works in most use cases.

My git foo is insufficient to do this as a revert, but it is a by-hand
reversion of CLs

	https://go-review.googlesource.com/121884
	https://go-review.googlesource.com/121995

See the issue for a long conversation about the general problem.

Update golang/go#10249
Update golang/go#25650

Change-Id: I7b6ce352a4c7ebf0977883509e9d7189aaac1251
Reviewed-on: https://go-review.googlesource.com/122535
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
SOF3 pushed a commit to SOF3/go-stringer-inverse that referenced this issue Aug 23, 2018
Revert "cmd/stringer: use source importer when available"

This reverts CL 40403.

The idea is to avoid type-checking and use just parsing, which should be
enough for stringer.

Separately reopening golang/go#10249 because the original change closed that issue,
but the change is itself causing other problems as described in the discussion
at golang/go#25650.

This reversion restores the old behavior of stringer and will be followed
with other fixes if they can be worked out.

Change-Id: I8404d78da08043ede1a36b0e135a3fc7fdf6728d
Reviewed-on: https://go-review.googlesource.com/121884
Reviewed-by: Ian Lance Taylor <iant@golang.org>
SOF3 pushed a commit to SOF3/go-stringer-inverse that referenced this issue Aug 23, 2018
Roll back my two recent changes. Stringer is now very slow again,
but works in most use cases.

My git foo is insufficient to do this as a revert, but it is a by-hand
reversion of CLs

	https://go-review.googlesource.com/121884
	https://go-review.googlesource.com/121995

See the issue for a long conversation about the general problem.

Update golang/go#10249
Update golang/go#25650

Change-Id: I7b6ce352a4c7ebf0977883509e9d7189aaac1251
Reviewed-on: https://go-review.googlesource.com/122535
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang deleted a comment from Wolfkid200444 Aug 23, 2018
@robpike
Copy link
Contributor Author

@robpike robpike commented Nov 11, 2019

@alandonovan Where are we with this issue? Stringer is still slow, although perhaps not unbearably so.

@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented Nov 11, 2019

Stringer continues to use go/types, and uses go/packages to obtain package information and export data for dependencies. Just now it took about 700ms to execute the Pill example in its documentation (a single source file with no imports), compared to about 60ms to compile that package, so stringer is indeed much slower than compilation. Most of that time appears to be spent in go/packages, since the golang.org/x/tools/go/packages/gopackages -mode=types command takes about the same amount of time. I'm not sure why go/packages takes so long to do ~nothing. @matloob @ianthehat ?

@matloob
Copy link
Contributor

@matloob matloob commented Jan 24, 2020

I'm not sure exactly what's happening, but it seems like running "go list -json -compiled" is doing a fair amount of extra work, even after a build happens.

Here are commands I ran on the example code, in modules mode, (I had to change the license detector import to v3 since v2's go.mod seems to be broken):

$ go clean -cache && time go build && time go list -compiled >/dev/null && time go list -compiled >/dev/null && time go list -compiled >/dev/null && gopackages -mode types . >/dev/null

# initial build
real	0m2.381s
user	0m10.230s
sys	0m2.811s

# first run of go list -compiled
real	0m3.547s
user	0m3.132s
sys	0m2.248s

# second run of go list -compiled
real	0m0.326s
user	0m0.383s
sys	0m0.469s

# gopackages -mode types
real	0m0.317s
user	0m0.395s
sys	0m0.477s

And go/packages passes the -compiled flag to go list in most modes because it needs the compiled source most of the time to fall back to when export data is missing.

@golang golang deleted a comment from Wolfkid200444 Feb 3, 2020
@FrankReh
Copy link

@FrankReh FrankReh commented Oct 16, 2020

gopackages -mode types does in fact create the same cache files as stringer -type Type. I started with a clean build cache and compared the files left in the cache afterwards by each - identical file names.

What's really interesting is that gopackages -mode types generates a superset of the files of go build. When both start with an empty cache, go build is faster - it has created fewer cache files.

go build leaves 450 files in the build cache. gopackages -mode types leaves 722. An extra 272 files.

Below I show two sets of runs, each after clearing the cache. The first shows the cache count grow from 2 to 450 to 722 because the go build is run first. The second shows the cache count goes from 2 to 722 immediately and stays there because the gopackages command was run first.

What extra work is gopackages -mode types doing that go build doesn't need?

A second observation: the real vs user times of the four commands is interesting. After a cache clear, the first command to run has a lot of work to do and has been able to parallalize much of it. But in the first pair, where the gopackages came second, the work it had left to do to create those extra 272 files was not parallalizable.

go version go1.15.3 darwin/amd64
GOPATH=
GOROOT=
$ go clean -cache
$ find ~/Library/Caches/go-build -type f | wc -l
       2
$ time go build

real	0m1.878s
user	0m8.059s
sys	0m2.100s
$ find ~/Library/Caches/go-build -type f | wc -l
     450
$ time gopackages -mode=types . > /dev/null

real	0m2.509s
user	0m2.366s
sys	0m1.259s
$ find ~/Library/Caches/go-build -type f | wc -l
     722
$ go clean -cache
$ find ~/Library/Caches/go-build -type f | wc -l
       2
$ time gopackages -mode=types . > /dev/null

real	0m3.848s
user	0m10.023s
sys	0m3.259s
$ find ~/Library/Caches/go-build -type f | wc -l
     722
$ time go build

real	0m0.211s
user	0m0.268s
sys	0m0.220s
$ find ~/Library/Caches/go-build -type f | wc -l
     722
@FrankReh
Copy link

@FrankReh FrankReh commented Oct 18, 2020

Okay, testing with go list and tracking down where the work is done, or even just why some go-build cached files gets created can be confusing.

Following the lead from above that even the Pill example takes go list and stringer a long time, here are some new observations.

Here's the source we're using as the example.

package painkiller

//go:generate stringer -type=Pill

type Pill int

const (
	Placebo Pill = iota
	Aspirin
	Ibuprofen
	Paracetamol
	Acetaminophen = Paracetamol
)

Here is the script being used to count go-build cached files and related scripts for cleaning the cache, and running go build and go list repeatedly.

$ cat cache-count
find ~/Library/Caches/go-build/ -type f | wc -l

$ cat ./cache-clean
go clean -cache; cache-count

$ cat go-build
go build; cache-count

$ cat go-list-compiled-false
go list -e -json -compiled=false -export=true -deps=true -- . > /tmp/go.list.compiled.false; cache-count

$ cat go-list-compiled-true
go list -e -json -compiled=true  -export=true -deps=true -- . > /tmp/go.list.compiled.true; cache-count

First, a nice easy set of tests that work well. They show that when the only go file is pill.go (shown above), the go list -compiled=false version creates six go-build cache files and
the more complicated form of go list adds nothing, nor does calling go build. Nice and tidy - just as one would hope.

$ ls *.go
pill.go
$ ./cache-clean
       2
$ ./go-list-compiled-false
       8
$ ./go-list-compiled-true
       8
$ ./go-build
       8

The fact that pill.go compiles without having run the stringer can make things confusing because testing with and without pill_string.go changes the results dramatically.
See what happens to the file cache count after running stringer twice. The first time: no change. The second time: a huge change.

$ ls *.go
pill.go
$ stringer -type=Pill
$ ls *.go
pill.go		pill_string.go
$ cache-count
       8
$ stringer -type=Pill
$ ls *.go
pill.go		pill_string.go
$ cache-count
      39

By now, you see one of the causes for confusion. Once the directory has a pill_string.go file, there is an import of "strconv" and that brings a lot more into the equation if stringer is called again, or if go list is now called.

To put it another way, if you are testing with no pill_string.go yet in the directory, running stringer is very fast, as fast as go build.
But as soon as you've run stringer, if you run it again to see if your timing is the same, you will find it took significantly longer because now it has called go list -compiled=true ... on the existing *_string.go file as well.

One more thing I want to show and then a question about the stringer tool and the go/packages package.

go list -compiled=false -export -deps causes the same work as go build.

go list -compiled=true -export -deps causes more work than go build does. Why? -compiled=true is meant to be asking for the compiled list of go files. Why are any extra cache files created, let alone so many more?

$ ./cache-clean
       2
$ ls *.go
pill.go		pill_string.go
$ ./go-list-compiled-false
       8
$ ./go-build
       8
$ ./go-list-compiled-true
      34

Why does passing -compiled=true cause so many new go-build cache files to be created? I hope to answer that soon, but I suspect there's a rational reason and it will stay that way.

But here's a question for the stringer and go/packages team. Should stringer have to rely on the compiled files list? I modified stringer source and go/packages source to allow a set of needs to be specified that allow -compiled to be false when invoking go list and in the two trivial cases for stringer that I've run, the types and syntax tree created are sufficient for generating the same *_string.go files. I haven't found the documentation that would tie this all up, indicating either it should be possible or in the general case it should not be possible.

And one final aside. Counting file growth in the go-build cache directory is itself rife for confusion when gopls is running in the background and one is opening up new files to look around.
Cleaning out the cache often, or counting the cache right before and then right after an operation helps keep things sane.

@FrankReh
Copy link

@FrankReh FrankReh commented Oct 20, 2020

A little closer to an answer for the question posed last November.

For the small case of the following source file, after a call to go build, cgo is affecting the speed seen from go list -compiled and hence from stringer.

The -compiled flag to go list causes a lot of extra work to be done, but what is really time consuming, and perhaps odd, is that this flag causes four packages requiring cgo to be built when plain build doesn't seem to.

go list -compiled is causing cgo to be invoked on four packages, runtime/cgo, os/user, net and crypto/x509. Those four packages being rebuilt with clang called 28 times accounts for .5, .7, .8 and .9 seconds, respectively, on my machine.

Currently stringer's time is spent waiting for go/packages and go/packages is waiting for go list which is busy calling clang 28 times. Is this go list -compiled behavior expected/intentional?

/cc @rsc

The separation of responsibilities between go list, go/packages and stringer isn't clear to me. I won't try to propose anything.

Other conditions which may or may not be factors:
go1.15.3, in a $HOME/dir with no go.mod and not being under $GOPATH.

How to reproduce:

% cat ./reproduce
go version
go clean -cache 
time go build -debug-actiongraph tmp0.actiongraph
time go list  -debug-actiongraph tmp1.actiongraph -export -deps           -- . > tmp1.go-list
time go list  -debug-actiongraph tmp2.actiongraph -export -deps -compiled -- . > tmp2.go-list
grep "clang" tmp0.actiongraph | wc -l
grep "clang" tmp1.actiongraph | wc -l
grep "clang" tmp2.actiongraph | wc -l

% ./reproduce
go version go1.15.3 darwin/amd64

real	0m1.906s
user	0m8.020s
sys	0m2.193s

real	0m0.251s
user	0m0.292s
sys	0m0.238s

real	0m2.529s
user	0m2.327s
sys	0m1.316s
       0
       0
      28


% ls *.go
list.go

% cat list.go
package main

import (
	"gopkg.in/src-d/go-license-detector.v3/licensedb/filer"
)

func main() {
}

type Type int

const (
	Other Type = iota
	A
	B
	C
	D
)

func getFiler(file string) filer.Filer {
	return nil
}
@FrankReh
Copy link

@FrankReh FrankReh commented Oct 21, 2020

Not sure how to write this.

You can ignore essentially everything I've written above over the last few days.

The run time for stringer is good. The run time for gopackages is good. The run time for go list -compiled is good.

All will use the cached files that were created by the go build run ahead of them.

As long as you know to remove your deprecated GOROOT/pkg/darwin_amd64 (or similar) directory.

Move GOROOT/pkg/darwin_amd64 to /tmp and go build (taking a lot longer the first time) runs the clang steps it needs to and puts them into the global cache where go list -compiled will find them.

Move that directory back again, and go build uses the files from that directory, so does not do the work and put things in the new cache. But go list only seems to use the new cache location, not the old one under pkg/darwin_amd64.

So both commands work but because they use different rules about where to look for prebuilt archives, we (or at least I) have been confused about what is triggering go list -compiled to be redoing work that we would have expected already is done by go build.

My clue today that my understanding was amiss, after discovering the build's -a flag, was seeing that in fact go build would also trigger calling clang for some std package cgo needs when told to ignore the cache. But I had been clearing out the cache (with go clean -cache) and not seen those calls to clang before. go build was picking up the archives from the old pkg cache location.

So another reason for my confusion. go clean -cache was not cleaning out both caches that go build used, just the newer cache. The cache that has existed for many years was not being cleaned out.

So if you are seeing long times when calling stringer or gopackages or go list, see if you also have this older cache directory still hanging around, and move it out of the way.

I also found what appears to be another cache directory that is structured like the new one but was at GOROOT/pkg/obj/go-build. I don't know what caused my GOCACHE value to change last week, maybe a new version of go, maybe a new version of macOS. That one also was not being cleaned by the go clean command, I moved it out of the way, and it has not reappeared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.