-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
There are two distinct formatting tools in a Go release; go fmt, which works on packages, and gofmt, which works on files. When go fmt ./... gets called, it ends up spawning one gofmt process for each file to format in each package. The concurrency is limited to the number of available CPUs, so as to format multiple files at once and go faster.
This makes sense, because gofmt is generally bound by CPU usage, not by I/O or some other factor. In the future, gofmt's CPU usage could be reduced a little, but I think it will always be bound by CPU and not the disk's read/write speed.
Unfortunately, this setup leaves gofmt being a sequential tool, to the point that it's measurably slow on large codebases - even though my CPU sits mostly idle. Using gofmt directly is also useful for two reasons:
- It has more options, like
-s,-d, and-r - It's more versatile; it can work on files which don't form part of any package, or format many Go modules at once
I think we should teach gofmt to use all available CPUs, using a similar mechanism to what go fmt does today. On a regular modern machine, this should make common operations like gofmt -l -w . many times faster.
go fmt could also benefit from this change. For example, we could have it call gofmt on all of the files in a package at once, reducing the number of times it has to fork and exec the other tool. An alternative could be to call gofmt in batches of files, such as 100 at a time, to perform better on tiny packages or with huge files.
I'm happy to work on this for Go 1.17, producing benchmarks for formatting all of $GOROOT/src using each tool. I did some similar work for go mod verify last year, which you can see here: https://go-review.googlesource.com/c/go/+/229817
cc @griesemer for cmd/gofmt, @bcmills @jayconrod @matloob for cmd/go