Go 1.18 compile times may be 15-18% slower than Go 1.17, largely due to the changes due to implementing generics. These easiest comparison (which shows most of the difference) is to compare the compilation times for -G=0 and -G=3 mode. -G=3 mode is the default, since it supports generics.
A comparison between -G=0 mode in Go 1.18 and Go 1.17 mode shows that the compiler may have slowed down ~1% because of non-generics changes (since -G=0 mode does not support generics).
So, we can mostly compare the speeds of -G=0 and -G=3 mode in Go 1.18 for now. Most of the difference is due to the new front-end processing, since the SSA backend doesn't change at all for generics. In -G=0 mode (used for all compilers before Go 1.18), there is a syntax parser, the noder phase to create the tree of ir.Node nodes, and the standard typechecker. In -G=3 mode, there is the same syntax parser, but the program is first typechecked by types2 (which supports generics), and then we have a noder2 phase to create the tree of ir.Node nodes using the syntax info and type information from the types2 typechecker. The sum of noder + types1-typechecking is about 4% in a run, whereas the sum of types2-typechecker+noder2 is 14%. So, we can see much of the slowdown is due to the change to front-end processing (not unexpectedly).
These are all rough numbers based on a small number of runs/inputs.
We will plan to reduce this extra overhead in Go 1.19.
The text was updated successfully, but these errors were encountered:
A small but not-zero part of this is changes to the GC pacer; because it starts at a smaller heap size (512kiB vs 4MiB) the compiler spends more time initially in GC, and for smaller packages this can be noticeable. Across a bunch of benchmarks the combo has geomean 7% increase in build user time; with the initial heap size restored the geomean slowdown is 2%. Because this is skewed by the startup overhead for smaller packages, it makes some sense to also consider the plain arithmetic mean, which is 2% slower with the new pacer and smaller heap, and only 0.02% slower (i.e., noise) if the initial heap size is restored to its old value.
We can increase the initial heap size late in the 1.18 release cycle since it is simply restoring old behavior; the new pacer we probably want to keep because it does slightly better with GC pause latency in certain corner cases. But, it is good to know where some of the time went.
Relatedly, I've been building Go itself with GOGC=off for years, which speeds up make.bash by more than 20%. This is because the build consists of many short-lived compile processes, which generate significant amounts of allocations and exit within seconds.
I know I have enough spare memory to not have to worry about GC, so letting the kernel reclaim memory at exit gives a significant speedup without noticeably increasing peak memory use.
Could we apply something similar to go build in general? I realize it has little to do with generics, but it is related to what @dr2chase brought up in regards to the pacer and heap size. For instance, if the OS has memory to spare, give each compile process a large target heap size so it wastes less time on GC.
For instance, if the OS has memory to spare, give each compile process a large target heap size so it wastes less time on GC.
Another alternative, which would presumably require less tweaking based on available memory, would be to set up a different GOGC default for compile/link/asm/etc processes. For instance, GOGC=200 would presumably increase memory usage slightly and save some CPU overhead, gaining us some of that 20% build time reduction without potentially using tons of memory.
Leaving GOGC at 200 would be bad for small-memory builds of large packages. What I understand would help is to start GOGC very high (800), periodically set the heap size, and once the heap is large enough (32m, for example) set GOGC back to its environment-specified value. Perhaps, also, we could set a finalizer on an intentionally dead object, so that "not too much time" elapses just in case our polling period is too large. The advantage of doing it this way is no-new-knobs, and (assuming we check heap size soon enough) not accidentally inflating the footprint for large heaps.
We've known for years that setting GOGC=off in the compiler makes it run faster.
We don't go down that road because it is the wrong fix.
If there is something wrong with the garbage collector,
it should be fixed in the garbage collector,
which will benefit everyone.
(The compiler is not a special snowflake.)