-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: under what circumstances does 1.18 generate significantly larger binaries than 1.17 when compiling the same source code? #52270
Comments
Possibly it is related to inlining changes, try compiling with Otherwise, I'm not aware of any changes that would cause this. Maybe try running |
Regarding Regarding dead function elimination, there may be something there:
So it does look like there are more functions at least. I will put something together diff-ish to dig into the differences and report back. |
@randall77 - I looked in depth at the diff of the For example in 1.18 I am seeing things like:
whereas in the same place for 1.17 it looks like:
My understanding is that .func1, .func2 etc. are anonymous functions, so I am confused why 1.18 should be emitting (a lot) more of these than 1.17 from the same source. A back of the envelope calculation suggests this is responsible for most of the increase. Update: comparing sizes of emitted functions in 1.17 vs 1.18, the code for 1.18 is also consistently larger than 1.17, but this is typically on the order of 1-3%. So most of the total size increase in the output binary in 1.18 does indeed seem to be due to the emission of the additional 'mystery' anonymous functions. Update 2: looking at the source for |
Indeed, both At tip the names changed to This program, when compiled with
All 5 of those references are from inside the functions in question. I think this has to do with the defers in |
@randall77 - thanks for the very complete analysis of the issue. I am still wondering what changed in 1.18 to make this problem happen, when in 1.17 and below it did not. |
@Randall7 this code size increase looks to be a side effect of the register ABI work. AIUI we now enable defer wrappers everywhere (even for architectures like arm and 386 where args are still passed entirely on the stack), since it greatly simplifies the runtime defer processing code. If we didn't do this, we'd have to continue to maintain the legacy version of You wrote: "Indeed, both crypto/rand.(*devReader).Read.func1 and crypto/rand.(*devReader).Read.func2 are unreferenced by any code in the binary except themselves." This part I don't think I understand -- when I build with "go build -ldflags=-dumpdep" I see this:
In other words, the *.func1 and *.func2 wrappers are being kept alive because they are referenced by crypto/rand.(*reader).Read (which is expected). |
@thanm - that's interesting, but I wonder if that is the full explanation here? From the Go 1.18 release notes, there is this, which presumably is talking about the change you are referring to:
But reading this makes it sounds like the architectural change actually happened in 1.17, and 1.18 simply extends that to more platforms. Yet the effect I reported only occurs in 1.18, not 1.17. If this is the same effect, it's not clear to me how 1.17 was able to handle the register ABI changes for amd64 without requiring the defer wrappers. Since |
Never mind, the references are there, it just isn't obvious from the disassembly.
That the referenced instruction stream location is:
That address, 0x7f728 is the start of |
Has it been decided whether the register ABI will be ported to 32-bit ARM? This is still probably the most popular 'mid-range' embedded architecture out there and I think go is being used a lot here. If the register ABI won't be ported to 32-bit ARM, then this change in 1.18 make me a little sad, because it means that a change to the compiler has been made (according to the release notes, for the sake of performance) which adds ~10% bloat to binaries on a platform that won't ever get that performance gain. Of course most changes involve trade-offs including edge cases that suffer even if gains are net positive on average, but in this case for 32-bit ARM it would be 'all suffering' with no gain. |
I don't think anyone is currently working on, or planning to work on, regabi for 32-bit arm. We of course welcome contributions if anyone wants to work on it. |
@randall77 - thanks for the info, I fully understand it isn't possible to please everyone. I hope that (even in the absence of register ABI for 32-bit ARM) that a way will be found moving forward to avoid passing the 'bloat' of unsupported register ABI onto 32-bit ARM nonetheless. But even if not, I still think the world is a better place with Golang. :-) |
What version of Go are you using (
go version
)?1.17 vs 1.18
What operating system and processor architecture are you using (
go env
)?linux/arm
Details
Regarding output binary size, the go 1.18 release notes only talk about changes (relative to 1.17) that potentially reduce size of output objects, due to linker optimisations etc. But under what circumstances would we expect 1.18 to produce significantly larger binaries than 1.17 for the same code?
Example: we have a medium-sized project that runs on an embedded linux/arm system. In this environment both performance (we do a lot of crypto stuff) and size are important. Output binary sizes for 1.17 vs 1.18 are as follows:
Both with no build flags and with
-w -s
we see a binary size increase of around 10% between 1.17 and 1.18, which is quite significant. (Binaries produced by 1.17 and 1.18 are similarly performant for us, so these extra bytes are 'for nothing'.)I am not sure whether this is strictly a 'compiler bug' or if there are circumstances where 1.18 simply produces more bloated code than 1.17. If that is true, it would be extremely useful to know what those circumstances are, so we can work around them, not migrate to 1.18 for our build process, etc.
Any light and wisdom on this would be much appreciated!
-Adrian
The text was updated successfully, but these errors were encountered: