New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: binaries too big and growing #6853
Comments
Just for added reference, this are the size on go 1.1.2 vs 1.2 with OS X 10.9 and Xcode 5 (darwin gcc llvm 5.0 x86_64): $ ls -l *1.* -rwxr-xr-x 1 j staff 1525984 2 Dez 21:44 hello_1.1.2 -rwxr-xr-x 1 j staff 2192672 2 Dez 21:40 hello_1.2 $ size *1.* __TEXT __DATA __OBJC others dec hex 1064960 94720 0 76000 1235680 12dae0 hello_1.1.2 1433600 147896 0 177440 1758936 1ad6d8 hello_1.2 |
|
This issue was updated by revision f238049. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/35940047 |
In this context, please reevaluate if the golang binaries can be changed to work with UPX (ultimate packer for executables).¹² For a small amount of computing power, upx can reduce the size of a binary down to a quarter of its original size. You can authenticate this using a existing example ›fixer‹ programm for golang binaries on linux/amd64.³ While this approach doesn't fix the root of the problem – only the symptoms – it would be nice to have this possibility always on hand. For technical background of this problem (PT_LOAD[0].p_offset==0), please look at the UPX bugtracker⁴. ¹ http://upx.sourceforge.net/ ² https://en.wikipedia.org/wiki/UPX ³ https://github.com/pwaller/goupx ⁴ http://sourceforge.net/p/upx/bugs/195/ |
re #10, my #9 reply is to #8, which is about an entirely different problem. i'm not saying that the ever-growing binaries is not our problem, only that i don't believe that upx not accepting our binaries is our problem. it's clear that upx isn't able to handle all possible and correct ELF files (i.e. if the kernel can execute our binaries just fine, it's upx's problem to not be able to compress them). |
More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point, adding one new entry to the list above: $ ls -l *1.* -rwxr-xr-x 1 j staff 1525984 2 Dez 21:44 hello_1.1.2 -rwxr-xr-x 1 j staff 2192672 2 Dez 21:40 hello_1.2 -rwxr-xr-x 1 j staff 2474512 Feb 18 20:27 hello_1.2.x $ size *1.* __TEXT __DATA __OBJC others dec hex 1064960 94720 0 76000 1235680 12dae0 hello_1.1.2 1433600 147896 0 177440 1758936 1ad6d8 hello_1.2 1699840 160984 0 188944 2049768 1f46e8 hello_1.2.x Text has grown substantially, as has data. At least some of this is due to new annotations for the garbage collector. |
More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point, adding one new entry to the list above: $ ls -l *1.* -rwxr-xr-x 1 r staff 1191952 Nov 30 10:25 x.1.0 -rwxr-xr-x 1 r staff 1525936 Nov 30 10:20 x.1.1 -rwxr-xr-x 1 r staff 2188576 Nov 30 10:18 x.1.2 -rwxr-xr-x 1 r staff 2474512 Feb 18 20:27 hello_1.2.x $ size *1.* __TEXT __DATA __OBJC others dec hex 880640 33682096 0 4112 34566848 20f72c0 x.1.0 1064960 94656 0 75952 1235568 12da70 x.1.1 1429504 147896 0 177440 1754840 1ac6d8 x.1.2 1699840 160984 0 188944 2049768 1f46e8 hello_1.2.x Text has grown substantially, as has data. At least some of this is due to new annotations for the garbage collector. |
More detail. The Plan 9 symbol table is about to be deleted. Here is a reference point, adding one new entry to the list above: $ ls -l *1.* -rwxr-xr-x 1 r staff 1191952 Nov 30 10:25 x.1.0 -rwxr-xr-x 1 r staff 1525936 Nov 30 10:20 x.1.1 -rwxr-xr-x 1 r staff 2188576 Nov 30 10:18 x.1.2 -rwxr-xr-x 1 r staff 2474512 Feb 18 20:27 hello_1.2.x $ size *1.* __TEXT __DATA __OBJC others dec hex 880640 33682096 0 4112 34566848 20f72c0 x.1.0 1064960 94656 0 75952 1235568 12da70 x.1.1 1429504 147896 0 177440 1754840 1ac6d8 x.1.2 1699840 160984 0 188944 2049768 1f46e8 x.1.2.x Text has grown substantially, as has data. At least some of this is due to new annotations for the garbage collector. |
|
This issue was updated by revision 964f6d3. Nothing reads the Plan 9 symbol table anymore. The last holdout was 'go tool nm', but since being rewritten in Go it uses the standard symbol table for the binary format (ELF, Mach-O, PE) instead. Removing the Plan 9 symbol table saves ~15% disk space on most binaries. Two supporting changes included in this CL: debug/gosym: use Go 1.2 pclntab to synthesize func-only symbol table when there is no Plan 9 symbol table debug/elf, debug/macho, debug/pe: ignore final EOF from ReadAt LGTM=r R=r, bradfitz CC=golang-codereviews https://golang.org/cl/65740045 |
For my test case before/after deleting the Plan 9 symbol table: % ls -l ... -rwxr-xr-x 1 r staff 2474512 Feb 18 20:27 hello_1.2.x -rwxr-xr-x 1 r staff 2150928 Feb 18 22:28 hello_1.2.y % size ... __TEXT __DATA __OBJC others dec hex 1699840 160984 0 188944 2049768 1f46e8 hello_1.2.x 1376256 160984 0 188944 1726184 1a56e8 hello_1.2.x % So deleting the Plan 9 symbol table pretty close to exactly compensates for the GC information. We're back at Go 1.2 levels, still far too large but it's a start. |
|
This issue was updated by revision 2541cc8. Every function now has a gcargs and gclocals symbol holding associated garbage collection information. Put them all in the same meta-symbol as the go.func data and then drop individual entries from symbol table. Removing gcargs and gclocals reduces the size of a typical binary by 10%. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/65870044 |
|
This issue was updated by revision ae38b03. For an ephemeral binary - one created, run, and then deleted - there is no need to write dwarf debug information, since the binary will not be used with gdb. In this case, instruct the linker not to spend time and disk space generating the debug information by passing the -w flag to the linker. Omitting dwarf information reduces the size of most binaries by 25%. We may be more aggressive about this in the future. LGTM=bradfitz, r R=r, bradfitz CC=golang-codereviews https://golang.org/cl/65890043 |
After removing gcargs from the symbol table (stepping across CL 65870044) % ls -l x.1.2.[yz] -rwxr-xr-x 1 r staff 2150928 Feb 18 22:28 hello_1.2.y -rwxr-xr-x 1 r staff 1932880 Feb 19 08:14 hello_1.2.z % size x.1.2.[yz] __TEXT __DATA __OBJC others dec hex 1376256 160984 0 188944 1726184 1a56e8 hello_1.2.y 1376256 160984 0 110160 1647400 192328 hello_1.2.z % It's now smaller than at 1.2 but still much bigger than 1.1, let alone 1.0. |
Let's not use this issue as a discussion list. Please ask questions on golang-dev. Thanks. pprof not working is issue #7452. Labels changed: added restrict-addissuecomment-commit. |
This is as fixed as it is going to be for Go 1.3.
Right now at tip + CL 80370045 on darwin/amd64, compiling this program:
package main
import "fmt"
func main() {
fmt.Println("hello, world")
}
I get 1830352 bytes for the binary. Assuming this is the same case for which Rob's
numbers are reported, by this metric Go 1.3 will roll back more than half the size
increase caused by Go 1.2 (relative to Go 1.1). Will leave further improvement for Go
1.4.
Labels changed: added release-go1.4, removed release-go1.3. |
|
This issue was updated by revision a26c01a. LGTM=khr R=khr CC=golang-codereviews https://golang.org/cl/80370045 |
|
One low-hanging fruit is probably how the dead-code pass of the linker deals with a use case of |
|
@typeless I'm not sure how much can be done about keeping deadcode alive in those situations. I suspect, maybe the solution in those cases is to either:
I suspect the reason your code still works is because the text/template doesn't hit the code-path that calls MethodByName on code that has been removed. |
|
@egonelbre |
|
I agree that the behavior is very conservative. The issue is that there's no easy way of knowing, which methods need to be kept. If it were low-hanging fruit, it would have been already picked :).
This change would break quite a lot of code. Any html/template that invokes a method could break. If it were built from the start that way, then sure, it could have been a better strategy. |
|
I thought at first that restricting dead-code elimination to only those packages that are not imported by other packages that invoke |
Seems fragile. Do all distros use the same string? Gets the wrong answer with a class of (absurdly named) executables. I guess the current approach is also arguably fragile in that you could lose an init vs delete race. Feel free to send a CL if you want. :) Note that you have to manually inline strings.TrimSuffix. |
|
Change https://golang.org/cl/311790 mentions this issue: |
|
@josharian ah, gotcha... I didn't read the comment near I'm not sure whether it'll get accepted or not, but I sent the CL anyways. |
Currently Readlink gets linked into the binary even when Executable is
not needed.
This reduces a simple "os.Stdout.Write([]byte("hello"))" by ~10KiB.
Previously the executable path was read during init time, because
deleting the executable would make "Readlink" return "(deleted)" suffix.
There's probably a slight chance that the init time reading would return
it anyways.
Updates #6853
Change-Id: Ic76190c5b64d9320ceb489cd6a553108614653d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/311790
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Tobias Klauser <tobias.klauser@gmail.com>
|
I've been following this issue for a long time. Unlike the general community, I use Golang for systems programming (GUI, games). Maybe a compiler parameter could be added to remove traceback information. |
|
@SeanTolstoyevski you may find this interesting: https://commaok.xyz/post/no-line-numbers/ |
|
@SeanTolstoyevski We're reluctant to provide a mechanism to remove traceback information because it will break |
Wouldn't it be possible to just remove symbol names, file, and line numbers, keeping the traceback data otherwise intact? This way consumers of |
|
I'm sorry, I don't understand the suggestion. The traceback information is exactly symbol names, file names, and line numbers (see https://golang.org/pkg/runtime/#Frame). How can we remove those and leave the traceback data intact? |
|
The Perhaps there was some sort of miscommunication. I assumed what you considered impossible is completely removing traceback information in the sense that |
|
Thanks everyone for the replies. Data such as traceback, import path information take up enough size in executable files. @ianlancetaylor In fact, there is no need to remove it completely. For example, for module paths consider something simple like Sorry for the insufficient english. I tried to explain this simply. I may have chosen the wrong words. |
|
That is available as the |
You might find https://github.com/burrowers/garble useful, which attempts to remove some source code information when building. It's particularly aggressive with the |
|
If we dropped the file/line/function information, then logging and profiling would break. |
I think breaking profiling is acceptable if the user desires smaller binaries. As for logging, I'm not sure what a good trade off would be. |
|
I use gomobile for my Andr and iOS app, the android architecture with arm, arm64, 386, amd64,if i support so many architecture, it will generate each architecture with a libgojni.so,the aar file size increase fast, if the size can't be shrink, i think no one will use gomobile. |
This shrinks a binary that just does an http.ListenAndServe by about 90kb due to using less code for the static array. syms delta name old-size new-size pct-difference -54206 vendor/golang_org/x/net/http2/hpack.newStaticTable 55233 1027 -98.14% -4744 runtime.pclntab 1041055 1036311 -0.46% -204 runtime.findfunctab 10675 10471 -1.91% 8 runtime.typelink 9852 9860 0.08% 41 runtime.gcbss 869 910 4.72% 11711 vendor/golang_org/x/net/http2/hpack.init 572 12283 2047.38% sections delta name old-size new-size pct-difference -41888 .text 2185840 2143952 -1.92% -37644 .rodata 842131 804487 -4.47% -4744 .gopclntab 1041055 1036311 -0.46% -3343 .debug_info 981995 978652 -0.34% -2931 .debug_line 291295 288364 -1.01% 8 .typelink 9852 9860 0.08% 59 .debug_pubnames 81986 82045 0.07% 96 .symtab 186312 186408 0.05% 113 .debug_pubtypes 137500 137613 0.08% 128 .debug_frame 219140 219268 0.06% 220 .strtab 217109 217329 0.10% 2464 .bss 127752 130216 1.93% Updates golang/go#6853 Change-Id: I3383e63300585539507b75faac1072264d8f37e7 Reviewed-on: https://go-review.googlesource.com/43090 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
Inlining isn't performed on generated init functions. Removing the pair function and initializing the structs directly saves another 16k on a simple http.ListenAndServe() binary. delta name old new -58 runtime.findfunctab 10471 10413 -0.55% -41 runtime.gcbss 910 869 -4.51% 41 runtime.gcdata 612 653 6.70% -408 runtime.pclntab 1036311 1035903 -0.04% -11711 vendor/golang_org/x/net/http2/hpack.init 12283 572 -95.34% Updates golang/go#6853 Change-Id: Ibccc796fe7403674cf4b4561acf9551d76ff11e8 Reviewed-on: https://go-review.googlesource.com/43190 Run-TryBot: Todd Neal <todd@tneal.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
Change https://go.dev/cl/462300 mentions this issue: |
This CL changes how the x86 compiler stores and loads the frame pointer on each function prologue and epilogue, with the goal to reduce the final binary size without affecting performance. The compiler is currently using MOV instructions to load and store BP, which can take from 5 to 8 bytes each. This CL changes this approach so it emits PUSH/POP instructions instead, which always take only 1 byte each (when operating with BP). It can also avoid using the SUBQ/ADDQ to grow the stack for functions that have frame pointer but does not have local variables. On Windows, this CL reduces the go toolchain size from 15,697,920 bytes to 15,584,768 bytes, a reduction of 0.7%. Example of epilog and prologue for a function with 0x10 bytes of local variables: Before === SUBQ $0x18, SP MOVQ BP, 0x10(SP) LEAQ 0x10(SP), BP ... function body ... MOVQ 0x10(SP), BP ADDQ $0x18, SP RET === After === PUSHQ BP LEAQ 0(SP), BP SUBQ $0x10, SP ... function body ... MOVQ ADDQ $0x10, SP POPQ BP RET === Updates #6853 Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785 Reviewed-on: https://go-review.googlesource.com/c/go/+/462300 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
|
Change https://go.dev/cl/463845 mentions this issue: |
This CL instructs the Go x86 compiler to load the frame pointer address using a MOV instead of a LEA instruction, being MOV 1 byte shorter: Before 55 PUSHQ BP 48 8d 2c 24 LEAQ 0(SP), BP After 55 PUSHQ BP 48 89 e5 MOVQ SP, BP This reduces the size of the Go toolchain ~0.06%. Updates #6853 Change-Id: I5557cf34c47e871d264ba0deda9b78338681a12c Reviewed-on: https://go-review.googlesource.com/c/go/+/463845 Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
This CL changes how the x86 compiler stores and loads the frame pointer on each function prologue and epilogue, with the goal to reduce the final binary size without affecting performance. The compiler is currently using MOV instructions to load and store BP, which can take from 5 to 8 bytes each. This CL changes this approach so it emits PUSH/POP instructions instead, which always take only 1 byte each (when operating with BP). It can also avoid using the SUBQ/ADDQ to grow the stack for functions that have frame pointer but does not have local variables. On Windows, this CL reduces the go toolchain size from 15,697,920 bytes to 15,584,768 bytes, a reduction of 0.7%. Example of epilog and prologue for a function with 0x10 bytes of local variables: Before === SUBQ $0x18, SP MOVQ BP, 0x10(SP) LEAQ 0x10(SP), BP ... function body ... MOVQ 0x10(SP), BP ADDQ $0x18, SP RET === After === PUSHQ BP LEAQ 0(SP), BP SUBQ $0x10, SP ... function body ... MOVQ ADDQ $0x10, SP POPQ BP RET === Updates golang#6853 Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785 Reviewed-on: https://go-review.googlesource.com/c/go/+/462300 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
This CL instructs the Go x86 compiler to load the frame pointer address using a MOV instead of a LEA instruction, being MOV 1 byte shorter: Before 55 PUSHQ BP 48 8d 2c 24 LEAQ 0(SP), BP After 55 PUSHQ BP 48 89 e5 MOVQ SP, BP This reduces the size of the Go toolchain ~0.06%. Updates golang#6853 Change-Id: I5557cf34c47e871d264ba0deda9b78338681a12c Reviewed-on: https://go-review.googlesource.com/c/go/+/463845 Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
As an experiment, I build "hello, world" at the release points for go 1.0. 1.1, and 1.2. Here are the binary's sizes: % ls -l x.1.? -rwxr-xr-x 1 r staff 1191952 Nov 30 10:25 x.1.0 -rwxr-xr-x 1 r staff 1525936 Nov 30 10:20 x.1.1 -rwxr-xr-x 1 r staff 2188576 Nov 30 10:18 x.1.2 % size x.1.? __TEXT __DATA __OBJC others dec hex 880640 33682096 0 4112 34566848 20f72c0 x.1.0 1064960 94656 0 75952 1235568 12da70 x.1.1 1429504 147896 0 177440 1754840 1ac6d8 x.1.2 % A near-doubling of the binary size in two releases is a bug of a kind. I will hold on to the files so they can be analyzed more, but am filing this issue to get the topic registered. We need to develop a better understanding of the problem and how to address it. Marking this 1.3 (not maybe) because I consider it a priority. A few months ago I exchanged mail with Russ about this topic regarding a different, much larger binary. To avoid him having to redo the analysis, here is what he said at the time: ==== i sent CL 13722046 to make the nm -S output a bit more useful. for the toy binary i now get 4a2280 1898528 D symtab 26f3a0 1405936 D type.* 671aa0 1058432 D pclntab 3c6790 598056 D go.string.* 4620c0 49600 D gcbss 7a7c20 45496 B runtime.mheap 46e280 21936 D gcdata 7a29e0 21056 b bufferList 1ed600 16480 T crypto/tls.(*Conn).clientHandshake 79eb20 16064 b semtable 1b3d90 14224 T net/http.init that seems plausible to me. some notes: symtab is the plan 9 symbol table. it in the binary but never referenced at run time. it supports things like nm -S only. it needs to move into an unmapped section of the binary, but it is only costing at most 8k at run time right now due to fragmentation and it just wasn't worth the effort to try to move. the new linker will make this easier. of course, moving it in the file doesn't shrink the file. the thing named pclntab is a reencoding of the original pclntab and the parts of the plan 9 symbol table that we did need at run time (mostly just a list of functions and their names and addresses). as you can see, it is much smaller than the old form (the symbol table dominates). type.* is the reflect types and go.string.* is the static go string data. the * indicates that i coalesced many symbols into one, to avoid useless individual names bloating the symbol table. if we tried we could probably cut the reflect types by 2-4x. it would mean packing the data a bit more compactly than an ordinary go data structure would and then using unsafe to get it back out. gcbss and gcdata are garbage collection bits for the bss and data segments. that's what atom symbol did, and it's not clear whether it will last (probably not) and whether what will replace it will be smaller. time will tell. i have a meeting with dmitriy, carl, and keith next week to figure out what the plan is. runtime.mheap, bufferList, and semtable are bss. you're not seeing the gdb dwarf debug information here, because it's not a runtime symbol. g% otool -l $(which toy) | egrep '^ segname|filesize' segname __PAGEZERO filesize 0 segname __TEXT filesize 7811072 segname __DATA filesize 126560 segname __LINKEDIT filesize 921772 segname __DWARF filesize 2886943 g% there's another 3 MB. you can build with -ldflags -w to get rid of that at least. if you read the full otool -l output you will find Load command 6 cmd LC_SYMTAB cmdsize 24 symoff 10825728 nsyms 22559 stroff 11186924 strsize 560576 looks like another 1 MB or so (560576+11186924-10825728 or 22559*16+560576) for the mach-o symbol table. when we do the new linker we can make recording this kind of information in a useful form a priority.The text was updated successfully, but these errors were encountered: