New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/link: by default, do not write out DWARF #26074

Open
robpike opened this Issue Jun 27, 2018 · 25 comments

Comments

Projects
None yet
@robpike
Contributor

robpike commented Jun 27, 2018

This is not as radical as it sounds.

At the very least, we need to understand what the default should be when building the Go installation: write out DWARF, or not? The costs and benefits are more subtle than some realize.

Update: As demonstrated below, dropping DWARF also causes a significant improvement in build/install time.

% cat hello.go
package main

import "fmt"

func main() {
	fmt.Printf("hello world")
}

With this canonical, trivial, but also representative program as input, I used a sequence of Go versions to build the binary on my Mac (OSX 10.13.5, amd64). I have sorted the list into chronological order by version:

% ls -l hello*
-rw-r--r--+   1 r  staff          71 Apr  8  2014 hello.go
-rwxr-xr-x    1 r  staff     1919504 Jun 27 13:44 hello1.4 # built with Go 1.4
-rwxr-xr-x    1 r  staff     1616000 Jun 27 13:54 hello1.7 # built with Go 1.7
-rwxr-xr-x    1 r  staff     1632480 Jun 27 13:45 hello1.8 # built with Go 1.8
-rwxr-xr-x    1 r  staff     1941456 Jun 27 13:47 hello1.9 # built with Go 1.9
-rwxr-xr-x    1 r  staff     2106672 Jun 27 13:50 hello1.10 # built with Go 1.10
-rwxr-xr-x    1 r  staff     2964464 Jun 27 14:01 hello-21Jun-2018 # built with Go at tip on 21 June 2018 - this is just before DWARF compression went in
-rwxr-xr-x    1 r  staff     1970552 Jun 27 13:53 hello1.11beta1 # built with Go 1.11 beta 1 # built with Go 1.11 beta 1, with DWARF compression

I believe the drop from 1.4 to 1.7 (1.5 and 1.6 won't run on my Mac any more) is due to various cleanups in the binary triggered by #6853.

The growth after that is pretty much all due to DWARF. Absent compression, DWARF debugging is now half the binary, as reported by @rsc's sizecmp:

% sizecmp hello1.7 hello-21Jun-2018 
__bss                               108784      116464       +7680
__data                                6144       26896      +20752
__debug_abbrev                         255         467        +212
__debug_aranges                         48           0         -48
__debug_frame                        68836       81036      +12200
__debug_gdb_scri                        40          40          +0
__debug_info                        245056      482436     +237380
__debug_line                        101426      146608      +45182
__debug_loc                              0      436989     +436989
__debug_pubnames                     61781       32854      -28927
__debug_pubtypes                     26794       44637      +17843
__debug_ranges                           0      153520     +153520
__gopclntab                         277492      478600     +201108
__gosymtab                               0           0          +0
__itablink                              64          96         +32
__nl_symbol_ptr                          0         144        +144
__noptrbss                           19520        9208      -10312
__noptrdata                           8264       52284      +44020
__rodata                            208087      290543      +82456
__symbol_stub1                           0         108        +108
__text                              508560      592476      +83916
__typelink                            2732        3032        +300
                         total     1643883     2948438    +1304555
%

That's 1.3MB of growth, almost all in debug info. Even the PC-to-line table grew massively, quite disproportionate to text size, which is inexplicable to me, but also a bit off topic.

So, DWARF is huge, but we need it, right?

I don't think we do, most of the time. Surely when we are using Delve or GDB or perhaps one day LLDB, yes, but mostly not.

The need for DWARF and other debugging support in Go programs is much less than the corresponding need in C programs, for which DWARF was designed. Go binaries already include basic type information (reflection), a simple symbol table, and PC-to-line data. These not only help the running program, they also provide valuable debugging aids as they stand.

Even without DWARF at all, stack traces cased by panic would be unchanged and would contain symbols and line numbers. Pprof, objdump, and many other tools would still work.

The DWARF tables are present only for the debuggers.

And useful though the debuggers are sometimes, they are not used often and often not used at all. I think Delve is a great tool, but I use it only once or twice a year because the existing, built-in debugging information is almost always all I need. Why then do we write bloated binaries, paying a cost in file I/O and DWARF write time (not to mention time to compress now) when half the data in the binary is almost never used?

If I look in my personal bin directory, it consists of a few shell scripts and many Go binaries, and the net size is in the gigabytes. Gigabytes of binaries! I could delete the DWARF data from all of them and get much of the space back at no cost.

Also, keep in mind that much of this data is redundant. Yes, the addresses change between binaries but the type information that we write out for the runtime, garbage collector, and so on is a megabyte or more of utter redundancy, unvarying yet unshared.

The counterargument to dropping DWARF is of course that people want to debug their programs. The recent Go developer survey reported much higher concern for good debugging support than for reducing binary size. But I stress, most programs are never shown to a debugger, and many programmers only rarely use a debugger on a Go binary.

The desire to have good debugging does not immediately translate into writing out full massive DWARF data every time we build and install a program. (Test binaries and such actually skip DWARF, by default.) I believe time might be better spent improving native debugging support in the binaries, such as more informative stack traces, but that is another topic.

So to the proposal itself:

I propose we change the Go build environment to suppress DWARF by default, saving lots of CPU time and disk space. Instead, a global shell environment variable, say GODWARF=1, could be set to cause it to be written out. Programmers that want DWARF can set that once, in their shell profile, and have full data available. Others could set it only occasionally, on bad days.

For the rest of us, the rest of the time, why bother with it?

If it is decided that DWARF is too valuable to disable by default, I would instead propose a variant of this proposal, where I could set GOWARF=0 and turn it off in perpetuity.

In other words, I am proposing two things.

  1. Provide a mechanism, such as a global shell environment variable, to control whether DWARF is written by the tool chain.

  2. Decide whether that setting should switch to "no DWARF" by default. I would like that, but would be almost as happy just to have part 1: a simple way to suppress it.

Note: It's not easy enough to use -ldflags=-w, since there is no mechanism to set LD flags globally in the Go toolchain. Perhaps that's another way to approach the problem.

@robpike robpike added the Proposal label Jun 27, 2018

@rsc rsc changed the title from propsal: by default, do not write out DWARF to propsal: cmd/link: by default, do not write out DWARF Jun 27, 2018

@ianlancetaylor ianlancetaylor changed the title from propsal: cmd/link: by default, do not write out DWARF to proposal: cmd/link: by default, do not write out DWARF Jun 27, 2018

@gopherbot gopherbot added this to the Proposal milestone Jun 27, 2018

@bronze1man

This comment has been minimized.

bronze1man commented Jun 27, 2018

I think this is a good idea.
We use -ldflags="-s -w" by default in our team for years to keep the binary size down and make the build process fast.
I delete the build those flags occasionally to use go tool objdump to see which function is in the binary.I was surprised that the binary size go from 1.6MB to 2.6MB with golang 1.9 with a simple windows exe.
I even do not know what Delve or GDB is after six years of golang developing.
It looks like this proposal has not effect to me, because i already use -ldflags=-w in my build tool.

@mvdan

This comment has been minimized.

Member

mvdan commented Jun 27, 2018

I too have tried to disable DWARF globally before, without much success. This proposal sounds great, and I agree with the change in default behavior. If a developer wants to use a debugger, they should be able to set an environment variable.

I also assume this wouldn't affect cmd/link's -s, since it doesn't seem to involve DWARF.

@aarzilli

This comment has been minimized.

Contributor

aarzilli commented Jun 27, 2018

This is not as radical as it sounds.

This is pretty much what every other compiler does, I wouldn't call it radical.

I propose we change the Go build environment to suppress DWARF by default, saving lots of CPU time and disk space. Instead, a global shell environment variable, say GODWARF=1, could be set to cause it to be written out

My preference would be for a go build flag. For example -g.

@bronze1man

This comment has been minimized.

bronze1man commented Jun 27, 2018

My preference would be for a go build flag. For example -g.

My preference would be a golang package that have a function that can build golang package with all possible switches with good default value and nice field name like GoOs, GoArch, GoPathList, TagList, GoRoot, EnableDataRace, EnableDwarf ...

Command line flags or shell environment variables are very difficult for me to understand all the possible and the correct type of that value.
I use golang to replace all bash scripts or ugly command line flags include the scripts to build my golang program.
Is it the reason why we use golang in the first place?

I suggest the command line of golang compiler should run a host os and arch golang package which is my golang program build script. Then I will write everything else in my golang program build script.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Jun 27, 2018

Yes please. As I understand it delve suggest compiling with -N -l already so adding another flag to the “I want to build for the debugger” step shouldn’t be a large burden.

@mvdan

This comment has been minimized.

Member

mvdan commented Jun 27, 2018

@bronze1man custom Go build programs have been discussed and rejected in the past - please keep this thread on topic. You're welcome to create a new proposal for your idea, though.

@rsc

This comment has been minimized.

Contributor

rsc commented Jun 27, 2018

@davecheney, I agree with you that "recompile for debugging with Delve/gdb/lldb/etc" is not too onerous and well-established. I just want to point out for the record that the recent work is supposed to make it the case that you don't need -N -l (and the consequent performance loss) just to run those debuggers. While I don't mind training people that if you're going to use Delve you might need a different build, it would be good if we could get away from "if you're going to use Delve you have to give up significant performance", and I believe Go 1.11 is a decent step toward that.

@rsc

This comment has been minimized.

Contributor

rsc commented Jun 27, 2018

@thanm

This comment has been minimized.

Member

thanm commented Jun 27, 2018

The DWARF tables are present only for the debuggers.

Profilers read DWARF as well, for what it is worth. It's nice to be able to collect a detailed profile without having to recompile. On the other hand (as with debuggers) very few people run profilers.

How about a tweak -- continue to emit DWARF, but when linking "user" programs, don't include the DWARF from the standard packages (people who use debuggers are rare, but people who debug the Go runtime are even fewer).

@aarzilli

This comment has been minimized.

Contributor

aarzilli commented Jun 27, 2018

How about a tweak -- continue to emit DWARF, but when linking "user" programs, don't include the DWARF from the standard packages (people who use debuggers are rare, but people who debug the Go runtime are even fewer).

That's not that easy. First you'd still want the full debug_frame (at minimum) to unwind the stack (possibly also skeleton DIEs for all functions too). Secondly, delve checks the destination of all CALL instructions to filter away the calls to runtime functions inserted by the compiler. Thirdly delve needs a bunch of DIEs from runtime (off the top of my head: runtime.allgs, runtime.g, runtime.firstmoduledata) for various reasons. Basically you would need to do something like microsoft's symbol files for system dlls.

@thanm

This comment has been minimized.

Member

thanm commented Jun 27, 2018

@aarzilli
Everything you say makes sense, but I thought we had previously established that delve users would already be recompiling with special delve-friendly compiler flags?

@aarzilli

This comment has been minimized.

Contributor

aarzilli commented Jun 27, 2018

@thanm: I was under the impression that this would be a compromise so they wouldn't have to. Also I assume most other tools will have needs similar to delve.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Jun 27, 2018

I am not the least bit happy that I'm expected to debug something different than what I run, and that a recompilation is expected. I don't run the debugger much because it's not easy to run the debugger much, and I'm trying to fix that ("not too many people swimming across the river, why do we need a bridge?"). I do think it is onerous; and I view it as one of the things that Java got relatively right.

There's the additional problem that the -N -l flag combo (treating it as a single flag) is not tested nearly as well as the default; if we honestly support it, we'll need to run all.bash in that mode too, and anyone with widely-used libraries will want to also test using that mode, just in case. Doubling the testing load is not going to speed up developer workflow. And of course, we must document the flags, and people learning to use the compiler will need to learn about them if they plan to create binaries that delve or gdb can debug.

The main point of adding all this debugging information to Go binaries was these two things; to remove speed bumps from Go debugging, and to make progress on the do-we/don't-we support problem for these flags.

Debugging information is also not just for debuggers. Years ago a friend of mine showed me what he called the "killer app for Python", which was just really good backtraces that did a nice job of reporting the values of all the local variables in the stack trace. Their app would wrap all that up in a neat little bundle, ask the customer if they were willing to send it in as a bug report (with certain assurances about information confidentiality that probably wouldn't fly now), and his next interaction with said customer was not to badger them for more information, but instead "here's the dot-release that fixes your problem, download it at your convenience". "And the customer thinks I'm a wizard!" We could/should do this for Go, at least as an option.

There are some alternatives, for instance, with reproducible builds we can generate debugging information on demand -- but when we mention that to nearby colleagues working on other languages and platforms, their reaction is "yeah, right". Their best practice is to build with debugging, make a copy (of either the debugging info alone, or the entire binary) and strip for shipping.

Is binary size causing some other problem that can be discussed publicly? Are we filling up disks? (not mine -- with 10 copies of the Go repo in various states of build, plus all the go-gotten binaries for tools, plus testing and benchmarking, about 2% of my "disk", counting both bin and pkg) . Is there some quota that we're exceeding? Are builds taking too long, and if so, by how much, and have we explored other ways of saving that time that don't reduce functionality? (the linker is due for a rewrite, the register allocator is definitely a time hog, Josh has done some interesting work on moving code from the front end to SSA that seems to save a little time, and has the potential to expose more of the compiler to multithreading).

@robpike

This comment has been minimized.

Contributor

robpike commented Jun 27, 2018

"Debugging information" does not mean only "DWARF". As I said in my post, we can improve self-debugging support independently of improving debugger support, which means DWARF.

@dr2chase can always get the DWARF he wants all the time by just setting the flag once in his profile. I am not proposing to turn DWARF off, I am proposing to turn it off by default as few programmers need the support it provides every day. You are not expected to run without DWARF, you are expected just to set the flag once, for yourself, if you want DWARF on.

Binary size matters. We copy binaries around, we push binaries over networks, we put them in containers, we store them in the cloud, we fill our bin directories with many programs; in those operations, the accumulations are significant.

As to your performance problems, my experiments with building the go command (cmd/go) show that generating DWARF for a significant program is about 40% of the build CPU time. That is a major chunk.

% rm go
rm: go: No such file or directory
bismarck=% time go build

real	0m1.205s
user	0m1.235s
sys	0m0.238s
% rm go
% time go build -ldflags=-w

real	0m0.730s
user	0m0.703s
sys	0m0.223s
% hoc -e 730/1235
0.5910931174089069
% 
@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Jun 27, 2018

You propose we double our testing load to account for the presence/absence of this option?

@robpike

This comment has been minimized.

Contributor

robpike commented Jun 27, 2018

Some compiler and tools work may require more testing, but only modest amounts. The great majority of Go programmers would be unaffected that way.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Jun 28, 2018

If the DWARF is redundant -- once we add the missing information to what we store in the binary -- we could generate it from built-in debugging information, thus allowing us to replace before deprecation, instead of vice-versa.

We'd also want to be careful about how any compatibility guarantees interact with the debugging information; what "variables" can be examined at which "statements" depends on all sorts of stuff.

Another worry is to what degree we'd allow a Go program to introspect on its own execution through this debugging facility.

@balasanjay

This comment has been minimized.

Contributor

balasanjay commented Jun 28, 2018

If the binary records an accurate list of all modules (with versions) used for the build and all other relevant flags, it seems like delve could transparently generate the necessary information by retrieving the source and invoking the compiler.

@junland

This comment has been minimized.

junland commented Jul 2, 2018

Would building out two separate binaries one with debugging information and one without (Using go build or maybe a different command go release or some other command ) would maybe help with people that don't like the idea of having there final binaries without DWARF or having to set a environment variable right off the bat? Don't know if this is in scope / on-topic for this proposal, just wanted to throw out a thought.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Jul 3, 2018

@dr2chase respectfully the “why build bridges” argument applies equally to binary size “disks are hella big, why optimise for space?”

@rsc

This comment has been minimized.

Contributor

rsc commented Jul 10, 2018

Now that DWARF is compressed on Mac (and I assume also on Windows), it seems like we're back to a 20% or so space overhead, and the main issue is now link latency.

Even on Linux I see DWARF aggregation approximately doubling the time spent in the linker (and it's not like the non-DWARF parts of the linker are terribly fast). So this is not a Mac-specific problem. Of course, compression might be what's taking all the time.

Putting this on hold pending better understanding of where all the CPU time is going and that there aren't significant optimizations remaining that might make it faster. If 2X really is the cost of DWARF, then we should resume this conversation about whether it makes sense to pay that cost speculatively all the time when the probability of needing it is near zero.

@rsc rsc added the Proposal-Hold label Jul 10, 2018

@rsc

This comment has been minimized.

Contributor

rsc commented Jul 10, 2018

Filed #26318 for the linker time issue.

@gopherbot

This comment has been minimized.

gopherbot commented Jul 29, 2018

Change https://golang.org/cl/126656 mentions this issue: cmd/go: add $GOFLAGS environment variable

gopherbot pushed a commit that referenced this issue Aug 1, 2018

cmd/go: add $GOFLAGS environment variable
People sometimes want to turn on a particular go command flag by default.
In Go 1.11 we have at least two different cases where users may need this.

1. Linking can be noticeably slower on underpowered systems
due to DWARF, and users may want to set -ldflags=-w by default.

2. For modules, some users or CI systems will want vendoring always,
so they want -getmode=vendor (soon to be -mod=vendor) by default.

This CL generalizes the problem to “set default flags for the go command.”

$GOFLAGS can be a space-separated list of flag settings, but each
space-separated entry in the list must be a standalone flag.
That is, you must do 'GOFLAGS=-ldflags=-w' not 'GOFLAGS=-ldflags -w'.
The latter would mean to pass -w to go commands that understand it
(if any do; if not, it's an error to mention it).

For #26074.
For #26318.
Fixes #26585.

Change-Id: I428f79c1fbfb9e41e54d199c68746405aed2319c
Reviewed-on: https://go-review.googlesource.com/126656
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
@docmerlin

This comment has been minimized.

docmerlin commented Aug 4, 2018

Almost all my production server code has the prof built in. As far as I know this is standard practice. I would hate for this to become harder to do. Also this would be a breaking change.

@stapelberg

This comment has been minimized.

Contributor

stapelberg commented Aug 5, 2018

Almost all my production server code has the prof built in.

As mentioned in the very first post of this issue, profiling is not affected when DWARF information is not present:

Even without DWARF at all, stack traces cased by panic would be unchanged and would contain symbols and line numbers. Pprof, objdump, and many other tools would still work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment