-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
What version of Go are you using (go version
)?
Tip for Go:
go version devel +48c79734ff Fri Jul 20 20:08:15 2018 +0000 linux/amd64
Tip for Gccgo
go version go1.10.3 gccgo (GCC) 9.0.0 20180724 (experimental) linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
Linux/amd64.
What did you do?
When collecting pprof CPU profiles using a gccgo-based toolchain I've noticed that there are unfriendly artifacts in the profile data, at least relative to what I see with the main Go compiler.
Here is a repro recipe. This uses "dwarf-check", a very small application that inspects the generated DWARF for ELF + MachoO binaries, looking for a specific class of bugs/insanities.
Step 1: download and build the app to profile:
$ cd `go env GOPATH`
$ go get github.com/thanm/dwarf-check
$ cd github.com/thanm/dwarf-check
$ go build .
$ file dwarf-check
dwarf-check: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
statically linked, not stripped
Step 2: run the program to generate a profile. This invocation runs the program on itself, emitting a CPU profile to cpu.p:
$ ./dwarf-check -v 1 -iters 3 -cpuprofile cpu.p ./dwarf-check
initiating CPU profile
in main
loading ELF for ./dwarf-check
loading DWARF for ./dwarf-check
examining DWARF for ./dwarf-check
read 142948 DIEs, processed 41499 abstract origin refs
...
leaving main
finishing CPU profile
$
Step 3: inspect the profile using "pprof"
$ pprof ./dwarf-check cpu.p
File: dwarf-check
Type: cpu
...
$ top15 > top15.txt
Generating report in top15.txt
$ raw > raw.txt
Generating report in raw.txt
$ tree > tree.txt
Generating report in tree.txt
$ quit
Repeat the steps above with the main Go, then with gccgo.
For gccgo:
- in the build step I used "go build -gccgoflags -static-libgo .", mainly to create a binary with enough interesting DWARF in it
NB: I am using upstream 'pprof', in case that matters (e.g. "go get -u github.com/google/pprof").
What did you expect to see?
For the profile collected with the main Go compiler, the top functions in the rofile look like:
Showing top 15 nodes out of 61
flat flat% sum% cum cum%
310ms 13.30% 13.30% 800ms 34.33% runtime.mallocgc
230ms 9.87% 23.18% 230ms 9.87% runtime.heapBitsSetType
200ms 8.58% 31.76% 1390ms 59.66% debug/dwarf.(*buf).entry
200ms 8.58% 40.34% 230ms 9.87% runtime.mapaccess2_fast32
120ms 5.15% 45.49% 230ms 9.87% runtime.mapassign_fast64
100ms 4.29% 49.79% 100ms 4.29% runtime.memclrNoHeapPointers
80ms 3.43% 53.22% 120ms 5.15% debug/dwarf.(*buf).string
60ms 2.58% 55.79% 1460ms 62.66% debug/dwarf.(*Reader).Next
60ms 2.58% 58.37% 60ms 2.58% runtime.nextFreeFast (inline)
...
What did you see instead?
For gccgo, I see the following top function in the profile:
Showing top 15 nodes out of 86
flat flat% sum% cum cum%
2.73s 98.91% 98.91% 2.73s 98.91% runtime.sigprof.isra.164
0.01s 0.36% 99.28% 0.02s 0.72% [dwarf-check.gccgo]
0 0% 99.28% 2.73s 98.91% _IO_funlockfile
0 0% 99.28% 0.10s 3.62% __nss_passwd_lookup
0 0% 99.28% 0.02s 0.72% debug_dwarf.Data.parseAbbrev
0 0% 99.28% 0.02s 0.72% debug_dwarf.Data.parseUnits
0 0% 99.28% 0.02s 0.72% debug_dwarf.New
0 0% 99.28% 1.23s 44.57% debug_dwarf.Reader.Next
The first problem seems to be that the sigprof handler itself is showing up in the profile... seems to me that this should not be happening, based on what I see in the regular Go CPU profile.
The second problem seems to be that there (apparently) bogus entries in the call stacks that are attributed to routines in libc.so, which I am pretty sure are not really being invoked (e.g. _IO_funlockfile).
Here is an excerpt from the "raw" dump from the gccgo run:
PeriodType: cpu nanoseconds
Period: 10000000
Time: 2018-07-25 09:01:46.066810243 -0400 EDT
Duration: 2.51
Samples:
samples/count cpu/nanoseconds
1 10000000: 1 2 2 3 4 5 6 7 8 9 10 11 12
1 10000000: 1 2 2 3 4 13 14 15 16 17 18 11 12
1 10000000: 1 2 2 3 4 19 20 21 16 17 18 11 12
1 10000000: 1 2 2 3 4 22 16 17 18 11 12
9 90000000: 1 2 2 3 4 23 24 25 26 27 28 29
1 10000000: 1 2 2 3 4 30 30 31 32 16 17 18 11 12
and the corresponding locations ("1", "2" above, etc) from later in the dump:
Locations
1: 0x4c79ef M=1 runtime.sigprof.isra.164 /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/proc.go:3421 s=0
2: 0x4d95bc M=1 runtime.sighandler /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/signal_sighandler.go:35 s=0
runtime.sigtrampgo /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/signal_unix.go:312 s=0
3: 0x569c06 M=1 runtime.sigtramp /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/runtime/go-signal.c:131 s=0
4: 0x7fa77097e0bf M=2 _IO_funlockfile ??:?:0 s=0
5: 0x4df1f4 M=1 runtime.mapassign /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/hashmap.go:583 s=0
6: 0x42cdd8 M=1 debug_dwarf.Data.parseAbbrev /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/debug/dwarf/entry.go:89 s=0
7: 0x42d860 M=1 debug_dwarf.Data.parseUnits /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/debug/dwarf/unit.go:77 s=0
All of the entries in the profile start with the same sequence:
1 10000000: 1 2 2 3 4 <N> ...
where is the actual function executing at the time SIGPROF was delivered. Location 4 is the suspicious one -- the IP is recorded as 0x7fa77097e0bf, which falls into the libc mapping:
1: 0x400000/0x74e000/0x0 ./dwarf-check.gccgo 52734d70535f365f564c596b3853416e645572522f5f56623832316f347a6f75636f6c6e4c6644454d2f346a4170633677515135414a714d724d4876542d2f5239433562436a50584756325f72356c45735a4a [FN][FL][LN][IN]
2: 0x7fa77096d000/0x7fa770985000/0x0 /lib/x86_64-linux-gnu/libpthread-2.24.so 29eacef9300ae49e471349173628703ccfeed6d7 [FN][FL][LN][IN]
Here 0x7fa77097e0bf - 0x7fa77096d000 is 0x110bf; disassembling the library in question I see:
...
0000000000011070 <_IO_funlockfile>:
__funlockfile():
/tmp/build-debs.Lfl5zt/glibc-2.24/nptl/../sysdeps/pthread/funlockfile.c:28
...
110b0: nop
110b1: nopl 0x0(%rax,%rax,1)
110b6: nopw %cs:0x0(%rax,%rax,1)
00000000000110c0 <__restore_rt>:
__restore_rt():
110c0: mov $0xf,%rax
110c7: syscall
110c9: nopl 0x0(%rax)
In other words, the IP value in question doesn't really make sense as far as I can tell.