Skip to content

gccgo problems with pprof CPU profile #26595

@thanm

Description

@thanm

What version of Go are you using (go version)?

Tip for Go:
go version devel +48c79734ff Fri Jul 20 20:08:15 2018 +0000 linux/amd64

Tip for Gccgo
go version go1.10.3 gccgo (GCC) 9.0.0 20180724 (experimental) linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

Linux/amd64.

What did you do?

When collecting pprof CPU profiles using a gccgo-based toolchain I've noticed that there are unfriendly artifacts in the profile data, at least relative to what I see with the main Go compiler.

Here is a repro recipe. This uses "dwarf-check", a very small application that inspects the generated DWARF for ELF + MachoO binaries, looking for a specific class of bugs/insanities.

Step 1: download and build the app to profile:

 $ cd `go env GOPATH`
 $ go get github.com/thanm/dwarf-check
 $ cd github.com/thanm/dwarf-check
 $ go build .
 $ file dwarf-check
 dwarf-check: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
   statically linked, not stripped

Step 2: run the program to generate a profile. This invocation runs the program on itself, emitting a CPU profile to cpu.p:

 $ ./dwarf-check -v 1 -iters 3 -cpuprofile cpu.p ./dwarf-check
 initiating CPU profile
 in main
 loading ELF for ./dwarf-check
 loading DWARF for ./dwarf-check
 examining DWARF for ./dwarf-check
 read 142948 DIEs, processed 41499 abstract origin refs
 ...
 leaving main
 finishing CPU profile
 $

Step 3: inspect the profile using "pprof"

 $ pprof ./dwarf-check cpu.p
 File: dwarf-check
 Type: cpu
 ...
 $ top15 > top15.txt
 Generating report in top15.txt
 $ raw > raw.txt
 Generating report in raw.txt
 $ tree > tree.txt
 Generating report in tree.txt
 $ quit

Repeat the steps above with the main Go, then with gccgo.

For gccgo:

  • in the build step I used "go build -gccgoflags -static-libgo .", mainly to create a binary with enough interesting DWARF in it

NB: I am using upstream 'pprof', in case that matters (e.g. "go get -u github.com/google/pprof").

What did you expect to see?

For the profile collected with the main Go compiler, the top functions in the rofile look like:

   Showing top 15 nodes out of 61
      flat  flat%   sum%        cum   cum%
     310ms 13.30% 13.30%      800ms 34.33%  runtime.mallocgc
     230ms  9.87% 23.18%      230ms  9.87%  runtime.heapBitsSetType
     200ms  8.58% 31.76%     1390ms 59.66%  debug/dwarf.(*buf).entry
     200ms  8.58% 40.34%      230ms  9.87%  runtime.mapaccess2_fast32
     120ms  5.15% 45.49%      230ms  9.87%  runtime.mapassign_fast64
     100ms  4.29% 49.79%      100ms  4.29%  runtime.memclrNoHeapPointers
      80ms  3.43% 53.22%      120ms  5.15%  debug/dwarf.(*buf).string
      60ms  2.58% 55.79%     1460ms 62.66%  debug/dwarf.(*Reader).Next
      60ms  2.58% 58.37%       60ms  2.58%  runtime.nextFreeFast (inline)
      ...

What did you see instead?

For gccgo, I see the following top function in the profile:

Showing top 15 nodes out of 86
      flat  flat%   sum%        cum   cum%
     2.73s 98.91% 98.91%      2.73s 98.91%  runtime.sigprof.isra.164
     0.01s  0.36% 99.28%      0.02s  0.72%  [dwarf-check.gccgo]
         0     0% 99.28%      2.73s 98.91%  _IO_funlockfile
         0     0% 99.28%      0.10s  3.62%  __nss_passwd_lookup
         0     0% 99.28%      0.02s  0.72%  debug_dwarf.Data.parseAbbrev
         0     0% 99.28%      0.02s  0.72%  debug_dwarf.Data.parseUnits
         0     0% 99.28%      0.02s  0.72%  debug_dwarf.New
         0     0% 99.28%      1.23s 44.57%  debug_dwarf.Reader.Next

The first problem seems to be that the sigprof handler itself is showing up in the profile... seems to me that this should not be happening, based on what I see in the regular Go CPU profile.

The second problem seems to be that there (apparently) bogus entries in the call stacks that are attributed to routines in libc.so, which I am pretty sure are not really being invoked (e.g. _IO_funlockfile).

Here is an excerpt from the "raw" dump from the gccgo run:

PeriodType: cpu nanoseconds
Period: 10000000
Time: 2018-07-25 09:01:46.066810243 -0400 EDT
Duration: 2.51
Samples:
samples/count cpu/nanoseconds
          1   10000000: 1 2 2 3 4 5 6 7 8 9 10 11 12 
          1   10000000: 1 2 2 3 4 13 14 15 16 17 18 11 12 
          1   10000000: 1 2 2 3 4 19 20 21 16 17 18 11 12 
          1   10000000: 1 2 2 3 4 22 16 17 18 11 12 
          9   90000000: 1 2 2 3 4 23 24 25 26 27 28 29 
          1   10000000: 1 2 2 3 4 30 30 31 32 16 17 18 11 12 

and the corresponding locations ("1", "2" above, etc) from later in the dump:

Locations
     1: 0x4c79ef M=1 runtime.sigprof.isra.164 /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/proc.go:3421 s=0
     2: 0x4d95bc M=1 runtime.sighandler /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/signal_sighandler.go:35 s=0
             runtime.sigtrampgo /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/signal_unix.go:312 s=0
     3: 0x569c06 M=1 runtime.sigtramp /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/runtime/go-signal.c:131 s=0
     4: 0x7fa77097e0bf M=2 _IO_funlockfile ??:?:0 s=0
     5: 0x4df1f4 M=1 runtime.mapassign /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/runtime/hashmap.go:583 s=0
     6: 0x42cdd8 M=1 debug_dwarf.Data.parseAbbrev /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/debug/dwarf/entry.go:89 s=0
     7: 0x42d860 M=1 debug_dwarf.Data.parseUnits /ssd/gcc-trunk/build-gcc/x86_64-pc-linux-gnu/libgo/../../../gcc-trunk/libgo/go/debug/dwarf/unit.go:77 s=0

All of the entries in the profile start with the same sequence:

        1   10000000: 1 2 2 3 4 <N> ...

where is the actual function executing at the time SIGPROF was delivered. Location 4 is the suspicious one -- the IP is recorded as 0x7fa77097e0bf, which falls into the libc mapping:

1: 0x400000/0x74e000/0x0 ./dwarf-check.gccgo 52734d70535f365f564c596b3853416e645572522f5f56623832316f347a6f75636f6c6e4c6644454d2f346a4170633677515135414a714d724d4876542d2f5239433562436a50584756325f72356c45735a4a [FN][FL][LN][IN]
2: 0x7fa77096d000/0x7fa770985000/0x0 /lib/x86_64-linux-gnu/libpthread-2.24.so 29eacef9300ae49e471349173628703ccfeed6d7 [FN][FL][LN][IN]

Here 0x7fa77097e0bf - 0x7fa77096d000 is 0x110bf; disassembling the library in question I see:

  ...
  0000000000011070 <_IO_funlockfile>:
  __funlockfile():
  /tmp/build-debs.Lfl5zt/glibc-2.24/nptl/../sysdeps/pthread/funlockfile.c:28
  ...
     110b0:	nop
     110b1:	nopl   0x0(%rax,%rax,1)
     110b6:	nopw   %cs:0x0(%rax,%rax,1)

  00000000000110c0 <__restore_rt>:
  __restore_rt():
     110c0:	mov    $0xf,%rax
     110c7:	syscall 
     110c9:	nopl   0x0(%rax)

In other words, the IP value in question doesn't really make sense as far as I can tell.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions