Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Windows: pprof output is broken in cgo packages for go test -cpuprofile #18416

Closed
smithjacobj opened this issue Dec 22, 2016 · 14 comments

Comments

Projects
None yet
6 participants
@smithjacobj
Copy link

commented Dec 22, 2016

Old issue: #16891

I'm concerned that even with embedded symbols in 1.8, this will still be an issue.

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    1.7 windows/386
  2. What operating system and processor architecture are you using (go env)?
    Windows, 386
  3. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.
    A link on play.golang.org is best.
    run go test on a package with the -cpuprofile flag.
  4. What did you expect to see?
    Using output from 1.6.3, I get the following result from pprof top10:
(pprof) top10
2470ms of 2880ms total (85.76%)
Dropped 61 nodes (cum <= 14.40ms)
Showing top 10 nodes out of 91 (cum >= 40ms)
      flat  flat%   sum%        cum   cum%
    1440ms 50.00% 50.00%     1720ms 59.72%  <redacted>
     540ms 18.75% 68.75%      540ms 18.75%  runtime.cgocall
     210ms  7.29% 76.04%      210ms  7.29%  runtime.osyield
      50ms  1.74% 77.78%      190ms  6.60%  <redacted>
      50ms  1.74% 79.51%      100ms  3.47%  runtime.mallocgc
      40ms  1.39% 80.90%       40ms  1.39%  runtime.heapBitsForObject
      40ms  1.39% 82.29%       40ms  1.39%  runtime.heapBitsSetType
      40ms  1.39% 83.68%       80ms  2.78%  runtime.mapaccess2
      30ms  1.04% 84.72%       30ms  1.04%  runtime.scanobject
      30ms  1.04% 85.76%       40ms  1.39%  runtime.typedmemmove
  1. What did you see instead?
    Using output from 1.7, I get the following result from pprof top10
(pprof) top10
2.87s of 2.87s total (  100%)
      flat  flat%   sum%        cum   cum%
     2.87s   100%   100%      2.87s   100%  [traffic.test.exe]

The issue seems to be related to use of the github.com/google/gopacket/pcap package, which binds to libpcap via CGo

I tested with a bare CGo sample, and also a control.

Control results:

$ go tool pprof cgo_17_bug_control.test.exe cpu.prof
Entering interactive mode (type "help" for commands)
(pprof) top10
1650ms of 2040ms total (80.88%)
Dropped 4 nodes (cum <= 10.20ms)
Showing top 10 nodes out of 27 (cum >= 50ms)
      flat  flat%   sum%        cum   cum%
     370ms 18.14% 18.14%      370ms 18.14%  sync/atomic.AddUint32
     310ms 15.20% 33.33%      310ms 15.20%  sync/atomic.CompareAndSwapUint32
     210ms 10.29% 43.63%     1360ms 66.67%  math/rand.(*lockedSource).Int63
     210ms 10.29% 53.92%      210ms 10.29%  math/rand.(*rngSource).Int63
     130ms  6.37% 60.29%      440ms 21.57%  sync.(*Mutex).Lock
     120ms  5.88% 66.18%     1480ms 72.55%  math/rand.(*Rand).Int63
     120ms  5.88% 72.06%      500ms 24.51%  sync.(*Mutex).Unlock
      70ms  3.43% 75.49%      370ms 18.14%  runtime.newstack
      60ms  2.94% 78.43%     1610ms 78.92%  scratch/cgo_17_bug_control.grind
      50ms  2.45% 80.88%       50ms  2.45%  runtime.casgstatus
(pprof)

Bug results:

$ go tool pprof cgo_17_bug.test.exe cpu.prof
Entering interactive mode (type "help" for commands)
(pprof) top10
1.33s of 1.33s total (  100%)
      flat  flat%   sum%        cum   cum%
     1.33s   100%   100%      1.33s   100%  [cgo_17_bug.test.exe]
(pprof)

Control:

impl

package cgo_17_bug_control

import (
	"math/rand"
)

func grind(in int) (out int) {
	for i := 0; i < 1000; i++ {
		out += in + rand.Int()
	}
	return
}

test

package cgo_17_bug_control

import (
	"testing"
)

func BenchmarkGrind(b *testing.B) {
	for i := 0; i < b.N; i++ {
		grind(i)
	}
}

Bug

impl

package cgo_17_bug

// #include <stdlib.h>
//
// unsigned int grind(unsigned int in) {
// 	unsigned int out = 0;
// 	for (int i = 0; i < 1000; i++) {
// 		out += in + rand();
// 	}
// 	return out;
// }
import "C"

func grind(in uint) (out uint) {
	return uint(C.grind(C.uint(in)))
}

test

package cgo_17_bug

import (
	"testing"
)

func BenchmarkGrind(b *testing.B) {
	for i := 0; i < b.N; i++ {
		grind(uint(i))
	}
}
@ALTree

This comment has been minimized.

Copy link
Member

commented Dec 22, 2016

On the old issue, rsc wrote:

I'm going to close this since I believe Go 1.8 will fix it.

Now you write:

I'm concerned that even with embedded symbols in 1.8, this will still be an issue.

But there's no mention of go1.8 in this bug report(?). You only show results from 1.6 and 1.7.

Did you try tip (the current master, soon-to-be go1.8)? Can you reproduce this on tip?

@smithjacobj

This comment has been minimized.

Copy link
Author

commented Dec 22, 2016

I made sure to remove the existing binary and profile.

PS C:\Users\Jacob> go version
go version devel +1719034 Thu Dec 22 09:05:25 2016 +0000 windows/386

PS D:\dev\go\src\scratch\cgo_17_bug> go test -bench . -cpuprofile cpu.prof
BenchmarkGrind-12         200000             11915 ns/op
PASS
ok      scratch/cgo_17_bug      2.674s

PS D:\dev\go\src\scratch\cgo_17_bug> go tool pprof .\cgo_17_bug.test.exe .\cpu.prof
Entering interactive mode (type "help" for commands)
(pprof) top10
2.50s of 2.50s total (  100%)
      flat  flat%   sum%        cum   cum%
     2.50s   100%   100%      2.50s   100%  [cgo_17_bug.test.exe]
(pprof)
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 22, 2016

I copied the "bug" versions of your file but changed the package to issue18416.

> go1.7 test -c
> ./issue18416.test -test.bench=. -test.cpuprofile=y
testing: warning: no tests to run
BenchmarkGrind-4   	  100000	     16649 ns/op
PASS
> go1.7 tool pprof ./issue18416.test ./y
Entering interactive mode (type "help" for commands)
(pprof) top10
1.88s of 1.88s total (  100%)
Showing top 10 nodes out of 11 (cum >= 1.88s)
      flat  flat%   sum%        cum   cum%
     1.87s 99.47% 99.47%      1.88s   100%  runtime.cgocall
     0.01s  0.53%   100%      0.01s  0.53%  runtime.newdefer
         0     0%   100%      1.88s   100%  issue18416.BenchmarkGrind
         0     0%   100%      1.88s   100%  issue18416._Cfunc_grind
         0     0%   100%      1.88s   100%  issue18416.grind
         0     0%   100%      0.01s  0.53%  runtime.deferproc
         0     0%   100%      0.01s  0.53%  runtime.deferproc.func1
         0     0%   100%      1.88s   100%  runtime.goexit
         0     0%   100%      0.01s  0.53%  runtime.systemstack
         0     0%   100%      1.88s   100%  testing.(*B).launch

With current tip:

> go test -c
> ./issue18416.test -test.bench=. -test.cpuprofile=x
BenchmarkGrind-4   	  100000	     16572 ns/op
PASS
> go tool pprof ./issue18416.test ./x
Entering interactive mode (type "help" for commands)
(pprof) top10
1.88s of 1.88s total (  100%)
      flat  flat%   sum%        cum   cum%
     1.88s   100%   100%      1.88s   100%  runtime.cgocall
         0     0%   100%      1.88s   100%  issue18416.BenchmarkGrind
         0     0%   100%      1.88s   100%  issue18416._Cfunc_grind
         0     0%   100%      1.88s   100%  issue18416.grind
         0     0%   100%      1.88s   100%  runtime.goexit
         0     0%   100%      1.88s   100%  testing.(*B).launch
         0     0%   100%      1.88s   100%  testing.(*B).runN

I am not seeing the result you report, where you get no results.

However, I am running on GNU/Linux, so this may be somehow Windows-specific.

@ianlancetaylor ianlancetaylor added this to the Go1.9 milestone Dec 22, 2016

@smithjacobj

This comment has been minimized.

Copy link
Author

commented Dec 22, 2016

However, I am running on GNU/Linux, so this may be somehow Windows-specific.

I very much so have this feeling as well.

@ianlancetaylor ianlancetaylor changed the title runtime: pprof output is broken in cgo packages for go test -cpuprofile runtime: Windows: pprof output is broken in cgo packages for go test -cpuprofile Dec 22, 2016

@smithjacobj

This comment has been minimized.

Copy link
Author

commented Dec 22, 2016

git-bisect result:

ed8f0e5c33269de2f950d33ab7d50554b13f336e is the first bad commit
commit ed8f0e5c33269de2f950d33ab7d50554b13f336e
Author: Christopher Nelson <nadiasvertex@gmail.com>
Date:   Sun Dec 13 08:02:29 2015 -0500

    cmd/go: fix -buildmode=c-archive should work on windows

    Add supporting code for runtime initialization, including both
    32- and 64-bit x86 architectures.

    Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors
    section to plug in to the GCC C/C++ startup initialization mechanism.
    This allows the Go runtime to initialize itself. Add .text section
    symbol for .ctor relocations. Note: This is unlikely to be useful for
    MSVC-based toolchains.

    Fixes #13494

    Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3
    Reviewed-on: https://go-review.googlesource.com/18057
    Reviewed-by: Ian Lance Taylor <iant@golang.org>
    Run-TryBot: Ian Lance Taylor <iant@golang.org>

:040000 040000 e368e9998a86add3523c76cea45d9cd61cdf87eb be48c01f20dbab2f7fd633db                                        f461aa2c70f01a42 M      misc
:040000 040000 3a984374ad0ce1fef4f93648cf8dd48c1d431d00 65a4eb18ac7d7d58dab4ad22                                        207555a993cc722a M      src
@alexbrainman

This comment has been minimized.

Copy link
Member

commented Jan 3, 2017

I can reproduce this too. Only on windows/386, and, yes, it is broken by ed8f0e5. I have built issue18416.test.exe with go test -c on commit before ed8f0e5 (good.exe) and on the tip (bad.exe). It appears

C:\dev\src\issues\issue18416>objdump -t bad.exe | grep pcln
File STDIN:
[1158](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00126000 _runtime.pclntab
[1332](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0018465b _runtime.epclntab

C:\dev\src\issues\issue18416>objdump -t good.exe | grep pcln
File STDIN:
[1381](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0015b9e0 runtime.epclntab
[3519](sec  1)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0010daa0 runtime.pclntab

C:\dev\src\issues\issue18416>

that all PE symbols in bad.exe have superfluous underscore in front of them.

So cmd/internal/objfile won't find any symbols in that executable. All correspondent go tools (nm, objdump, maybe others) would be broken too. This is for cgo only. We have a test for this in cmd/nm, but it does not test cgo.

I will let others decide what to do here. Please, let me know if I can help in any way. Thank you.

CC @nadiasvertex

Alex

@nadiasvertex

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2017

The underscores are necessary on that version of Windows. That's why they are there. They are not superfluous. We have had this discussion before.

@alexbrainman

This comment has been minimized.

Copy link
Member

commented Jan 3, 2017

The underscores are necessary on that version of Windows.

Why are they necessary?

We have had this discussion before.

Perhaps. I am an old man. I forget things.

Do you plan to fix broken pprof tool? And nm and objdump?

Thank you.

Alex

@nadiasvertex

This comment has been minimized.

Copy link
Contributor

commented Jan 5, 2017

The mangling scheme was established by Microsoft, and has been informally followed by other compilers including Digital Mars, Borland, and GNU GCC, when compiling code for the Windows platforms. The scheme even applies to other languages, such as Pascal, D, Delphi, Fortran, and C#. This allows subroutines written in those languages to call, or be called by, existing Windows libraries using a calling convention different from their default.

When compiling the following C examples:

int _cdecl    f (int x) { return 0; }
int _stdcall  g (int y) { return 0; }
int _fastcall h (int z) { return 0; }

32 bit compilers emit, respectively:

_f
_g@4
@h@4

In the stdcall and fastcall mangling schemes, the function is encoded as _name@X and @name@X respectively, where X is the number of bytes, in decimal, of the argument(s) in the parameter list (including those passed in registers, for fastcall). In the case of cdecl, the function name is merely prefixed by an underscore.

The 64-bit convention on Windows (Microsoft C) has no leading underscore.


I do not plan on fixing the broken tools, since that is the responsibility of the people that made them. I don't like the convention Microsoft established, but that is nonetheless what they did.

@alexbrainman

This comment has been minimized.

Copy link
Member

commented Jan 8, 2017

@nadiasvertex thanks for explaining. I will try fix this issue myself.

Alex

@gopherbot

This comment has been minimized.

Copy link

commented Jan 13, 2017

CL https://golang.org/cl/35076 mentions this issue.

@alexbrainman

This comment has been minimized.

Copy link
Member

commented Jan 13, 2017

@smithjacobj could you try this change https://go-review.googlesource.com/#/c/35076/ to see if it fixes your problem. Thank you.

Alex

@smithjacobj

This comment has been minimized.

Copy link
Author

commented Feb 1, 2017

@alexbrainman apologies for the delay, I didn't noticed the mention. I have tested with the patch and it appears to resolve the issue.

@alexbrainman

This comment has been minimized.

Copy link
Member

commented Feb 2, 2017

I have tested with the patch and it appears to resolve the issue.

Thank you for confirming.

Alex

@gopherbot gopherbot closed this in c7a7c5a Feb 4, 2017

@golang golang locked and limited conversation to collaborators Feb 4, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.