Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pprof: ignores symbols in symbolized profile with no mappings #120

Closed
aalexand opened this issue Mar 15, 2017 · 19 comments · Fixed by #135
Closed

pprof: ignores symbols in symbolized profile with no mappings #120

aalexand opened this issue Mar 15, 2017 · 19 comments · Fixed by #135
Assignees

Comments

@aalexand
Copy link
Collaborator

The issue was noticed in #111.

Test server:

package main

import (
  "log"
  "net/http"
  _ "net/http/pprof"
)

func run() {
  for {
  }
}

func main() {
  go func() {
    log.Println(http.ListenAndServe("localhost:6060", nil))
  }()
  run()
}

Running it as go run server.go and profiling as "pprof -symbolize=remote localhost:6060/debug/pprof/profile?seconds=5" produces unknown symbol on OSX with Go 1.8. Doing the same with Go 1.7.5 produces correctly symbolized output.

Looking at "-raw" output for the Go 1.8 vs. 1.7.5 profiles, there is a difference in the mapping section:

Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile?seconds=5
Saved profile in /Users/aalexand/pprof/pprof.samples.cpu.011.pb.gz
Type: cpu
Time: Mar 15, 2017 at 1:08pm (PDT)
Duration: 5s, Total samples = 4.29s (85.74%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) raw
PeriodType: cpu nanoseconds
Period: 10000000
Time: 2017-03-15 13:08:32.787861506 -0700 PDT
Duration: 5.00
Samples:
samples/count cpu/nanoseconds
         83  830000000: 1 2 3 4 
          1   10000000: 5 6 7 3 4 
        345 3450000000: 8 9 10 11 
Locations
     1: 0x105adc6 
     2: 0x1037612 
     3: 0x10309cd 
     4: 0x10308a3 
     5: 0x105abc6 
     6: 0x105abf4 
     7: 0x1037726 
     8: 0x1311630 
     9: 0x1311679 
    10: 0x102dad9 
    11: 0x1059af0 
Mappings

with Go 1.8 vs.

Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile?seconds=5
Saved profile in /Users/aalexand/pprof/pprof.samples.cpu.012.pb.gz
Type: cpu
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) raw
PeriodType: cpu nanoseconds
Period: 10000000
Samples:
samples/count cpu/nanoseconds
        388 3880000000: 1 2 3 4 
         37  370000000: 5 6 7 8 
Locations
     1: 0x2040 M=1 main.run :0 s=0
     2: 0x2089 M=1 main.main :0 s=0
     3: 0x2ddd3 M=1 runtime.main :0 s=0
     4: 0x5d930 M=1 runtime.goexit :0 s=0
     5: 0x5e8b6 M=1 runtime.usleep :0 s=0
     6: 0x377d1 M=1 runtime.sysmon :0 s=0
     7: 0x30c1d M=1 runtime.mstart1 :0 s=0
     8: 0x30af3 M=1 runtime.mstart :0 s=0
Mappings
1: 0x0/0xffffffffffffffff/0x0   [FN]

with Go 1.7.5.

Doing the same on Linux, produces

Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile?seconds=5
Saved profile in /usr/local/google/home/aalexand/pprof/pprof.server.samples.cpu.005.pb.gz
File: server
Type: cpu
Time: Mar 15, 2017 at 1:16pm (PDT)
Duration: 5s, Total samples = 4.99s (99.79%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) raw
PeriodType: cpu nanoseconds
Period: 10000000
Time: 2017-03-15 13:16:34.995513521 -0700 PDT
Duration: 5.00
Samples:
samples/count cpu/nanoseconds
        499 4990000000: 1 2 3 4 
Locations
     1: 0x70fb00 M=1 main.run :0 s=0
     2: 0x70fb49 M=1 main.main :0 s=0
     3: 0x42dab9 M=1 runtime.main :0 s=0
     4: 0x45a4f0 M=1 runtime.goexit :0 s=0
Mappings
1: 0x400000/0x710000/0x0 /tmp/go-build929989772/command-line-arguments/_obj/exe/server  [FN]

and

Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile?seconds=5
Saved profile in /usr/local/google/home/aalexand/pprof/pprof.server.samples.cpu.004.pb.gz
File: server
Type: cpu
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) raw
PeriodType: cpu nanoseconds
Period: 10000000
Samples:
samples/count cpu/nanoseconds
        500 5000000000: 1 2 
Locations
     1: 0x401000 M=1 main.run :0 s=0
     2: 0x45ddc0 M=1 runtime.goexit :0 s=0
Mappings
1: 0x400000/0x699000/0x0 /tmp/go-build092539107/command-line-arguments/_obj/exe/server  [FN]

respectively that are both correctly symbolized profiles.

@tamird
Copy link
Contributor

tamird commented Mar 15, 2017

@aalexand perhaps it's worthwhile to revert #114 since this isn't specific to the changes made there.

@rauls5382 rauls5382 self-assigned this Mar 15, 2017
aalexand added a commit to aalexand/pprof that referenced this issue Mar 15, 2017
This reverts the revert of google#111 relaxing the test to make it pass as the
issue reported in google#111 is caused by unrelated google#120.

Added a test, verified the test failed before the fix.

This fixes google#94.
aalexand added a commit that referenced this issue Mar 15, 2017
* Recognize (again) https+insecure scheme for symbolization POSTs.

This reverts the revert of #111 relaxing the test to make it pass as the
issue reported in #111 is caused by unrelated #120.

Added a test, verified the test failed before the fix.

This fixes #94.

* Fix the staticcheck failure.
@rauls5382
Copy link
Contributor

The profile coming from the 1.8 handler on OSX is unsymbolized and has no mappings.
I'd prefer all Go profiles to be symbolized online. Is online symbolization available on OSX? If not, we would need a mapping in the profile to enable offline symbolization.

Can somebody from the Go distro take a look?

@hyangah
Copy link
Contributor

hyangah commented Mar 30, 2017

After cherry-picking https://go-review.googlesource.com/34192
it seems to work for me.
I don't understand how it worked on Linux given that go1.8 release branch doesn't include the change.

Filed golang/go#19786

@rauls5382
Copy link
Contributor

I assume @aalexand tried this on a linux system that was using a patched version.
Closing as there is no work to do for this in pprof.

@tamird
Copy link
Contributor

tamird commented Mar 30, 2017

@rauls5382 I don't understand your comment. I originally reported this issue in #111 and I was definitely using stock toolchains in both Linux and Darwin.

@rauls5382
Copy link
Contributor

Yes, the issue is on all platforms. The comment from @aalexand at the beginning of this issue suggests that this only happens on OSX, but that appears not to be the case.

@tamird
Copy link
Contributor

tamird commented Mar 30, 2017

I mean, that's nice, but it's factually incorrect. The problem does not occur on Linux on Go 1.8.

I think this should be reopened and investigated more deeply.

@rauls5382
Copy link
Contributor

I see. I saw your comment on #111 thinking that this reproduced on both, but you were referring to the original issue, not the symbolization failure.

From @aalexand investigation seems the difference is that the OSX profile has no mappings, so pprof cannot symbolize it. Moving forward we want online symbolization in Go anyway, but it may be worth opening an issue against go about they not having mappings on OSX -- those will be useful to symbolize non-Go symbols offline.

@tamird
Copy link
Contributor

tamird commented Mar 30, 2017

Great, thanks for that.

Could you mention that issue when you've filed it?

@hyangah
Copy link
Contributor

hyangah commented Mar 30, 2017

I see. OSX is still missing the mappings section with Go tip.

(pprof) raw
PeriodType: cpu nanoseconds
Period: 100000
Time: 2017-03-30 12:10:20.33628219 -0400 EDT
Duration: 5.19
Samples:
samples/count cpu/nanoseconds
        436   43600000: 1 2 3 
          1     100000: 4 
          1     100000: 5 6 7 8 
Locations
     1: 0x12dc6e0 main.run /Users/hakim/scratch/server.go:10 s=0
     2: 0x12dc729 main.main /Users/hakim/scratch/server.go:18 s=0
     3: 0x102d6d4 runtime.main /Users/hakim/go/src/runtime/proc.go:185 s=0
     4: 0x12d42f8 runtime/pprof.profileWriter /Users/hakim/go/src/runtime/pprof/pprof.go:706 s=0
     5: 0x105c5c6 runtime.usleep /Users/hakim/go/src/runtime/sys_darwin_amd64.s:323 s=0
     6: 0x1037775 runtime.sysmon /Users/hakim/go/src/runtime/proc.go:3741 s=0
     7: 0x103090d runtime.mstart1 /Users/hakim/go/src/runtime/proc.go:1162 s=0
     8: 0x10307e3 runtime.mstart /Users/hakim/go/src/runtime/proc.go:1132 s=0
Mappings

@aalexand
Copy link
Collaborator Author

aalexand commented Mar 30, 2017

@hyangah can you file a Go bug for this? Or I can do it - LMK, want to make sure we don't race.

@hyangah
Copy link
Contributor

hyangah commented Mar 30, 2017

Yes, I can.
Can you provide a pointer to how pprof retrieved Mappings with go1.7? (pre-proto era, I guess)

Clearly, https://github.com/golang/go/blob/master/src/runtime/pprof/proto.go#L370 is not for osx.

@aalexand
Copy link
Collaborator Author

I think go 1.7 generated the profile in a legacy format - https://github.com/golang/go/blob/go1.7.4/src/runtime/pprof/pprof.go#L683, and that has special handling for when there aren't any mappings -

// This is important for the Go legacy handler, which produced
.

In #89 there was a workaround added by @rauls5382 to create a fake mapping for non-legacy (i.e. proto) profiles as well but only when there is a binary explicitly specified on the command line. We could probably add the fake mapping if the profile source is remote but this seems a fragile hack.

@tamird
Copy link
Contributor

tamird commented Mar 30, 2017

Indeed, 1.8 removed the legacy format: golang/go@76f12cd

@tamird
Copy link
Contributor

tamird commented Apr 3, 2017

So uh, time to reopen this?

@rsc
Copy link
Contributor

rsc commented Apr 3, 2017

@tamird and @rauls5382, I've confirmed that the current pprof (what you get from go get -u github.com/google/pprof) does not work properly with profiles that are symbolized but have no mappings, while the pprof we shipped in Go 1.8 does work fine.

So @tamird, you should use 'go tool pprof' instead of 'pprof', and @rauls5382, you may want to fix / reassign this.

First window:

$ cat /tmp/x.go
package main

import (
	"crypto/rand"
	"log"
	"net/http"
	_ "net/http/pprof"
)

func main() {
	go func() {
		for {
			rand.Prime(rand.Reader, 1024)
		}
	}()
	log.Fatal(http.ListenAndServe("localhost:8000", nil))
}
$ go run /tmp/x.go

Second window.

Go tool pprof works:

$ go tool pprof -symbolize=remote http://localhost:8000/debug/pprof/profile?seconds=5
Fetching profile from http://localhost:8000/debug/pprof/profile?seconds=5
Please wait... (5s)
Saved profile in /Users/rsc/pprof/pprof.localhost:8000.samples.cpu.002.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top10
3.94s of 3.97s total (99.24%)
Dropped 3 nodes (cum <= 0.02s)
Showing top 10 nodes out of 41 (cum >= 0.04s)
      flat  flat%   sum%        cum   cum%
     1.92s 48.36% 48.36%      1.92s 48.36%  math/big.addMulVVW
     0.63s 15.87% 64.23%      0.63s 15.87%  runtime.usleep
     0.45s 11.34% 75.57%      2.73s 68.77%  math/big.nat.montgomery
     0.45s 11.34% 86.90%      0.45s 11.34%  syscall.Syscall
     0.32s  8.06% 94.96%      0.32s  8.06%  runtime.memmove
     0.05s  1.26% 96.22%      0.05s  1.26%  runtime.kevent
     0.04s  1.01% 97.23%      2.79s 70.28%  math/big.nat.expNNMontgomery
     0.03s  0.76% 97.98%      0.03s  0.76%  math/rand.(*rngSource).Seed
     0.03s  0.76% 98.74%      0.03s  0.76%  runtime.memclrNoHeapPointers
     0.02s   0.5% 99.24%      0.04s  1.01%  math/big.nat.divLarge
(pprof)

but current github.com/google/pprof does not:

$ pprof -symbolize=remote http://localhost:8000/debug/pprof/profile?seconds=5
Fetching profile over HTTP from http://localhost:8000/debug/pprof/profile?seconds=5
Saved profile in /Users/rsc/pprof/pprof.samples.cpu.001.pb.gz
Type: cpu
Time: Apr 3, 2017 at 11:51am (EDT)
Duration: 5s, Total samples = 4.10s (81.94%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 4.10s, 100% of 4.10s total
      flat  flat%   sum%        cum   cum%
     4.10s   100%   100%      4.10s   100%  <unknown>
(pprof) 

Dropping -symbolize=remote does not change the behavior.

@rsc rsc reopened this Apr 3, 2017
@rsc rsc changed the title Profiling debug/pprof/profile of a Go 1.8 server on OSX produces unsymbolized data pprof: ignores symbols in symbolized profile with no mappings Apr 3, 2017
@hyangah
Copy link
Contributor

hyangah commented Apr 3, 2017

https://github.com/golang/go/blob/release-branch.go1.8/src/cmd/pprof/pprof.go vs
https://github.com/google/pprof/blob/master/pprof.go

So, go1.8 pprof uses symbolz.Symbolize if --symbolize is remote, "", or force.

@rauls5382, is it not possible to fall back to symbolz?

@aalexand
Copy link
Collaborator Author

@rauls5382, do you plan to look at this issue? Feel free to unassign if not.

@rauls5382
Copy link
Contributor

Apologies for the long latency.

In profile.proto the samples that came from a single binary are grouped together through the mapping. It would be good if the Go runtime would generate a mapping for the samples and associate them to the binary (the result of os.Executable()). That would relieve the user from having to pass the name of the binary to pprof for offline symbolization.

More generally, the mappings would be useful for a handler returning profiles with samples from two different binaries, for example for cross-language support.

pprof relies on the mappings to drive symbolization. It creates a single mapping for the legacy Go profiles. I was hoping to avoid doing that when handling profile.proto profiles, but we already do it in some cases if there are no mappings. I'll send shortly a commit to do this all the time, which will resolve the problem.

@rsc, would you consider adding mappings to the profiles generated in Go? Even if some of the fields aren't meaningful on some platforms, the mapping still has value. This may be moot with online symbolization in Go, but it may still prove valuable for cross-language support.

rauls5382 added a commit to rauls5382/pprof that referenced this issue Apr 18, 2017
If a profile has mappings but no profiles, pprof may be unable to
symbolize it offline, as it uses the mappings to keep track of which
locations need symbolization.

This fixes google#120
aalexand added a commit to aalexand/pprof that referenced this issue Apr 18, 2017
If a profile has mappings but no profiles, pprof may be unable to
symbolize it offline, as it uses the mappings to keep track of which
locations need symbolization.

This fixes google#120.

Added the test, verified it fails on Mac with Go 1.7 before the fix, and
passes with the fix. The test is done by augmenting the existing test
for handling https+insecure:// schema in URLs. This is a bit vague but I
figured that this test needed an updated anyway since as we moved it
recently we stopped exercising the symbolization as part of the test
which was its original intention in fixing google#94. Can split the tests if
things do look too ugly.
rauls5382 pushed a commit that referenced this issue Apr 19, 2017
* Create a fake mapping for profile.proto profiles

If a profile has mappings but no profiles, pprof may be unable to
symbolize it offline, as it uses the mappings to keep track of which
locations need symbolization.

This fixes #120.

Added the test, verified it fails on Mac with Go 1.7 before the fix, and
passes with the fix. The test is done by augmenting the existing test
for handling https+insecure:// schema in URLs. This is a bit vague but I
figured that this test needed an updated anyway since as we moved it
recently we stopped exercising the symbolization as part of the test
which was its original intention in fixing #94. Can split the tests if
things do look too ugly.

* Fix the test to include the failed regex matching error in the message.
gmarin13 pushed a commit to gmarin13/pprof that referenced this issue Dec 17, 2020
* Create a fake mapping for profile.proto profiles

If a profile has mappings but no profiles, pprof may be unable to
symbolize it offline, as it uses the mappings to keep track of which
locations need symbolization.

This fixes google#120.

Added the test, verified it fails on Mac with Go 1.7 before the fix, and
passes with the fix. The test is done by augmenting the existing test
for handling https+insecure:// schema in URLs. This is a bit vague but I
figured that this test needed an updated anyway since as we moved it
recently we stopped exercising the symbolization as part of the test
which was its original intention in fixing google#94. Can split the tests if
things do look too ugly.

* Fix the test to include the failed regex matching error in the message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants