Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: memory profiler over-accounts allocations of size >= sampling rate #26618

Closed
aalexand opened this Issue Jul 26, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@aalexand
Copy link
Contributor

aalexand commented Jul 26, 2018

$ go version
go version go1.10.3 linux/amd64

The test program is given below. It's output is below as well. For the test that allocates the data in 500 KiB chunks, there were 95.367 GiB allocated total and pprof reports matching "95.77GB" number. But for the test that uses 512 KiB chunks (which is == the default memory profiler sampling rate), there were 97.656 GiB allocated while the pprof output reports "154.49GB" which is 58% greater.

The issue is likely caused by the fact that in malloc.go allocations >= the sampling rate are sampled with the probability of 1 while unsampling in pprof/protomem.go is done unconditionally. And 1 / (1 - 1/2.71828) is 1.58.

$ go test -v -run 500 -memprofile mprof.out mprof_test.go && pprof -alloc_space -top mprof.out
=== RUN   TestMprof500
total allocated: 95.367 GiB
--- PASS: TestMprof500 (7.09s)
PASS
ok      command-line-arguments  7.097s
File: main.test
Type: alloc_space
Time: Jul 25, 2018 at 9:14pm (PDT)
Showing nodes accounting for 95.77GB, 100% of 95.77GB total
      flat  flat%   sum%        cum   cum%
   95.77GB   100%   100%    95.77GB   100%  command-line-arguments.alloc
         0     0%   100%    95.77GB   100%  command-line-arguments.TestMprof500
         0     0%   100%    95.77GB   100%  testing.tRunner


$ go test -v -run 512 -memprofile mprof.out mprof_test.go && pprof -alloc_space -top mprof.out
=== RUN   TestMprof512
total allocated: 97.656 GiB
--- PASS: TestMprof512 (7.22s)
PASS
ok      command-line-arguments  7.223s
File: main.test
Type: alloc_space
Time: Jul 25, 2018 at 9:13pm (PDT)
Showing nodes accounting for 154.49GB, 100% of 154.49GB total
Dropped 8 nodes (cum <= 0.77GB)
      flat  flat%   sum%        cum   cum%
  154.49GB   100%   100%   154.49GB   100%  command-line-arguments.alloc
         0     0%   100%   154.49GB   100%  command-line-arguments.TestMprof512
         0     0%   100%   154.49GB   100%  testing.tRunner
package main

import (
        "fmt"
        "testing"
)

func alloc(iter, size int) int {
        var a []byte
        for i := 0; i < iter; i++ {
                a = make([]byte, size)
        }
        return len(a) * iter
}

func TestMprof512(t *testing.T) {
        fmt.Printf("total allocated: %.3f GiB\n", float64(alloc(200000, 512*1024))/(1<<30))
}

func TestMprof500(t *testing.T) {
        fmt.Printf("total allocated: %.3f GiB\n", float64(alloc(200000, 500*1024))/(1<<30))
}

@rauls5382 @hyangah

@aalexand aalexand changed the title Heap profiler over-accounts allocations of size >= sampling rate Memory profiler over-accounts allocations of size >= sampling rate Jul 26, 2018

@ALTree ALTree changed the title Memory profiler over-accounts allocations of size >= sampling rate runtime/pprof: memory profiler over-accounts allocations of size >= sampling rate Jul 26, 2018

@ALTree ALTree added this to the Go1.12 milestone Jul 26, 2018

@rauls5382

This comment has been minimized.

Copy link
Contributor

rauls5382 commented Jul 27, 2018

Cool, so perhaps the fix is just to remove the rate check?

Last time I touched the go heapz sampling I wrote a testcase to validate the accuracy of the unsampling, but sadly does so only up to 64k :/

https://golang.org/test/heapsampling.go

@hyangah

This comment has been minimized.

Copy link
Contributor

hyangah commented Aug 2, 2018

I hoped to find the rational behind this always-sample-large-objects condition but failed to find the explanation. The condition was there from malloc.cgo era (cc @rsc). Meanwhile @rauls5382 implemented more robust sampling implementation based on poisson process but this part of code remained untouched, which I think is a bug.

Internally pprof team in Google verified C++ code that also uses the similar sampling mechanism doesn't have such rate check. I tried to remove the rate check and got more reasonable pprof result.

@rauls5382

This comment has been minimized.

Copy link
Contributor

rauls5382 commented Aug 3, 2018

Yes, I think that the always-sample-large-objects is a bug and should be removed.
We should also update the testcase to cover the larger allocations.

Feel free to assign this to me if you want me to prepare a commit for it.

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Aug 12, 2018

Change https://golang.org/cl/129117 mentions this issue: runtime: sample large heap allocations correctly

rauls5382 added a commit to rauls5382/go that referenced this issue Jan 17, 2019

runtime: sample large heap allocations correctly
Remove an unnecessary check on the heap sampling code that forced sampling
of all heap allocations larger than the sampling rate. This need to follow
a poisson process so that they can be correctly unsampled. Maintain a check
for MemProfileRate==1 to provide a mechanism for full sampling, as
documented in https://golang.org/pkg/runtime/#pkg-variables.

Additional testing for this change is on cl/129117.

Fixes golang#26618
@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Jan 17, 2019

Change https://golang.org/cl/158337 mentions this issue: runtime: sample large heap allocations correctly

@gopherbot gopherbot closed this in dc88902 Jan 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.