Skip to content

proposal: runtime/pprof: add “heaptime” bytes*GCs memory profile #55900

Open
@aclements

Description

@aclements

I propose we add a new “heaptime” profile type to runtime/pprof. This is equivalent to the existing allocs profile, except that allocations are scaled by the number of GC cycles they survive. Hence, the profile would measure “space” in bytes*GCs and “objects” in objects*GCs. This can also be viewed as the “inuse” profile integrated over time.

This type of profile would help with understanding applications where a CPU profile shows significant time spent in GC scanning/marking by attributing the cause of the time. GC marking time scales with the number of GC cycles an object survives and its size, so an object that survives no cycles costs essentially nothing to GC, while an object that survives many cycles costs a lot in total. A heaptime profile would show users which allocations to reduce in order to reduce marking time.

For example, this came up in the context of improving compiler performance. The compiler spends a significant amount of CPU time in marking, but looking at the existing allocs profile is a poor proxy for attributing mark time.

One downside is that this profile would not be totally representative of marking costs, and hence may be misleading. For one, marking is more closely proportional to the “scan size” of an object, which is the size up to an object’s last pointer. Pointer-free objects are very cheap to mark, even if they are very large. It also wouldn’t capture time spent scanning non-heap objects like stacks and globals. Capturing these would require complicating the definition (and implementation) of the profile, but may be doable.

Taking this downside to the extreme, if we were to substantially change the GC algorithm, this type of profile may no longer reflect marking time at all. For example, in a generational garbage collector, it’s not entirely clear what this profile would mean (perhaps you would count the number of minor cycles an object survived while young plus the number of major cycles it survived while old, but that’s certainly harder to explain).

I believe this would be straightforward to implement. We would add the current GC cycle count to runtime.specialprofile and new counters to runtime.memRecordCycle. runtime.mProf_Free would subtract the allocating GC cycle counter stored in the special from the current cycle counter, use this to scale the allocation count and size, and accumulate these into the new fields of memRecordCycle. One complication is that I don’t think we want to add these to runtime.MemProfileRecord, so we would need some other way to get them into the runtime/pprof package. We already use private APIs between runtime and runtime/pprof for the CPU profile, and I think we would do something similar for this.

/cc @golang/runtime @mknyszek @dr2chase

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions