Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/metrics: add /gc/heap/live:bytes #56857

Open
felixge opened this issue Nov 20, 2022 · 12 comments
Open

runtime/metrics: add /gc/heap/live:bytes #56857

felixge opened this issue Nov 20, 2022 · 12 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Milestone

Comments

@felixge
Copy link
Contributor

felixge commented Nov 20, 2022

There appears to be no way to monitor what is called the Live Heap in the GC Guide.

Screen Shot 2022-11-20 at 04 25 30

The closest runtime/metric seems to be /memory/classes/heap/objects:bytes which is Live Heap + New Heap (i.e. includes potentially dead but unswept allocations).

Monitoring Live Heap seems to be useful when running with GOGC=off and a GOMEMLIMIT to understand if the live heap of a program is approaching the GOMEMLIMIT which could lead to high CPU usage or OOM kills.

Therefor I'm proposing to add /memory/classes/heap/live_objects:bytes and /gc/heap/live_objects:objects to the runtime/metrics package. They should return similar values as what is currently returned by the InUse* values of the runtime.MemProfile() profile. It's probably worth to discuss whether the term InUse is preferable over Live.

I created a very hacky prototype of this here, but there is probably a much better way to do this.

@gopherbot gopherbot added this to the Proposal milestone Nov 20, 2022
@seankhliao
Copy link
Member

cc @mknyszek

@mknyszek
Copy link
Contributor

A similar proposal was rejected previously (#51966). Part of the problem was inclusion in MemStats, but runtime/metrics also existed and I made some other arguments against it. Mainly, I think relying on it solely as an indicator can be somewhat fragile if you put it in context of the Go 1.18 GOGC changes. I wasn't ever fundamentally opposed to this, but at the time for the use-case presented, it just didn't seem worth it. I also wanted to think more carefully about what implications it could have (and why we didn't do it already).

However, I do think your argument that it's useful for identifying issues with a memory limit is "new information" (now that the memory limit has landed) as far as the proposal processes goes. I think that might be enough to push this over the edge into more serious consideration for me personally.

I think to start with I'd be inclined at this point to call this /gc/heap/live:bytes. This way it lives next to /gc/heap/goal:bytes. /memory/classes to me is more about a breakdown of memory at a higher level and all the metrics inside there are supposed to be orthogonal. We'd have to break up some of the other metrics and I don't think it's worth it for this.

If we do this, we should probably also export scannable stack bytes and scannable globals bytes, at least as far as they go into the GOGC calculation. We should also really export GOGC and the memory limit as well, since they can change during application execution.

I don't really see enough utility from exporting the count of live objects though. I think a memory profile is better for that sort of thing anyway. (Also, we don't currently track that (mod memory profiling) and it's more work.)

@mknyszek mknyszek added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 21, 2022
@mknyszek
Copy link
Contributor

I created a very hacky prototype of this here, but there is probably a much better way to do this.

Yeah, there's a simpler way for bytes. :) You want gcController.heapMarked. You can just read that directly in the metrics callback without any synchronization since it only gets updated during a STW.

@felixge
Copy link
Contributor Author

felixge commented Nov 21, 2022

Sounds great. I'll update my patch based on your suggestions.

I also wanted to think more carefully about what implications it could have (and why we didn't do it already).

Yeah, that makes sense. Seems like this is missing the code freeze anyway.

@felixge
Copy link
Contributor Author

felixge commented Nov 21, 2022

I updated the CL to use gcController.heapMarked, but found one odd thing.

If we do this, we should probably also export scannable stack bytes and scannable globals bytes, at least as far as they go into the GOGC calculation. We should also really export GOGC and the memory limit as well, since they can change during application execution.

I can prepare CLs for those as well. Should I submit them separately?

@mknyszek
Copy link
Contributor

I updated the CL to use gcController.heapMarked, but found one odd thing.

It's an accounting artifact currently required for good performance. It would be nice to fix, but I'm not yet sure how without either (1) impacting the allocation fast path or (2) slowing down metrics.Read to ReadMemStats levels (this path would avoid the global latency impact that ReadMemStats has because it's not necessary to actually, fully stop the world, but the worst-case latency of metrics.Read becomes effectively the same as stopping the world).

For users that really need this consistency (like the testing package), ReadMemStats provides it, so there is a workaround. Most of the time though, the skew doesn't make a very meaningful difference, and the memory stats (in /memory/classes) are all at least guaranteed to be consistent with each other. It might be worth mentioning in the docs that this stat specifically isn't necessarily consistent with the /memory/classes stats. We can lift this restriction in the future if we can figure out a way to flush cached efficiently (I'll keep thinking about it).

If we do this, we should probably also export scannable stack bytes and scannable globals bytes, at least as far as they go into the GOGC calculation. We should also really export GOGC and the memory limit as well, since they can change during application execution.

I can prepare CLs for those as well. Should I submit them separately?

Either or. Your current CL is not very big.

@rsc
Copy link
Contributor

rsc commented Nov 30, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc changed the title proposal: runtime/metrics: add live_objects metrics proposal: runtime/metrics: add /gc/heap/live:bytes Dec 7, 2022
@rsc
Copy link
Contributor

rsc commented Dec 7, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Dec 14, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: runtime/metrics: add /gc/heap/live:bytes runtime/metrics: add /gc/heap/live:bytes Dec 14, 2022
@rsc rsc modified the milestones: Proposal, Backlog Dec 14, 2022
@prattmic prattmic modified the milestones: Backlog, Go1.21 Dec 14, 2022
@prattmic
Copy link
Member

@felixge were you planning to implement this?

@felixge
Copy link
Contributor Author

felixge commented Dec 14, 2022

@felixge were you planning to implement this?

Yes, I'll try to submit a good patch for /gc/heap/live:byte 👍.

If we do this, we should probably also export scannable stack bytes and scannable globals bytes, at least as far as they go into the GOGC calculation. We should also really export GOGC and the memory limit as well, since they can change during application execution.

If time allows I could also try to look into these, but I'm not sure if they'd be covered by the accepted proposal or would need separate proposals?

@mknyszek
Copy link
Contributor

That's a good question. I personally feel we should just include those in the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Projects
Status: Todo
Status: Accepted
Development

No branches or pull requests

6 participants