Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/metrics: add goroutine state counts, total goroutines created, total threads #15490

Open
deft-code opened this issue Apr 29, 2016 · 33 comments
Assignees
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. FeatureRequest help wanted Proposal Proposal-Accepted
Milestone

Comments

@deft-code
Copy link

deft-code commented Apr 29, 2016

Update, Jun 7 2023: runtime/metrics now exists, but there are a few metrics in the draft CL here that aren't yet exposed. See #15490 (comment).


MemStats provides a way to monitor allocation and garbage collection.

We need a similar facility to monitor the Scheduler.

Briefly:

  • Total goroutines create
  • Current number of of goroutines
  • Total number of goroutines scheduled
  • Current number of goroutines scheduled
  • Total thread starts
  • Current number of threads.
  • Metrics on the delay between a goroutine being ready and running on a proc.
@minux
Copy link
Member

minux commented Apr 29, 2016 via email

@aclements
Copy link
Member

@minux, while true, runtime/trace seems like a pretty high overhead way to collect what amounts to a fairly small amount of information. It's certainly low overhead for what it does, but what it does is much more than what's needed here. The metrics @deft-code wants are primarily intended for continuous monitoring (based on offline conversations), so it needs to be cheap.

@aclements
Copy link
Member

Here are the notes on the desired metrics I had from our meeting a while ago:

Ring buffer of sampled duration between entering and exiting runnable state

  • With some probability, when a goroutine enters runnable, tag it, and when it exits runnable, add the runnable duration to a ring
  • Consumers can do what they want with these samples, including just averaging them, or building distributions.

Four global stats

  • Current number of goroutines
  • Total number of goroutines ever created
  • Current number of runnable goroutines
  • Total number of runnable-to-running transitions

Maybe current number of running goroutines

@bradfitz bradfitz added this to the Unplanned milestone May 4, 2016
@bradfitz
Copy link
Contributor

Assigning to @aclements to decide what we're willing to support long-term.

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 22, 2016
@adg
Copy link
Contributor

adg commented Sep 26, 2016

Ping @aclements

@rsc
Copy link
Contributor

rsc commented Dec 19, 2016

Still up to @aclements.

@rsc
Copy link
Contributor

rsc commented Jan 9, 2017

Ping @aclements. Can you look at this during the release candidate quiet?

@aclements
Copy link
Member

@deft-code, do the specific stats I suggested in #15490 (comment) address your needs?

@aclements aclements modified the milestones: Go1.9Early, Unplanned Jan 9, 2017
@aclements
Copy link
Member

Sorry, I'd lost track of the fact that there was a concrete proposal doc for this: https://github.com/deft-code/proposal/blob/master/design/15490-schedstats.md

@deft-code, could you mail a CL to add this to the go-proposal repository and, once submitted, edit your first post to link to it? Thanks.

@deft-code
Copy link
Author

I'll get on top of it.

@aclements
Copy link
Member

Thanks!

@rsc rsc changed the title proposal: runtime SchedStats API proposal: runtime: add SchedStats API Jan 23, 2017
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/38180 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Mar 27, 2017

Do we need to keep this issue open, or should we accept it?

@aclements
Copy link
Member

There's definitely still work to do on how and what exactly the API should expose, but I think it's pretty clear we need to provide some visibility into the scheduler.

@rsc
Copy link
Contributor

rsc commented Mar 28, 2017

Teams inside Google are patching in CL 38180 and getting some experience with it. If others would like to do the same, please do. We'll probably wait until Go 1.10 to decide to add the API officially. Putting the proposal on hold until then.

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017
@bradfitz bradfitz removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 22, 2017
@bradfitz bradfitz added early-in-cycle A change that should be done early in the 3 month dev cycle. and removed early-in-cycle A change that should be done early in the 3 month dev cycle. labels Jun 14, 2017
@bradfitz bradfitz removed this from the Go1.10Early milestone Jun 14, 2017
@mknyszek
Copy link
Contributor

I think a bunch of the metrics in https://go.dev/cl/38180 are not covered by the runtime/metrics package. I've had it on my TODO list for a while but it's never quite made it to the top. Clearly not for this release, but I continue to hope.

In terms of the proposal process, I don't think this needs to be on hold anymore. Adding new metrics is a fair bit more lightweight than it used to be, so even if we don't see an obvious use-case right now, the bar is low enough that I'm comfortable with just adding the remaining metrics.

In terms of the original proposal, I think all we're missing is counts of goroutines in various states, total count of goroutines created (to create a rate metric), and a thread count. (We already have a histogram metric for time spent in "runnable.")

@mknyszek
Copy link
Contributor

If someone would like to take a stab at implementing this, please be my guest. Otherwise, I'll get to it next cycle.

@mknyszek mknyszek modified the milestones: Unplanned, Go1.22 May 26, 2023
@mknyszek mknyszek self-assigned this May 26, 2023
@mknyszek
Copy link
Contributor

If someone would like to take a stab at implementing this, please be my guest. Otherwise, I'll get to it next cycle.

@ianlancetaylor
Copy link
Contributor

@mknyszek Thanks. Should we keep this issue open and retarget to runtime/metrics? Or should we open a new proposal?

@mknyszek
Copy link
Contributor

We can keep this issue open and retarget it. I'll update the header and such.

@mknyszek mknyszek changed the title proposal: runtime: add SchedStats API proposal: runtime/metrics: add more scheduler-related metrics May 26, 2023
@rsc
Copy link
Contributor

rsc commented Jun 7, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc removed the Proposal-Hold label Jun 7, 2023
@rsc rsc changed the title proposal: runtime/metrics: add more scheduler-related metrics proposal: runtime/metrics: add goroutine state counts, total goroutines created, total threads Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

Retitled based on #15490 (comment).
This is about per-state goroutine counts, a count of total goroutines created, and total number of OS threads.

Have all concerns about this proposal been addressed?

@mknyszek
Copy link
Contributor

Yeah, I believe all concerns are addressed.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: runtime/metrics: add goroutine state counts, total goroutines created, total threads runtime/metrics: add goroutine state counts, total goroutines created, total threads Jul 5, 2023
@mknyszek
Copy link
Contributor

As for what these metrics should be named, perhaps:

/sched/threads:threads
/sched/goroutines-created:goroutines (cumulative)
/sched/goroutines/waiting:goroutines
/sched/goroutines/runnable:goroutines
/sched/goroutines/running:goroutines
/sched/goroutines/not-in-go:goroutines

The goroutine state metrics come from https://go-review.googlesource.com/c/go/+/38180/9/src/runtime/pstats.go#18. I figure we can reuse most of that implementation.

@prattmic
Copy link
Member

/sched/goroutines/not-in-go:goroutines is intended to cover syscalls and cgo, I assume? i.e., it is _Gsyscall?

@gopherbot
Copy link
Contributor

This issue is currently labeled as early-in-cycle for Go 1.22.
That time is now, so a friendly reminder to look at it again.

@gopherbot
Copy link
Contributor

This issue is currently labeled as early-in-cycle for Go 1.23.
That time is now, so a friendly reminder to look at it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. FeatureRequest help wanted Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests