Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: runtime/pprof: document mutex profiler stack traces #44920

Open
prattmic opened this issue Mar 10, 2021 · 7 comments
Open

doc: runtime/pprof: document mutex profiler stack traces #44920

prattmic opened this issue Mar 10, 2021 · 7 comments

Comments

@prattmic
Copy link
Member

@prattmic prattmic commented Mar 10, 2021

The mutex profiler records contention on the Unlock path of locks, so the stack traces in the profiles all appear as Unlock calls rather than the more intuitively expected Lock call [1].

This can be a source of confusion. Here's a Stack Overflow question, and I personally fielded a question about this today.

Our own runtime/pprof docs say nothing about this (in fact, the complete documentation we have for mutex profiles is "mutex - stack traces of holders of contended mutexes"). This should certainly be expanded.

There is a bit of documentation elsewhere online. e.g., @felixge's block profile documentation mentions this property.

Related to #14689.

[1] This also means it isn't possible to directly tell which Lock calls are hottest. In practice I doubt this is an issue as typical mutex use has a unique Unlock call for each unique Lock call. If this were a problem, we could move the recorded stack trace to Lock here (at the expensive of requiring a slot in the sudog to store the stack until we can record the event in Unlock).

@prattmic
Copy link
Member Author

@prattmic prattmic commented Mar 10, 2021

@felixge
Copy link
Contributor

@felixge felixge commented Mar 10, 2021

@prattmic +100. I think it's fair to say that all of the builtin profilers could use some additional documentation about the kind of data they produce, the sampling that's involved, etc.

I'd be happy to help with patches for this, but I'm currently still a bit unsure about the best place to document this. A lot of the profilers have various APIs and knobs exposed in different places, so there isn't always a natural place to put this info. Perhaps the Diagnostics doc would be the right place? I'm thinking each profiler could use a section there, and then all the API docs could be updated to point people in the right direction?

@prattmic
Copy link
Member Author

@prattmic prattmic commented Mar 10, 2021

The inverse of footnote [1] above is that recording in Unlock potentially shows which lock holders are most problematic (i.e., they hold the lock for too long). This could be a very useful property. e.g., consider a fast "getter" method that holds the lock only long enough to read a single field vs an update method that holds the lock for a very long time. You may have a very high rate of access to the "getters" without causing significant contention, but a low rate of calls to "update" will block all the "getters". IMO, in this case it is likely more useful to see results on the Unlock in "update" rather than on Lock on the "getters".

@felixge
Copy link
Contributor

@felixge felixge commented Mar 10, 2021

@prattmic also great points about the tradeoffs between Unlock vs Lock. I found reporting Unlock very unnatural and confusing, but you make a great point about it being potentially better.

@prattmic
Copy link
Member Author

@prattmic prattmic commented Mar 10, 2021

@felixge Great question about the best to put this kind of documentation since there are several relevant packages. I was initially thinking runtime/pprof would be most natural, but then I didn't even realize that the Diagnostics page existed (discoverability problem, anyone? 😄).

The main possibilities I see are:

  • Package runtime docs: contains some knobs for profiling, but isn't the primary interface, so I don't think this is great.
  • Package runtime/pprof docs: generally the primary interface for profiling.
  • Website Diagnostics page (or other new golang.org page): This feels a bit more suited to long-form exposition on how to interpret, visualize the profiles, etc, but isn't quite as discoverable.

I'd probably lean slightly in favor of golang.org right now. Wherever this ends up, I think we should:

  1. Have a full section about each profile type (at least a paragraph!). No more single sentence being the complete documentation.
  2. Link to the definitive / expanded documentation from the other package doc locations.

Others like @bcmills and @jayconrod may have more thoughts on the best location for this kind of documentation?

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Mar 10, 2021

I think it makes sense to put this somewhere under https://golang.org/doc/ and link to it from that Diagnostics page and from the runtime/pprof package documentation. It can be hosted in https://github.com/golang/website/tree/master/_content/doc.

@hyangah
Copy link
Contributor

@hyangah hyangah commented Mar 11, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants