Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: cannot ReadMemStats during GC #19812

Open
aclements opened this issue Mar 31, 2017 · 5 comments

Comments

Projects
None yet
6 participants
@aclements
Copy link
Member

commented Mar 31, 2017

GC holds the worldsema across the entire garbage collection, which means nothing else can stop the world, even during concurrent marking. Anything that does try to stop the world will block until GC is done. Probably the most interesting such operation is ReadMemStats.

We should fix GC to only hold worldsema during sweep termination and mark termination. The trickiest part of this is probably dealing with GOMAXPROCS changing during GC, which will affect mark worker scheduling.

/cc @RLH

@andreimatei

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

This issue is kind of a pain for CockroachDB because we use ReadMemStats() to get periodic heap stats. Because the ReadMemStats() blocks for GC to finish (and we've seen it take many seconds), we've had to collect these heap stats separately from the loop that sample other stats in our server (which we care about collecting periodically, and in a timely manner), and so now our heap stats are out of sync with those other stats.

FWIW, we'd rather have more approximate heap stats that we can collect cheaply than more accurate heap stats that can only be collected at specific points.

@beorn7

This comment has been minimized.

Copy link

commented May 16, 2019

Note that there is a work around in client_golang now, see prometheus/client_golang#568 (check the final state of the PR, not the whole discussion). It will be released any moment as v0.9.3.

@beorn7

This comment has been minimized.

Copy link

commented May 16, 2019

Meaning: If you use prometheus/client_golang, you are good now. If you are doing it on your own, it might serve as inspiration.

@aclements

This comment has been minimized.

Copy link
Member Author

commented May 16, 2019

Thanks for the reports of this happening in real systems. I knew this was a theoretical problem, but this is the first data I've had on it happening in production.

Making the stats approximate doesn't really help in this case. We're just doing some heavy-handed synchronization that we need to figure out how to break up.

@go101

This comment has been minimized.

Copy link

commented May 17, 2019

Is it acceptable to cache and return the stats before GC starts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.