Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: getMCache causes allocation fast path regression #42305

Closed
mknyszek opened this issue Oct 30, 2020 · 11 comments
Closed

runtime: getMCache causes allocation fast path regression #42305

mknyszek opened this issue Oct 30, 2020 · 11 comments
Assignees
Milestone

Comments

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Oct 30, 2020

bent benchmarks revealed a regression in some community microbenchmarks on 10/26. Bisecting, I discovered that c02134a is to blame.

That commit is very simple, so my best guess is it's the fact that we're looking up the M and the P a second time. Also, the compiler doesn't seem to inline getMCache which is unfortunate and could be part of it. Not sure why, though. Maybe mallogc is too big?

Anyway, we should fix this, probably by manually inlining it back into mallocgc and leaving a comment pointing to getMCache? There are a couple other circumstances where we do this extra dereference where it would be nice to not do so, so maybe getMCache should take a p pointer that might be nil? That means potentially doing a second lookup since it's ambiguous whether we don't have a P or just haven't looked it up yet, but even if we do we're only doing it when bootstrapping which is probably fine.

@prattmic WDYT?

@mknyszek mknyszek added this to the Go1.16 milestone Oct 30, 2020
@mknyszek mknyszek self-assigned this Oct 30, 2020
@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Oct 30, 2020

OK I think we've pretty much confirmed it's the additional function call, not reloading the P.

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Oct 30, 2020

What's preventing getMCache from inlining seems to be the throw inside. If that gets treated like a regular call, then it's still not quite enough. But a throw is like a panic, but strictly rarer because a throw can't be recovered from. Perhaps it should be cheaper?

CC @dr2chase who knows all about mid-stack inlining and inlining costs.

@josharian
Copy link
Contributor

@josharian josharian commented Oct 31, 2020

Aside: Fixing #24686 might let us get rid of mcache entirely. I never managed to get it implemented, but I don't know the runtime as well as you. :)

@Helflym
Copy link
Contributor

@Helflym Helflym commented Nov 2, 2020

There are some failures in last builds for AIX which seem to related to this issue:
https://build.golang.org/log/fa90b60048ac7f99bc84523d236948a1f090b4c3
https://build.golang.org/log/a5515473b2094744ca0ff0d19c330fa66ca6184a

I've searched really quickly through the buildfarm dashboard but it didn't find any other appearance.

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 2, 2020

There are some failures in last builds for AIX which seem to related to this issue:
https://build.golang.org/log/fa90b60048ac7f99bc84523d236948a1f090b4c3
https://build.golang.org/log/a5515473b2094744ca0ff0d19c330fa66ca6184a

I've searched really quickly through the buildfarm dashboard but it didn't find any other appearance.

Huh. That is a problem (write barrier without a P...?) but separate from this one. Just the fact that allocSpan now actually cares that we have a P.

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 2, 2020

@Helflym Could you file a new issue for that and assign it to me?

@Helflym
Copy link
Contributor

@Helflym Helflym commented Nov 2, 2020

@Helflym Could you file a new issue for that and assign it to me?

Done with #42339. But it can't assign it to you. I don't have the permissions I guess

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 2, 2020

So there are a number of ways to fix this regression. First is to make throw have cost 1, so getMCache gets inlined. Second is to manually inline getMCache and leave a comment. Third is to reorganize getMCache to put the throw in some //go:noinline function and hope that the call cost (57) is low enough that it still gets inlined.

For (1) and (3), we should add getMCache to the list of functions we want to make sure get inlined. I'm leaning toward (2) as the safest solution for the 1.16 release, though (3) is fine also (albeit a bit ugly).

@prattmic @dr2chase WDYT?

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 2, 2020

A fourth alternative is to change getMCache to return nil if it can't get one instead of throwing. Then it would be up to the caller to throw if necessary. This would be useful to handle #42339 also since apparently allocSpan needs to be able to run without a P. It requires updating each callsite, but because it also helps with another issue, this seems like a good idea to me.

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 2, 2020

I think I'm gonna go with 4.

(Credit to @prattmic for coming up with option 4 last week.)

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 2, 2020

Change https://golang.org/cl/267157 mentions this issue: runtime: make getMCache inlineable

@gopherbot gopherbot closed this in ac766e3 Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants