-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: BinaryTree17 performance regression #13535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This commit has a significant effect on the behavior of the background sweeper, which means the profiling information is largely masked by #13527. In fact, by just about every performance counter, it looks like things should have improved: Before:
After:
The one thing that doesn't improve is the amount of time we spend blocked in futex: it goes from 43528.4 total ms to 45066.6 total ms (though the number of futex calls drops from 272919 to 106681!). (The futex stats are computed from |
With the background sweeper disabled as a workaround, the performance metrics become much clearer: Before:
After:
It's clear we're getting for more LLC references and misses. The increased LLC references are presumably from lower-level cache misses. All of the other counters are virtually identical between the runs. A differential profile of cache-references shows that runtime.memclr goes from 0.75% to 27.82% (1st place) and runtime.mCentral_Grow goes from 0.34% to 2.70% (6th place). The rest of the profile is basically unchanged. According to Counting the number of mCache_Grow calls reveals that this commit raised it by 6X from 38,805 to 232,021. However, we're clearly satisfying these from the heap (not from the OS), because the max RSS is virtually unchanged at ~600MB. |
CL https://golang.org/cl/17745 mentions this issue. |
BinaryTree17 from test/bench/go1 clearly slowed down by ~5% with commit 7407d8e, which made the proportional sweeper less agressive.
The text was updated successfully, but these errors were encountered: