-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Replicate generic hardware events on all CPU PMUs #2123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @captain5050 |
|
Does this need any kind of guarding or is this always the right thing to do? |
This is always correct as far as I'm aware. I've tested that it does the right thing on an M2 Ultra Mac Studio, a Pixel 10 and a regular x86 PC (where it just finds the single PMU which returns the same results as not specifying a PMU). |
LebedevRI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't test, but seems fine.
|
can we add google test unit tests for this please? i don't know how tricky it is to test it but i'm a little concerned having no tests at all. |
We can test it by verifying that counters are non-zero when the process is pinned to each CPU in the system. We can adapt the test |
Done |
|
looks like we need to update this from head and do some clang-format fun. |
On systems with more than one PMU for the CPUs (e.g. Apple M series SOCs), generic hardware events are only created for an arbitrary PMU. Usually this is the big cluster's PMU, which can cause inaccuracies when the process is scheduled onto a little core. To fix this, teach PerfCounters to register generic hardware events on all CPU PMUs. CPU PMUs are identified using the same method as perf.
Done |
|
thank you so much :) |
On systems with more than one PMU for the CPUs (e.g. Apple M series SOCs), generic hardware events are only created for an arbitrary PMU. Usually this is the big cluster's PMU, which can cause inaccuracies when the process is scheduled onto a little core. To fix this, teach PerfCounters to register generic hardware events on all CPU PMUs.
CPU PMUs are identified using the same method as perf.