-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[profiler] Add Linux Perf support #87866
Conversation
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87866
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 FailuresAs of commit 672abf3: The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! ghstack-source-id: 171665397 Pull Request resolved: #87866
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! [ghstack-poisoned]
@pytorchbot merge -f 'Landed internally' (Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally) |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! Pull Request resolved: pytorch#87866 Approved by: https://github.com/SS-JIA
* Add support to use Linux kernel perf subsystem via the profiler. * For now the perf configurability is quite limited to just event names. Threading etc. to come later. * Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited. * Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts. * For now, max event counts is capped at 4, time multiplexing is not allowed. * Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)! Pull Request resolved: pytorch#87866 Approved by: https://github.com/SS-JIA
Stack from ghstack (oldest at bottom):
[profiler] Expose experimental performance events to python #87905
Nested profiling support for Linux-perf Profiler #87904
[edge profiler] Add e2e test for profiler event and chrometrace #87877
[edge profiler] Add support for performance events counting #87876
[profiler] Add Performance events support in Kineto profiler #87874
-> [profiler] Add Linux Perf support #87866
Add support to use Linux kernel perf subsystem via the profiler.
For now the perf configurability is quite limited to just event names. Threading etc. to come later.
Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited.
Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts.
For now, max event counts is capped at 4, time multiplexing is not allowed.
Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general
Differential Revision: D40238033
NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!