-
Notifications
You must be signed in to change notification settings - Fork 25k
Add python stack tracing option on on-demand flow #80919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful links
✅ No Failures (0 Pending)As of commit 7f411f3 (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
This pull request was exported from Phabricator. Differential Revision: D37410204 |
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: pytorch#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Differential Revision: D37410204 fbshipit-source-id: b7ae9b49d87e58b84123c852ba79fe8e80aecbaf
This pull request was exported from Phabricator. Differential Revision: D37410204 |
950c5a6
to
3470eed
Compare
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: pytorch#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Differential Revision: D37410204 fbshipit-source-id: 67061846905eeb44aa38f70654731acfde7aa8ad
3470eed
to
9aa30e0
Compare
This pull request was exported from Phabricator. Differential Revision: D37410204 |
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: pytorch#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Differential Revision: D37410204 fbshipit-source-id: f3f98c161897d39e57ce1c3df55d6d69ce8c0214
This pull request was exported from Phabricator. Differential Revision: D37410204 |
9aa30e0
to
5b2f55a
Compare
This pull request was exported from Phabricator. Differential Revision: D37410204 |
5b2f55a
to
c6faad6
Compare
This pull request was exported from Phabricator. Differential Revision: D37410204 |
c6faad6
to
93f6725
Compare
This pull request was exported from Phabricator. Differential Revision: D37410204 |
93f6725
to
e78e9ad
Compare
e78e9ad
to
da24bf0
Compare
This pull request was exported from Phabricator. Differential Revision: D37410204 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, with an action item to clean up the interface in a follow up PR.
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: pytorch#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Trace after changes by running //kineto/libkineto/fb/integration_tests/trace_tester.cpp (requested by chaekit) https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2F0%2F1657304871%2F127.0.0.1%2Flibkineto_activities_3502962.json.gz&bucket=gpu_traces Reviewed By: chaekit Differential Revision: D37410204 fbshipit-source-id: 569c05335ca47701bd0c5a6e821b986846766702
This pull request was exported from Phabricator. Differential Revision: D37410204 |
725b8e6
to
429b378
Compare
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: pytorch#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Trace after changes by running //kineto/libkineto/fb/integration_tests/trace_tester.cpp (requested by chaekit) https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2F0%2F1657304871%2F127.0.0.1%2Flibkineto_activities_3502962.json.gz&bucket=gpu_traces Reviewed By: chaekit Differential Revision: D37410204 fbshipit-source-id: c9d38d0dd925e54153c95ee2322feaa2ed0fa486
This pull request was exported from Phabricator. Differential Revision: D37410204 |
429b378
to
44805b4
Compare
Summary: Pull Request resolved: pytorch#80919 X-link: pytorch/kineto#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Trace after changes by running //kineto/libkineto/fb/integration_tests/trace_tester.cpp (requested by chaekit) https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2F0%2F1657304871%2F127.0.0.1%2Flibkineto_activities_3502962.json.gz&bucket=gpu_traces Test Plan: 1. launch a python test case with the following command for on-demand flow: echo -e "PYTHON_STACK_TRACE=true" > /tmp/scott_kineto.conf && dyno gputrace --gputrace_duration 300ms --gpuconf /tmp/scott_kineto.conf 2. Then, we can see with_stack enabled as intended from output log: INFO:2022-06-27 15:00:16 1009443:1011716 kineto_client_interface.cpp:22] withStack : 1 Reviewed By: chaekit Differential Revision: D37410204 fbshipit-source-id: 7962c05b3bccb1359dfc46d832abda10caeb0e81
This pull request was exported from Phabricator. Differential Revision: D37410204 |
44805b4
to
7f411f3
Compare
Summary: X-link: pytorch/pytorch#80919 Pull Request resolved: #628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Trace after changes by running //kineto/libkineto/fb/integration_tests/trace_tester.cpp (requested by chaekit) https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2F0%2F1657304871%2F127.0.0.1%2Flibkineto_activities_3502962.json.gz&bucket=gpu_traces Reviewed By: chaekit Differential Revision: D37410204 fbshipit-source-id: 2df900afc57ddab776b7cf1fd303debd9eafe8a1
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
@pytorchbot successfully started a merge job. Check the current status here |
Hey @slgong-fb. |
Summary: Pull Request resolved: #80919 X-link: pytorch/kineto#628 Changes: 1. add an option in Config; can use 'PYTHON_STACK_TRACE=true' option (via .conf) 2. deliver PYTHON_STACK_TRACE value to kineto_client_interface start() 3. abstract class also changed. Trace after changes by running //kineto/libkineto/fb/integration_tests/trace_tester.cpp (requested by chaekit) https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2F0%2F1657304871%2F127.0.0.1%2Flibkineto_activities_3502962.json.gz&bucket=gpu_traces Test Plan: 1. launch a python test case with the following command for on-demand flow: echo -e "PYTHON_STACK_TRACE=true" > /tmp/scott_kineto.conf && dyno gputrace --gputrace_duration 300ms --gpuconf /tmp/scott_kineto.conf 2. Then, we can see with_stack enabled as intended from output log: INFO:2022-06-27 15:00:16 1009443:1011716 kineto_client_interface.cpp:22] withStack : 1 Reviewed By: chaekit Differential Revision: D37410204 fbshipit-source-id: 2df900afc57ddab776b7cf1fd303debd9eafe8a1
@pytorchbot revert -m "Sorry, reverting as this broke buck build/test https://hud.pytorch.org/pytorch/pytorch/commit/f50a248a5eacb9a9aa475a9e610486aea136e4f5" -c nosignal Could you please add ciflow/periodic to the reland PR to capture the signal? |
@pytorchbot successfully started a revert job. Check the current status here |
@slgong-fb your PR has been successfully reverted. |
This reverts commit f50a248. Reverted #80919 on behalf of https://github.com/janeyx99 due to Sorry, reverting as this broke buck build/test https://hud.pytorch.org/pytorch/pytorch/commit/f50a248a5eacb9a9aa475a9e610486aea136e4f5
To be clear, buck build helped to catch an actual issue with the code: |
the buck build failure is legit but not sure why
this is strictly profiler change and does not touch any PyTorch core. also it passed in CI before landing |
It's unrelated to the BCE test. The issue is that before the class hierarchy looked something like:
And it got changed to:
Honestly, I think the question isn't why did buck fail but rather "why didn't all other CI fail"? (My suspicion is that PyTorch is using Kineto's CMake files so whatever lets it through Kineto CI also lets it through PyTorch CI. But just a hunch.) It didn't fail CI because buck CI doesn't run on PRs, it runs periodically. If it was an issue with buck I would have fought the revert (I have also been bitten by this and it's very annoying), but in this case buck was flagging a real problem as @kit1980 points out. Speaking of which, this seems like a good reason to put buck build in the per PR group. |
Summary:
Changes:
Test Plan:
launch a python test case with the following command for on-demand flow:
echo -e "PYTHON_STACK_TRACE=true" > /tmp/scott_kineto.conf && dyno gputrace --gputrace_duration 300ms --gpuconf /tmp/scott_kineto.conf
Then, we can see with_stack enabled as intended from output log:
INFO:2022-06-27 15:00:16 1009443:1011716 kineto_client_interface.cpp:22] withStack : 1
Differential Revision: D37410204