-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Fixes for profiling JIT code #38453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for profiling JIT code #38453
Conversation
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI [ghstack-poisoned]
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit afe7020 (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 36 times. |
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: Two fixes: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: unittest, CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: python test/test_jit.py TestJit.test_profiler CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: python test/test_jit.py TestJit.test_profiler CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, minor comments
aten/src/ATen/Parallel.h
Outdated
@@ -4,7 +4,7 @@ | |||
#include <c10/macros/Macros.h> | |||
|
|||
namespace at { | |||
namespace internal { | |||
namespace internal {// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
@@ -943,6 +943,9 @@ struct InterpreterStateImpl : c10::intrusive_ptr_target { | |||
// unique to every frame with prim::profile across all threads | |||
c10::optional<size_t> id; | |||
static std::atomic<size_t> num_frames; | |||
|
|||
// RecordFunction object associated with this frame | |||
std::shared_ptr<at::RecordFunction> record_function; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it can be a unique_ptr (or even c10::optional)
int64_t dist_autograd_context_id = 0) | ||
: state(state_), stack(std::move(stack_)) { | ||
int64_t dist_autograd_context_id = 0, | ||
c10::optional<at::ThreadLocalState> tls_state = c10::nullopt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - can you remind me the reason dist_autograd_context is not part of TLS? (I think there was one but I don't recall)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I better refer to #38510 , there's been some ongoing discussion
Summary: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: python test/test_jit.py TestJit.test_profiler CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: python test/test_jit.py TestJit.test_profiler CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
Summary: - RecordFunction in JIT interpreter should exist during the execution of the frame, and not just when we enter the frame - When creating a JIT continuation in wait instruction, we'd want to preserve the original thread local context, right now when we resume execution in continuation we preserve the thread local state of the thread that set future value (i.e. executed a forked task) Test Plan: python test/test_jit.py TestJit.test_profiler CI Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959) [ghstack-poisoned]
@ilia-cher merged this pull request in 235f624. |
Stack from ghstack:
Summary:
of the frame, and not just when we enter the frame
preserve the original thread local context, right now when we resume
execution in continuation we preserve the thread local state of the
thread that set future value (i.e. executed a forked task)
Test Plan:
python test/test_jit.py TestJit.test_profiler
CI
Differential Revision: D21565959