-
Notifications
You must be signed in to change notification settings - Fork 25k
[jit] Clear recursive error stack on each compilation #23458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously we weren't clearing the stack, so any failures that didn't stop the program stayed around in the stack and would show up if something else accessed the stack.
torch/jit/__init__.py
Outdated
def script(obj, optimize=None, _frames_up=0, _rcb=None): | ||
if not _enabled: | ||
return obj | ||
|
||
# In case there were some previous failed compilations, clear out the stack | ||
torch._C._clear_compilation_error_stack() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suspicious of using manual calls to manipulate the error stack state. Can we not make pushing/popping an RAII-like property of the IR generation step? so the constructor/destructor of to_ir
pushes/pops off the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be good but I don't know if it can work here. With changing CallStack
's constructor / destructor to replace push/pop_function
, it looks like the stack-unwinding happens before the exception::what()
is called, so the stack gets deleted before it's shown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@pytorchbot retest this please |
Can we revive this asap? This is still an open issue on master |
The hot fix that's in 1.2 can be landed in #23682 |
sure, but I'd prefer to get the non-goofy version |
@pytorchbot retest this please |
@pytorchbot rebase this please |
Previously we weren't clearing the stack, so any failures that didn't
stop the program stayed around in the stack and would show up if
something else accessed the stack.
Differential Revision: D16866719