Skip to content

[jit] Clear recursive error stack on each compilation #23458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

driazati
Copy link
Contributor

@driazati driazati commented Jul 26, 2019

Previously we weren't clearing the stack, so any failures that didn't
stop the program stayed around in the stack and would show up if
something else accessed the stack.

Differential Revision: D16866719

Previously we weren't clearing the stack, so any failures that didn't
stop the program stayed around in the stack and would show up if
something else accessed the stack.
@pytorchbot pytorchbot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 26, 2019
def script(obj, optimize=None, _frames_up=0, _rcb=None):
if not _enabled:
return obj

# In case there were some previous failed compilations, clear out the stack
torch._C._clear_compilation_error_stack()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suspicious of using manual calls to manipulate the error stack state. Can we not make pushing/popping an RAII-like property of the IR generation step? so the constructor/destructor of to_ir pushes/pops off the stack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be good but I don't know if it can work here. With changing CallStack's constructor / destructor to replace push/pop_function, it looks like the stack-unwinding happens before the exception::what() is called, so the stack gets deleted before it's shown.

@pytorchbot pytorchbot added the module: pybind Related to our Python bindings / interactions with other Python libraries label Jul 31, 2019
Copy link
Member

@suo suo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@driazati
Copy link
Contributor Author

driazati commented Aug 1, 2019

@pytorchbot retest this please

driazati pushed a commit that referenced this pull request Aug 1, 2019
This is a redo of #23458 since that was getting weird Windows failures
since it changed `CallStack` to be a `RAII`-type resource. This PR is a
temporary fix for the 1.2 release, but #23458 should be landed in favor
of this eventually.
@suo
Copy link
Member

suo commented Aug 8, 2019

Can we revive this asap? This is still an open issue on master

@driazati
Copy link
Contributor Author

driazati commented Aug 8, 2019

Can we revive this asap? This is still an open issue on master

The hot fix that's in 1.2 can be landed in #23682

@suo
Copy link
Member

suo commented Aug 9, 2019

sure, but I'd prefer to get the non-goofy version

Your Name added 2 commits August 12, 2019 16:22
@driazati
Copy link
Contributor Author

@pytorchbot retest this please

@driazati
Copy link
Contributor Author

@pytorchbot rebase this please

@facebook-github-bot
Copy link
Contributor

@driazati merged this pull request in 10c4564.

@facebook-github-bot facebook-github-bot deleted the driazati/errors/4 branch July 13, 2020 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged module: pybind Related to our Python bindings / interactions with other Python libraries oncall: jit Add this issue/PR to JIT oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants