-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix FallbackKernel behavior on mutable ops #118649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118649
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 86e5690 with merge base 064610d ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests ghstack-source-id: e85d925 Pull Request resolved: #118649
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
if isinstance(self.op_overload, torch._ops.HigherOrderOperator): | ||
# We assume here that HOPs with FallbackKernel are functional. | ||
# This may not always be true! HOPs must individually opt-in to | ||
# FallbackKernel, so please check this if you opt-in. | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check this assumption somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there's no notion of a schema for HOPs. Though we generally tell people that they must be functional.
return | ||
|
||
if schema.is_mutable and not can_auto_functionalize(kernel): | ||
raise NotImplementedError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cant we handle this case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For custom ops, the schema is not enough information to tell us about the semantics of the operator. For example, if the operator has a schema that looks like "(Tensor(a!) x) -> Tensor(a)", it's unclear if the operator is (1) returning x as-is or (2) returning a view on x.
For aten operators, we have the convention that it is always (1). It's unclear to me how to generate code when we don't know which of these two situations we're in.
Pragmatically, I've never seen a custom op have a schema that looks like the above, so, this is not a case we need to worry about.
torch/_inductor/ir.py
Outdated
schema_args = schema.arguments | ||
args, kwargs = self.unflatten_args(self.inputs, self.constant_args) | ||
input_alias_sets = set() | ||
for arg, info in zip(args, schema_args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that these two match up. schema_args will not unroll Tensor lists for example. could we add some assertions of some kind or another
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll work through this more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an assertion for isinstance(Tensor) is sufficient here, will put it in (and some comments about why)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about func(Tensor[] inp, Tensor (a!) inp2)
? Wont we still unflatten the inp list and have the mutation arg not be aligned ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got a test for this. But to answer your question, self.unflatten_args
puts the flat list back into the structure that is aligned with the schema. Though I realized I forgot to handle kwargs, will go do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eellison I put in a bunch of safety checks (and we handle kwargs now), please let me know if that's what you were looking for.
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
return devices[0] | ||
return None | ||
|
||
def has_side_effects(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return self.alias_names | ||
|
||
def get_mutation_names(self): | ||
return self.mutation_names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to assert that this 0 or 1 elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will do
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
FallbackKernel wasn't handing mutable ops correctly: it would not report them in get_mutation_names or get_alias_names. This would lead to silent incorrectness -- Inductor would incorrectly reorder the mutable op with other mutable ops. This PR fixes that: - we only support mutable operations that are "auto_functionalizable". That is, they mutate inputs and do not return aliases of any inputs. - Following the Triton kernel work, any mutated inputs must be specified in get_alias_names and processed via mark_node_as_mutating - We also do some minor cleanup by killing dead code (FallbackKernel no longer processes OpOverloadPacket) and adding some handling around HOPs. Test Plan: - new tests Pull Request resolved: #118649 Approved by: https://github.com/eellison, https://github.com/oulgen
Stack from ghstack (oldest at bottom):
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.
This PR fixes that:
That is, they mutate inputs and do not return aliases of any inputs.
in get_alias_names and processed via mark_node_as_mutating
longer processes OpOverloadPacket) and adding some handling around
HOPs.
Test Plan:
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler