Skip to content

Conversation

zou3519
Copy link
Contributor

@zou3519 zou3519 commented Jan 30, 2024

Stack from ghstack (oldest at bottom):

FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:

  • we only support mutable operations that are "auto_functionalizable".
    That is, they mutate inputs and do not return aliases of any inputs.
  • Following the Triton kernel work, any mutated inputs must be specified
    in get_alias_names and processed via mark_node_as_mutating
  • We also do some minor cleanup by killing dead code (FallbackKernel no
    longer processes OpOverloadPacket) and adding some handling around
    HOPs.

Test Plan:

  • new tests

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jan 30, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118649

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 86e5690 with merge base 064610d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zou3519 added a commit that referenced this pull request Jan 30, 2024
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

ghstack-source-id: e85d925
Pull Request resolved: #118649
@zou3519 zou3519 added the keep-going Don't stop on first failure, keep running tests until the end label Jan 30, 2024
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
Comment on lines +4552 to +4556
if isinstance(self.op_overload, torch._ops.HigherOrderOperator):
# We assume here that HOPs with FallbackKernel are functional.
# This may not always be true! HOPs must individually opt-in to
# FallbackKernel, so please check this if you opt-in.
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check this assumption somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's no notion of a schema for HOPs. Though we generally tell people that they must be functional.

return

if schema.is_mutable and not can_auto_functionalize(kernel):
raise NotImplementedError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why cant we handle this case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For custom ops, the schema is not enough information to tell us about the semantics of the operator. For example, if the operator has a schema that looks like "(Tensor(a!) x) -> Tensor(a)", it's unclear if the operator is (1) returning x as-is or (2) returning a view on x.

For aten operators, we have the convention that it is always (1). It's unclear to me how to generate code when we don't know which of these two situations we're in.

Pragmatically, I've never seen a custom op have a schema that looks like the above, so, this is not a case we need to worry about.

schema_args = schema.arguments
args, kwargs = self.unflatten_args(self.inputs, self.constant_args)
input_alias_sets = set()
for arg, info in zip(args, schema_args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that these two match up. schema_args will not unroll Tensor lists for example. could we add some assertions of some kind or another

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll work through this more

Copy link
Contributor Author

@zou3519 zou3519 Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assertion for isinstance(Tensor) is sufficient here, will put it in (and some comments about why)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about func(Tensor[] inp, Tensor (a!) inp2) ? Wont we still unflatten the inp list and have the mutation arg not be aligned ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got a test for this. But to answer your question, self.unflatten_args puts the flat list back into the structure that is aligned with the schema. Though I realized I forgot to handle kwargs, will go do that.

Copy link
Contributor Author

@zou3519 zou3519 Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eellison I put in a bunch of safety checks (and we handle kwargs now), please let me know if that's what you were looking for.

FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@zou3519 zou3519 requested a review from eellison February 5, 2024 21:18
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@zou3519 zou3519 requested a review from Chillee February 6, 2024 22:20
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@zou3519 zou3519 added ciflow/trunk Trigger trunk jobs on your pull request release notes: inductor labels Feb 8, 2024
return devices[0]
return None

def has_side_effects(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR: I want us to kill has_side_effects. We should detect this, not mark it.

cc: @eellison @Chillee

return self.alias_names

def get_mutation_names(self):
return self.mutation_names
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to assert that this 0 or 1 elements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do

FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@zou3519
Copy link
Contributor Author

zou3519 commented Feb 9, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

clee2000 pushed a commit that referenced this pull request Feb 14, 2024
FallbackKernel wasn't handing mutable ops correctly: it would not report
them in get_mutation_names or get_alias_names. This would lead to silent
incorrectness -- Inductor would incorrectly reorder the mutable op with other
mutable ops.

This PR fixes that:
- we only support mutable operations that are "auto_functionalizable".
  That is, they mutate inputs and do not return aliases of any inputs.
- Following the Triton kernel work, any mutated inputs must be specified
  in get_alias_names and processed via mark_node_as_mutating
- We also do some minor cleanup by killing dead code (FallbackKernel no
  longer processes OpOverloadPacket) and adding some handling around
  HOPs.

Test Plan:
- new tests

Pull Request resolved: #118649
Approved by: https://github.com/eellison, https://github.com/oulgen
@github-actions github-actions bot deleted the gh/zou3519/900/head branch March 11, 2024 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request keep-going Don't stop on first failure, keep running tests until the end Merged module: inductor release notes: inductor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants