-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable nvprims.transpose fusions for nvFuser #86967
Enable nvprims.transpose fusions for nvFuser #86967
Conversation
…ityBasedPartitioner
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86967
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b013d6f: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@SherlockNoMad could you please review the |
@ngimel could you please take a look? It's purely nvFuser related change and doesn't modify anything else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
torch/_prims/nvfuser_executor.py
Outdated
@@ -268,6 +268,29 @@ def __call__(self, *args): | |||
) | |||
|
|||
|
|||
# A set of operators that are supported by nvFuser | |||
# but should not form a fusion group solely on their own | |||
# _non_compute_ops = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: commented code should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an example, but it should have been marked so more clearly 😄
# "torch.ops.nvprims.broadcast_in_dim.default", | ||
# "torch.ops.nvprims.squeeze.default", | ||
# } | ||
_non_compute_ops = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: peeking through op return type and categorizing view
-like return to non-compute ops is counter-intuitive. (thinking about in-place update).
I think this is safe for our uses, but should we rename this to _view_like_ops
and add a line in the comment that view-like ops are likely no-compute ops, since in-place update is handled by functionalization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, I used the same name as was already used in the partitioner code.
We also don't have the in-place update (not merged yet).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partitioner change looks good to me.
@pytorchbot merge -g |
Merge startedYour change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This PR allows transposes to be fused with other operations. If a fusion group is formed only from operations that just manipulate metadata in PyTorch (transpose, view, etc.) then this group is not sent to nvFuser. On top of that if we have converted to `nvprims` but then decided to not form a fusion group we modify the graph use `prim.impl_aten` attribute instead of calling `prim(*args, **kwargs)` that has a higher overhead. cc @kevinstephano @jjsjann123 Pull Request resolved: pytorch#86967 Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
This PR allows transposes to be fused with other operations. If a fusion group is formed only from operations that just manipulate metadata in PyTorch (transpose, view, etc.) then this group is not sent to nvFuser. On top of that if we have converted to `nvprims` but then decided to not form a fusion group we modify the graph use `prim.impl_aten` attribute instead of calling `prim(*args, **kwargs)` that has a higher overhead. cc @kevinstephano @jjsjann123 Pull Request resolved: pytorch#86967 Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
This PR allows transposes to be fused with other operations. If a fusion group is formed only from operations that just manipulate metadata in PyTorch (transpose, view, etc.) then this group is not sent to nvFuser. On top of that if we have converted to `nvprims` but then decided to not form a fusion group we modify the graph use `prim.impl_aten` attribute instead of calling `prim(*args, **kwargs)` that has a higher overhead. cc @kevinstephano @jjsjann123 Pull Request resolved: pytorch#86967 Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
This PR allows transposes to be fused with other operations. If a fusion group is formed only from operations that just manipulate metadata in PyTorch (transpose, view, etc.) then this group is not sent to nvFuser.
On top of that if we have converted to
nvprims
but then decided to not form a fusion group we modify the graph useprim.impl_aten
attribute instead of callingprim(*args, **kwargs)
that has a higher overhead.cc @kevinstephano @jjsjann123