-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[inductor] Avoid fallback case for custom scan op lowering #130936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130936
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit f9051b4 with merge base 7919f0b ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just an idea for the naming
torch/_inductor/ir.py
Outdated
*, | ||
# Whether we should fallback if split criteria is met but the backend | ||
# feature is not available | ||
require_split_scan: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can_fallback_to_aten: bool = True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a much better name, thanks!
[ghstack-poisoned]
We currently can't generate split scans when there are multiple scan values, so we normally fall back to ATen. However, for the higher order scan op, we can't fallback so it makes sense to just generate the slower kernel anyway. This avoids having special shapes where we fail to codegen. ghstack-source-id: 232843a Pull Request resolved: #130936
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm6.1-py3.8 / test (default, 2, 2, linux.rocm.gpu) Details for Dev Infra teamRaised by workflow job |
We currently can't generate split scans when there are multiple scan values, so we normally fall back to ATen. However, for the higher order scan op, we can't fallback so it makes sense to just generate the slower kernel anyway. This avoids having special shapes where we fail to codegen. ghstack-source-id: 5346730 Pull Request resolved: #130936
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…30936) We currently can't generate split scans when there are multiple scan values, so we normally fall back to ATen. However, for the higher order scan op, we can't fallback so it makes sense to just generate the slower kernel anyway. This avoids having special shapes where we fail to codegen. Pull Request resolved: pytorch#130936 Approved by: https://github.com/lezcano
…30936) We currently can't generate split scans when there are multiple scan values, so we normally fall back to ATen. However, for the higher order scan op, we can't fallback so it makes sense to just generate the slower kernel anyway. This avoids having special shapes where we fail to codegen. Pull Request resolved: pytorch#130936 Approved by: https://github.com/lezcano
Stack from ghstack (oldest at bottom):
We currently can't generate split scans when there are multiple scan
values, so we normally fall back to ATen. However, for the higher order
scan op, we can't fallback so it makes sense to just generate the slower
kernel anyway. This avoids having special shapes where we fail to
codegen.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang