New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inductor] Refactor common triton imports into one function #121438
Conversation
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `@pointwise` we now generate `@triton_heuristics.pointwise` just so the imports are the same for all kernel types. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/121438
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 7e0dfa6 with merge base 953c6c3 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `pointwise` we now generate `triton_heuristics.pointwise` just so the imports are the same for all kernel types. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `pointwise` we now generate `triton_heuristics.pointwise` just so the imports are the same for all kernel types. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `pointwise` we now generate `triton_heuristics.pointwise` just so the imports are the same for all kernel types. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `pointwise` we now generate `triton_heuristics.pointwise` just so the imports are the same for all kernel types. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `pointwise` we now generate `triton_heuristics.pointwise` just so the imports are the same for all kernel types. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
@@ -91,6 +92,7 @@ def f(lhs, index, rhs): | |||
kernel_code = kernel_list[0] | |||
self.assertEqual(metrics._count_pattern(kernel_code, "tl.atomic_add"), 1) | |||
|
|||
@largeTensorTest(25e7 * 2 * 4, device=GPU_TYPE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but how come this was passing before? Or it was not, and it was just not being tested on small GPU boxes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only actually 2 GB of data, but was failing for me because of other users on the machine. Would be fine in CI.
Pull Request resolved: #121267 Approved by: https://github.com/lezcano ghstack dependencies: #121438
This means when codegen depends on a particular import we only need to add it in one place and it's applied to all triton kernels. This also changes codegen slightly so instead of generating `@pointwise` we now generate `@triton_heuristics.pointwise` just so the imports are the same for all kernel types. Pull Request resolved: #121438 Approved by: https://github.com/lezcano
Pull Request resolved: #121267 Approved by: https://github.com/lezcano ghstack dependencies: #121438
Stack from ghstack (oldest at bottom):
This means when codegen depends on a particular import we only need to
add it in one place and it's applied to all triton kernels.
This also changes codegen slightly so instead of generating
@pointwise
we now generate@triton_heuristics.pointwise
just sothe imports are the same for all kernel types.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler @amjames @desertfire @chauhang