-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[inductor] Make AOT CPU Inductor work in fbcode #106225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: This diff has a couple of hacks to make inductor-CPU work for AOT codegen in fbcode: - We need to add the CUDA link flags; AOT-Inductor is specialized for CUDA right now and uses a lot of `at::cuda` stuff. We should do a proper AOT CPU at some point but this unblocks perf measurement. - Add an include path to the cpp_prefix. It's kind of hilarious; we remove the include path for remote execution, but then for AOT we need it back. 🤷 Test Plan: internal test Differential Revision: D47882848 fbshipit-source-id: fb2eb9a70128fb6f0465e0c64ef9c9a5fd2131ac
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106225
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 83b417a: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D47882848 |
LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood that it is a hack.
@pytorchbot merge -f "all tests passed" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary:
This diff has a couple of hacks to make inductor-CPU work for AOT codegen in fbcode:
We need to add the CUDA link flags; AOT-Inductor is specialized for CUDA
right now and uses a lot of
at::cuda
stuff. We should do a proper AOT CPUat some point but this unblocks perf measurement.
Add an include path to the cpp_prefix. It's kind of hilarious; we remove the
include path for remote execution, but then for AOT we need it back. 🤷
Test Plan: internal test
Differential Revision: D47882848
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov