-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[AOTI] Support scalar to tensor in the ABI-compatible mode #118024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118024
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 585e614 with merge base b5b36cf ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
// AOTI_TORCH_VALUE_TO_TENSOR_IMPL(uint32, uint32_t, UInt32) | ||
// error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to | ||
// ‘torch::detail::TensorDataContainer’ is ambiguous | ||
// AOTI_TORCH_VALUE_TO_TENSOR_IMPL(uint64, uint64_t, UInt64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... Well you could always use scalar_to_tensor instead, that should work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Let me try that.
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
|
||
AtenTensorHandle new_tensor_handle(at::Tensor&& tensor) { | ||
at::Tensor* new_tensor = new at::Tensor(std::move(tensor)); | ||
return tensor_pointer_to_tensor_handle(new_tensor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd, is there a reason this utility didn't exist previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but somehow everybody has been just doing copy-pasting. I will post another refactor PR after this.
torch/_inductor/graph.py
Outdated
] = ( | ||
[] | ||
) # This is the linemap used by the profiler to mark custom compiled kernels getting run | ||
self.symbol_to_dtype: Dict[sympy.Symbol, torch.dtype] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs docs
torch/_inductor/ir.py
Outdated
self.sym = sym.args[0] | ||
self.is_bool = True | ||
|
||
V.graph.symbol_to_dtype[self.sym] = data.get_dtype() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I remember you asking me about this, and this is emphatically not what I suggested. Tell me more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return another data struct, say a data class with symbol and dtype, instead of symbol here,
pytorch/torch/_inductor/lowering.py
Line 2558 in e309d6f
return sym |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I still think this is very suspicious.
Looking over the modifications you made, the change here is solely so that ShapeAsConstantBuffer can be code generated. But here, there is a strange divergence between the Python in C++. In Python, the shape is code generated as a plain int; however, in C++, the shape is wrapped into a Tensor. This seems... wrong! Now that we are supporting raw integers flowing through AOTInductor, the shape should be represented as is (as a int64_t
) when flowing through AOTInductor.
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]
@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
torch/_inductor/ir.py
Outdated
self.sym = sym.args[0] | ||
self.is_bool = True | ||
|
||
V.graph.symbol_to_dtype[self.sym] = data.get_dtype() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I still think this is very suspicious.
Looking over the modifications you made, the change here is solely so that ShapeAsConstantBuffer can be code generated. But here, there is a strange divergence between the Python in C++. In Python, the shape is code generated as a plain int; however, in C++, the shape is wrapped into a Tensor. This seems... wrong! Now that we are supporting raw integers flowing through AOTInductor, the shape should be represented as is (as a int64_t
) when flowing through AOTInductor.
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler Differential Revision: [D53019485](https://our.internmc.facebook.com/intern/diff/D53019485) [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler Differential Revision: [D53019485](https://our.internmc.facebook.com/intern/diff/D53019485) [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler Differential Revision: [D53019485](https://our.internmc.facebook.com/intern/diff/D53019485) [ghstack-poisoned]
if config.aot_inductor.abi_compatible: | ||
output_buffer = V.graph.graph_outputs[idx] | ||
if isinstance(output_buffer, ir.ShapeAsConstantBuffer): | ||
# Need to wrap scalar into tensor as the main function returns a vector of tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HMMMM are you ever considering to change this ABI? I can imagine that returning shapes from AOTInductor models is not that common today and so maybe papering over the ABI problem this way is fine, but when we support dynamic shapes AOTInductor models, I really don't want input ints to be passed into the model as tensors, it's soooo unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @chenyang78 who may look into this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good!
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler Differential Revision: [D53019485](https://our.internmc.facebook.com/intern/diff/D53019485) [ghstack-poisoned]
@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pytorchbot merge -f 'Landed internally' (Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally) |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler
Differential Revision: D53019485