-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Fix size_hint call sites failing on unbacked SymInts #110520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110520
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 960577b with merge base 898482f ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: c3044c0 Pull Request resolved: #110520
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be constraining the hint with the value range if it exists, cc @ezyang.
torch/_inductor/dependencies.py
Outdated
@@ -49,7 +49,7 @@ def numbytes_hint(self): | |||
for var, size in zip(self.var_names, self.size): | |||
if var in vars: | |||
numel = numel * size | |||
return V.graph.sizevars.size_hint(numel) * get_dtype_size( | |||
return V.graph.sizevars.size_hint(numel, fallback=8192) * get_dtype_size( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is sound, see
def can_inplace(self, read_dep: dependencies.MemoryDep):
if self.get_aliases() or self.is_template():
return False
if read_dep.name not in self.scheduler.name_to_node:
# don't allow reuse of an 'input' buffer, we don't own it
# (would this have been fixed if I tracked mutations properly above?)
return False
if not isinstance(
self.node, (torch._inductor.ir.AllReduce, torch._inductor.ir.InPlaceHint)
):
# TODO make this a property of the IR
return False
if len(self.read_writes.writes) == 1:
write_dep = next(iter(self.read_writes.writes))
return read_dep.numbytes_hint() == write_dep.numbytes_hint()
return False
We must know for sure that the sizes are the same to inplace, so we shouldn't fallback, we should reject inplacing (or maybe do it symbolically?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem you're referring to specifically related to can_inplace
(in which case we can, indeed, reject inplacing in can_inplace
in the face of unbacked SymInts)? Or are there other cases where using the fallback=8192
inside numbytes_hint
can cause problems (in which case we can make fallback=None
a parameter of numbytes_hint
, propagate it to size_hint
, and set it to 8192
only from unproblematic places)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can_inplace
is what needs fixing. But it looks like this function is actually the sole user of numbytes_hint
, soooooo
torch/_inductor/ir.py
Outdated
@@ -1769,7 +1769,7 @@ def _dynamic_reshape_indexer(old_size, new_size): | |||
""" | |||
Perform a reshape entirely by modifying indexing math | |||
""" | |||
size_hint = V.graph.sizevars.size_hint | |||
size_hint = partial(V.graph.sizevars.size_hint, fallback=8192) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty sus. Can we just disable dynamic reshape indexing when things are unbacked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify what "disable" means in this context? I.e., I see dynamic_reshape_indexer
being used in a few places in the IR like this:
https://github.com/pytorch/pytorch/blob/main/torch/_inductor/ir.py?#L996
or here
https://github.com/pytorch/pytorch/blob/main/torch/_inductor/ir.py?#L1705
and there doesn't seem to be obvious ways to get around those calls. Or should we invoke some other (static) reshape indexer within dynamic_reshape_indexer
if we see unbacked SymInts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I am not too sure. @jansel any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynamic_reshape_indexer is used to implement views.
I think we could just call .realize()
as a fallback if there are unbacked symints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter two changes look OK, but the first two seem trouble.
@aakhundov yea you could clip it within the size hint api |
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
@ezyang @eellison Updates since the last time:
The only Please take a look. Thanks! |
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
if unbacked_symbols_in_sizes and (not is_contiguous_storage_and_layout(x)): | ||
# realize x; otherwise, the dynamic_reshape_indexer below will fail | ||
# due to the size_hint's inability to process unbacked SymInts | ||
x.realize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Chillee and @aakhundov discussed this in chat, the thinking is that realize should be initially contiguous as long as no one makes it noncontiguous later.
torch/_inductor/scheduler.py
Outdated
return False | ||
if write_dep.has_unbacked_symbols(): | ||
# can't inplace, as can't guarantee the exact size match | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably make this a little better by doing an exact symbolic match, but this is fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if we take symbolic product of the read_dep
and write_dep
shapes with unbacked SymInts in them, and those two symbolic products turn out to be equal, this means that the buffer sizes are provably the same and we can allow in-placing? I this is the case, I think, it's not too hard to do.
On the other hand, I have doubts about the current model for backed SymInts. We're relying on the size hints of the numels, but size hints being the same doesn't necessarily mean that the shapes are the same? I guess, matching symbolic numels would be more reliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if we take symbolic product of the read_dep and write_dep shapes with unbacked SymInts in them, and those two symbolic products turn out to be equal, this means that the buffer sizes are provably the same and we can allow in-placing? I this is the case, I think, it's not too hard to do.
Exactly.
On the other hand, I have doubts about the current model for backed SymInts. We're relying on the size hints of the numels, but size hints being the same doesn't necessarily mean that the shapes are the same? I guess, matching symbolic numels would be more reliable?
Yeah, this seems a bit sus. Suppose that I have s0 and s1, and they happen to be the same this run. Why would reusing the buffer be OK if s0 and s1 later vary? Maybe we can construct a test case that forces a failure here...
@@ -1624,6 +1629,9 @@ def score_fusion_memory(self, node1, node2): | |||
common_memory_deps = (node1.read_writes.reads | node1.read_writes.writes) & ( | |||
node2.read_writes.reads | node2.read_writes.writes | |||
) | |||
common_memory_deps = { | |||
dep for dep in common_memory_deps if not dep.has_unbacked_symbols() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one, wouldn't you prefer to use a fallback heuristic number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat hesitant to use a heuristic number here: if the number is too large, this may lead to overoptimistic saving estimation; if too small, this likely won't make a substantial difference. So my thinking was to (somewhat conservatively) drop the dependencies with unbacked SymInts in them, as no reasonably close estimation can be done (like with backed SymInts). Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
# this cannot be done directly as below as we'll choke on the size_hint | ||
# here | ||
if x_size_product > 0 and not any( | ||
V.graph.sizevars.shape_env.is_unbacked_symint(s) for s in sizes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test redundant now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, x.get_size()
and sizes
can generally be different in the expand
op, with unbacked SymInts being present in any of them, but not the other? That's why I kept both tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right
There are some suboptimal bits, but nothing that looks wrong. Thanks for doing this. It would be really good to have tests for this, because this is from some internal model and we really don't want our only test coverage for this to be when we actually run it for real. |
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. #109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary: Unbacked SymInts can't get a `sizevars.size_hint` due to being data-dependent. pytorch#109893 has added a new `fallback` parameter to `sizevars.size_hint` to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: pytorch#110520 Approved by: https://github.com/jansel, https://github.com/ezyang
fix for #143498 We were incorrectly using contiguous strides for a non-contiguous tensor. There are two separate causes: 1. #110520 made it so we turn Views contiguous with unbacked symints becuase `dynamic_reshape_indexer below will fail due to the size_hint's inability to process unbacked SymInts`. Seems like we should fix. Regardless - it will make the input contiguous if input is unbacked to workaround this. 2. We weren't actually making it contiguous! I filed an issue for this here: #145561. This is still worth landing as a fix, even though we should those issues. Pull Request resolved: #145548 Approved by: https://github.com/desertfire
Stack from ghstack (oldest at bottom):
Summary: Unbacked SymInts can't get a
sizevars.size_hint
due to being data-dependent. #109893 has added a newfallback
parameter tosizevars.size_hint
to specify the fallback value in cases like unbacked SymInt. In this PR we add more of those.Test Plan: CI
Reviewers:
Subscribers:
Tasks:
Tags:
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler