-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[export] support linear & layer_norm unbacked #155260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155260
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 1d571f8 with merge base 9bf6593 ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
torch/_prims_common/__init__.py
Outdated
|
|
||
| # Short-circuits if the tensor is already contiguous or channels-last contiguous | ||
| if is_contiguous(a) or is_channels_last_contiguous(a): | ||
| if definitely_contiguous(a) or is_known_channels_last_contiguous(a): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
i think we need
definitely_non_overlapping_and_dense
unless you know for sure that all uses of is_non_overlapping_and_dense are already non material checks?
if not maybe introduce the new one and use it where appropriate? -
my bad i forgot to change the naming of is_known_channels_last_contiguous to definitely_channels_last_contiguous can you do it as part of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a quick audit tells me the python version of
is_non_overlapping_and_denseis only used three times and there's always a general path. I'm not sure whether I should be checking the c++ callsites.
Also I'm applying it on a short-circuit check, so I was thinking it's fine to leave it be since there's a more general path below.
- will change
is_known_channels_last_contiguous->definitely_channels_last_contiguous.
in #155499
torch/_subclasses/fake_impls.py
Outdated
| continue | ||
| is_contiguous = is_contiguous and op.is_contiguous( | ||
| memory_format=torch.contiguous_format | ||
| is_contiguous = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems legit one way to verify that i have been doing is putting a PR where I always take the slow path and see what fails.
is_contiguous -> definitely_contiguous
torch/_subclasses/fake_impls.py
Outdated
| ) | ||
| is_channels_last = is_channels_last and op.is_contiguous( | ||
| memory_format=torch.channels_last | ||
| is_channels_last = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_channels_last -> definitely_channels_last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments lmk what you think
## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` [ghstack-poisoned]
## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` [ghstack-poisoned]
## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
What
definitely_contiguous_for_memory_formatinstead ofis_contiguouswhen the non-contiguous case is fine if we encounter a DDE.definitely_contiguous_for_memory_formatand clone if there's a DDE.Example DDEs
definitely_contiguous_for_memory_formatinfast_binary_implrefs.contiguousinstead of calling aten's contiguous (that'd require a bigger re-write in Aten)definitely_contiguous_for_memory_formatin ref's contiguousStack from ghstack (oldest at bottom):