[export] support linear & layer_norm unbacked #155260

ColinPeppler · 2025-06-05T20:38:23Z

What

use definitely_contiguous_for_memory_format instead of is_contiguous when the non-contiguous case is fine if we encounter a DDE.
use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use definitely_contiguous_for_memory_format and clone if there's a DDE.

Example DDEs

Fixed with definitely_contiguous_for_memory_format in fast_binary_impl

torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)).  (Size-like symbols: u0)

Caused by: layer_norm = self.layer_norm(linear)  # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl)

Fixed with refs.contiguous instead of calling aten's contiguous (that'd require a bigger re-write in Aten)

  File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*)
  File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const
  File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const

torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)).  (Size-like symbols: u0)

Caused by: (_refs/__init__.py:3302 in native_layer_norm)

Fixed with definitely_contiguous_for_memory_format in ref's contiguous

torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2).  (Size-like symbols: u0)

Caused by: (_prims_common/__init__.py:279 in is_contiguous)

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2025-06-05T20:38:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155260

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 1d571f8 with merge base 9bf6593 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3-clang12-executorch / build (gh) (#150261)
Final attempt failed. Child_process exited with error code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: ea14463 Pull Request resolved: #155260

laithsakka · 2025-06-09T16:54:01Z

torch/_prims_common/__init__.py


    # Short-circuits if the tensor is already contiguous or channels-last contiguous
-    if is_contiguous(a) or is_channels_last_contiguous(a):
+    if definitely_contiguous(a) or is_known_channels_last_contiguous(a):


i think we need definitely_non_overlapping_and_dense
unless you know for sure that all uses of is_non_overlapping_and_dense are already non material checks?
if not maybe introduce the new one and use it where appropriate?

my bad i forgot to change the naming of is_known_channels_last_contiguous to definitely_channels_last_contiguous can you do it as part of this change?

a quick audit tells me the python version of is_non_overlapping_and_dense is only used three times and there's always a general path. I'm not sure whether I should be checking the c++ callsites.

https://github.com/search?q=repo%3Apytorch%2Fpytorch+%2F%5Cbis_non_overlapping_and_dense%5C%28%2F+language%3APython&type=code&l=Python

Also I'm applying it on a short-circuit check, so I was thinking it's fine to leave it be since there's a more general path below.

will change is_known_channels_last_contiguous -> definitely_channels_last_contiguous.

in #155499

torch/_refs/__init__.py

laithsakka · 2025-06-09T17:02:48Z

torch/_subclasses/fake_impls.py

                    continue
-                is_contiguous = is_contiguous and op.is_contiguous(
-                    memory_format=torch.contiguous_format
+                is_contiguous = (


this seems legit one way to verify that i have been doing is putting a PR where I always take the slow path and see what fails.

is_contiguous -> definitely_contiguous

laithsakka · 2025-06-09T17:03:04Z

torch/_subclasses/fake_impls.py

                )
-                is_channels_last = is_channels_last and op.is_contiguous(
-                    memory_format=torch.channels_last
+                is_channels_last = (


is_channels_last -> definitely_channels_last

laithsakka

left some comments lmk what you think

## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` [ghstack-poisoned]

ghstack-source-id: 1046aa6 Pull Request resolved: #155260

## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` [ghstack-poisoned]

ghstack-source-id: 9663b8d Pull Request resolved: #155260

ColinPeppler · 2025-06-10T22:53:15Z

@pytorchbot merge

pytorchmergebot · 2025-06-10T22:55:17Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-06-11T04:53:41Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

ColinPeppler · 2025-06-11T16:44:41Z

@pytorchbot merge

pytorchmergebot · 2025-06-11T16:47:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

export/inductor support layer_norm unbacked

f6a75a9

[ghstack-poisoned]

ColinPeppler changed the title ~~export/inductor support layer_norm unbacked~~ [export] support layer_norm unbacked Jun 5, 2025

Update on "[export] support layer_norm unbacked"

f87107c

[ghstack-poisoned]

ColinPeppler added a commit that referenced this pull request Jun 5, 2025

export/inductor support layer_norm unbacked

521cfce

ghstack-source-id: ea14463 Pull Request resolved: #155260

ColinPeppler added the topic: not user facing topic category label Jun 5, 2025

ColinPeppler changed the title ~~[export] support layer_norm unbacked~~ [export] support linear & layer_norm unbacked Jun 5, 2025

ColinPeppler mentioned this pull request Jun 5, 2025

[inductor] support linear & layer_norm unbacked #155267

Closed

ColinPeppler requested review from bobrenjc93, laithsakka and pianpwk June 6, 2025 02:56

laithsakka reviewed Jun 9, 2025

View reviewed changes

torch/_refs/__init__.py Show resolved Hide resolved

laithsakka reviewed Jun 9, 2025

View reviewed changes

torch/_refs/__init__.py Show resolved Hide resolved

laithsakka reviewed Jun 9, 2025

View reviewed changes

laithsakka requested changes Jun 9, 2025

View reviewed changes

ColinPeppler mentioned this pull request Jun 9, 2025

[refactor] is_known_channels_last_contiguous* -> definitely_channels_last_contiguous* #155499

Closed

ColinPeppler added a commit that referenced this pull request Jun 9, 2025

export/inductor support layer_norm unbacked

81d2663

ghstack-source-id: 1046aa6 Pull Request resolved: #155260

ColinPeppler requested a review from laithsakka June 10, 2025 17:14

laithsakka approved these changes Jun 10, 2025

View reviewed changes

ColinPeppler added a commit that referenced this pull request Jun 10, 2025

export/inductor support layer_norm unbacked

13fe180

ghstack-source-id: 9663b8d Pull Request resolved: #155260

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 10, 2025

pytorchmergebot added the merging label Jun 10, 2025

pytorchmergebot closed this in 7b7cd56 Jun 11, 2025

pytorchmergebot added Merged and removed merging labels Jun 11, 2025

github-actions bot deleted the gh/ColinPeppler/71/head branch July 13, 2025 02:24

[export] support linear & layer_norm unbacked #155260

[export] support linear & layer_norm unbacked #155260

Uh oh!

Conversation

ColinPeppler commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Example DDEs

Uh oh!

pytorch-bot bot commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155260

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

laithsakka Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

ColinPeppler Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

laithsakka Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

laithsakka Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

laithsakka left a comment

Choose a reason for hiding this comment

Uh oh!

ColinPeppler commented Jun 10, 2025

Uh oh!

pytorchmergebot commented Jun 10, 2025

Merge started

Uh oh!

pytorchmergebot commented Jun 11, 2025

Uh oh!

ColinPeppler commented Jun 11, 2025

Uh oh!

pytorchmergebot commented Jun 11, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ColinPeppler commented Jun 5, 2025 •

edited

Loading

pytorch-bot bot commented Jun 5, 2025 •

edited

Loading

ColinPeppler Jun 9, 2025 •

edited

Loading