Avoid generating as_strided for alaising views in auto_functionalize_v2 #137149

laithsakka · 2024-10-02T00:04:49Z

Stack from ghstack (oldest at bottom):

-> Avoid generating as_strided for alaising views in auto_functionalize_v2 #137149

during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base
we create a view that is regenerated by calling aten.alias instead of as_strided for better performance.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-10-02T00:04:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137149

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ddba772 with merge base 839d356 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 3806878 Pull Request resolved: #137149

…ctionalize_v2" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 4627b28 Pull Request resolved: #137149

…ctionalize_v2" title, see unit tests cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 792fa7b Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we just pass the base. (we can potentially call alias instead to be persisitance with weried case that check id inside custom op) not sure if we shall do that. Those checks should not be inside auto_functionalized_dense to avoid generating guards in any case. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: a854d9c Pull Request resolved: #137149

zou3519 · 2024-10-02T21:45:47Z

during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base
we just pass the base. (we can potentially call alias instead to be persisitance with weried case that check id inside custom op) not sure if we shall do that.

Can the Tensor ever be used later on? If so then we need to call alias to preserve the semantics (that the tensor is a view instead of the original tensor)

laithsakka · 2024-10-02T23:16:02Z

during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base
we just pass the base. (we can potentially call alias instead to be persisitance with weried case that check id inside custom op) not sure if we shall do that.

Can the Tensor ever be used later on? If so then we need to call alias to preserve the semantics (that the tensor is a view instead of the original tensor)

inside the custom op?

test/inductor/test_auto_functionalize.py

zou3519

Let's beef up the testing with dynamic shapes. Also, we shouldn't delete the as_strided calls, they should be turned into alias calls.

laithsakka · 2024-10-04T16:42:50Z

so we do add guards when we fail the check
for example:

@torch.compile(fullgraph=True, dynamic=True, backend="inductor")
def f(a):
   b = a[0]
   foo(b)

+- LAMBDA_GUARD: Ne(L['a'].size()[0], L['a'].size()[2])  # (_higher_order_ops/auto_functionalize.py:81 in write_single_view)

we do not add new guards for this

@torch.compile(fullgraph=True, dynamic=True, backend="inductor")
def f(a):
   b = torch.ops.aten.alias (a)
   foo(b)

because the two side of the symbol we are comparing are the same symbols.

however what is the likely hood of actually failing the guards in next iterations, is your concern about the guard checking
over head that we can avoid if we do this statically (capturing the alias call). or is your concern with potential recompilation.

laithsakka · 2024-10-04T16:46:11Z

I see the point that "this guards is not needed in the case above, because we always know that it will is true, and we can write the opt in a way that does not generate the guard".

but someone can counter that with " well if its always the same then we dont have to worry about recompilation so unless we are worried about guard checking time we shall be ok"

I do not have strong opinion about landing this just sharing my thoughts, lmk what do you think

zou3519 · 2024-10-04T18:19:35Z

The as_strided -> alias() change sounds good then, because it does not add new guards. Can you add additional tests to show that it doesn't invoke a recompile when the shape changes? Then we can land this PR.

I'm just concerned about unnecessary recompiles.

For the slice change: I'm confused, why is it generating a guard that looks like Ne(L['a'].size()[0], L['a'].size()[2]) ?

laithsakka · 2024-10-04T19:03:57Z

as_strided -> alias()

oh Ne(L['a'].size()[0], L['a'].size()[2]) is added by the alias change not by the slice change. I did not check anything on the other slice change yet.
basically we fail the check

     base.size() == tensor.size()

and this cause the above failure.

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we just pass the base. (we can potentially call alias instead to be persisitance with weried case that check id inside custom op) not sure if we shall do that. Those checks should not be inside auto_functionalized_dense to avoid generating guards in any case. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 95bc0c7 Pull Request resolved: #137149

laithsakka · 2024-10-07T18:27:36Z

added recompilation tests, not sure if that covers your intention of :

we should also have a test where we create a tensor with unbacked symints, and then we take an alias of that, and then we mutate it.

also update the diff to generate alias, (added a test but it fail seems that soemwhere down the road inductor remove the alias ) see issue #137434

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we just pass the base. (we can potentially call alias instead to be persisitance with weried case that check id inside custom op) not sure if we shall do that. Those checks should not be inside auto_functionalized_dense to avoid generating guards in any case. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 525f1e8 Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 2375bd9 Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 5701e3c Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 965af19 Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 53020c2 Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: a955348 Pull Request resolved: #137149

zou3519 · 2024-10-09T01:43:46Z

test/inductor/test_auto_functionalize.py

+        alias_default_1: "f32[s0][1]cpu" = torch.ops.aten.alias.default(arg1_1)
+        foo_default = torch.ops.mylib.foo.default(alias_default, alias_default_1);  \
+alias_default = alias_default_1 = foo_default = None
+        copy_: "f32[s0][1]cpu" = torch.ops.aten.copy_.default(arg1_1, arg1_1);  copy_ = None


To check... does inductor remove the copy_ during lowering?

output code

def call(args): arg0_1, arg1_1, arg2_1 = args args.clear() s0 = arg0_1 s1 = arg1_1 assert_size_stride(arg2_1, (s0, s1), (s1, 1)) # Topologically Sorted Source Nodes: [], Original ATen: [] torch.ops.mylib.foo.default(arg2_1, arg2_1) return (arg2_1, arg2_1, )

zou3519 · 2024-10-09T01:45:56Z

test/inductor/test_auto_functionalize.py

+            def f(x):
+                a = torch.ops.aten.alias.default(x)


can you make this something like...

def f(x): a = torch.ops.aten.alias.default(x) b = x.clone() c = b.nonzero().float() d = c.alias() torch.ops.mylib.foo(a, d) return a, d

d is a Tensor with unbacked symints in the shape

i will add a test for that i here and in the recompile test

added they pass

zou3519 · 2024-10-09T01:47:57Z

test/inductor/test_auto_functionalize.py

+    # that id(x) != id(base)
+    @torch._inductor.config.patch(enable_auto_functionalized_v2=True)
+    @unittest.skip(
+        reason="This test fails because something else in inductor optimize out the alias. issue #137434"


zou3519 · 2024-10-09T02:03:16Z

test/inductor/test_auto_functionalize.py

+            def func(x):
+                a = torch.ops.aten.alias.default(x)
+                torch.ops.mylib.not_eq(a, x)
+


can you return a? and then assert that the input and output of func have different id identities.

Inductor doesn't match the tensor identity of intermediates of the function, but it should better match the identity of inputs/outputs of the compiled function.

i will add another test that does that

added that , that one pass.

zou3519

code LGTM, but please see my suggestions for the test cases

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 14a12cd Pull Request resolved: #137149

…ctionalize_v2" during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: d2505ab Pull Request resolved: #137149

laithsakka · 2024-10-09T23:54:25Z

@pytorchbot merge

pytorchmergebot · 2024-10-09T23:56:27Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…v2 (#137149) during auto_functionalize_v2 if we encounter a view such that size() stride() and storage_offset() matches the base we create a view that is regenerated by calling aten.alias instead of as_strided for better performance. Pull Request resolved: #137149 Approved by: https://github.com/zou3519

avoid generating as_strided for alaising views in auto_functionalize_v2

64d88d7

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Oct 2, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

5f12c1c

ghstack-source-id: 3806878 Pull Request resolved: #137149

pytorch-bot bot added module: inductor labels Oct 2, 2024

laithsakka added a commit that referenced this pull request Oct 2, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

6cd00bd

ghstack-source-id: 4627b28 Pull Request resolved: #137149

laithsakka changed the title ~~avoid generating as_strided for alaising views in auto_functionalize_v2~~ Avoid generating as_strided for alaising views in auto_functionalize_v2 Oct 2, 2024

laithsakka added the topic: not user facing topic category label Oct 2, 2024

laithsakka added a commit that referenced this pull request Oct 2, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

3ba3675

ghstack-source-id: 792fa7b Pull Request resolved: #137149

laithsakka requested a review from zou3519 October 2, 2024 17:05

laithsakka added a commit that referenced this pull request Oct 2, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

ca2df12

ghstack-source-id: a854d9c Pull Request resolved: #137149

laithsakka requested a review from oulgen October 2, 2024 17:33

laithsakka mentioned this pull request Oct 2, 2024

Generate slice.Tensor view operations instead of as_strided when split is used in the original program. #137225

Closed

zou3519 reviewed Oct 3, 2024

View reviewed changes

test/inductor/test_auto_functionalize.py Outdated Show resolved Hide resolved

zou3519 reviewed Oct 3, 2024

View reviewed changes

test/inductor/test_auto_functionalize.py Show resolved Hide resolved

zou3519 requested changes Oct 3, 2024

View reviewed changes

laithsakka mentioned this pull request Oct 3, 2024

Proper handling of arguments passed by in kwargs inside zip_schema #137311

Closed

laithsakka added a commit that referenced this pull request Oct 7, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

f3a4672

ghstack-source-id: 95bc0c7 Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 7, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

97f9091

ghstack-source-id: 525f1e8 Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 7, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

90aa064

ghstack-source-id: 2375bd9 Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 8, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

b9bc3de

ghstack-source-id: 5701e3c Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 8, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

a13bf85

ghstack-source-id: 965af19 Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 8, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

dbb2a3f

ghstack-source-id: 53020c2 Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 8, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

7509bbc

ghstack-source-id: a955348 Pull Request resolved: #137149

zou3519 reviewed Oct 9, 2024

View reviewed changes

zou3519 approved these changes Oct 9, 2024

View reviewed changes

laithsakka added a commit that referenced this pull request Oct 9, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

126e775

ghstack-source-id: 14a12cd Pull Request resolved: #137149

laithsakka added a commit that referenced this pull request Oct 9, 2024

avoid generating as_strided for alaising views in auto_functionalize_v2

59f360f

ghstack-source-id: d2505ab Pull Request resolved: #137149

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 9, 2024

pytorchmergebot added the merging label Oct 9, 2024

pytorchmergebot added the Merged label Oct 10, 2024

pytorchmergebot closed this in 1aa130e Oct 10, 2024

pytorchmergebot removed the merging label Oct 10, 2024

github-actions bot deleted the gh/laithsakka/76/head branch November 10, 2024 02:07

Avoid generating as_strided for alaising views in auto_functionalize_v2 #137149

Avoid generating as_strided for alaising views in auto_functionalize_v2 #137149

Uh oh!

Conversation

laithsakka commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137149

✅ No Failures

Uh oh!

zou3519 commented Oct 2, 2024

Uh oh!

laithsakka commented Oct 2, 2024

Uh oh!

Uh oh!

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

laithsakka commented Oct 4, 2024

Uh oh!

laithsakka commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zou3519 commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laithsakka commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laithsakka commented Oct 7, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

laithsakka commented Oct 9, 2024

Uh oh!

pytorchmergebot commented Oct 9, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

laithsakka commented Oct 2, 2024 •

edited

Loading

pytorch-bot bot commented Oct 2, 2024 •

edited

Loading

laithsakka commented Oct 4, 2024 •

edited

Loading

zou3519 commented Oct 4, 2024 •

edited

Loading

laithsakka commented Oct 4, 2024 •

edited

Loading