Return unsafe_view instead of view from matmul when folding occurs #134568

jwieczorekhabana · 2024-08-27T08:48:00Z

When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors.
It can be especially problematic when after such function inplace allreduce is performed.
Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned.

Test included in this PR reproduces the issue.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec

pytorch-bot · 2024-08-27T08:48:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134568

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d155a11 with merge base 41e6534 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2024-08-27T08:48:05Z

The committers listed above are authorized under a signed CLA.

✅ login: jwieczorekhabana / name: Jan Wieczorek (d155a11, 6e11997, aec701f, 3b92d11, af6a5da)

xinyu-intel · 2024-08-27T12:08:07Z

@jgong5 @EikanWang Hi, can you help on review the PR?

jwieczorekhabana · 2024-08-27T12:52:00Z

Error thrown:

jwieczorekhabana · 2024-08-28T07:34:14Z

@zou3519 Hi I have linter patch ready. In the error messages I've also seen note about adding test owner. From what I've seen in similar test it's just '# Owner(s): ["module: unknown"]. I'm also wondering whether it's the right place for this kind of test or you might already have a more fitted suite somewhere else. Could you give any hints on that?

zou3519 · 2024-08-30T14:56:10Z

test/custom_function/test_custom_function.py

+class TestCustomFunction(TestCase):
+    def test_autograd_function_with_matmul_folding_at_output(self):
+        """
+        When tensor folding occurs during matmul operation returned tensor is a view.
+        This can cause issues when matmul is used inside a custom function
+        and such view is then returned as output. Then it cannot be modified inplace
+        and causes errors.
+        It can be especially problematic when after such function inplace allreduce
+        is performed. This test recreates this behaviour.
+        Issue is resolved when unsafe_view is returned from matmul instead.
+        """


Move this to test/dynamo/test_misc.py

zou3519

The code change makes sense to me. Let's move the test to an existing file rather than put it in its own testcase.

xinyu-intel · 2024-09-02T11:50:12Z

@pytorchbot label "topic: not user facing"

jwieczorekhabana · 2024-09-05T05:47:51Z

@pytorchbot rebase

pytorch-bot · 2024-09-05T05:47:55Z

You don't have permissions to rebase this PR since you are a first time contributor. If you think this is a mistake, please contact PyTorch Dev Infra.

zou3519 · 2024-09-05T14:21:14Z

@pytorchbot rebase

pytorchmergebot · 2024-09-05T14:22:38Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors. It can be especially problematic when after such function inplace allreduce is performed. Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned.

- Removed return types from forward/backward functions in test_custom_function to be compatible with python 3.8 - Updated graph in test_proxy_tensor test_reflect_r_over_x os that _unsafe_view is added instead of view_4 after mm

pytorchmergebot · 2024-09-05T14:22:42Z

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

zou3519

lint failing

jwieczorekhabana · 2024-09-18T06:27:12Z

lint failing

done

jwieczorekhabana · 2024-09-19T11:44:49Z

@pytorchbot merge

pytorchmergebot · 2024-09-19T11:46:36Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ytorch#134568) When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors. It can be especially problematic when after such function inplace allreduce is performed. Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned. Test included in this PR reproduces the issue. Pull Request resolved: pytorch#134568 Approved by: https://github.com/zou3519

When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors. It can be especially problematic when after such function inplace allreduce is performed. Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned. Pull request openned to pytorch pytorch#134568 Change-Id: I77484ff6f22d3e290352348b1acbffa267eb063b

pytorchbot added the open source label Aug 27, 2024

albanD requested a review from zou3519 August 27, 2024 14:41

albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 27, 2024

zou3519 reviewed Aug 30, 2024

View reviewed changes

pytorch-bot bot added the module: dynamo label Sep 2, 2024

pytorch-bot bot added the topic: not user facing topic category label Sep 2, 2024

zou3519 approved these changes Sep 3, 2024

View reviewed changes

jwieczorekhabana added 3 commits September 5, 2024 14:22

Moved matmul folding test to test_misc.py

6e11997

pytorchmergebot force-pushed the main branch from d174036 to 6e11997 Compare September 5, 2024 14:22

zou3519 requested changes Sep 5, 2024

View reviewed changes

jwieczorekhabana added 2 commits September 6, 2024 06:59

Merge branch 'pytorch:main' into main

af6a5da

Linting changes

d155a11

zou3519 approved these changes Sep 18, 2024

View reviewed changes

zou3519 added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 18, 2024

pytorchmergebot added the merging label Sep 19, 2024

pytorchmergebot closed this in 908a568 Sep 19, 2024

pytorchmergebot added Merged and removed merging labels Sep 19, 2024

Return unsafe_view instead of view from matmul when folding occurs #134568

Return unsafe_view instead of view from matmul when folding occurs #134568

Uh oh!

Conversation

jwieczorekhabana commented Aug 27, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134568

✅ No Failures

Uh oh!

linux-foundation-easycla bot commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xinyu-intel commented Aug 27, 2024

Uh oh!

jwieczorekhabana commented Aug 27, 2024

Uh oh!

jwieczorekhabana commented Aug 28, 2024

Uh oh!

zou3519 Aug 30, 2024

Choose a reason for hiding this comment

Uh oh!

jwieczorekhabana Sep 2, 2024

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

xinyu-intel commented Sep 2, 2024

Uh oh!

jwieczorekhabana commented Sep 5, 2024

Uh oh!

pytorch-bot bot commented Sep 5, 2024

Uh oh!

zou3519 commented Sep 5, 2024

Uh oh!

pytorchmergebot commented Sep 5, 2024

Uh oh!

pytorchmergebot commented Sep 5, 2024

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

jwieczorekhabana commented Sep 18, 2024

Uh oh!

jwieczorekhabana commented Sep 19, 2024

Uh oh!

pytorchmergebot commented Sep 19, 2024

Merge started

Uh oh!

Uh oh!

jwieczorekhabana commented Aug 27, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 27, 2024 •

edited

Loading

linux-foundation-easycla bot commented Aug 27, 2024 •

edited

Loading