sampled_addmm backward: fix incorrect gradient wrt self #103548

nikitaved · 2023-06-13T21:16:55Z

As per title. Previous gradient was only correct under the Sparse Semantics, i.e. withalpha * (mat1 @ mat2) ignored. However, then, it is wrongly parametrized in the backward pass, as we need to project the gradient in a generic case.
Under this parametrization we can expect gradcheck to succeed with either masked=True or masked=False even after #103518 is fixed.

cc @alexsamardzic @pearu @cpuhrsch @amjames @bhosmer @ezyang @gchanan @albanD @zou3519 @gqchen @soulitzer @lezcano @Varal7 .

Stack from ghstack (oldest at bottom):

cc @alexsamardzic @pearu @cpuhrsch @amjames @bhosmer @ezyang @gchanan @albanD @zou3519 @gqchen @soulitzer @lezcano @Varal7

[ghstack-poisoned]

pytorch-bot · 2023-06-13T21:16:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103548

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ 1 Unrelated Failure

As of commit 4ed23c1:

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

linux-bionic-cuda12.1-py3.10-gcc9-bazel-test / build-and-test (default, 1, 1, linux.4xlarge.nvidia.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: a639df2ec7b5579743f3cb36560716d46bc227b3 Pull Request resolved: #103548

[ghstack-poisoned]

As per title. Previous gradient was only correct under the Sparse Semantics, i.e. with`alpha * (mat1 @ mat2)` ignored. However, then, it is wrongly parametrized in the backward pass, as we need to project the gradient in a generic case. Under this parametrization we can expect `gradcheck` to succeed with either `masked=True` or `masked=False` even after #103518 is fixed. cc pearu . cc alexsamardzic pearu cpuhrsch amjames bhosmer ezyang gchanan albanD zou3519 gqchen soulitzer Lezcano Varal7 [ghstack-poisoned]

github-actions · 2023-08-27T19:33:47Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

sampled_addmm backward: fix incorrect gradient wrt self

4f13263

[ghstack-poisoned]

nikitaved requested review from albanD and soulitzer as code owners June 13, 2023 21:16

nikitaved mentioned this pull request Jun 13, 2023

sampled_addmm: backward performance improvements #103544

Closed

This was referenced Jun 13, 2023

softmax: Triton kernel for BSR inputs #102095

Closed

sampled_addmm for BSR inputs: fuse softmax #101978

Closed

nikitaved added a commit that referenced this pull request Jun 13, 2023

sampled_addmm backward: fix incorrect gradient wrt self

00cb5be

ghstack-source-id: a639df2ec7b5579743f3cb36560716d46bc227b3 Pull Request resolved: #103548

Update on "sampled_addmm backward: fix incorrect gradient wrt self"

e5b921e

[ghstack-poisoned]

nikitaved added module: sparse Related to torch.sparse module: autograd Related to torch.autograd, and the autograd engine in general module: bc-breaking Related to a BC-breaking change labels Jun 13, 2023

pytorch-bot bot added the topic: bc breaking topic category label Jun 13, 2023

nikitaved added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 13, 2023

pytorchbot added the open source label Jun 13, 2023

nikitaved added 3 commits June 13, 2023 21:59

nikitaved mentioned this pull request Jun 16, 2023

Multioutput backward formula: allow conditional guards against saving #103750

Closed

nikitaved added 4 commits June 16, 2023 14:15

nikitaved mentioned this pull request Jun 19, 2023

doc fix for scaled_dot_product_attention #103835

Closed

nikitaved added 2 commits June 20, 2023 17:07

nikitaved mentioned this pull request Jun 22, 2023

SDPA: frontend for BSR masks #104042

Closed

nikitaved added 3 commits June 23, 2023 14:29

nikitaved added 4 commits June 26, 2023 15:45

github-actions bot added the Stale label Aug 27, 2023

github-actions bot closed this Sep 26, 2023

facebook-github-bot deleted the gh/nikitaved/55/head branch October 27, 2023 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampled_addmm backward: fix incorrect gradient wrt self #103548

sampled_addmm backward: fix incorrect gradient wrt self #103548

nikitaved commented Jun 13, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Jun 13, 2023 •

edited

github-actions bot commented Aug 27, 2023

sampled_addmm backward: fix incorrect gradient wrt self #103548

sampled_addmm backward: fix incorrect gradient wrt self #103548

Conversation

nikitaved commented Jun 13, 2023 • edited by pytorch-bot bot

pytorch-bot bot commented Jun 13, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103548

✅ 1 Unrelated Failure

github-actions bot commented Aug 27, 2023

nikitaved commented Jun 13, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Jun 13, 2023 •

edited