sampled_addmm: BSR support #101163

nikitaved · 2023-05-11T07:25:56Z

This PR implements a sampled_addmm kernel that works with a BSR mask.

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2023-05-11T07:25:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/101163

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8a25f86:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: ffa39d1 Pull Request resolved: #101163

[ghstack-poisoned]

ghstack-source-id: 3cdcc6f Pull Request resolved: #101163

[ghstack-poisoned]

ghstack-source-id: de96103 Pull Request resolved: #101163

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

cpuhrsch · 2023-05-23T18:32:36Z

test/test_sparse_csr.py

+        batches = [(), (2,), (2, 2)]
+        size = [128, 256, 0]
+
+        def sampled_addmm_ref(input, mat1, mat2, alpha, beta):


Could another way to implement this be torch.addmm(input.to_dense(), mat1, mat2, alpha, beta).sparse_mask(input.to_sparse()).to_sparse_bsr(input.values().shape[-2:])?

Sure it's less efficient, but a simpler reference implementation could give us more confidence?

We test against CSR from sparse.sampled_addmm for more confidence. The proposed solution also only works for non-batched inputs, so, for simplicity, we still need to loop over batches... Alternatively, we can remove the reference and just test against CSR with half promoted to float. Would that be more preffered?

Yes, testing against CSR with float would work too. That's even simpler.

cpuhrsch

I'm a bit worried about the complexity of the reference addmm implementation just so we can have a simple comparison point.

cpuhrsch · 2023-05-23T18:34:01Z

test/test_sparse_csr.py

+                    res_csr = torch.sparse.sampled_addmm(csr, mat1csr, mat2csr, alpha=alpha, beta=beta)
+                    self.assertEqual(res_tri.to_dense(), res_csr.to_dense())
+
+                # Check grid consistency


What does this mean?

I will update the comment to clarify on that.

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

cpuhrsch · 2023-05-24T19:49:57Z

Looks like there's one lint error left otherwise this is good to go. Thanks for writing this!

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

nikitaved · 2023-05-25T12:31:30Z

@pytorchbot merge

pytorchmergebot · 2023-05-25T12:33:46Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: #94825 Approved by: https://github.com/albanD, https://github.com/cpuhrsch

…2] (#102660) Test was originally skipped in #98462 Not sure why it was removed in #94825 Now the test hits CUDA illegal memory access on H100 again after #101163 Pull Request resolved: #102660 Approved by: https://github.com/zou3519

sampled_addmm: BSR support

c8e47bb

[ghstack-poisoned]

nikitaved mentioned this pull request May 11, 2023

bsr_dense_bmm(): enable more precise float32 support with float64 accumulators #100882

Closed

pytorch-bot bot added the release notes: sparse release notes category label May 11, 2023

pytorchbot added the open source label May 11, 2023

Update on "sampled_addmm: BSR support"

2bbe9d9

[ghstack-poisoned]

nikitaved marked this pull request as draft May 11, 2023 07:42

Update on "sampled_addmm: BSR support"

0ac256e

[ghstack-poisoned]

nikitaved mentioned this pull request May 11, 2023

sparse compressed validation: allow empty-batched inputs #101180

Closed

Update on "sampled_addmm: BSR support"

01d7158

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

f7ea11e

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

e3abbe5

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

4ddd69b

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

276b1eb

[ghstack-poisoned]

nikitaved added a commit that referenced this pull request May 11, 2023

sampled_addmm: BSR support

01318de

ghstack-source-id: ffa39d1 Pull Request resolved: #101163

Update on "sampled_addmm: BSR support"

370e102

[ghstack-poisoned]

nikitaved added a commit that referenced this pull request May 12, 2023

sampled_addmm: BSR support

f6108d5

ghstack-source-id: 3cdcc6f Pull Request resolved: #101163

Update on "sampled_addmm: BSR support"

7d344bd

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

167fc4d

[ghstack-poisoned]

Update on "sampled_addmm: BSR support"

ef9567f

[ghstack-poisoned]

nikitaved added a commit that referenced this pull request May 15, 2023

sampled_addmm: BSR support

cbf7427

ghstack-source-id: de96103 Pull Request resolved: #101163

Update on "sampled_addmm: BSR support"

f611578

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

This was referenced May 22, 2023

sampled_addmm for BSR inputs: fuse softmax #101978

Closed

softmax: Triton kernel for BSR inputs #102095

Closed

Update on "sampled_addmm: BSR support"

e25a2cc

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

cpuhrsch reviewed May 23, 2023

View reviewed changes

cpuhrsch requested changes May 23, 2023

View reviewed changes

cpuhrsch reviewed May 23, 2023

View reviewed changes

nikitaved added 2 commits May 24, 2023 08:29

Update on "sampled_addmm: BSR support"

8fee341

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

Update on "sampled_addmm: BSR support"

4bb5aee

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

nikitaved requested a review from cpuhrsch May 24, 2023 14:21

nikitaved mentioned this pull request May 24, 2023

torch.sparse.softmax: allow negative dim #102171

Closed

Update on "sampled_addmm: BSR support"

4f08ba3

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

nikitaved mentioned this pull request May 24, 2023

torch.sparse.softmax: allow negative dim #102172

Closed

cpuhrsch approved these changes May 24, 2023

View reviewed changes

nikitaved added 2 commits May 25, 2023 08:31

Update on "sampled_addmm: BSR support"

64307a6

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

Update on "sampled_addmm: BSR support"

8a25f86

This PR implements a `sampled_addmm` kernel that works with a BSR mask. [ghstack-poisoned]

pytorchmergebot added the merging label May 25, 2023

pytorchmergebot added the Merged label May 25, 2023

pytorchmergebot removed the merging label May 25, 2023

pytorchmergebot closed this in 6c7410d May 25, 2023

xwang233 referenced this pull request May 31, 2023

nn.Linear: dispatch to bsr_dense_mm for half and bfloat16 (#94825)

1adb6fa

Pull Request resolved: #94825 Approved by: https://github.com/albanD, https://github.com/cpuhrsch

xwang233 mentioned this pull request May 31, 2023

Skip test test_triton_bsr_dense_bmm if not TEST_WITH_TORCHINDUCTOR [v2] #102660

Closed

facebook-github-bot deleted the gh/nikitaved/45/head branch June 8, 2023 18:07

crcrpar mentioned this pull request Jun 28, 2023

Illegal Memory Access on H100 TestSparseCompressedTritonKernelsCUDA.test_triton_sampled_addmm_block_size_16_cuda_bfloat16 #104322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sampled_addmm: BSR support #101163

sampled_addmm: BSR support #101163

Uh oh!

nikitaved commented May 11, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 11, 2023 •

edited

Loading

Uh oh!

cpuhrsch May 23, 2023

Uh oh!

nikitaved May 23, 2023 •

edited

Loading

Uh oh!

cpuhrsch May 23, 2023

Uh oh!

cpuhrsch left a comment

Uh oh!

cpuhrsch May 23, 2023

Uh oh!

nikitaved May 23, 2023

Uh oh!

cpuhrsch commented May 24, 2023

Uh oh!

nikitaved commented May 25, 2023

Uh oh!

pytorchmergebot commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sampled_addmm: BSR support #101163

sampled_addmm: BSR support #101163

Uh oh!

Conversation

nikitaved commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/101163

✅ No Failures

Uh oh!

cpuhrsch May 23, 2023

Choose a reason for hiding this comment

Uh oh!

nikitaved May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpuhrsch May 23, 2023

Choose a reason for hiding this comment

Uh oh!

cpuhrsch left a comment

Choose a reason for hiding this comment

Uh oh!

cpuhrsch May 23, 2023

Choose a reason for hiding this comment

Uh oh!

nikitaved May 23, 2023

Choose a reason for hiding this comment

Uh oh!

cpuhrsch commented May 24, 2023

Uh oh!

nikitaved commented May 25, 2023

Uh oh!

pytorchmergebot commented May 25, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nikitaved commented May 11, 2023 •

edited

Loading

pytorch-bot bot commented May 11, 2023 •

edited

Loading

nikitaved May 23, 2023 •

edited

Loading