Skip to content

Conversation

@tina134
Copy link
Contributor

@tina134 tina134 commented Sep 22, 2023

Summary:
Similar implementation like BMM & ADDMM, the bias tensor is using the packed weights, similar to MM, but increases the index via the z-dim to get more matrices in the batch.

Packed bias (input of MM):

ivec3 pos(k_, j_, 0);
float v = texelFetch(uInput, pos, 0)
# v.xyzw are 4 numbers in one matrix
# no batch
# k_, j_ has only 1/4 of the range as the original matrix size (H*W matrix i=> H/2*W/2*1 3D Image).

Packed bias (input of BMM):

ivec3 pos(k_, j_, i);
float v = texelFetch(uInput, pos, 0)
# v.xyzw are 4 numbers in one matrix
# i as batch id

To support broadcasting, the bias packing of mm is slightly different than weight packing, which repeats the single element in height-dim twice to fill the 4 planes (see code for details). The width-dim doesn’t repeat twice, but the code still works, because stacking 3 planes together with the last one empty yields the same 3D image.
However, this doesn’t work for bmm, since it’s a series of {4 planes} {4 planes} … {4 planes}, and each {4 planes} represents a matrix, so only 3 planes completely mess up the indexing. Thus, I repeat the single element in width-dim as well to fill all 4 planes to have the correct indexing.

https://pytorch.org/docs/stable/generated/torch.baddbmm.html

Test Plan:

[ttingchulin@27298.od /data/sandcastle/boxes/fbsource (bmm)]$ LD_LIBRARY_PATH=third-party/swiftshader/lib/linux-x64/ buck run fbcode/mode/dev-nosan //xplat/caffe2:pt_vulkan_api_test_bin

Reviewed By: yipjustin

Differential Revision: D49402181

Summary:
Similar implementation like BMM & ADDMM, the bias tensor is using the packed weights, similar to MM, but increases the index via the z-dim to get more matrices in the batch.

Packed bias (input of MM):
```
ivec3 pos(k_, j_, 0);
float v = texelFetch(uInput, pos, 0)
# v.xyzw are 4 numbers in one matrix
# no batch
# k_, j_ has only 1/4 of the range as the original matrix size (H*W matrix i=> H/2*W/2*1 3D Image).

```
Packed bias (input of BMM):
```
ivec3 pos(k_, j_, i);
float v = texelFetch(uInput, pos, 0)
# v.xyzw are 4 numbers in one matrix
# i as batch id
```

**To support broadcasting**, the bias packing of `mm` is slightly different than weight packing, which repeats the single element in height-dim twice to fill the 4 planes (see code for details). The width-dim doesn’t repeat twice, but the code still works, because stacking 3 planes together with the last one empty yields the same 3D image.
However, this doesn’t work for `bmm`, since it’s a series of `{4 planes} {4 planes} … {4 planes}`, and each `{4 planes}` represents a matrix, so only 3 planes completely mess up the indexing. Thus, I repeat the single element in width-dim as well to fill all 4 planes to have the correct indexing.

https://pytorch.org/docs/stable/generated/torch.baddbmm.html

Test Plan:
```
[ttingchulin@27298.od /data/sandcastle/boxes/fbsource (bmm)]$ LD_LIBRARY_PATH=third-party/swiftshader/lib/linux-x64/ buck run fbcode/mode/dev-nosan //xplat/caffe2:pt_vulkan_api_test_bin
```

Reviewed By: yipjustin

Differential Revision: D49402181
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 22, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109851

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit ff78668 with merge base e1d7123 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR module: vulkan release notes: vulkan release notes category labels Sep 22, 2023
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49402181

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 22, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged module: vulkan release notes: vulkan release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants