Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vulkan] Pad channels when using texture storage instead of "tight packing" #95251

Closed
wants to merge 2 commits into from

Conversation

SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Feb 22, 2023

Stack from ghstack (oldest at bottom):

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with N=2, C=3. The depth axis of the texture would contain the data

|tex1|tex2|
-----------
|AAAB|BB00|

Where A represents data from n=1and B represents data form n=2.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with N=2, C=3 would now have the depth axis contain

|tex1|tex2|
-----------
|AAA0|BBB0|

Differential Revision: D43068669

NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!

…cking"

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 22, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95251

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2790599:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: vulkan release notes category label Feb 22, 2023
SS-JIA added a commit that referenced this pull request Feb 22, 2023
…cking"

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!

ghstack-source-id: 180840767
Pull Request resolved: #95251
…f "tight packing""

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!

[ghstack-poisoned]
SS-JIA added a commit that referenced this pull request Feb 22, 2023
…cking"

Pull Request resolved: #95251

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```
ghstack-source-id: 180908974

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
Copy link
Contributor

@salilsdesai salilsdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 25, 2023
…cking" (#95251)

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
Pull Request resolved: pytorch/pytorch#95251
Approved by: https://github.com/salilsdesai
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 25, 2023
…cking" (#95251)

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
Pull Request resolved: pytorch/pytorch#95251
Approved by: https://github.com/salilsdesai
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 5, 2023
…cking" (#95251)

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
Pull Request resolved: pytorch/pytorch#95251
Approved by: https://github.com/salilsdesai
pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023
@facebook-github-bot facebook-github-bot deleted the gh/SS-JIA/210/head branch June 8, 2023 14:50
jhavukainen pushed a commit to kulinseth/pytorch that referenced this pull request Mar 15, 2024
…cking" (pytorch#95251)

Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.

For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data

```
|tex1|tex2|
-----------
|AAAB|BB00|
```
Where A represents data from `n=1`and B represents data form `n=2`.

This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.

The same tensor with `N=2`, `C=3` would now have the depth axis contain

```
|tex1|tex2|
-----------
|AAA0|BBB0|
```

Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
Pull Request resolved: pytorch#95251
Approved by: https://github.com/salilsdesai
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged release notes: vulkan release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants