New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vulkan] Pad channels when using texture storage instead of "tight packing" #95251
Conversation
…cking" Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95251
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 2790599: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…cking" Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! ghstack-source-id: 180840767 Pull Request resolved: #95251
…f "tight packing"" Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! [ghstack-poisoned]
…cking" Pull Request resolved: #95251 Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` ghstack-source-id: 180908974 Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
@pytorchbot merge -f 'Landed internally' (Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally) |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…cking" (#95251) Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! Pull Request resolved: pytorch/pytorch#95251 Approved by: https://github.com/salilsdesai
…cking" (#95251) Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! Pull Request resolved: pytorch/pytorch#95251 Approved by: https://github.com/salilsdesai
…cking" (#95251) Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! Pull Request resolved: pytorch/pytorch#95251 Approved by: https://github.com/salilsdesai
…tight packing" (pytorch#95251)" This reverts commit 0eeb046.
…cking" (pytorch#95251) Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries. For instance, consider a tensor with `N=2`, `C=3`. The depth axis of the texture would contain the data ``` |tex1|tex2| ----------- |AAAB|BB00| ``` Where A represents data from `n=1`and B represents data form `n=2`. This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures. The same tensor with `N=2`, `C=3` would now have the depth axis contain ``` |tex1|tex2| ----------- |AAA0|BBB0| ``` Differential Revision: [D43068669](https://our.internmc.facebook.com/intern/diff/D43068669/) **NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43068669/)! Pull Request resolved: pytorch#95251 Approved by: https://github.com/salilsdesai
Stack from ghstack (oldest at bottom):
Currently, in Vulkan 4D tensors are represented in GPU textures by simply combining the batch and channel dimensions into the depth axis. However, if the number of channels is not a multiple of 4, then data belonging to the same batch can cross texel boundaries.
For instance, consider a tensor with
N=2
,C=3
. The depth axis of the texture would contain the dataWhere A represents data from
n=1
and B represents data formn=2
.This packing structure ("tight packing") makes some ops that care about batch boundaries more complex and inefficient to implement. Therefore this diff introduces channel padding when storing tensors as image textures.
The same tensor with
N=2
,C=3
would now have the depth axis containDifferential Revision: D43068669
NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!