-
Notifications
You must be signed in to change notification settings - Fork 712
[ET-VK][BE] vTensor cleanup 6/N - Do not use gpu_memory_layout as a source of truth, use packed_dim_whcn_idx directly
#5479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… source of truth, use `packed_dim_whcn_idx` directly ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5479
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 1 PendingAs of commit 6f363c6 with merge base 8ef6c79 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
… source of truth, use `packed_dim_whcn_idx` directly ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) ghstack-source-id: 243424497 Pull Request resolved: #5479
|
This pull request was exported from Phabricator. Differential Revision: D62995121 |
…ayout` as a source of truth, use `packed_dim_whcn_idx` directly" ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D62995121 |
…ayout` as a source of truth, use `packed_dim_whcn_idx` directly" ## Context `GPUMemoryLayout` is not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction of `axis_map`. For buffer backed tensors, the `strides` of the tensor is required to fully represent how the data of the tensor is laid out in GPU memory. For texture backed tensors, the `axis_map` and `packed_dim_whcn_idx` is required to fully represent the layout of the tensor as an image texture. Furthermore, with the introduction of functions like `virtual_transpose()`, tensor layouts may be produced which cannot be captured cleanly by an enum. This diff decouples `GPUMemoryLayout` from `vTensor`. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata. The layout of a tensor can be estimated afterwards using `estimate_memory_layout()`, but this is only a "best effort" at producing a comparable memory layout. `GPUMemoryLayout` was helpful as a compact representation of the `packed_dim_whcn_idx` of the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, whenever `GPUMemoryLayout` is referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references to `memory_layout()` to reference `packed_dim_whcn_idx()` instead. Differential Revision: [D62995121](https://our.internmc.facebook.com/intern/diff/D62995121/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D62995121 |
|
This pull request has been merged in 7c6d58a. |
Stack from ghstack (oldest at bottom):
packed_dim_whcn_idxwithpacked_dim#5484gpu_memory_layoutas a source of truth, usepacked_dim_whcn_idxdirectly #5479Context
GPUMemoryLayoutis not a sufficient description of how a tensor is laid out in GPU memory. For buffer backed tensors, this was true ever since strides were introduced; for texture backed tensors, this is true since the introduction ofaxis_map.For buffer backed tensors, the
stridesof the tensor is required to fully represent how the data of the tensor is laid out in GPU memory.For texture backed tensors, the
axis_mapandpacked_dim_whcn_idxis required to fully represent the layout of the tensor as an image texture.Furthermore, with the introduction of functions like
virtual_transpose(), tensor layouts may be produced which cannot be captured cleanly by an enum.This diff decouples
GPUMemoryLayoutfromvTensor. Rather than storing it as a tensor property, it is only used during construction to determine the initial tensor layout metadata.The layout of a tensor can be estimated afterwards using
estimate_memory_layout(), but this is only a "best effort" at producing a comparable memory layout.GPUMemoryLayoutwas helpful as a compact representation of thepacked_dim_whcn_idxof the tensor, which identifies the "fastest moving dimension" in buffer backed tensors, or which dim is packed along a texel for texture backed tensors. Therefore, wheneverGPUMemoryLayoutis referenced, what is really of interest is the packed dim index. Therefore, this diff also replaces references tomemory_layout()to referencepacked_dim_whcn_idx()instead.Differential Revision: D62995121