Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigation: Storage Texture #513

Closed
Jiawei-Shao opened this issue Dec 11, 2019 · 22 comments
Closed

Investigation: Storage Texture #513

Jiawei-Shao opened this issue Dec 11, 2019 · 22 comments
Projects
Milestone

Comments

@Jiawei-Shao
Copy link
Contributor

Jiawei-Shao commented Dec 11, 2019

Introduction

Storage-texture” is a binding type defined in WebGPU Specification. This binding type allows performing texture reads without sampling and store to arbitrary positions in shaders. This report will discuss the implementation details to support Storage Textures in WebGPU.

Related Features

Texture Usage

D3D12, Metal and Vulkan all require that the texture must be created with a proper usage before it can be used as a storage texture.

On D3D12, when we want to use the texture as a read-only storage texture, a Shader Resource View (SRV) will be enough. When we want to write to a storage texture, we need to create an Unordered Access View (UAV) on it, which requires the texture be created with the flag D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS. The D3D12 document suggests “applications should avoid setting this flag when unordered access operations will never occur”. UAVs are allocated in the same type of descriptor heap as SRVs (CBV/SRV/UAV Heap).

On Metal, the option MTLTextureUsageShaderRead is required when we want to access the given texture with a read() or sample() function in any shader. When we want to access the texture with load() function, we need to specify the option MTLTextureUsageShaderWrite when we create the texture.

On Vulkan, we are required to set VK_IMAGE_USAGE_STORAGE_BIT to specify that the image can be used to create a VkImageView suitable for occupying a VkDescriptorSet slot of type VK_DESCRIPTOR_TYPE_STORAGE_IMAGE.

Texture Format

D3D12, Metal and Vulkan all have the minimum requirements on the texture formats that support storage textures. Here we will mainly talk about the supports of writable storage textures and read-write storage textures on each texture color format.

Writable Storage Textures

The supports of writable storage textures on D3D12, Metal and Vulkan on the texture color formats that are required in current WebGPU SPEC are summarized in the following table.

# WebGPU Texture Formats D3D12 (Typed UAV Store) Metal (Writable) Vulkan (VK_IMAGE_USAGE_STORAGE_BIT)
1 r8unorm Supported Supported Optional
2 r8snorm Supported Supported Optional
3 r8uint Supported Supported Optional
4 r8sint Supported Supported Optional
5 r16uint Supported Supported Optional
6 r16sint Supported Supported Optional
7 r16float Supported Supported Optional
8 rg8unorm Supported Supported Optional
9 rg8snorm Supported Supported Optional
10 rg8uint Supported Supported Optional
11 rg8sint Supported Supported Optional
12 r32uint Supported Supported Supported
13 r32sint Supported Supported Supported
14 r32float Supported Supported Supported
15 rg16uint Supported Supported Optional
16 rg16sint Supported Supported Optional
17 rg16float Supported Supported Optional
18 rgba8unorm Supported Supported Supported
19 rgba8unorm-srgb Supported Not supported on A7 and all Mac Optional
20 rgba8snorm Supported Supported Supported
21 rgba8uint Supported Supported Supported
22 rgba8sint Supported Supported Supported
23 bgra8unorm Supported Supported Optional
24 bgra8unorm-srgb Supported Not supported on A7 and all Mac Optional
25 rgb10a2unorm Supported Not supported on A7 and A8 Optional
26 rg11b10float Supported Not supported on A7 and A8 Optional
27 rg32uint Supported Supported Supported
28 rg32sint Supported Supported Supported
29 rg32float Supported Supported Supported
30 rgba16uint Supported Supported Supported
31 rgba16sint Supported Supported Supported
32 rgba16float Supported Supported Supported
33 rgba32uint Supported Supported Supported
34 rgba32sint Supported Supported Supported
35 rgba32float Supported Supported Supported

Read-Write Storage Textures

D3D12 and Metal have special requirements on the texture formats that support both read and write in one shader.

D3D12 devices that support feature level 11_0 are required to support UAV Load on R32_FLOAT, R32_UINT and R32_SINT.

Metal supports “Texture ReadWrite” since Metal 1.2. On iOS 11+ and macOS 10.13+ we can query the Tier of the support of “ReadWrite Texture” with MTLDevice.readWriteTextureSupport.

The supports of read-write storage textures for all the texture color formats in the current WebGPU on D3D12, Metal and Vulkan are listed here:

# WebGPU Texture Formats D3D12 (Typed UAV Load) Metal (Read/Write) Vulkan (VK_IMAGE_USAGE_STORAGE_BIT)
1 r8unorm FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
2 r8snorm Optional Not Supported Optional
3 r8uint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
4 r8sint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
5 r16uint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
6 r16sint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
7 r16float FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Optional
8 rg8unorm Optional Not Supported Optional
9 rg8snorm Optional Not Supported Optional
10 rg8uint Optional Not Supported Optional
11 rg8sint Optional Not Supported Optional
12 r32uint Supported OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 Supported
13 r32sint Supported OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 Supported
14 r32float Supported OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 Supported
15 rg16uint Optional Not Supported Optional
16 rg16sint Optional Not Supported Optional
17 rg16float Optional Not Supported Optional
18 rgba8unorm FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
19 rgba8unorm-srgb Optional Not Supported Optional
20 rgba8snorm Optional Not Supported Supported
21 rgba8uint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
22 rgba8sint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
23 bgra8unorm Optional Not Supported Optional
24 bgra8unorm-srgb Optional Not Supported Optional
25 rgb10a2unorm Optional Not Supported Optional
26 rg11b10float Optional Not Supported Optional
27 rg32uint Optional Not Supported Supported
28 rg32sint Optional Not Supported Supported
29 rg32float Optional Not Supported Supported
30 rgba16uint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
31 rgba16sint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
32 rgba16float FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
33 rgba32uint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
34 rgba32sint FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported
35 rgba32float FeatureData.TypedUAVLoadAdditionalFormats OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 Supported

Sample Count

D3D12 requires when we set D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, the sample count must be 1.

Metal requires multisampled textures only support “read” attribute in Metal Shading Languages (Chapter 2.8, “Textures”).

Vulkan SPEC requires if the multisampled storage image feature (“shaderStorageImageMultisample”) is not enabled, and the usage contains VK_IMAGE_USAGE_STORAGE_BIT, samples must be VK_SAMPLE_COUNT_1_BIT. The coverage of the Vulkan feature “shaderStorageImageMultisample” is 67%.

Resource Bindings

D3D12 defines a descriptor range type D3D12_DESCRIPTOR_RANGE_TYPE_UAV for UAVs in the root signature.

Metal treats the textures used as storage textures the same as sampled textures (all of them should be set in the related Argument tables). For Metal Argument Buffers, Metal defines two Tiers for Argument Buffers and writable textures are only supported in Tier 2.

Vulkan defines the descriptor type VK_DESCRIPTOR_TYPE_STORAGE_IMAGE for storage images. If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, the imageView member of each element of pImageInfo must have been created with VK_IMAGE_USAGE_STORAGE_BIT set.

Shader Stages

On D3D12 I fail to find any restrictions to limit the use of UAVs in any shader stages. On D3D11, with feature level 11_0, UAVs can only be used in the pixel shaders and compute shaders, and with feature level 11_1+, UAVs can be used in all shader stages.

Metal supports MTLTextureUsageShaderRead in any shader stage, and MTLTextureUsageShaderWrite only in compute shaders. Since OSX_GPUFamily1_v2, both vertex and fragment functions can now write to textures.

Vulkan SPEC requires storage image loads must be supported in all shader stages, and stores to storage images in compute shaders.

  • The Vulkan feature “fragmentStoresAndAtomics” specifies whether storage buffers and images support stores and atomic operations in the fragment shader stage. This feature has a coverage of 99%.
  • The Vulkan feature “vertexPipelineStoresAndAtomics” specifies whether storage buffers and images support stores and atomic operations in the vertex, tessellation, and geometry shader stages. This feature has a coverage of 85%.

Shader Operations

Sample, Load and Store

On D3D12, according to HLSL documents, Sample() can only be allowed on read-only texture objects (Texture1D, Texture2D, etc), and on writable texture objects (RWTexture2D, RWTexture2DArray, etc) it is not allowed to call Sample(). Both read-only and writable texture objects support Load() operation.

On Metal Shading Language (Chapter 2.8), “sample” and “read” are different “access” attributes. "sample" implies the ability to read from a texture with and without a sampler, and “read” implies without a sampler, a graphics or kernel function can only read the texture object.

On Vulkan, According to the definition of OpTypeImage in SPIR-V, “Sampled” indicates whether or not this image will be accessed in combination with a sampler. In the Vulkan execution environment, OpTypeImage must have a “Sampled” operand of 1 (sampled image) or 2 (storage image). “storage image” and “sampled image” have different usages in SPIR-V. SPIR-V provides OpImageRead to read a texel from an image without a sampler and OpImageWrite to write a texel to an image without a sampler. Both of these two instructions require the operand “Image” must be an object whose type is OpTypeImage with a “Sampled” operand of 2.

Atomic Functions

On D3D12, feature level 11_0 devices support atomic operations (UAV Atomic Exchange, UAV Atomic Signed Min/Max, UAV Atomic Unsigned Min/Max, UAV Atomic Add, UAV Atomic Bitwise Ops and UAV Atomic Cmp&Store/ Cmp&Exch) on R32_UINT and R32_SINT.

On Metal Shading Language (Chapter 6.13), atomic functions are only allowed on Metal atomic data, which does not include writable textures.

On Vulkan, the image atomic functions are supported on the formats with VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT. Vulkan SPEC (Table 65) requires VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT must be supported on VK_FORMAT_R32_UINT and VK_FORMAT_R32_SINT.

Resource Limits

On D3D12, the resource limits about UAVs are defined together with Hardware Tiers. The maximum number of UAVs in all descriptor tables across all stages are listed as follows:

  • Tier 1: 64 for feature levels 11_1+, 8 for feature level 11_0
  • Tier 2: 64
  • Tier 3: full heap

According to the Metal document Metal-Feature-Set-Tables, the maximum number of entries in the texture argument table, per graphics or compute function are listed here:

  • MTLGPUFamilyApple1, MTLGPUFamilyApple2, MTLGPUFamilyApple3 (A7 - A10): 31
  • MTLGPUFamilyApple4, MTLGPUFamilyApple5 (A11, A12): 96
  • MTLGPUFamilyApple6, MTLGPUFamilyMac1, MTLGPUFamilyMac2: 128

On Vulkan, the minimum required resource limits that are related to storage images in Vulkan SPEC are listed as follows:

  • maxPerStageDescriptorStorageImages (4)
  • maxDescriptorSetStorageImages (4 * 6, 6 is the number of shader stages)
  • maxFragmentCombinedOutputResources (4)
    maxFragmentCombinedOutputResources is the total number of storage buffers, storage images and output buffers which can be used in the fragment stage.

Resource Barriers

On D3D12 there are two types of barriers that are related to UAVs:

  • Transition Barrier (D3D12_RESOURCE_STATE_UNORDERED_ACCESS): A subresource must be in this state when it is accessed by the 3D pipeline via UAV.
  • UAV barrier (D3D12_RESOURCE_UAV_BARRIER): indicate all UAV accesses (read or write) to a particular resource must complete before any future UAV accesses (read or write) can begin.

On Metal devices that support OSX_GPUFamily1_v2, it is guaranteed that:

  • Between Command Encoders, all resource writes performed in a given command encoder are visible in the next command encoder. This is true for both render and compute command encoders.
  • Within a Render Command Encoder: for textures, the textureBarrier(deprecated, only available until macOS 10.14, use MTLCommandEncoder.memoryBarrierWithScope since macOS 10.14) method ensures that writes performed in a given draw call are visible to subsequent reads in the next draw call.
  • Within a Compute Command Encoder: all resource writes performed in a given kernel function are visible in the next kernel function.

Vulkan defines image memory barriers that are only apply to memory accesses involving a specific image subresource range.

  • Image memory barriers can also be used to define image layout transitions or a queue family ownership transfer for the specified image subresource range.
  • Vulkan SPEC requires if descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, for each descriptor that will be accessed via load or store operations the imageLayout member for corresponding elements of pImageInfo must be VK_IMAGE_LAYOUT_GENERAL.

Besides, Vulkan SPEC has severe restrictions to use image memory barriers in a render pass instance:

  • If vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of any element of pImageMemoryBarriers must be equal to the layout member of an element of the pColorAttachments, pResolveAttachments or pDepthStencilAttachment members of the VkSubpassDescription instance that the current subpass was created with, that refers to the same image.
  • If vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of an element of pImageMemoryBarriers must be equal.

Because of these restrictions, the group has agreed to not synchronize individual draw calls within a render pass.

Proposal

Now that we have added "storage-texture" in GPUBindingType and “STORAGE” in GPUTextureUsage, we can just discuss some details on the support of Storage Textures in WebGPU implementations.

  1. Textures that are used as writable storage textures cannot be multisampled as it is not allowed in D3D12 and Metal.

  2. Maybe it is better to add “READONLY-STORAGE” as a new enum in GPUTextureUsage because with this extra information we can easily know:

    • Whether we need to set D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS when creating textures on D3D12.
    • Whether we need to set MTLTextureUsageShaderWrite when creating textures on Metal.
    • If we allow “SAMPLED-TEXTURE” being used as “READONLY-STORAGE”, then on Vulkan we have to set both VK_IMAGE_USAGE_SAMPLED_BIT and VK_IMAGE_USAGE_STORAGE_BIT, which may hurt the performance of texture sampling if the texture is actually only used for sampling.
  3. Maybe we also need to and “readonly-storage-texture” as a new type of binding point because we need to know the following information when we create the bind group layouts:

    • Whether we should use D3D12_DESCRIPTOR_RANGE_TYPE_UAV when creating the root signatures on D3D12.
    • Whether we should use VK_DESCRIPTOR_TYPE_STORAGE_IMAGE as the descriptorType member of a VkDescriptorSetLayoutBinding object used in the creation of Vulkan graphics pipeline as VK_DESCRIPTOR_TYPE_STORAGE_IMAGE and VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE are different VkDescriptorType enums.
  4. The color texture formats that are allowed to be writable storage textures on D3D12, Metal and Vulkan are summarized in the previous tables.

  5. The support of Read-Write storage textures has to be an extension because it is only supported on macOS 10.12+ and iOS 11+.

  6. Readable storage textures can be supported in all shader stages, and writable storage textures can only be supported in compute shaders.

    • The support of writable storage textures in fragment shaders has to be an extension as it is only available on macOS 10.12+.
    • The support of writable storage textures in vertex shaders has to be an extension as it requires D3D feature level 11_1+, macOS 10.12+ and Vulkan feature “vertexPipelineStoresAndAtomics
  7. We suggest the maximum number of storage images is 4 as it is following the resource limits in Vulkan, which is the strictest among D3D12, Metal and Vulkan.

  8. We cannot support image atomic functions because this feature cannot be supported on Metal.

@Kangz
Copy link
Contributor

Kangz commented Dec 11, 2019

Thank you for the excellent investigation!

Another choice is always treating "sampledTexture" as the read-only usage, and in shaders we always translate ImageLoad() as TexelFetch().

That approach looks the simplest without loss of generality imho.

@kvark
Copy link
Contributor

kvark commented Dec 11, 2019

@Jiawei-Shao
❤️ the amount of detail and links here, great work!
A few correctons:

Metal treats the textures used as storage textures the same as sampled textures (all of them should be set in the related Argument tables).

Argument buffers have 2 tiers, and writable textures are not supported on the 1st tier.

On Metal Shading Language (Chapter 2.8), “sample” and “read”

This block is repeated. Was it meant to explain how the "write" access works?

Vulkan defines image memory barriers that are only apply to memory accesses involving a specific image subresource range.

There are severe restrictions on making memory barriers inside a render pass, making it mostly unfeasible to do.

Maybe it is better to add “READONLY-STORAGE” as a new enum in GPUTextureUsage

I agree, that would be useful.

“readonly-storage-texture” as a new type of binding point

How would this help specifically? The list that follows, I think, mostly applies to the benefits of "READONLY-STORAGE".

Another choice is always treating "sampledTexture" as the read-only usage, and in shaders we always translate ImageLoad() as TexelFetch()

Are they equivalent? My understanding was that TexelFetch would stiil go through the texture unit (even though the coordinate transformation part is skipped), while ImageLoad is a direct access to the image.

The support of Read-Write storage textures has to be an extension because it is only supported on macOS 10.12+ and iOS 11+.

Metal baseline is macOS 10.12, so this isn't a restriction.
iOS is technically a restriction, but I wonder what the portion of devices we'd be talking about when WebGPU V1 ships in, say, a year from now.

If we decide to support writable storage textures in vertex or fragment shaders, we need to expose memory barriers as it is also required in Metal.

Could you clarify this requirement please? The investigation only mentions the barriers for inside-render-pass usage,
which we agreed to allow data-races in for UAV access, so we shouldn't need the barriers.

Agree with the rest of the proposal! 👍

@Jiawei-Shao
Copy link
Contributor Author

Hi @kvark,

My replies are inline, PTAL, thanks!

@Jiawei-Shao
❤️ the amount of detail and links here, great work!
A few correctons:

Metal treats the textures used as storage textures the same as sampled textures (all of them should be set in the related Argument tables).

Argument buffers have 2 tiers, and writable textures are not supported on the 1st tier.

Oh here I want to talk about the total size of the argument table, not for Metal Argument buffers.

On Metal Shading Language (Chapter 2.8), “sample” and “read”

This block is repeated. Was it meant to explain how the "write" access works?

Yes it's duplicated and I've removed it.

Vulkan defines image memory barriers that are only apply to memory accesses involving a specific image subresource range.

There are severe restrictions on making memory barriers inside a render pass, making it mostly unfeasible to do.

Thanks to point out this and I have added the restrictions in this paragraph.

Maybe it is better to add “READONLY-STORAGE” as a new enum in GPUTextureUsage

I agree, that would be useful.

“readonly-storage-texture” as a new type of binding point

How would this help specifically? The list that follows, I think, mostly applies to the benefits of "READONLY-STORAGE".

It will help create the right D3D12 root signatures. The binding point "readonly-storage-texture" may relate to D3D12_ROOT_PARAMETER_TYPE_SRV and "storage-texture" may relate to D3D12_ROOT_PARAMETER_TYPE_UAV.

Another choice is always treating "sampledTexture" as the read-only usage, and in shaders we always translate ImageLoad() as TexelFetch()

Are they equivalent? My understanding was that TexelFetch would stiil go through the texture unit (even though the coordinate transformation part is skipped), while ImageLoad is a direct access to the image.

On Vulkan, VK_IMAGE_USAGE_SAMPLED_BIT and VK_IMAGE_USAGE_STORAGE_BIT are different image usages. TexelFetch is only allowed on sampled textures with a sampler, while imageLoad is only allowed on storage textures without a sampler. I wonder if we can always treat imageLoad as TexelFetch, then we could avoid adding "readonly-storage-texture" to WebGPU.

The support of Read-Write storage textures has to be an extension because it is only supported on macOS 10.12+ and iOS 11+.

Metal baseline is macOS 10.12, so this isn't a restriction.
iOS is technically a restriction, but I wonder what the portion of devices we'd be talking about when WebGPU V1 ships in, say, a year from now.

Yes. It would be great to define the minimum feature requirements for the devices that support WebGPU.

If we decide to support writable storage textures in vertex or fragment shaders, we need to expose memory barriers as it is also required in Metal.

Could you clarify this requirement please? The investigation only mentions the barriers for inside-render-pass usage,
which we agreed to allow data-races in for UAV access, so we shouldn't need the barriers.

Oh I just see the Metal document mentions that within a Render Command Encoder, for textures, we need the textureBarrier(deprecated, only available until macOS 10.14, use MTLCommandEncoder.memoryBarrierWithScope since macOS 10.14) method to ensure that writes performed in a given draw call are visible to subsequent reads in the next draw call, so I think a texture barrier may be needed.

@kvark
Copy link
Contributor

kvark commented Dec 12, 2019

@Jiawei-Shao I think my points didn't get through. Let me try to re-phrase.

Oh here I want to talk about the total size of the argument table, not for Metal Argument buffers.

We agreed previously that the API is designed with native support by Metal Argument Buffers in mind. So we need to consider their limitations in the binding space.

It will help create the right D3D12 root signatures. The binding point "readonly-storage-texture" may relate to D3D12_ROOT_PARAMETER_TYPE_SRV and "storage-texture" may relate to D3D12_ROOT_PARAMETER_TYPE_UAV.

Or, the user could just use the "sampled-texture" binding point instead. Is there a downside to do so versus adding a new bind point?

I wonder if we can always treat imageLoad as TexelFetch, then we could avoid adding "readonly-storage-texture" to WebGPU.

What I'm saying is that there might be performance and quality considerations for using texelFetch in the context where the users does imageLoad.

for textures, we need the textureBarrier(...) method to ensure that writes performed in a given draw call are visible to subsequent reads in the next draw call, so I think a texture barrier may be needed.

We agreed to not synchronize individual draw calls within a render pass a while ago. That's why I'm saying the barriers are not needed here.

@Jiawei-Shao
Copy link
Contributor Author

Jiawei-Shao commented Dec 16, 2019

@kvark thanks for your comments!

We agreed previously that the API is designed with native support by Metal Argument Buffers in mind. So we need to consider their limitations in the binding space.

Thanks for your comments and I have added the information.

Or, the user could just use the "sampled-texture" binding point instead. Is there a downside to do so versus adding a new bind point?

In Vulkan VK_DESCRIPTOR_TYPE_STORAGE_IMAGE and VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE are different VkDescriptorType enums, so I think it is better to use different binding points on "sampled-texture" and "readonly-storage-texture".

What I'm saying is that there might be performance and quality considerations for using texelFetch in the context where the users does imageLoad.

OK now I agree with you and I've removed that paragraph.

We agreed to not synchronize individual draw calls within a render pass a while ago. That's why I'm saying the barriers are not needed here.

Thanks for your explanation and I've added this decision into the proposal.

@Kangz
Copy link
Contributor

Kangz commented Dec 16, 2019

@kvark

Metal baseline is macOS 10.12, so this isn't a restriction.

Didn't we to target all Metal devices? It started with macOS 10.11 so we'd need an extension for read-write storage textures.

@kvark
Copy link
Contributor

kvark commented Dec 16, 2019

Ah right, so 10.11 is a problem... :/

@Jiawei-Shao
Copy link
Contributor Author

In SPIR-V as OpImageRead can only be called on storage images while OpImageFetch can only be called on sampled images, one of the differences between OpImageFetch and OpImageRead is that OpImageRead can be affected by the SPIR-V decorations "Coherent" and "Volatile", while OpImageFetch won't.

"Coherent" and "Volatile" cannot be used in Vulkan Memory Model, which is currently defined in the Vulkan extension VK_KHR_vulkan_memory_model. According to this link, the coverage of VK_KHR_vulkan_memory_model is 36.19% on Windows, 19.76% on Linux and 2.14% on Android.

@Kangz
Copy link
Contributor

Kangz commented Jan 13, 2020

As discussed offline with @Jiawei-Shao the differing hardware paths between OpImageFetch and OpImageRead as well as potential for instruction targeting only Sampled 0 or Sampled 1 texture means it makes sense to have both "sampled-texture" and "readonly-storage-texture" imho.

@kdashg
Copy link
Contributor

kdashg commented Nov 18, 2020

I thought it was particularly sad to miss out on rg16float/uint/sint formats, so I went looking.

It looooks like a bunch of these optional-in-Vulkan formats are valid with the feature shaderStorageImageExtendedFormats, but that Android support for the feature as-a-whole is prohibitively poor.

gpuinfo is less than helpful in how it tries to make the format bits easier to read, but now I can't tell what they mean!
https://vulkan.gpuinfo.org/listformats.php?platform=android

I'm hoping that these otherwise-common formats have enough support to reintroduce after further investigation.

@vorg
Copy link

vorg commented Aug 2, 2021

I'm trying to ping pong update floating point textures in the compute shader and when using rgba32float for texture format i'm getting Texture component type usage mismatch. The same code works fine for rgba16float and rgba8unorm.

const texture = device.createTexture({
  size: {
    width: 64,
    height: 64,
  },
  format: "rgba32float",
  usage:
    GPUTextureUsage.COPY_DST |
    GPUTextureUsage.STORAGE |
    GPUTextureUsage.SAMPLED,
})
[[group(0), binding(1)]] var writeTex: texture_storage_2d<rgba32float, write>;

Chrome Version 94.0.4593.0 (Official Build) canary (x86_64)
macOS 11.5 (20G71)
AMD Radeon Pro 5600M 8 GB

@vorg
Copy link

vorg commented Aug 2, 2021

I finally figured it out... The error was Texture component type usage mismatch at ValidateTextureBinding so i though something is wrong with my bindings but the problems seems to be that rgba32float textures are not filterable? Switching from textureSampleLevel in my fragment shader to textureLoad fixed the issue.

let computeColor: vec4<f32> = textureSampleLevel(computeTexture, linearSampler, fragData.uv, 0.0).rrrr;
// has to be replaced
let computeColor: vec4<f32> = textureLoad(computeTexture, vec2<i32>(fragData.uv * 64.0), 0).rrrr;

@fintelia
Copy link

fintelia commented Aug 11, 2021

I thought it was particularly sad to miss out on rg16float/uint/sint formats, so I went looking.

It looooks like a bunch of these optional-in-Vulkan formats are valid with the feature shaderStorageImageExtendedFormats, but that Android support for the feature as-a-whole is prohibitively poor.

Could someone check whether requiring this feature is now possible? The link seems to indicate only 67 entries for Android that don't have support for shaderStorageImageExtendedFormats compared to 637 entries for devices that do. Yet, the "Device Coverage" percentage for Android is reported as 62% so maybe I'm misunderstanding something

@kvark
Copy link
Contributor

kvark commented Aug 11, 2021

@fintelia this is a great suggestion!

Yet, the "Device Coverage" percentage for Android is reported as 62% so maybe I'm misunderstanding something

Perhaps it shows 67 unique reports, but the actual number of reports is bigger (i.e. duplicate reports).

Could someone check whether requiring this feature is now possible?

I see a lot of correlation with devices lacking a good maxDrawIndexedIndexValue - #1343 (comment) . Namely, Qualcomm devices.

However, in addition to Qualcomm there are ARM, ImgTec, and software Vulkan implementations (hello #2030). So requiring the shaderStorageImageExtendedFormats will reduce the range of compatible Android devices.

It would be great to make this decision based on the actual statistics from browsers use on these devices, as opposed to vulkaninfo reports.

@fintelia
Copy link

fintelia commented Aug 11, 2021

However, in addition to Qualcomm there are ARM, ImgTec, and software Vulkan implementations (hello #2030). So requiring the shaderStorageImageExtendedFormats will reduce the range of compatible Android devices.

Looking more closely at the non-Qualcomm devices, it seems the ARM devices all are unsupported due to maxImageArrayLayers=256 (see #1327) while most of the ImgTec ones lack the necessary compressed texture support (EDIT: see #144 (comment)). On the software Vulkan implementation front, shaderStorageImageExtendedFormats is supported by both SwiftShader and LavaPipe at this point

@kvark
Copy link
Contributor

kvark commented Aug 11, 2021

most of the ImgTec ones lack the necessary compressed texture support

compressed textures aren't in WebGPU core though. But overall this does sound as quite a strong argument to just require shaderStorageImageExtendedFormats, to make WebGPU more future proof and oriented.

@kainino0x
Copy link
Contributor

We agreed to require (BC || (ETC && ASTC)) but those devices don't have ASTC

@kvark kvark added this to Needs Discussion in Main Aug 11, 2021
@kvark kvark added this to the MVP milestone Aug 23, 2021
@kvark kvark moved this from Needs Discussion to Needs Investigation/Proposal or Revision in Main Aug 23, 2021
@kvark
Copy link
Contributor

kvark commented Aug 23, 2021

Editors meeting:

  • this is important, it makes WebGPU more usable
  • needed to be resolved be fore MVP
  • and this needs an action item to properly list all of the devices/chips/driver versions that we'd lose compatibility with

If anybody wants to collect this list, this would be much welcome!

@kainino0x
Copy link
Contributor

We should try to collect a complete list of reports from gpuinfo.org, and filter out all the ones we know we've dropped, and then see what else we would lose by requiring this.

@Kangz
Copy link
Contributor

Kangz commented Aug 24, 2021

it seems the ARM devices all are unsupported due to maxImageArrayLayers=256 (see #1327)

We should probably revisit that, it's not ok for WebGPU to be completely unsupported on all ARM Android devices.

@Kangz
Copy link
Contributor

Kangz commented Aug 24, 2021

Based on internal data, we shouldn't require shaderStorageImageExtendedFormats as it culls too many Android devices. I'll see if I can share more info.

@Kangz
Copy link
Contributor

Kangz commented Nov 3, 2021

I think this investigation can be closed: storage textures have been added to the spec (at least writable ones. We need data on whether to add readwrite ones, see #1772)

@Kangz Kangz closed this as completed Nov 3, 2021
@kainino0x kainino0x moved this from Needs Investigation/Proposal or Revision to No Action in Main Jan 19, 2022
@kainino0x kainino0x moved this from No Action to Specification Done in Main Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Main
Specification Done
Development

No branches or pull requests

7 participants