Texture views cause unnecessary memory bandwidth use #744

litherum · 2020-05-02T00:23:28Z

Creating a texture view in Metal requires the use of the MTLTextureUsagePixelFormatView usage on the original texture. Specifying this usage disables hardware lossless compression in modern iPhones. The reason is that the texture view can have a different pixel format than the original, and the lossless compression is specific to a particular pixel format. The result of specifying this flag when it isn’t really needed is that memory bandwidth use is unnecessarily increased.

The user-visible symptom of increased memory bandwidth use is:

Worse performance, if the content is already memory bandwidth limited
Worse battery life, as memory bandwidth use is a good indicator of power use by the GPU on iPhones
A hotter device, which users don't like

In order to characterize this, you can run the “traditional deferred lighting” mode of this sample code with and without the MTLTextureUsagePixelFormatView flag. If you run it unmodified, an iPhone XS Max reads 40.32 MiB and writes 56.8 MiB per frame. If you then add the flag to the texture creation functions, this increases to 55.55 MiB read and 71.81 MiB written of memory per frame. Therefore, removing the flag causes a 23.7% reduction of bandwidth use on this content.

This is a simple case. In our experience, it's not uncommon to see memory bandwidth use shrink by 50% in real content.

This is a problem in WebGPU because texture views are used in bind groups. In WebGPU, it is impossible to sample from a texture without a view.

One way to solve this would be to let GPUBindingResource be a GPUTexture. However, that alone probably wouldn’t be sufficient because there are some types of GPUTextureViews which can’t be represented by just GPUTextures, like cubemaps. Therefore, we should unify GPUTextureViewDimension and GPUTextureDimension, and add an arrayLayerCount into GPUTextureDescriptor (and possibly more).

The text was updated successfully, but these errors were encountered:

kainino0x · 2020-05-02T00:44:58Z

The documentation of MTLTextureUsagePixelFormatView says:

This option allows you to call any of these methods of the given texture to create a texture view with a different component layout. (...)

The pixel layout is considered different if the number of components differs, or if their size or order is different from the components in the original pixel format.

GPUTextureViews don't have the capability to do that right now (the format can't be changed), so if I'm understanding correctly, MTLTextureUsagePixelFormatView should never be set. (Unfortunately, it looks like we do this incorrectly in Dawn right now.)

kainino0x · 2020-05-02T00:47:09Z

don't have the capability to do that right now

Sorry, there's an option for it in the descriptor, but the behavior isn't specified. It shouldn't be allowed by default (and I'm pretty sure Dawn doesn't allow it). I recall that it has come up in the past that we would need a flag on GPUTexture creation to allow reinterpretation of the format.

Kangz · 2020-05-04T09:19:16Z

Thanks @litherum for the analysis! It provides a good rational for why we don't want to allow free-reinterpretation of formats. It is highly related to #168.

One way to solve this would be to let GPUBindingResource be a GPUTexture. However, that alone probably wouldn’t be sufficient because there are some types of GPUTextureViews which can’t be represented by just GPUTextures, like cubemaps. Therefore, we should unify GPUTextureViewDimension and GPUTextureDimension, and add an arrayLayerCount into GPUTextureDescriptor (and possibly more).

I'm not sure I agree with the conclusion because the issue here is the MTLTextureUsagePixelFormatView which is just for reinterpreting the format of of texture views, but doesn't matter for selecting sub-ranges of mipmaps / array layers or the dimensionality.

How about adding a flag at texture creation, or a list of formats the application would like to be able to reinterpret as? Having the list of formats would allow for better optimization when some reinterpret_casting is possible without breaking the texture compression.

kvark · 2020-05-04T14:10:51Z

Similar thoughts here. Great investigation by @litherum (thanks!), just not the conclusion I expected. I recall we already discussed the requirements for backends to reinterpret the formats back in the day, I think it's more than just Metal that would benefit from an explicit flag like @Kangz described.

litherum · 2020-05-04T18:27:47Z

I see. If there's a requirement that the view's pixel format has to match the texture's pixel format, and there are architectural reasons why this must be so (as described above), then I suppose the best solution would be to delete the GPUTextureFormat field from the GPUTextureViewDescriptor.

We can add the field back if we want to drop this requirement in the future.

kainino0x · 2020-05-05T00:21:55Z

Metal only requires such a flag if the view format has a "different component layout". We should still be able to support reinterpretations with the same component layout.

The pixel layout is considered different if the number of components differs, or if their size or order is different from the components in the original pixel format.

Don't set this option if the texture view just needs to read the component values in a different order. Instead, create a texture view using a swizzle pattern to specify the new order. Similarly, don’t set this option if your texture view only converts between linear space and sRGB; for example, if your texture uses the MTLPixelFormatRGBA8Unorm pixel format and your texture view uses MTLPixelFormatBGRA8Unorm_sRGB.

kainino0x · 2020-05-18T20:24:58Z

I guess this is "Needs Proposal" since a proposal for "valid view formats list" was requested.

litherum · 2020-05-28T19:12:15Z

Yep. I'm in the middle of coming up with a proposal now.

Kangz · 2020-06-03T10:23:34Z

As discussed in the meeting, let's investigate the exact mechanisms of texture format reinterpretation. It's kind of a duplicate with #168 but worth doing again imho.

Vulkan

Vulkan has two texture creation flags that control whether formats can be reinterpreted:

VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT that says some reinterpretation will be possible inside format compatibility classes (see below), and also allows creating single-plane views of planar format with a format compatible with that plane's format.
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT requires that VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT is set and allows creating texture views of block-compressed formats with a size-compatible format (i.e. the texel size is the same size in bytes as the texel block for the compressed format).

Vulkan format compatibility classes (expressed in WebGPU formats) are the following:

8-bit formats:

    "r8unorm",
    "r8snorm",
    "r8uint",
    "r8sint",

16-bit formats

    "r16uint",
    "r16sint",
    "r16float",
    "rg8unorm",
    "rg8snorm",
    "rg8uint",
    "rg8sint",

32-bit formats:

    "r32uint",
    "r32sint",
    "r32float",
    "rg16uint",
    "rg16sint",
    "rg16float",
    "rgba8unorm",
    "rgba8unorm-srgb",
    "rgba8snorm",
    "rgba8uint",
    "rgba8sint",
    "bgra8unorm",
    "bgra8unorm-srgb",
    // Packed 32-bit formats
    "rgb10a2unorm",
    "rg11b10float",

64-bit formats:

    "rg32uint",
    "rg32sint",
    "rg32float",
    "rgba16uint",
    "rgba16sint",
    "rgba16float",

128-bit formats:

    "rgba32uint",
    "rgba32sint",
    "rgba32float",

Depth and stencil formats cannot be reinterpreted:

    // Each is its own compatibility class.
    "depth32float",
    "depth24plus",
    "depth24plus-stencil8"

@RafaelCintron @litherum please help produce similar investigations for D3D12 and Metal.

damyanp · 2020-06-05T21:59:44Z

Direct3D 12

The D3D documentation refers to this sort of reinterpretation as casting. Most of the details can be found in the D3D11 Functional Spec and are still relevant to D3D12. Of particular interest here is the discussion around typeless and typed memory. An explicit table showing the various formats is provided in D3D11_3_Formats_FL11_1.xls.

To summarize the rules:

Resources created with _TYPELESS can be accessed with compatible resource views using a compatible typed format as long as the number components and number of bits per-component are identical.
Depending on the value of the CastingFullyTypedFormatSupported:
- CastingFullyTypedFormatSupported == true: Resources created with the typed formats can be cast to any compatible type
- CastingFullyTypedFormatSupported == false: Resources created with the typed formats can only be accessed with exactly matching reosurce view formats (so no casting is supported)
_TYPELESS is potentially less performant if the resource is only accessed using a single typed format. (Edit: clarified that CastingFullyTypedFormatSupported does not give any promises about performance)

CastingFullyTypedFormatSupported is required for any WDDM2.2 driver; this corresponds to any driver that supports Windows 10 version 1703 or higher.

Here's the list of D3D12 format compatbility classes in terms of WebGPU formats:

// 8-bit formats (DXGI_FORMAT_R8_TYPELESS)
"r8unorm",
"r8snorm",
"r8uint",
"r8sint",

// 16-bit formats (DXGI_FORMAT_R16_TYPELESS)
"r16uint",
"r16sint",
"r16float",

// 16-bit formats (DXGI_FORMAT_R8G8_TYPELESS)
"rg8unorm",
"rg8snorm",
"rg8uint",
"rg8sint",

// 32-bit formats (DXGI_FORMAT_R32_TYPELESS)
"r32uint",
"r32sint",
"r32float",

// 32-bit formats (DXGI_FORMAT_R16G16_TYPELESS)
"rg16uint",
"rg16sint",
"rg16float",
// note: d3d also supports unorm & int variants here

// 32-bit formats (DXGI_FORMAT_R8G8B8A8_TYPELESS)
"rgba8unorm",
"rgba8unorm-srgb",
"rgba8snorm",
"rgba8uint",
"rgba8sint",

// 32-bit formats (DXGI_FORMAT_B8G8R8A8_TYPELESS)
"bgra8unorm",
"bgra8unorm-srgb",

// 64-bit formats (DXGI_FORMAT_R32G32B32A32_TYPELESS)
"rg32uint",
"rg32sint",
"rg32float",

// 64-bit formats (DXGI_FORMAT_R16G16B16A16_TYPELESS)
"rgba16uint",
"rgba16sint",
"rgba16float",
// note: d3d also supports snorm & unorm variants

// 128-bit formats (DXGI_FORMAT_R32G32B32_TYPELESS)
"rgba32uint",
"rgba32sint",
"rgba32float",

// Depth and stencil formats (DXGI_FORMAT_R32G8X24_TYPELESS)
"depth32float", // DXGI_FORMAT_D32_FLOAT
"depth24plus",  // DXGI_FORMAT_D32_FLOAT
"depth24plus-stencil8" // DXGI_FORMAT_D32_FLOAT_S8X24_UINT

kvark · 2020-06-15T18:41:11Z

@damyanp

Resources created with _TYPELESS can be accessed with compatible resource views using a compatible typed format as long as the number components and number of bits per-component are identical.
Here's the list of D3D12 format compatbility classes in terms of WebGPU formats:

These would apply to SRV and RTV/DSV only.
For UAV, it appears to be a bit more relaxed: we can view any 32bit format as R32, according to this.

kainino0x · 2020-06-16T01:47:41Z

I tried to compile all this info into a spreadsheet, and I also tried to summarize the (mostly fairly simple) rules from Metal, but the result is quite complicated and I'm not totally sure it's right.

https://docs.google.com/spreadsheets/d/1PRiOja_AVse0QuB6rDH_YzidGw5fucQIRz0md9Sg4m8/edit?usp=sharing

Please request edit access if you would like to make any changes.

RafaelCintron · 2020-06-19T22:21:46Z

@kainino0x , thank you for putting together this spreadsheet.

One aspect of @damyanp's reply that I am not 100% on is the following:

// Depth and stencil formats (DXGI_FORMAT_R32G8X24_TYPELESS)
"depth32float", // DXGI_FORMAT_D32_FLOAT
"depth24plus", // DXGI_FORMAT_D32_FLOAT
"depth24plus-stencil8" // DXGI_FORMAT_D32_FLOAT_S8X24_UINT

It would be unfortunate if depth24plus-stencil8 was always implemented with D32_FLOAT_S8X24_UINT instead of D24_UNORM_S8_UINT. Hence, I think the "depth24plus*" textures should be group together from a compatibility perspective and "depth32float" should be grouped with the "r32*" formats.

Of course, if web developers only test on platforms where "plus" textures give them more bits of precision, we may be forced to provide the same number of bits everywhere to keep content portable.

kainino0x · 2020-06-19T22:24:23Z

Updated the spreadsheet to reflect that.

I don't think it ultimately matters, because I don't think Vulkan will allow us to have any conversions between these formats. And on top of that, it seems potentially dodgy to make views that view the "X" parts of formats with unused bits.

kainino0x · 2020-06-19T22:26:55Z

Also updated the D3D12 and Metal spreadsheets to reflect that views shouldn't view X bits.

damyanp · 2020-06-20T00:27:46Z

Good catch Rafael, I'm not sure how I ended up with the depth/stencil in that state as it doesn't match the methodology I was using to generate the rest of the table at all.

litherum · 2020-06-22T01:08:26Z

I don't have access to modify that spreadsheet, but it appears the Metal page is incorrect. I wrote a sample program to determine the correct data, and here is the results: Metal Pixel Formats.zip

litherum · 2020-06-22T01:23:42Z

@Kangz

VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT that says some reinterpretation will be possible inside format compatibility classes

Does that mean that, without this bit, all views must have the same format as their underlying texture?

Does setting this bit have any performance implications?

(If the answer to both these questions is "yes" then that means Vulkan's spreadsheet truth table thing is identical to Metal's.)

litherum · 2020-06-22T01:24:53Z

Also: Does setting this bit require any extensions to be enabled in Vulkan?

litherum · 2020-06-22T01:34:02Z

(If the answer to both these questions is "yes" then that means ...)

Assuming the answer is "yes," (and no Vulkan extensions are required), then I think the proposal would be:

Add a flag to GPUTextureUsage called allowsSomeViewFormatReinterpretation (name up for bikeshedding). When this flag is not supplied by the author, texture views must have the same format as their owning texture. When the flag is supplied by the author, then some other texture formats are allowed. The set of allowed texture formats is the intersection of the green(ish) boxes in the spreadsheet (meaning - reinterpretation between two formats is allowed iff all of Metal, Vulkan, and D3D can perform the reinterpretation).

The flag makes sense because of the performance vs flexibility tradeoff. And, if the flag is supplied, we need to use the intersection of the greenish boxes, because we can't expand that set while also maintaining implementability on all 3 native APIs.

Kangz · 2020-06-22T17:20:05Z

I don't have access to modify that spreadsheet, but it appears the Metal page is incorrect. I wrote a sample program to determine the correct data, and here is the results: Metal Pixel Formats.zip

Whoops, I'm not sure I have access either, but I put the table in sheets so it's easy to view.

litherum · 2020-06-22T19:45:40Z

Discussed today.

Resolution: Accept the proposal. @kvark to come up with a proposal for how we can use the yellow box in the D3D spreadsheet. (Until then, don't use the yellow box.)

litherum · 2020-07-21T05:54:16Z

Updating from macOS Catalina 19F to 19G caused the macOS spreadsheet to change slightly. Now, converting between RGBA8Unorm and RGBA8Unorm_sRGB is free in either direction, and converting between BGRA8Unorm and BGRA8Unorm_sRGB is free in either direction.

Given that Vulkan hasn't changed at all, I don't think this affects the decision we make in WebGPU. I just wanted to mention it because there was some confusion during the call about how the Metal documentation seemed to disagree with the Metal validation layers. In 19G, they don't disagree.

kainino0x · 2020-08-18T21:21:04Z

Updated spreadsheet with rgb9e5 (#975) and the info from Catalina 19G+ (^)

kainino0x · 2020-08-18T21:37:54Z

Given the resolution from the Vulkan investigation it seems like there are no formats that can be freely reinterpreted without a "reinterpretable" flag. Adding a "reinterpretable" flag is #168. So, the only specification action that comes out of this issue is something that says "texture view format must equal texture format" until #168 is done.

litherum · 2021-01-15T23:30:58Z

ok

kainino0x added this to Needs Discussion in Main May 18, 2020

kainino0x moved this from Needs Discussion to Needs Investigation/Proposal or Revision in Main May 18, 2020

kvark mentioned this issue Jun 2, 2020

Allow Metal images without MTLTextureUsagePixelFormatView gfx-rs/gfx#3267

Closed

Kangz mentioned this issue Jun 4, 2020

Add GPUTextureDescriptor.viewFormats #811

Closed

This was referenced Jun 5, 2020

Allow D3D12 images that are strongly typed gfx-rs/gfx#3268

Open

Investigate copying between textures with different formats. #842

Open

kainino0x mentioned this issue Jun 20, 2020

Copies of depth values outside [0, 1] #789

Closed

kainino0x mentioned this issue Jun 23, 2020

Copying of depth24plus formats #652

Closed

kainino0x moved this from Needs Investigation/Proposal or Revision to Needs Specification in Main Aug 18, 2020

kainino0x added the investigation label Aug 18, 2020

kainino0x mentioned this issue Aug 18, 2020

Proposal: Add a usage bit to allow creating a view with different format when creating a texture #168

Closed

litherum closed this as completed Jan 15, 2021

This was referenced Nov 18, 2021

Allow format changes in copyTextureToTexture #2322

Open

Add GPUTextureDescriptor option for srgb reinterpretation #2336

Closed

This was referenced Dec 20, 2021

Specify that filtering with comparison is implementation-defined #2425

Merged

texture_2d_depth for loading depth values might be unnecessary? #2094

Closed

kainino0x moved this from Needs Specification to Specification Done in Main Jan 19, 2022

ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this issue Sep 6, 2022

wgsl: add 32-bit limits and other constants (gpuweb#744)

6a374ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Texture views cause unnecessary memory bandwidth use #744

Texture views cause unnecessary memory bandwidth use #744

litherum commented May 2, 2020 •

edited

kainino0x commented May 2, 2020 •

edited

kainino0x commented May 2, 2020 •

edited

Kangz commented May 4, 2020

kvark commented May 4, 2020

litherum commented May 4, 2020 •

edited

kainino0x commented May 5, 2020

kainino0x commented May 18, 2020

litherum commented May 28, 2020

Kangz commented Jun 3, 2020

damyanp commented Jun 5, 2020 •

edited

kvark commented Jun 15, 2020

kainino0x commented Jun 16, 2020 •

edited

RafaelCintron commented Jun 19, 2020

kainino0x commented Jun 19, 2020

kainino0x commented Jun 19, 2020

damyanp commented Jun 20, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

Kangz commented Jun 22, 2020

litherum commented Jun 22, 2020 •

edited

litherum commented Jul 21, 2020

kainino0x commented Aug 18, 2020 •

edited

kainino0x commented Aug 18, 2020

litherum commented Jan 15, 2021

Texture views cause unnecessary memory bandwidth use #744

Texture views cause unnecessary memory bandwidth use #744

Comments

litherum commented May 2, 2020 • edited

kainino0x commented May 2, 2020 • edited

kainino0x commented May 2, 2020 • edited

Kangz commented May 4, 2020

kvark commented May 4, 2020

litherum commented May 4, 2020 • edited

kainino0x commented May 5, 2020

kainino0x commented May 18, 2020

litherum commented May 28, 2020

Kangz commented Jun 3, 2020

Vulkan

damyanp commented Jun 5, 2020 • edited

Direct3D 12

kvark commented Jun 15, 2020

kainino0x commented Jun 16, 2020 • edited

RafaelCintron commented Jun 19, 2020

kainino0x commented Jun 19, 2020

kainino0x commented Jun 19, 2020

damyanp commented Jun 20, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

litherum commented Jun 22, 2020

Kangz commented Jun 22, 2020

litherum commented Jun 22, 2020 • edited

litherum commented Jul 21, 2020

kainino0x commented Aug 18, 2020 • edited

kainino0x commented Aug 18, 2020

litherum commented Jan 15, 2021

litherum commented May 2, 2020 •

edited

kainino0x commented May 2, 2020 •

edited

kainino0x commented May 2, 2020 •

edited

litherum commented May 4, 2020 •

edited

damyanp commented Jun 5, 2020 •

edited

kainino0x commented Jun 16, 2020 •

edited

litherum commented Jun 22, 2020 •

edited

kainino0x commented Aug 18, 2020 •

edited