Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add a usage bit to allow creating a view with different format when creating a texture #168

Closed
Jiawei-Shao opened this issue Jan 17, 2019 · 44 comments
Assignees
Projects
Milestone

Comments

@Jiawei-Shao
Copy link
Contributor

Introduction

Explicit APIs all provide mechanisms to specify if we can create a texture view with different format when creating a texture to optimize the access of textures when the texture is never interpreted in a different format. This proposal intends to add a new usage bit in WebGPUTextureDescriptor to make full use of this type of optimization provided by all explicit APIs.

Native APIs

D3D12

In D3D, there are two ways to specify the layout (or memory footprint) of a resource:

  • Typed - fully specify the type when the resource is created.
  • Typeless - fully specify the type when the resource is bound to the pipeline.

D3D document also mentions creating a fully-typed resource enables the runtime to optimize access, especially if the resource is created with flags indicating that it cannot be mapped by the application.
However fully-typed resources cannot be reinterpreted using the view mechanism unless the resource was created with the D3D10_DDI_BIND_PRESENT flag.

On a typeless resource, the data type is unknown when the resource is first created. The exact data format (whether the memory will be interpreted as integers, floating point values, unsigned integers etc.) will not be determined until the resource is bound to the pipeline with a resource view. The format specified must be from the same family as the typeless format used when creating the resource. For example, a resource created with the R8G8B8A8_TYPELESS format cannot be viewed as a R32_FLOAT resource even though both formats may be the same size in memory.

Metal

In Metal, a texture must be created with MTLTextureUsagePixelFormatView usage if you want to call makeTextureView() on this texture.

Vulkan

Vulkan defines VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT for the member flag in VkImageViewCreateInfo struct. VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT specifies that the image can be used to create a VkImageView with a different format from the image.

If VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT is not set, the drivers will be able to do optimizations on the storage of the image. For example, according to the implementation in Mesa, for the images in some formats and created without this flag, their storage will be compressed inside the driver and the performance of all clearing, texturing and rendering operations on these images will be able to get improved.

Proposal

Here is our proposal on WebGPU texture view in IDL.

interface WebGPUTextureUsage {
    const u32 NONE = 0;
    const u32 TRANSFER_SRC = 1;
    const u32 TRANSFER_DST = 2;
    const u32 SAMPLED = 4;
    const u32 STORAGE = 8;
    const u32 OUTPUT_ATTACHMENT = 16;
    const u32 PRESENT = 32;

    const u32 VIEW_IN_DIFFERENT_FORMAT = 64;
};

As D3D12, Metal and Vulkan all provide mechanisms to restrict the format of the texture view must be the same as the original texture, it is necessary to expose this ability to WebGPU.

In our proposal a new usage bit VIEW_IN_DIFFERENT_FORMAT is added to WebGPUTextureUsage to specify if we can create a texture view with a different format when creating the WebGPU texture. This usage bit can be implemented as follows on each backends:

  • On D3D12, when the usage bit is set, the we create the D3D12 texture in a typeless format, otherwise we will use a typed format.
  • On Metal, when the usage bit is set and the texture will be a 2D texture with only one mipmap level and one array layer, we can avoid setting the MTLTextureUsagePixelFormatView flag when creating Metal textures.
  • On Vulkan, when the usage bit is set, we can avoid setting VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT when creating Vulkan images.
@bbernhar
Copy link

GPU could support casting between fully typed resources; if so, typeless is not required. In D3D, this is done by querying D3D12_FEATURE_DATA_D3D12_OPTIONS where the conversions are done in H/W transparently.

@kvark
Copy link
Contributor

kvark commented Jan 17, 2019

Thank you for investigation @Jiawei-Shao !

In general, I like the idea of specifying whether the user intends to create views of a texture with different formats up-front. The particular mechanism to do so is concerning: we have a clear definition of "usage", and we shouldn't add more flags there that don't fit this definition. Perhaps, we could introduce a new flag type:

interface WebGPUTextureCapabilities {
    const u32 NONE = 0;
    const u32 VIEW_AS_DIFFERENT_FORMAT = 1;
    const u32 VIEW_AS_CUBE = 2;
};

Compatibility spec

Major things missing from the investigation are:

  • identifying format compatibility and specifying it. Not all formats can be viewed as. D3D12 rules and limitations for UAV views are different from SRV/RTV views, for example.
  • specifying what happens exactly when one format is being used to change the resource while another to read from it. See this note in Vulkan spec for example:

Values intended to be used with one view format may not be exactly preserved when written or read through a different format. For example, an integer value that happens to have the bit pattern of a floating point denorm or NaN may be flushed or canonicalized when written or read through a view with a floating point format. Similarly, a value written through a signed normalized format that has a bit pattern exactly equal to -2b may be changed to -2b + 1 as described in Conversion from Normalized Fixed-Point to Floating-Point.

Correction-1: D3D12

The link you provided for D3D12 API is really about D3D11 programming guide. The D3D12 docs on CreateShaderResourceView state the following:

When viewing a resource, the resource-view description must specify a typed format, that is compatible with the resource format. So that means that you can't create a resource-view description using any format with _TYPELESS in the name. You can however view a typeless resource by specifying a typed format for the view. For example, a DXGI_FORMAT_R32G32B32_TYPELESS resource can be viewed with one of these typed formats: DXGI_FORMAT_R32G32B32_FLOAT, DXGI_FORMAT_R32G32B32_UINT, and DXGI_FORMAT_R32G32B32_SINT, since these typed formats are compatible with the typeless resource.

Digging deeper, the rules for casting seem to be very complex: https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/hardware-support-for-direct3d-12-0-formats

Look at the "PCS" versus "FCS" versus "FNS" there... ouch

Correction-2: D3D12

The format specified must be from the same family as the typeless format

IIRC, this only applies to RTV and SRV, where the number of bits for each component must match. For UAVs D3D12 allows RGBA8 to be treated as Uint32, for example. I don't know if that's just specified on a case-by-case basis, or simply requires the total number of bits to match. Would be great to get some clarity from @RafaelCintron .

Correction-3: Metal

On Metal, when the usage bit is set and the texture will be a 2D texture with only one mipmap level and one array layer, we can avoid setting the MTLTextureUsagePixelFormatView flag when creating Metal textures.

I don't think we can do this. We'd still need MTLTextureUsagePixelFormatView in order to make views of this texture in different formats, regardless of 2D/mipmap/layers.

@litherum
Copy link
Contributor

https://developer.apple.com/documentation/metal/mtltexture/1515598-maketextureview

Reinterpretation of image data between pixel formats is supported within the following groups:
All 8-bit color formats
All 16-bit color formats
All 32-bit color formats
All 64-bit color formats
All 128-bit color formats
sRGB and non-sRGB forms of the same compressed format

@kdashg
Copy link
Contributor

kdashg commented Jan 23, 2019

I prefer if we aim for the subset of "compatible classes" of formats between the underlying APIs.

Vulkan has largely the same classes as Metal, (Vulkan spec "Format Compatibility Classes") though Vulkan is more compatible between compressed textures, but does restrict some formats that are otherwise of the same bit-per-block count. (R10X6G10X6B10X6A10X6 is not compatible with R16G16B16A16)

@RafaelCintron will confirm what the compatibility landscape is for D3D12.

@RafaelCintron
Copy link
Contributor

Here is the breakdown of the D3D's texture families.

Typeless Format Specific Format
R32G32B32A32_TYPELESS R32G32B32A32_FLOAT
R32G32B32A32_UINT
R32G32B32A32_SINT
R32G32B32_TYPELESS R32G32B32_FLOAT
R32G32B32_UINT
R32G32B32_SINT
R16G16B16A16_TYPELESS R16G16B16A16_FLOAT
R16G16B16A16_UNORM
R16G16B16A16_UINT
R16G16B16A16_SNORM
R16G16B16A16_SINT
R32G32_TYPELESS R32G32_FLOAT
R32G32_UINT
R32G32_SINT
R32G8X24_TYPELESS D32_FLOAT_S8X24_UINT
R32_FLOAT_X8X24_TYPELESS
X32_TYPELESS_G8X24_UINT
R10G10B10A2_TYPELESS R10G10B10A2_UNORM
R10G10B10A2_UINT
R10G10B10_XR_BIAS_A2_UNORM
R11G11B10_FLOAT
R8G8B8A8_TYPELESS R8G8B8A8_UNORM
R8G8B8A8_UNORM_SRGB
R8G8B8A8_UINT
R8G8B8A8_SNORM
R8G8B8A8_SINT
R16G16_TYPELESS R16G16_FLOAT
R16G16_UNORM
R16G16_UINT
R16G16_SNORM
R16G16_SINT
R32_TYPELESS D32_FLOAT
R32_FLOAT
R32_UINT
R32_SINT
R24G8_TYPELESS D24_UNORM_S8_UINT
R24_UNORM_X8_TYPELESS
X24_TYPELESS_G8_UINT
R8G8_TYPELESS R8G8_UNORM
R8G8_UINT
R8G8_SNORM
R8G8_SINT
R16_TYPELESS R16_FLOAT
D16_UNORM
R16_UNORM
R16_UINT
R16_SNORM
R16_SINT
R8_TYPELESS R8_UNORM
R8_UINT
R8_SNORM
R8_SINT
A8_UNORM
R9G9B9E5_SHAREDEXP
R8G8_B8G8_UNORM
G8R8_G8B8_UNORM
BC1_TYPELESS BC1_UNORM
BC1_UNORM_SRGB
BC2_TYPELESS BC2_UNORM
BC2_UNORM_SRGB
BC3_TYPELESS BC3_UNORM
BC3_UNORM_SRGB
BC4_TYPELESS BC4_UNORM
BC4_SNORM
BC5_TYPELESS BC5_UNORM
BC5_SNORM
B5G6R5_UNORM
B5G5R5A1_UNORM
B8G8R8A8_TYPELESS B8G8R8A8_UNORM
B8G8R8A8_UNORM_SRGB
B8G8R8X8_TYPELESS B8G8R8X8_UNORM
B8G8R8X8_UNORM_SRGB
BC6H_TYPELESS BC6H_UF16
BC6H_SF16
BC7_TYPELESS BC7_UNORM
BC7_UNORM_SRGB
AYUV
Y410
Y416
NV12
P010
P016
420_OPAQUE
YUY2
Y210
Y216
NV11
AI44
IA44
P8
A8P8
B4G4R4A4_UNORM

@kvark
Copy link
Contributor

kvark commented Feb 9, 2019

@RafaelCintron documentation for SRV creation has an explicit section casting from typeless to typed formats, and the families you provided are applicable there. However, documentation for either RTV, DSV, or UAV don't mention any compatibility requirements with the original resource format. Please clarify if this is an oversight in documentation, or the rules for SRV are indeed different?

@RafaelCintron
Copy link
Contributor

@kvark , this information is missing from the documentation. The rules are the same for RTV and DSV as they are for SRV.

@kvark
Copy link
Contributor

kvark commented Feb 20, 2019

@RafaelCintron I don't think this is true, or at least there appears to be something missing here.
Just tried to do a quick experiment: given a texture of format R8G8B8A8_TYPELESS I tried creating SRV and UAV of type R32_UINT. The SRV creation failed with the following explanation:

D3D12 ERROR: ID3D12Device::CreateShaderResourceView: The Format (0x2a, R32_UINT) is invalid, when creating a View; it is not a fully qualified Format castable from the Format of the Resource (0x1b, R8G8B8A8_TYPELESS). [ STATE_CREATION ERROR #28: CREATESHADERRESOURCEVIEW_INVALIDFORMAT]

The UAV however succeeded, even though the target type is outside of the source type family. There is also a relevant documentation piece about casting to R32_UINT, although I'm not aware of a place that would fully specify these rules.

@RafaelCintron
Copy link
Contributor

Thank you for keeping me honest, @kvark

  • To use a particular UAV type, you must make the resource with the typeless format that matches the family in my earlier table.
  • Typed Loads from UAVs that are R32_[UINT|SINT|FLOAT] are special in that they the resource can be created with any of the following typeless formats:
    • DXGI_FORMAT_R10G10B10A2_TYPELESS
    • DXGI_FORMAT_R8G8B8A8_TYPELESS
    • DXGI_FORMAT_B8G8R8A8_TYPELESS
    • DXGI_FORMAT_B8G8R8X8_TYPELESS
    • DXGI_FORMAT_R16G16_TYPELESS
    • DXGI_FORMAT_R32_TYPELESS
  • To figure out whether the UAV of your desired type will work, refer to the documentation in Typed Unordered Access View Loads document. Note that when the document says a format is “supported”, it means “supported for typed load as a UAV”.
  • Hardware has the option to support casting from one typed format to another, without the need for creating the resource as typeless. This can be determined via CheckFeatureSupport using D3D12_FEATURE_DATA_D3D12_OPTIONS3.

@kainino0x
Copy link
Contributor

Hopefully comprehensive spreadsheet that we can reference when adding a "reinterpretable" flag: #744 (comment)

@Kangz Kangz modified the milestones: post-V1, V1.0 Sep 2, 2021
@Kangz Kangz added this to Needs Discussion in Main Oct 11, 2021
@litherum
Copy link
Contributor

What's the rationale for this being in V1?

@Kangz
Copy link
Contributor

Kangz commented Oct 25, 2021

RGB vs. SRGB seems pretty important but I don't know for sure. Other reinterpretations don't seem that useful for V1 though.

@Jiawei-Shao
Copy link
Contributor Author

Unorm vs Uint

RGB vs. SRGB seems pretty important but I don't know for sure. Other reinterpretations don't seem that useful for V1 though.

Unorm vs Uint also seems important.

@litherum
Copy link
Contributor

Unorm vs Uint also seems important.

Why?

@kainino0x
Copy link
Contributor

kainino0x commented Oct 25, 2021

Meeting result: this probably* won't be super urgent, and it's not obvious how we expose this: usage bit (like metal(?)) vs list of formats (like vulkan(?)). Myles will continue internal discussions and come back, but if after that we're not able to come to a conclusion on this quickly, we're OK bumping it to post-V1.

@kainino0x
Copy link
Contributor

* Unless we find a particular use case that will need this (and texture-to-buffer-to-texture copy would be a problem).

@shaoboyan
Copy link
Contributor

Unorm vs Uint also seems important.
Why?

This is triggered by the implementation of CopyExternalImageToTexture. We find that, for example, rgba8unorm and rgba8uint is not allowed to copy from/to each other directly. Instead, we need to employee a render pipeline to do the blit. Since the content of the texture are same, if we support the direct copy, we don't need to start a pipeline and do some calculation in shaders.

But I don't know any practical use cases from developers. Maybe someone use the same texture in render pipeline with unorm format and compute pipeline with uint format.

@Jiawei-Shao
Copy link
Contributor Author

Unorm vs Uint also seems important.

Why?

It seems sometimes using uint is more "precise" than unorm. For example,

  • writing 0.5 into an R8Unorm texture in the shader, and do texture-to-buffer copy, the byte value in the buffer may be 127 or 128.
  • creating an R8Uint texture view over the R8Unorm texture and writing 127 into it in the shader, and do texture-to-buffer copy, the value in the buffer should always be 127.

So now when writing tests I prefer using the pixel values like 0.2, 0.4, 0.6 or 0.8 because they will always generate exactly same values when copying from texture to buffer.

@Kangz
Copy link
Contributor

Kangz commented Oct 26, 2021

Note that the issue with 0.5 is because a unorm texture represents values of the form, i / 255. 0.5 is exactly halfway between 127 / 255 and 128 / 255. If you use exact values the formats work correctly (I made tests for this in Dawn).

@litherum
Copy link
Contributor

writing 0.5 into an R8Unorm texture in the shader, and do texture-to-buffer copy, the byte value in the buffer may be 127 or 128

Right, this makes sense. Thanks for the explanation.

creating an R8Uint texture view over the R8Unorm texture and writing 127 into it in the shader

Even if the destination texture is R8Unorm, a shader is still capable of writing precise values into it. It can execute a scale-round-unscale operation before writing the result.

I understand that the texture view solution is more elegant than this solution. However, I'm not sure the texture view solution is required for V1 if there's a fairly straightforward workaround which is already achievable today.

@litherum
Copy link
Contributor

For reference, #744 seems to have all the research into it.

https://docs.google.com/spreadsheets/d/1PRiOja_AVse0QuB6rDH_YzidGw5fucQIRz0md9Sg4m8/edit?usp=sharing is the big spreadsheet with the allowed conversions in all the 3 APIs.

@litherum
Copy link
Contributor

In this issue and #744, the only proposal I see is adding a single flag to GPUTextureUsage. And yet, there's talk about a new list-based proposal. Where is this list-based proposal written?

@litherum
Copy link
Contributor

Oh, I think there's another possible solution: do it like like D3D. The application asks whether two formats are compatible, and the device can return an enum: {yes, no, slow}. If the application asks for a "slow" reinterpretation, the texture must have been created with some allow_slow_reinterpretation bit set.

This solution has pros and cons.

Yet again, I'm thinking that this topic should be postponed because it's a not-insignificant design problem, and solving it isn't urgent.

@Kangz
Copy link
Contributor

Kangz commented Oct 28, 2021

+1, I don't think we have a use case for which this feature is critical yet.

@magcius
Copy link

magcius commented Nov 10, 2021

In games, it's common to do most of your world rendering in linear space, and then do tonemapping, and then render the UI in gamma space. If you have textures that you need to reuse between the world and UI, then you need to sample with both sRGB and linear texture formats. A classic example is particle effects that need to appear in both the world and the UI.

In WebGL 2, you would have to upload the same texture data twice, which is unfortunate.

There's other examples too, though I don't know if format reinterpretation would work for them, like writing to a framebuffer texture in sRGB-format, and then sampling from that same framebuffer texture by a separate non-sRGB-format view, which effectively does gamma correction in the hardware.

@kainino0x
Copy link
Contributor

In this issue and #744, the only proposal I see is adding a single flag to GPUTextureUsage. And yet, there's talk about a new list-based proposal. Where is this list-based proposal written?

I finally figured this out during the issue triage today. It's #811.

@kainino0x
Copy link
Contributor

I really wish we could just make srgb reinterpretation always-enabled, but Vulkan and D3D12 imply it must be costly on some (desktop) hardware. OTOH, it is always enabled in Metal, so for them either the cost was small enough, or the driver is doing something special for that hardware (an implicit relayout, or other workaround).

@magcius
Copy link

magcius commented Nov 10, 2021

Considering that the evolution of sRGB texture reads started as a sampler flag (e.g. D3DSAMP_SRGBTEXTURE in D3D9) rather than a separate texture format, I can't imagine it's very expensive on desktop hardware, but that's just a hunch, I'd be curious to hear from IHVs.

@Kangz
Copy link
Contributor

Kangz commented Nov 10, 2021

Intel had some reservations about adding SRGB compatibility by default because some GPU generation can use color compression for non-SRGB formats but not for SRGB formats, so that disables framebuffer color compression altogether (ccs_e). See this table that seems to indicate it would prevent color compression on RGBA8Unorm on Intel Gen9 (HD 630 and friends, really popular GPUs).

@magcius
Copy link

magcius commented Nov 10, 2021

Is that for render target writes only? Or does it apply to just normal textures as well.

If it's just render target writes, does mesa/the GL driver have to do a full decompression step if I run glDisable(GL_FRAMEBUFFER_SRGB) ?

@Jiawei-Shao
Copy link
Contributor Author

Intel had some reservations about adding SRGB compatibility by default because some GPU generation can use color compression for non-SRGB formats but not for SRGB formats, so that disables framebuffer color compression altogether (ccs_e). See this table that seems to indicate it would prevent color compression on RGBA8Unorm on Intel Gen9 (HD 630 and friends, really popular GPUs).

That's true. So on Intel it will be more performant if the image is created with VkImageFormatListCreateInfo (provided in Vulkan 1.2 or VK_KHR_image_format_list) to specify all the formats that will be used with this texture.

@kainino0x
Copy link
Contributor

Some more thoughts on viewFormats list (#811) vs flag (this issue / #744), and the proposal in #2336.

IIRC @litherum previously expressed was that a single reinterpretation flag was strongly preferable. Since I think we've we concretely determined we need more than a single flag (none, vs srgb, vs wider reinterpretation), #2336 attempts a reasonable middle ground.

I think it's a fair argument, at least from a porting standpoint, that not having any presets would be a pain point for developers trying to assume reinterpretability later on. That said, I'm not concerned about it - I don't think format reinterpretation comes up extremely often, and when it does, developers can just hardcode the compatibility families that they might actually need - which might end up being smaller and more efficient than the families they would get with a preset option.

We can infer from the existence of VK_KHR_image_format_list, and determine concretely (e.g. as Dzmitry has done here), that reducing the number of view formats has a concrete benefit. Unless we can concretely determine and standardize more presets that are somehow future-proof against future GPU generations, I don't see any way we can give users the best performance with a less descriptive API. I think the best we could do is make some educated guesses at building a form that is still very expressive without listing every format, like having viewFormats allow (multiple) presets like 'exact same', 'different sRGBness', 'different int sign' (sint/uint), 'different norm sign' (snorm/unorm), 'same rgba bit layout' (D3D12 core rules), 'same size' (MTLTextureUsagePixelFormatView and VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT). And maybe something like 'r32-storage' for the special R32_* UAVs in D3D12. As I describe it, this is sounding like a whole new bitfield.

In any case, we previously seemed to be moving toward not tackling this problem until post-V1. If we do that, we just need a reasonably future-looking API that could be compatible with any of the other future designs (hence exploring those future designs in depth). Hence the proposal in #2336.

@kainino0x
Copy link
Contributor

That alternative proposal in IDL form:

typedef [EnforceRange] unsigned long GPUTextureViewReinterpretationFlags;
[Exposed=(Window, DedicatedWorker)]
namespace GPUTextureViewFamily {
    const GPUFlagsConstant SRGB                        = 0x0001;
    const GPUFlagsConstant COMPONENT_SIGN              = 0x0002;
    const GPUFlagsConstant COMPONENT_TYPE              = 0x0004;
    const GPUFlagsConstant SAME_BYTE_SIZE_UNCOMPRESSED = 0x0008; // strict superset of 1|2|4
    const GPUFlagsConstant R32_STORAGE                 = 0x0010;

    // Allows reinterpretation between block and non-block formats.
    // Probably not possible in D3D12, but included for completeness.
    const GPUFlagsConstant SAME_BYTE_SIZE              = 0x0020;
};

partial dictionary GPUTextureDescriptor {
    GPUTextureViewFamilyFlags viewFormats = 0;
};

@fintelia
Copy link

fintelia commented Dec 3, 2021

// Allows reinterpretation between block and non-block formats.
// Probably not possible in D3D12, but included for completeness.
const GPUFlagsConstant SAME_BYTE_SIZE              = 0x0020;

Is this intended to capture the semantics of Vulkan's VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT (which is slighly more constrained than "any two formats with the same byte size")? If not, I'd definitely encourage adding a family that does provide that functionality.

@kvark
Copy link
Contributor

kvark commented Dec 3, 2021

@kainino0x

I think the best we could do is make some educated guesses at building a form that is still very expressive without listing every format

That's a difficult enterprise. We'd have to understand all the open-source driver logic and anticipate what happens in closed-source drivers. We can't get it perfect.

There is a different approach we could take. We could define the set of formats that can be viewed with, portably, in all APIs. And then ask the user to specify exactly the ones they want:

partial dictionary GPUTextureDescriptor {
    sequence<GPUTextureFormat> viewFormats = [];
};

I believe this is more future-proof, since we don't have to expose the nasty details of the drivers.
I also believe it's more straightforward for the users. If a user needs to cast between Srgb and non-Srgb, they'd check with the spec if it's possible, and pass in the cast format in this list. They wouldn't need to know which exact sub-class of castable formats this belongs to.

@kainino0x
Copy link
Contributor

That's a difficult enterprise. We'd have to understand all the open-source driver logic and anticipate what happens in closed-source drivers. We can't get it perfect.

Exactly my point, we can't get it perfect especially for future hardware. We can only get closer to perfect, with better granularity.

There is a different approach we could take. We could define the set of formats that can be viewed with, portably, in all APIs. And then ask the user to specify exactly the ones they want:

I think this is the same as #811. If we allowed a list of formats we should definitely always enforce strictly specified compatibility rules, not vary by platform.

Is this intended to capture the semantics of Vulkan's VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT (which is slighly more constrained than "any two formats with the same byte size")? If not, I'd definitely encourage adding a family that does provide that functionality.

Ah yeah, that was what I had in mind, I just forgot what the name in Vulkan was. (I just didn't think about it too hard because I think it's impossible in D3D12 anyway so we wouldn't be able to expose it.)

@shaoboyan
Copy link
Contributor

shaoboyan commented Jan 21, 2022

We got inputs from the mesa driver team about creating *-srgb texture views on non-srgb texture.

  • Mesa team confirms that it isn't possible to enable ccs_e on R8G8B8A8_UNORM_SRGB prior to Gen 10. So it has to be disabled.
  • Mesa team can't see a way to turn on/off compression dynamically.
    With descriptor indexing they just don't have a way to tell when an image is going to be used in SRGB/UNORM.

So we(Intel) support Kai's proposal.

@kdashg
Copy link
Contributor

kdashg commented Jan 22, 2022

WGSL meeting minutes 2022-01-19
  • KN: this just came up in canvas formats discussion, so it's relevant. If we can reinterpret texture as sRGB, we can do that rather than allocating sRGB texture. Reinterpreting the base format.
  • KN: I put up a proposal a while ago. #2336. Tries to sidestep the question of "do we have a yes/no reinterpret flag" or "list of all formats you can reinterpret as", for sRGB in particular. Can discuss the whole gamut of the question.
  • KN: I don't really care, but I don't think a single bit is enough. There are ergonomic concerns about providing the whole list. Happy to do something in between.
  • MM: is the reason we're discussing htis now because we want srgb-to-non-srgb views?
  • KN: that's why it's at the top of the queue, yes.
  • MM: in Metal those are free. In other APIs they're not?
  • KN: correct. Nuance around it. Intel devices, according to our reading of open source drivers, even when they do know specifically they only need to reinterpret between e.g. rgba8unorm and rgba8unorm-srgb, they deoptimize, because framebuffer compression doesn't work on srgb formats (on older intel architectures?). Can be really valuable for drivers ot know this info.
  • KN: Vulkan used to only have a single bit - now has an extension with a list of reinterpretation formats. The driver would examine the list.
  • KG: but if you wanted it for free - we'd lose framebuffer compression even in cases where user didn't want srgb?
  • KN: not that old - Gen9 or 9.5, not that old.
    • context: Gen10 introduced in 2018.
  • JN: from D3D side - all recent drivers have to support free casting between sRGB and non sRGB specifically. Any broader casting, have to specify typeless formats to reinterpret components, and can't go beyond certain class types. Can't change channel orders or counts.
  • KN: thanks, didn't remember the D3D details.
  • RC: Jesse's right and years ago I put links into MSDN articles describing when it's automatic and not, etc.
  • MM: "all recent drivers" - is WebGPU expected to run on non-recent drivers?
  • RC: there's a desire to do that, yes. In my dream world, you don'tn need to maintain both WebGL and WebGPU - would like one API that works everywhere. E.g., disabling features of WebGPU, like no compute shaders on certain hardware - to extend the reach. We've discussed WebGPU Compat, though no longer in this format.
  • JN: in terms of era of drivers, circa 2017.
  • KG: that's encouraging.
  • KN: agree.
  • MM: Does Intel have comments / thoughts?
  • MM: this sounds like straightforward tier 1/2/3 thing.
  • KN: Not straightforward. Vulkan' doesn't look like that.
  • MM: tier 1 = one set. Tier 2 = bigger set. Tier 3 = even larger set.
  • KN: I mean - Vk API lets you specify reinterpretable format list. No boundaries between different tiers. The best we can do - look at real hardware and what it supports, and draw lines that look nice.
  • JN: I do believe Vk requires reinterpretation of all formats that are the same bytes per pixel.
  • KN: yes. Requires them to be possible.
  • JN: and all formats that fit into that category. No opt-out.
  • KN: can't figure out where to draw lines for perf purposes. Happen to know Gen9 doesn't support FB compression on sRGB formats, but don't know enough about other drivers.
  • KG: could take into account Metal's and D3D's expectations. If those don't perform super poorly on Vk, could adopt those. Castable sRGB support, for example, we'd always ask this.
  • KN: don't think we're willing to take that perf hit on Gen9. It's a big request.
  • MM: lowest tier would not allow sRGB casting. Next tier up, just sRGB to non sRGB casts. After that, would look at D3D and Metal tables and come up with something that makes sense. Maybe higher tier allowing all sorts of casting.
  • KN: did an exercise on how that'd work. Have to dig up the comment.
  • KN: my proposal: my #2336 creates one of these tiers - the sRGB tier. It punts on the rest of the question. Could either put the list there, or put more tiers in the enum. I'd like to accept #2336, sounds like it's inline with what the group's coming to.
  • KG: makes sense to me, could bikeshed on the name.
  • MM: think that's reasonable, agree with KG on the naming.
  • KN: we don't' have to land it now, but would like to accept the form.
  • MM: core of this - some devices shouldn't be able to freely cast between sRGB/non-sRGB, so give authors one bit of control per texture to start with.
  • KN: in that case, I'll see if we can talk about this in the editors' meeting. Think this is a good way forward. Would like tiers in the future, or lists - don't think it blocks this proposal.
  • KN: FWIW most of the people arguing to have the list are Corentin and Dzmitry, not able to be present at today's meeting.
  • SY: Intel driver's poor performance, we'll talk with internal driver team.
  • KN: I'll send you the resource we have. The driver says, on Gen9 do this, on Gen10 it's not a problem. I'll send you the info.
    • 168#issuecomment-965782693
    • after the fact: more thoughts:
      • We're pretty confident about the restriction being there, but we would be really interested in (at least qualitatively) how large the performance hit would be.
      • Another question would be whether the driver has any ability to dynamically enable and disable ccs_e based on whether srgb views exist of it. This I have no idea about. If the driver doesn't have to disable ccs_e on R8G8B8A8_UNORM until you actually create a view with format R8G8B8A8_UNORM_SRGB, then it would be significantly less of a problem. (edited)
  • KG: it's great to have confirmation from Intel.

@kainino0x kainino0x self-assigned this Jan 26, 2022
@kainino0x kainino0x moved this from Needs Discussion to Needs Specification in Main Jan 26, 2022
kainino0x added a commit to kainino0x/gpuweb that referenced this issue Jan 26, 2022
Initially allows only srgb formats; further rules for format
compatibility can follow.

Issue: gpuweb#168
CC: gpuweb#2322
kainino0x added a commit to kainino0x/gpuweb that referenced this issue Jan 26, 2022
Initially allows only srgb formats; further rules for format
compatibility can follow.

Issue: gpuweb#168
CC: gpuweb#2322
kainino0x added a commit that referenced this issue Jan 28, 2022
* Add GPUTextureDescriptor viewFormats list

Initially allows only srgb formats; further rules for format
compatibility can follow.

Issue: #168
CC: #2322

* note on canvas config
@litherum
Copy link
Contributor

I think there's more to discuss here:

  1. What happens if someone puts two totally incompatible formats into the list?
  2. What happens if someone puts two formats in the list which are compatible on one device/API but not compatible on another?
  3. What happens if someone puts RGBA8 and RGBA8_srgb into the list on Intel 630? The hardware is capable of it, at lower performance. What about the same thing, but on a device which doesn't have to disable compression?

@Kangz
Copy link
Contributor

Kangz commented Jan 31, 2022

What happens if someone puts two totally incompatible formats into the list?

Each format should have a list of compatible formats (that is portable to all APIs of course) that are supposed to all be compatible with each other. Using a viewFormat that's not in the GPUTexture's format compatibility list produces an error.

What happens if someone puts two formats in the list which are compatible on one device/API but not compatible on another?

That wouldn't be allowed, the list of compatible formats should be portable (except for optional features of course).

What happens if someone puts RGBA8 and RGBA8_srgb into the list on Intel 630? The hardware is capable of it, at lower performance. What about the same thing, but on a device which doesn't have to disable compression?

I'm not sure I understand. On HD630 you get a slowdown. On other systems you don't. That's the point of having a creation argument that says whether there's reinterpretation.

@Kangz
Copy link
Contributor

Kangz commented Jan 31, 2022

Note that #2540 correctly restricts the set of formats that can be reinterpreted:

            Two {{GPUTextureFormat}}s |format| and |viewFormat| are <dfn dfn for=>texture view format compatible</dfn> if:

            - |format| equals |viewFormat|, or
            - |format| and |viewFormat| differ only in whether they are `srgb` formats (have the `-srgb` suffix).

Daasin added a commit to FOSS-Archives/WebGPU that referenced this issue Mar 21, 2022
* Always allow reassociation (gpuweb#2403)

Fixed: gpuweb#2402

* Update question.md

* Fix GPUBuffer.destroy when "mapping pending" (gpuweb#2411)

Fixes gpuweb#2410

* CopyExternalImageToTexture: Support copy from webgpu context canvas/offscreenCanvas (gpuweb#2375)

Address gpuweb#2350.

* Fix typo in requestDevice and clarify error cases (gpuweb#2415)

* Disallow empty buffer/texture usages (gpuweb#2422)

* Specify dispatchIndirect behavior when exceeding limit (gpuweb#2417)

The rest of the behavior of this command isn't specified yet, but this
gets this into the spec so we can close the issue and edit later.

Fixes gpuweb#323

* Require buffer bindings to have non-zero size (gpuweb#2419)

* Require depth clear value in 0.0-1.0 (gpuweb#2421)

* Require depth clear value in 0.0-1.0

* clarify handling of union

* Fix error conventions for optional features (gpuweb#2420)

* Fix error conventions for optional features

* relax claim

* Add increment and decrement statements (gpuweb#2342)

* Add increment and decrement statements

Phrased in terms of += and -= complex assignments

Fixes: gpuweb#2320

* Remove the note about ++ and -- being reserved

* Specify order of evaluation for expressions (gpuweb#2413)

Fixes gpuweb#2261

* Expressions (and function parameters) are evaluated in left-to-right
  order
* Remove todos that are covered elsewhere in the spec
  * variable lifetime
  * statement and intra-statement order

* Remove [[block]] attribute from sample (gpuweb#2439)

* Enable dynamic indexing on matrix and array values (gpuweb#2427)

Fixes: gpuweb#1782

* Behaviour of empty statement is {Next} (gpuweb#2432)

Fixes: gpuweb#2431

* wgsl: Fix TODOs in Execution section (gpuweb#2433)

- in Technical overiew, say that evaluation of module-scope constants
  is the first thing to be executed
- Remove "Before an entry point begins" because that's now fully covered
  by the Technical overview
- Add introductory prose in the "Execution" top level section
- remove parens from "Program order (within an invocation)" section.

* wgsl: Fix "Entry Point" section TODOs (gpuweb#2443)

- Link 'stage' attribute text to definitions later on
- Move definition of "entry point" to the top of the "Entry Points"
  section, away from the "Entry point declaration" section.
- Rework and simplify the first part of "Entry point declaration".
  Link to other parts of the spec, e.g. to user-defined function.

* wgsl: Allow 0X for hex prefix (gpuweb#2446)

Fixes: gpuweb#1453

* Specify compilation message order/locations are impl-defined (gpuweb#2451)

Issue gpuweb#2435

* Disallow pipe for hex literals and allow capital (gpuweb#2449)

* Remove [SameObject] from GPUUncapturedErrorEvent.error (gpuweb#2423)

Implements the same behavior by prose rather than by WebIDL attribute.
The WebIDL attribute isn't currently valid on union types, and we have
to define this in prose anyway since [SameObject] is pure documentation
(has no behavioral impact on its own).

Fixes gpuweb#1225

* Make GPUDevice.lost return the same Promise object (gpuweb#2457)

Fixes gpuweb#2147

* Require alignment limits to be powers of 2 (gpuweb#2456)

Fixes gpuweb#2099

* Define GPUTextureViewDimension values (gpuweb#2455)

including the order of faces in cube maps.

Fixes gpuweb#1946

* Restore the box around algorithm divs (gpuweb#2453)

When the spec template changed, algorithms stopped having an outline
around them, which makes the spec hard to read.

* Add source image orientation to copyExternalImageToTexture (gpuweb#2376)

* Add 'originBottomLeft' attribute in GPUImageCopyTextureTagged

Resolved gpuweb#2324

* Simplify the description and move originBottomLeft to GPUImageCopyExternalImage

* Update spec/index.bs

Address Kai's description.

Co-authored-by: Kai Ninomiya <kainino1@gmail.com>

* Fix typo

* Apply suggestions from code review

* Update spec/index.bs

Co-authored-by: Kai Ninomiya <kainino1@gmail.com>

* Clarify that attachments may not alias (gpuweb#2454)

Fixes gpuweb#1358

* Fix examples classes, globals, and previews (gpuweb#2412)

* Rework encoder state and mixins (gpuweb#2452)

* GPUDebugCommandsMixin

* Move state and command list to a GPUCommandsMixin

* Propagate commands in endPass

* fix which [[commands]] is appended

* nits

* "Validate"->"Prepare" the encoder state

* Fully describe validation of render attachments (gpuweb#2458)

* Fully describe validation of render attachments

Fixes gpuweb#2303

* typos

* more typo

* Texture format caps for MSAA and resolve (gpuweb#2463)

* Texture forma caps for MSAA and resolve

* Fix missing columns, add notes

* Add multisample flags even where rendering isn't supported

* [editorial] wgsl: left shifts are logical (gpuweb#2472)

* Remove 'read','read_write','write' as keywords, image formats as keywords (gpuweb#2474)

* Texture format names are not keywords

Issue: gpuweb#2428

* read, write, read_write are not keywords

Fixes: gpuweb#2428

* Only define image format names usable for storage textures (gpuweb#2475)

* Only define image format names usable for storage textures

Fixes: gpuweb#2473

* Sort texel format names by channel width first

Make it consistent with the other tables in the WGSL spec,
and with the Plain Color Formats table in the WebGPU spec.

* [editorial] Rename "built-in variable" -> "built-in value" (gpuweb#2476)

* Rename "built-in variable" -> "built-in value"

Fixes: gpuweb#2445

* Rewrite builtin-in inputs and outputs section

It needed an overhaul because with pipeline I/O via entry point
parameters and return types.  Previously it was phrased in terms
of *variables*, and some things just didn't make sense.

Added rules expressing the need to match builtin stage and direction
with entry point stage and parameter vs. return type.
This also prevents mixing builtins from different stages or conflicting
directions within a structure.

* Move Limits section to under "WGSL Program" (gpuweb#2480)

I think it makes more sense there.

* Fix declaration-and-scope section for out-of-order decls (gpuweb#2479)

* Fix declaration-and-scope section for out-of-order decls

Also reorganize to bring "resolves to" closer to the definition of "in scope".

Fixes: gpuweb#2477

* Apply review feedback

* Behaviors: Ban obviously infinite loops (gpuweb#2430)

Fixes: gpuweb#2414

* Clarify fract (gpuweb#2485)

* Officially add Brandon as a spec editor (gpuweb#2418)

Brandon has de facto equal standing as an editor and I think it's time
to recognize it.

* Require 1D texture mipLevelCount to be 1. (gpuweb#2491)

Fixes gpuweb#2490

* Add simple examples for create, init, and error handling functions

* Address feedback from Kai

* Tweak to device loss comments

* wsgl: Add bit-finding functions. (gpuweb#2467)

* wsgl: Add bit-finding functions.

- countLeadingZeros, countTrailingZeros
   - Same as MSL clz, ctz
- firstBitHigh, firstBitLow
   - Same as HLSL firstbithi, firstbitlow
   - Same as GLSL findMSB, findLSB
   - Same as SPIR-V's GLSL.std.450 FindSMsb FindUMsb, FindILsb

Fixes: gpuweb#2130

* Apply review feedback

- Better description for countLeadingZeros countTrailingZeros
- For i32, we can say -1 instead of |T|(-1)

* Apply review feedback: drop "positions"

* wgsl: Add extractBits, insertBits (gpuweb#2466)

* wgsl: Add extractBits, insertBits

Fixed: gpuweb#2129 gpuweb#288

* Formatting: break lines between parameters

* insertBits operates on both signed and unsigned integral types

* Add mixed vector-scalar float % operator (gpuweb#2495)

Fixes: gpuweb#2450

* wgsl: Remove notes about non-ref dynamic indexing (gpuweb#2483)

We re-enabled dynamically indexing into non-ref arrays and matrices in
gpuweb#2427, as discussed in gpuweb#1782.

* Disallow aliasing writable resources (gpuweb#2441)

* Describe resource aliasing rules

Fixes gpuweb#1842

* Update spec/index.bs

Co-authored-by: Kai Ninomiya <kainino1@gmail.com>

* Update spec/index.bs

Co-authored-by: Kai Ninomiya <kainino1@gmail.com>

* Editorial changes

* Editorial: split out aliasing analysis

* Consider only used bind groups and consider visibility flags

* Consider aliasing between read-only and writable bindings

* Tentatively add note about implementations

* editorial nits

* Fix algorithm

* Remove loop over shader stages

* Rephrase as "TODO figure out what happens"

* clarify

* Add back loop over shader stages

* map -> list

Co-authored-by: Myles C. Maxfield <mmaxfield@apple.com>
Co-authored-by: Myles C. Maxfield <litherum@icloud.com>

* Fix map/list confusion from gpuweb#2441 (gpuweb#2504)

I forgot to save the file before committing my last fix to gpuweb#2441.

* Fix a typo in packing built-in functions list (gpuweb#2513)

* Remove tiny duplicates (gpuweb#2514)

This removes a few tiny duplicates found in the spec.

* integer division corresponds to OpSRem (gpuweb#2518)

* wgsl: OpMod -> OpRem for integers

* OpURem -> OpUMod again

Co-authored-by: munrocket <munrocket@pm.me>

* Remove stride attribute (gpuweb#2503)

* Remove stride attribute

Fixes: gpuweb#2493

Rework the examples for satisfying uniform buffer layout, using align
and stride.

* Remove attribute list from array declaration grammar rule

Fixes: gpuweb#1534 since this is the last attribute that may be applied to a type declaration.

* Switch to `@` for Attributes (gpuweb#2517)

* Switch to `@` for Attributes

* Convert new examples

* Struct decl does not have to end in a semicolon (gpuweb#2499)

Fixes: gpuweb#2492

* wgsl: float to integer conversion saturates (gpuweb#2434)

* wgsl: float to integer conversion saturates

Fixes a TODO

* Saturation follows rounding toward zero.

Simplify the wording of the rule.

Explain what goes on at the extreme value (as discussed in the issue
and agreed at the group), how you don't actually get the max value
in the target type because of imprecision.

* Store type for buffer does not have to be structure (gpuweb#2401)

* Store type for buffer does not have to be structure

* Modify an example showing a runtime-sized array as the store type
  for a storage buffer.

Fixes: gpuweb#2188

* Update the API-side rules about minBindingSize

The store type of the corresponding variable is not always
going to be a structure type. Qualify the rule accordingly.

* Rework minimum binding size in both WebGPU and WGSL spec

Define 'minimum binding size' in WGSL spec, and link to it from WebGPU.
Repeat the rule in both places (to be helpful).

The minimum binding size for a var with store type |T| is
max(AlignOf(T),SizeOf(T)), and explain why the AlignOf part is needed:
it's because sometimes we have to wrap |T| in a struct.

This also replaces the old rule in WGSL which confusingly dependend
on the storage class.  The storage class aspect is already embedded
in the alignment and size constraints for the variable.

* Simplify minimum binding size to Sizeof(store-type)

Underlying APIs don't need the extra padding at the end of any
structure which might wrap the store type for the buffer variable.

* Update API-side to SizeOf(store-type)

* Apply review feedback

- Link to SizeOf in the WGSL spec
- More carefully describe the motivation for the min-binding-size
  constraint.

* Simplify, and avoid using the word "mapping"

"Mapping" is how the buffer's storage is paged into host-accessible
address space. That's a different concept entirely, and would only
confuse things.

* Remove duplicated words (gpuweb#2529)

* Remove duplicated words `be`

Remove two duplicated `be` from the WebGPU spec.

* remove another duplicated `the`

* editorial: streamline array and structure layout descriptions (gpuweb#2521)

* Simplify array layout section

- move definition of element stride to start of memory layout section
- remove redundant explanation of array size and alignment
- remaining material in that example is just examples
- Add more detail to examples, including computing N_runtime for
  runtime-sized array

* Streamline structure layout section

- Make 'alignment' and 'size' defined terms
- Don't repeat the rule for overall struct alignment and size.
- Rename "Structure Layout Rules" to "Structure Member Layout" because
  that's all that remains.
  - Streamline the text in this section.

Fixes: gpuweb#2497

* Apply review feedback:

- state constraints at definition of the align and size attributes
- rename 'size' definition to 'byte-size'
- use the term "memory location" when defining alignment.
- rename the incorrectly-named "lastOffset" to "justPastLastMember"
- in the description of internal layout, state the general rule that the
  original buffer byte offset k must divide the alignment of the type.

* Change notation: say i'th member instead of M<sub>i</sub>

* Remove stray sentence fragment

* Change GPUObjectBase.label from nullable to union-with-undefined (gpuweb#2496)

* Separate loadOp and clear values

* Add note explaining how dispatch args and workgroup sizes interact (gpuweb#2519)

* Add a note explaining how the dispatch arguments and workgroup sizes interact

* Address feedback

* Address feedback from Kai

* Refine the supported swapchain formats (gpuweb#2522)

This removes the "-srgb" formats, and adds "rgba16float".

Fixes: gpuweb#1231

* wgsl: Fix example's builtin name (gpuweb#2530)

* wgsl: detailed semantics of integer division and remainder (gpuweb#1830)

* wgsl: detailed semantics of integer division and remander

Adds definition for `truncate`

For divide by zero and signed integer division overlow, say they
produce a "dynamic error".

Fixes: gpuweb#1774

* Assume polyfill for the overflow case

This pins down the result for both division and remainder.

* Specify definite results for int division, % by zero

These are no longer "dynamic errors".

For integer division:   e1 / 0 = e1

For integers,           e1 % 0 = 0

The % case is somewhat arbitrary, but it makes this true:
    e1 = (e1 / 0) + (e1 % 0)

Another way of formulating the signed integer cases is to forcibly
use a divisor of 1 in the edge cases:
     where MINIT = most negative value in |T|
     where Divisor = select(e2, 1, (e2==0) | ((e1 == MININT) & (e2 == -1)))
then
     "e1 / e2" = truncate(e1 / Divisor)
     "e1 % e2" = e1 - truncate(e1/Divisor) * Divisor

The unsigned integer case is similar but doesn't have (MININT,-1) case.

* Add override declarations (gpuweb#2404)

* Add override declarations

* Refactor `var` and `let` section
  * have a general value declaration subsection
  * subsections on values for let and override
  * move override requirements from module constants to override
    declarations
* introduce override keyword
* remove override attribute and add id attribute
  * literal parameter is now required
* Update many references in the spec to be clearer about a let
  declaration vs an override declaration
* update examples
* Make handling of `offset` parameter for texture builtins consistent
  * always either const_expression or module scope let

* Changes for review

* combine grammar rules
* refactor validation rules for overrides
* fix typos

* add todo for creation-time constant

* fix example

* combine grammar rules

* Rename storage class into address space (gpuweb#2524)

* Rename storage class into address space

* Davids review findings, plus renaming of |SC|

* Add an explainer for Adapter Identifiers to facilitate further design discussion.

* Explain smoothStep better (gpuweb#2534)

- Use more suggestive formal parameter names
- Give the formula at the function definition, not just at the
  error bounds explanation.

* Fix clamp arity in error bounds section (gpuweb#2533)

Also at the definitions, use more suggestive formal parameter names (e,low,high)
instead of the less readable (e1,e2,e3)

* Use consistent capitalisation in section titles (gpuweb#2544)

* remove some unnecessary `dfn` tags

* Remove unnecessary todos (gpuweb#2543)

* Defer API linkage issues to the API spec
* remove issues and todos that are covered in the API
* remove todo about array syntax

Co-authored-by: David Neto <dneto@google.com>

* Add GPUTextureDescriptor viewFormats list (gpuweb#2540)

* Add GPUTextureDescriptor viewFormats list

Initially allows only srgb formats; further rules for format
compatibility can follow.

Issue: gpuweb#168
CC: gpuweb#2322

* note on canvas config

* Enforce presence of an initializer for module-scope let (gpuweb#2538)

* Enforce presence of an initializer for module-scope let

* Since pipeline-overridable constants were split from let declarations
  the grammar for let declarations can enforce the presence of an
  initializer
* Remove global_const_intiailizer since it was only used in for a single
  grammar production (module-scope let) and only had a single grammar
  itself
* Update extract_grammar to initialize type declarations with zero-value
  expressions

* fix typos

* Make depth/stencil LoadOp and StoreOp optional again

This change was landed as part of gpuweb#2387 but was then accidentally
reverted when gpuweb#2386 landed out of order.

* Add the uniformity analysis to the WGSL spec (gpuweb#1571)

* Add the uniformity analysis to the WGSL spec

Make the information computed more explicit per Corentin's suggestion

Add uniformity of builtins, limit the singleton rule to {Next}, do some minor cleanup

Make the typography more uniform and hopefully less confusing

Add rule for switch, and simplify rule for if

Clarify the role of CF_start, and remove two instances of the word 'simply'

Remove TODO and allow accesses to read-only global variables to be uniform

Mark functions that use implicit derivatives as ReturnValueCannotBeUniform

* s/#builtin-variables/#builtin-values/ after rebasing

* Add (trivial) rules for let/var, as suggested by @alan-baker and @dneto

* Add rules for non-shortcircuiting operators, as suggested by @alan-baker and @dneto

* Use the rowspan attribute to simplify the tables in the uniformity section

* Fix syntax of statement sequencing/blocks in the uniformity rules, following an earlier fix to the behavior analysis.

* s/adressing/addressing/, as suggested by @alan-baker in an earlier review.

* Clarify 'local variable'

* Deal with non-reconvergence at the end of functions

* s/global/module-scope/, s/built-in variable/built-in value/, and mention let-declarations

* Address the last issues found by @dneto0

* CannotBeUniform -> MayBeNonUniform

* Apply Dzmitry's suggestions

* s/MustBeUniform/RequiredToBeUniform/g

Co-authored-by: Robin Morisset <rmorisset@apple.com>

* Vectors consist of components (gpuweb#2552)

* Vectors consist of components

* Update index.bs

* Make depth/stencil LoadOp and StoreOp optional again (pt.2)

This change was landed as part of gpuweb#2387 but was then accidentally
reverted when gpuweb#2386 landed out of order.

* WGSL: Replace [SHORTNAME] with WGSL (gpuweb#2564)

Fixes gpuweb#1589

* Fix step() logic (gpuweb#2566)

* Relax vertex stride requirements (gpuweb#2554)

* s/endPass/end/ for pass encoders (gpuweb#2560)

Fixes gpuweb#2555

* Fix canvas resizing example

* Rework "validating texture copy range" for 1D/3D textures. (gpuweb#2548)

That algorithm special cased 1D and 2D textures, making empty copies
valid for 2D and not for 1D. 3D textures where just not discussed.

Fix this by just checking that the copy fits in the subresource size,
and also turn "validating texture copy range" into an algorithm with
arguments.

Co-authored-by: Dzmitry Malyshau <kvark@fastmail.com>

* Add optional trailing comma for the attribute syntax. (gpuweb#2563)

All places that use a variable number of comma-separated things now
support trailing commas. However things with a fixed number of
comma-separated arguments don't. They are:

 - array_type_decl
 - texture_sampler_types
 - type_decl
 - variable_qualifier

Fixes gpuweb#1243

* wgsl: reserve `demote` and `demote_to_helper` (gpuweb#2579)

* if,switch param does not require parentheses (gpuweb#2585)

Fixes: gpuweb#2575

* Add while loop (gpuweb#2590)

Update behaviour analysis and uniformity analysis.

Fixes: gpuweb#2578

* Allow sparse color attachments and ignored FS outputs (gpuweb#2562)

* Allow sparse color attachments and ignored FS outputs

Fixes gpuweb#1250
Fixes gpuweb#2060

* Update pipeline matching rules

Co-authored-by: Dzmitry Malyshau <kvark@fastmail.com>

* Render -- as is (gpuweb#2576)

* Render -- as is

* Use backticks

* Use backticks for plus plus too

* Better separate Security and Privacy sections (gpuweb#2592)

* Better separate Security and Privacy sections

They were largely already separate but the header levels were a
bit confusing so this CL normalizes them and renames the sections
to "Security Considerations" and "Privacy Considerations" as
requested by the W3C security review guidelines.

Also expands the privacy section with a brief header, another
mention of driver bugs as a potentially identifying factor, and
a note indicating that discussions about adapter identifiers are
ongoing.

* Simplify adapter info privacy considerations note.

* Remove SPIR-V mappings (gpuweb#2594)

* Remove most references to SPIR-V opcodes and types in the
  specification
* References remain transitively in the Memory Model section as it is
  necessary for the specification
* Removed goal section as they only described SPIR-V

* Complete the Errors & Debugging section

* Addressed feedback

* Update spec/index.bs

Co-authored-by: Kai Ninomiya <kainino@chromium.org>

* Update spec/index.bs

Co-authored-by: Kai Ninomiya <kainino@chromium.org>

* Make firing the unhandlederror event optional in the algorithm

* Refactored algorithms for more sensible names.

* Fix typo in "validating texture copy range" argument (gpuweb#2596)

* Fix a typo in a note "applicaitions". (gpuweb#2602)

* Disallow renderable 3D textures (gpuweb#2603)

* `createSampler` creates a `GPUSampler` (gpuweb#2604)

Correct the link that errantly pointed to `GPUBindGroupLayout`.

* Extend lifetime of GPUExternalTexture imported from video element (gpuweb#2302)

* Extend lifetime of GPUExternalTexture imported from video element

Original PR: gpuweb#1666
Discussion: gpuweb#2124

* adjust lifetime, add flag

* editorial

* wgsl: reserve 'std', 'wgsl' (gpuweb#2606)

Fixes: gpuweb#2591

* wgsl: Fix typos (gpuweb#2610)

`signficant` -> `significant`
`consectuive` -> `consecutive`

* wgsl: Rename `firstBitHigh` and `firstBitLow`

The current names are confusing, as `High` or `Low` may refer to scanning from the MSB or LSB, or that it is scanning for the bits `1` or `0`.

By renaming to `firstLeadingBit` and `firstTrailingBit` the ambiguity is reduced, and we have a consistent terminology with `countLeadingZeros` / `countTrailingZeros`.

* Update override examples (gpuweb#2614)

Fixes gpuweb#2613

* Update WGSL syntax for overridable constants

* Fix a typo in FP32 internal layout (gpuweb#2615)

* Fix a typo in FP32 internal layout

In the internal layout of float32, Bits 0 through 6 of byte k+2 contain
bits 16 through 22 of the fraction, which has a total of 23 bits.

* remove duplicated "bit"

* WGSL style guide: in progress extensions developed outside main spec (gpuweb#2616)

Fixes: gpuweb#2608

* Clarify when a built-in function name can be redefined (gpuweb#2621)

* wgsl: Cleanup internal layout of matrix type (gpuweb#2624)

* Cleanup internal layout of matrix type

Use alignOf() to cleanup the description of internal layout of matrix type.

* Use "i x AlignOf()" instead of "AlignOf() x i"

* [editorial] Fix uniformity table widths (gpuweb#2623)

* Reduce table width by allowing more line breaks
* Make op consistently italicized

* Add break-if as optional at end of continuing (gpuweb#2618)

* Add break-if as optional at end of continuing

A break-if can only appear as the last statement in a continuing
clause.

Simplifies the rule about where a bare 'break' can occur: It
must not be placed such that it would exit a continuing clause.

Fixes: gpuweb#1867

Also refactor the grammar to make:
  continuing_compound_statement
  case_compound_statement
These are called out as special forms of compound statement, so
that the scope rule of declarations within a compound statement
also clearly apply to them.

* Add statement behaviour of break-if

The expresison always includes {Next}.
When the expression is true, the {Break} behaviour is invoked.
Otherwise, the {Next} behaviour is invoked.

So it's   B - {Next} + {Next, Break}
or   B + {Break}

* Add uniformity analysis for break-if

* Apply review feedback

- Tighten the wording about where control is transferred for break and
  break-if
- Allow "break" to be used in a while loop.

* Avoid double-avoid

* wgsl: Reserve words from common programming languages (gpuweb#2617)

* wgsl: Reserve words from common programming languages

Reserves words from C++, Rust, ECMAScript, and Smalltalk

Add a script to automatically generate the contents of the _reserved
grammar rule.

Update extract-grammar.py to strip HTML comments before processing.

* List WGSL as a reason for a keyword reservation

* Reserve 'null'

* Reserve keywrods from GLSL 4,6 and HLSL

* Use ECMAScript 2022 instead of ECMAScript 5.1

* Reserve HLSL keywords

* Add acosh, asinh, atanh builtin functions (gpuweb#2581)

* Add acosh, asinh, atanh builtin functions

Use the polyfills from Vulkan.

Fixes: gpuweb#1622

* Result is 0 in regions that make no mathematical sense: acosh, atanh

* wgsl: Cleanup the internal memory layout of vector using SizeOf (gpuweb#2626)

* Cleanup the internal memory layout of vector using SizeOf

Descript the internal memory layout of vector types vecN<T> with
SizeOf(T) rather than literal number.

* Fix typo

* Fix typos of accuracy of exp and exp2 (gpuweb#2634)

Fix the accuracy requirement of exp and exp2 to 3 + 2 * abs(x) ULP.

* Reland: Only allow depth/stencil load/store ops when they have an effect

* Validate query index overwrite in timestampWrites of render pass (gpuweb#2627)

Vulkan requires the query set must be reset between uses and the reset
command must be called outside render pass, which makes it impossable to
overwrite a query index in same query set in a render pass, but we can
do that in different query set or different render pass.

* Add definitions for uniformity terms (gpuweb#2638)

* add definitions (and link back to them) for:
  * uniform control flow
  * uniform value
  * uniform variable
* define the scope of uniform control flow for different shader stages

* Allow statically unreachable code (gpuweb#2622)

* Allow statically unreachable code

Fix gpuweb#2378

* modify behavior analysis to allow statically unreachable code
  * unreachable code does not contribute to behaviors
* modify uniformity analysis to not analyze unreachable code
  * unreachable statements are not added to the uniformity graph

* Improve examples

* Clarify when sequential statement behaviour leads to a different
  behaviour from that of the individual statement
* improve example comment formatting to reduce possible horizontal
  scrolling

* Name in enable directive can be a keyword or reserved word (gpuweb#2650)

Fixes: gpuweb#2649

Also simplify description of where an enable directive can appear.
They must appear before any declaration.

* GPUDevice.createBuffer() requires a valid GPUBufferDescriptor (gpuweb#2643)

* Typo in definition of finish()

* Allow unmasked adapter info fields to be requested individually.

* Update design/AdapterIdentifiers.md

Co-authored-by: Kai Ninomiya <kainino@chromium.org>

* Explicitly note that declined consent rejects the promise

* Uses commas to separate struct members intead of semicolons (gpuweb#2656)

Fixes gpuweb#2587

* Change the separator for struct members from semicolons to commas
  * Comma is optional after the last member
* Changes the grammar to require one or more struct members
  * Already required by the prose of the spec

* [editorial] Compress expression tables (gpuweb#2658)

* [editorial] Compress expression tables

* Combine arithmetic and comparison expression table entries for
  integral and floating-point entries
  * since SPIR-V mappings were removed there is no strong need to have
    separate entries

* improved wording

* make online should die on everything (gpuweb#2644)

* Commiting GPUCommandBuffers to the wrong GPUDevice is an error (gpuweb#2666)

Eventually we'll want to add more logic to make sure that command buffers are only valid on
the queue they're created from. Right now, though, every device just has exactly one queue,
so matching devices is the same as matching queues.

* Mipmap filtering might be extended separately from min/mag filtering in the future

* Add a way of setting the initial label for the default queue

* add rAF/rVFC examples

* [Process] Add RequirementsForAdditionalFunctionality.md

* Addressing Kai and Dzmitry's comments

* GPUSamplerDescriptor.maxAnisotropy gets clamped to a platform-specific maximum (gpuweb#2670)

Co-authored
@Kangz
Copy link
Contributor

Kangz commented Mar 21, 2022

This landed in the spec, so tentatively closing this issue.

@Kangz Kangz closed this as completed Mar 21, 2022
ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this issue Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Main
Needs Specification
Development

No branches or pull requests