Minutes 2021 07 19

GPU Web 2021-07-19

Chair: Corentin

Scribe: Ken Russell

Location: Google Meet

Tentative agenda

Administrivia
CTS update
Should the API enforce that writable storage buffer bindings don't alias each other? #1842
Reading from MSAA depth textures #1924
- 2nd, 3rd, and 4th components of sampled depth/stencil textures are undefined in Metal #1198
Allow render pipelines to have color targets not written to by fragment outputs #1900
Remove the read-only texture storage bindings #1794 (naming issue?)
Add opaque region or isOpaque hint #1871
Agenda for next meeting

Attendance

Apple
- Myles C. Maxfield
Google
- Austin Eng
- Ben Clayton
- Brandon Jones
- Kai Ninomiya
- Ken Russell
Microsoft
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Francois Guthmann
Michael Shannon
Mehmet Oguz Derin
Rick Battagline

Administrivia

CW: So VF2F mid october-ish and see about the details early september.
Brandon made a logo a while back, discussed multiple times - can we agree on using it?
- https://github.com/gpuweb/admin/issues/14
- The Font's "Poppins"
- MM: think we could choose a better font…
- BJ: easy to swap out the font. The logo with the triangles is more important to get sign-off on.
A 3D version of the logo's been used in Khronos' joint WebGL and WebGPU meetups:
- https://www.khronos.org/events/webgl-webgpu-meetup-2021-07
CW: So go with it and change stuff for v1 if we feel strongly?
MM: Dark mode variant?
- BJ: Blue shows well on dark, but might need to swap out font color.
- BJ: Feel logo can be adapted to many colors.
CW: let's go with it for now, make changes later if needed.

CTS update

KN: landing lots of CTS improvements. Gaining coverage, improving some existing tests. Making decent progress in past couple of weeks. Lots to do. Contributions are very very appreciated.
KR: how can people most easily sign up to do work?
KN: please send me a message.

Should the API enforce that writable storage buffer bindings don't alias each other? #1842

MM: didn't realize this when we discussed this before, but Metal spec says don't do it - there are more subtle rules - configurations of buffers that are forbidden by the spec.
MM: if we wanted to dictate that every implementation must use argument buffers, we don't need this limitation - but thought Chrome impl didn't use argument buffers.
CW: that's correct now; could be changed though.
MM: we intend to use argument buffers unconditionally. If group thinks that's the right way to go, we could do that. But arguments I made earlier about understandability by the author, and performance - still apply.
DM: our impl doesn't use arg buffers yet. Want to roll it out optionally first, make sure it works correctly. Discussing last time, this kind of UB is different from regular unsynchronized storage access. That's where we stopped the discussion.
MM: I wrote an example in the thread a few days ago. One is unsync access by multiple threads, the other a single thread with this difference. Per spec, these are both UB. Realistically, top one would have mor divergent behavior than the bottom one.
CW: right now we're saying it's UB, but it's really undefined memory op at a specific location. Can't just draw kittens all over your page. Can assume certain things about how hardware works at bit/byte level. Author should know they're going into data race; sort of the point of storage buffers.
MM: there's nothing we can do to make the top example safe. VM, run on the CPU? Can't at compile time figure out there'll be a data race. Symptom of problem vs. its solution are different in these two cases. Second case, there is something we can do; I think we should start with the assumption that we're safe, and dial it back if needed.
CW: I'm worried that here it's not a matter of being safe, but being portable. Undefined reads/writes - small subset of all UB. Think we can say it's safe w.r.t. writing out of bounds.
MM: top example is not safe. Value in buffer can be a mixture of two values.
KN: are you saying this insecure, and we can't allow code to do this?
MM: no. Trying to draw distinction between first and second case. One's behavior, the other programmer expectation. Reasonable for us to have 2 different policies.
KN: the first isn't insecure, right?
MM: technically it is - but nothing we can do to make it secure.
JG: if we don't disagree on the actions we have to take then we don't have to argue about secure/insecure.
CW: makes sense. The second one I have a concern so as to not break developer expectations - we'll forbid some things and add hefty draw call validation, to avoid races.
MM: it's single threaded - not a race.
CW: ok.
MM: also we haven't figured out how expensive the validation is. I say, we start with the validation in the spec, then someone measures it, and if it's too expensive we re-discuss it.
JG: from your comment 6 days ago - they don't reflect what I thought this discussion was about. If you have storage buffer binding, and pass it in as 2 values to the shader. We punt to the API side.
MM: in second example, consider "bar" to be the entry point rather than "baz".
JG: except that you need 2 storage buffers.
CW: it has 2 arguments; you pass the same storage buffer to both. Yes, two declarations.
JG: feels materially different from these two. Do we care about these threads racing? Or shader variable aliasing? Or binding the same subrange of a resource to two storage buffers?
MM: think the second and third examples you mentioned are identical - philosophical problem's the same for both.
JG: not sure. Same reference in shading language to same storage buffer, then split it apart and make compiler deal with aliases. Alternative is like mmap'ing the same file into 2 different pointers.
KN: this aliasing in the sample is local to the shader. On API side you can read entire shader and not know it's aliased.
MM: second example's already forbidden by SPIR-V. Should we have same validation for buffer bindings as we do for helper function arguments?
CW: cost of validation's at a different place. Also less flexibility.
CW: I hear what you're saying - we could just add the validation and measure the cost. Concern: it'll be a while before we have content.
MM: I don't want perfect to be enemy of good. Synthetic test's reasonable. Just want some number we can point to. "This number says we can't do the validation".
DM: I could allocate a hash map per draw call, or multiple hash maps for resources. Have to make sure it has a sufficient level of optimization before we can derive results from it.
MM: right, an impl couldn't make this part slow just to show it's slow.
CW: that would be bad faith. DM's point is we have to do our normal optimizations on this before measuring. We can't do that any time before the origin trial. Would be a week or more of work.
MM: don't think the OT is a particularly important deadline for this bug; shipping is.
DM: we almost agreed on the idea that we don't want people to do this, but also don't want to invest in it right now. Said we should put it in the spec as forbidden, and later, measure it and decide whether to keep or remove the normative text.
MM: default would probably be to keep it, in the absence of data. Bias ourselves toward being portable.
CW: this ties into the next topic as well. We can do a lot of things for portability / flexibility in the API. Guesswork as to which pieces of flexibility are worth keeping / removing.
JG: I expect our impl, even if allowed to do this, would generate warning when passing in aliased ranges. Debug warning.
MM: also more recent part of discussion. If valid WebGPU impl on Metal can not use argument buffers, need more text here.
MM: Metal spec says: resources as argument values to graphics or kernel function cannot alias.
MM: so if we want to say it's legal for a web author to bind one resource to 2 bind points in shader, we'd also be saying every webgpu impl must be using argument buffers. Reasonable, but not great.
KR: concerned about mandating this validation - could be very expensive.
CW: maybe if we had Order Independent Transparency demo and massive amounts of draw calls...it's hard to measure. If we need to, we can put it in the spec, and it can be on the cutting block if we realize it's not needed.
MM: that's acceptable path forward to me.
KN: have the validation be on the cutting block?
MM: yes.
KN: I'm more interested in saying, you can't do this, and put it on the shoulders of the impl to either use arg buffers or find a way to not issue this draw call. Issue warning, lose device, etc.
CW: or pretend it's not an issue on Metal.
KR: ok, let's say this is not allowed in the spec, and file an issue to revisit this later.
KN: maybe in an issue box, say it's not validated. No validation error associated with it for now though.
CW: essentially wait until we can get more representative data.

Reading from MSAA depth textures #1924

CW: we don't have a way to read MSAA depth in WGSL right now. Can only read texture depth through textureDepth intrinsics. In Dawn, we allow using depth with texture2D (?). Seems to work?
MM: it's forbidden by the spec. Apps can not rely on it.
CW: is there a Metal version where this is guaranteed to work?
MM: not sure.
CW: think this is more scary. If Metal says bind point's not supported then we should probably not support it.
MM: why can't we have depth multisample type in the shader?
CW: we can, just don't have it right now.
MM: seems like natural solution.
JG: if we can't fit it into the existing types.
MM: Metal spec says you can't.
CW: ok, so add textureMultisampleDepth or textureDepthMultisample.
MM: shading language question? Discuss tomorrow?
JG: yes. Any API side changes?
CW: only shading language.
MM: do we need bind point so you can't bind single-sample depth texture to multi-sample depth texture type?
JG: we'll do it in the shading language first.
CW: gpu texture sampling type has multisampled as boolean, and texture type.
MM: so no API change?
JG: no.

2nd, 3rd, and 4th components of sampled depth/stencil textures are undefined in Metal #1198

KR: In WebGL we ran into this. Spec'd you should only rely on the red component. It's been fine.
MM: :(
JG: not great, not terrible.
CW: can we make feature requests to Apple?
MM: I already have.
CW: something we can tighten up over time. WebGL had some influence over the graphics ecosystem, e.g. robustness. Hoping this group can make things more consistent over time.
MM: what's a use case of binding the stencil texture as sampleable resource?
CW: seen lot of presentations about Uncharted, or another fancy AAA game, doing coarse rasterization using MSAA, and using stencil using readbacks.
CW: also lot of reading stencil - lot of games use stencil to tag different kinds of objects for TAA. Read back stencil during TAA since it's bundled with depth.
MM: if you bundle it with depth, in Metal, it's inaccessible to shader. Sample depth/stencil, but only get depth component.
CW: can't you create a view from it?
MM: yes.
CW: in WebGPU to be able to sample stencil, either ...STENCIL8 format in which case we can do Metal texture viewing stuff, or you have to select stencil aspect when you create the view.
MM: ok, not going to stop this, just reflecting displeasure.

Allow render pipelines to have color targets not written to by fragment outputs #1900

AE: currently spec requires fragment shader to have fragment output for every color render target of pipeline. Overall, if you want shader that only wrote to one color target, difficult to reuse that shader in multiple render passes / pipelines if # of render pass attachments didn't match. If you wanted to have shader not write some fragment outputs, could say: it has these outputs, then set writemask for those to nothing. But better if impl could validate: if not writing to those, must have writemask of nothing, so you can have shaders that don't have to write to all the attachments.
MM: set of outputs that the shader outputs is less than # attachments of render pass? Currently illegal, want to make it legal?
AE: yes.
MM: so if there's RB attached to pass, shader doesn't write to it, writemask must be 0?
AE: yes.
JG: this is what we have for WebGL.
KN: question about discard. Then you aren't writing to any attachment. Can dynamically decide this. Different than statically deciding?
CW: yes. Can have a draw decide not to write, while others do write. Garbage written? No data written? Nice thing about Austin's solution: if shader says I'm setting writemask to 0, it is allowed to not output this attachment.
MM: did you try this in all backend APIs?
(in WebGL: https://www.khronos.org/registry/webgl/specs/latest/2.0/#5.4 )
AE: yes, it works.
CW: including the validation layers.
DM: even if it doesn't work, you can inject the proper shader code with dummy output.
MM: you can't. You don't know what render pass looks like at shader compilation time.
DM: it's compatibility. You match shader to pipeline.
MM: unless you're saying pipeline layout contains render target information - which I don't think it does…
DM: when creating pipeline today. have to spec all attachments as you'll use with the render pass. What's contested: now your shader code can only emit output to color target. You put writemask of 0 in your descriptor. Doesn't matter whether native API lets you do this. We can inject shader code at pipeline creation time.
MM: that assumes you're compiling shader at pipeline creation time.
CW: AE tested this works on every backend, and with Metal Debug devices, D3D12 / Vulkan validation.
MM: and that was without adding more code to the shader?
AE. In Metal you can clear out the colorWriteMask (?) to 0. Don't have to modify the shader. On D3D12 the pipeline state - it's already a fixed array of 8 targets. You don't set the size of it.
KN: so you reflect on the shader to see what it does? And at pipeline creation you set the colorWriteMask, etc. for the things not written by the shader?
AE: the validation sets the writemask to 0. Could consider changing it to match what user spec'd.
DM: only talking about validating colorWriteMask against the shader.
KN: either seems fine.
MM: OK.
AE: if you have feedback from the Metal team that this isn't OK, let me know.
MM: on tile-based deferred renderers, output attachments refer to pieces of tile memory. When shader's done running, flush tile memory to main memory. Writemask determines that second step. Don't expect problem there, but haven't talked about it with Metal team.
KN: our colorMask's in the pipeline? not the render pass?
DM: correct.
JG: haven't spec'd pipeline / renderpass compatibility yet?
CW: it's clear what the rule is. Same set of attachments, etc.
JG: that's not what the Vulkan spec says.
DM: Vk has explicit renderpass object. We don't. But what we have is already spec'd.
KN: thought so, but don't remember where.
JG: what happens if list of attachments for pipeline is shorter than, but otherwise compatible with, different list of attachments?
DM: in Vk, provide renderpass to pipeline creation. Our impl has to know about renderpass. If you don't have enough attachments, won't know renderpass to use it with.
CW: rule isn't spec'd right now, but pipeline/renderpass compat should be really clear.
JG: if it's not spec'd, it's not clear to me. Should spec it. Blocker for release.
CW: understood. But wgpu and Dawn should already be compatible here. Should spec it.
CW: going back to previous issue; it's only during render pipeline compilation. Know the set of formats/attachments used in renderpass when creating pipeline. Can validate shader / colorWriteMask against attachments.
JG: think you're right, but since we haven't written the spec for the compatibility, hard for me to reason about it.
MM: can we say that if Corentin's statement is true, we agree about this colorMask issue, so we just have to write the spec?
CW: could do that. Think we can make the PR and resolve this offline. At this point, fairly confident we'll be able to move forward. Let's just tighten the spec.
DM: whether spec'd or not, we know how it'll work in Vulkan today. It'll be spec'd this way because we have no other way to implement it in Vk or Metal.
JG: ok, I just can't agree to this until it's written down and spec'd.
MM: no reason to not do what Jeff's asking for.

Remove the read-only texture storage bindings #1794 (naming issue?)

Actually now https://github.com/gpuweb/gpuweb/pull/1963

Add opaque region or isOpaque hint #1871

Agenda for next meeting

Read-only texture storage bindings: Actually now https://github.com/gpuweb/gpuweb/pull/1963
opaque / isOpaque hint #1871
- DM: thought we could resolve that in the editors' meeting?
- CW: either way…
- KR: discussed earlier in https://github.com/gpuweb/gpuweb/issues/1425 , editors should feel free to move forward with it.
#1845. New constraint in macOS 11.
- Should the alignment of destinationOffset in resolveQuerySet be 256 on Metal?
- MM: think there's a larger project here. Have to validate, but counters as currently spec'd in WebGPU are unimplementable on M1. That's a problem. Have to verify, but if it's true, have to figure out a proposal.
https://github.com/gpuweb/gpuweb/pull/1978

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2021 07 19

GPU Web 2021-07-19

Tentative agenda

Attendance

Administrivia

CTS update

Should the API enforce that writable storage buffer bindings don't alias each other? #1842

Reading from MSAA depth textures #1924

2nd, 3rd, and 4th components of sampled depth/stencil textures are undefined in Metal #1198

Allow render pipelines to have color targets not written to by fragment outputs #1900

Remove the read-only texture storage bindings #1794 (naming issue?)

Add opaque region or isOpaque hint #1871

Agenda for next meeting

Clone this wiki locally