Skip to content

GPU Web 2023 11 08

Kai Ninomiya edited this page Mar 26, 2024 · 2 revisions

GPU Web 2023-11-08 Atlantic-time

Note that unless stated otherwise this is a GPU for the Web community group meeting and not a working group meeting.

Chair: CW

Scribe: KR

Location: Google Meet

Tentative agenda

  • Administrivia
  • CTS Update
  • “Tacit Resolution” Queue
  • WebGPU Compatibility Mode (remaining issues) #4266
    • [PR] Add the mip level and array layer consensus proposals. #4365
    • 7. Disallow textureLoad() of depth textures in WGSL via validation.
    • 8. Disallow texture*() of a texture_depth_2d_array with an offset
    • 10. Disallow bgra8unorm-srgb textures.
    • Anisotropic filtering might not be available (requires GL_EXT_texture_filter_anisotropic).
    • Read-only depth stencil being ok!
    • [Teo] GPUDepthStencilState.depthBiasClamp must be 0.
  • Support for arrays of textures #822
  • Problem measuring time deltas with writeTimestamp #3952
  • Clarification regarding the validation of GPUQueue.submit and submitting duplicate command buffers? #4367
  • what happens if vertex w position is zero, or negative? #4129
  • Default entryPoints to shader modules #4342 (Corentin is very excited about this tiny addition)
  • Agenda for next meeting

Attendance

  • Apple
    • Mike Wyrzykowski
  • Google
    • Ben Clayton
    • Corentin Wallez
    • Kai Ninomiya
    • Ken Russell
    • Stephen White
  • Microsoft
    • Rafael Cintron
  • Mozilla
    • Jim Blandy
    • Kelsey Gilbert
    • Teodor Tanasoaia
  • Mehmet Oguz Derin
  • Rajesh Malviya

Administrivia

  • Please review su`roup proposal https://github.com/gpuweb/gpuweb/pull/4368
    • Pretty large, has portability challenges. Functionality that's been requested vocally by many people; anything we can integrate will be very useful to developers.
  • Call for spec editors?
    • Or, nominate directly. Will discuss with Kelsey.
  • Stage mechanism similar to TC 39?
    • JB: decisions we make here in committee would carry more weight in Google's process than they have in the past.
    • CW: Chrome's shipping process requires "positions" from other browser vendors. Any new thing going into WebGPU is new functionality from Chrome's standpoint, now. To smooth the process, we'd like to formalize our process a bit, so that changes that land in the spec indicate that all browsers are on board with it. Or, if they aren't, we'd add a prominent note saying so. Would like to formalize this so we don't need to spam other browser vendors with requests like "is it OK to ship RGB10A2"?
    • CW: if other browsers are on board, I'll propose stages - perhaps stage 1, an issue, stage 2, a proposal, stage 3, it's in the spec. Stage 2 can be skipped for smaller items.
    • KG: there's warning signs there that we need to sort out, but I hear you and totally understand.
    • CW: we've been holding off on this to avoid spamming folks, but need to move forward.
    • KG: we'll talk with our standards folks on the Firefox side.
    • CW: I can write a brief doc describing this.
    • KG: warning signs on my side: things land in the spec frequently that we're not super keen on. Landing in the spec means that Firefox isn't super negative, but am cautious about things sneaking in, or we didn't realize that something had a repercussion.
    • KN: not something we're looking to do retroactively. Want to change things so that when things land in the spec, folks are positive about it.
    • KG: need to check with our standards folks, but I don't think we can say that if something lands in the spec, we're positive about it.
    • KN: we can keep records of the conversations.
    • CW: we can at least say "neutral". There are things like compilation hints that are in the spec that Chrome doesn't plan to support. We're basically saying that if things are in the spec that all browsers plan to implement it.
    • KG: sounds like you're dealing with process that's perhaps unreasonable - I'll check our half of the circle to see who decides the policy.
    • CW: MW can you ask your folks too?
    • MW: yes, I'll circle back with our web standards people.

CTS Update

  • KN: fixing various things, more WebGPU Compat testing. Some work on fixing a few bugs that were causing flakes and slow tests which were causing flaky timeouts. Test suite should be more reliable. I did some work on the build; it should be faster now.

“Tacit Resolution” Queue

  • Disallow requesting alignment limits that are ≥ 2**32 #4358
    • Because of Web IDL rules
  • KN: looks like everyone except Mozilla looked at this. Now, spec says - you can ask for 2^33 alignment and we'll say OK. Annoying to implement because it's a uint32 internally (and that type's used elsewhere in the spec). Just saying that alignment limit has to be 32-bit integer.
  • JB: just marked this as OK with Mozilla too.

WebGPU Compatibility Mode (remaining issues) #4266

  • [PR] Add the mip level and array layer consensus proposals. #4365
    • SW: validation will happen at different times for mip levels and array layers because constraints are different.
    • SW: mip levels have to be done at draw time. Can have 2 draws with different mip levels bound. Just illegal when you have the same texture used twice in the same draw with different mip layers. Array layers can be validated at createBindGroup.
    • TT: makes sense.
    • CW: also can't be done at createView time. Might be making one layer views for rendering.
    • TT: was something I asked last time, people said we shouldn't do that.
    • SW: need to be able to make one layer views for rendering.
    • SW: side note, if you create texture with multiple layers as 2D texture - it's useless - you can only reference the first texture. May want to validate this out - don't bother creating a multi-layer 2D texture because you can't do anything with it.
    • Finish discussion on the PR.
    1. Disallow textureLoad() of depth textures in WGSL via validation.
    • SW: Doesn't work in GLES. There are some suggested workarounds. I didn't try them. At this point, we have so few exceptions to add that it's easier to add them now and relax them later. Any comments?
    • KG: texture sampling works?
    • SW: yes.
    • TT: that'd be the next item.
    • SW: the next one is depth 2D array textures.
    1. Disallow texture*() of a texture_depth_2d_array with an offset
    • SW: Think they shoved all the texture parameters in a vec3, but with x/y offsets you'd need a vec5. Would like to forbid this now, and relax it later if we find a way to.
    • TT: left a comment - seems textureGatherOffset exists, but not textureOffset. Also, texture LOD is not present at all.
    • SW: is textureLod possible by adding a bias parameter to the functions?
    • SW: sampler2DArrayShadow's missing these functions?
    • TT: yes.
    • SW: you may be right, sorry missed this comment, will come back about this.
    • TT: think there isn't much we can do about this here.
    • TT: think it's fine to disallow this.
    • MW: would need to look at this more later to decide.
    1. Disallow bgra8unorm-srgb textures.
    • SW: I don't have data on ChromeOS here yet. Android, preferred format is RGBA. Also can’t remember all the issues with this, vs with storage textures.
    • KN: think this format doesn't exist at all in GLES.
    • SW: correct. Could emulate with RGBA via swizzling on upload/readback.
    • TT: is this also about the non-sRGB format?
    • SW: yes, but the BGRA extension has 99% coverage. So the non-sRGB format is actually supported widely.
    • JB: the workaround where we implement buffer-to-texture and t-2-b copies in compute shader - how well does that perform compared to ordinary transfers? Perf cliff?
    • SW: we haven't benchmarked it, but because transfers are often so slow, we expect it to be the bottleneck. Much larger than the compute pass needed later.
    • JB: our general concern is - does the workaround introduce surprising perf characteristics.
    • SW: would require double the VRAM temporarily.
    • JB: yes, that's more of a concern. Do the transfer to a temporary internal buffer that the user didn't ask for.
    • SW: fair point. Don't think there's anything we can do here that the user can't do themselves.
    • JB: agree. Our audience is generally people writing frameworks/libraries, and they're already adapting to preserve their APIs.
    • SW: agree - the one concern I have is there's no way to do a BGRA storage texture in GLES. If you had BGRA and that was your preferred framebuffer format, might be surprising to developers if you can't do it.
    • KG:
    • SW: more about expectations. RGBA sRGB's supported, would be surprising BGRA sRGB isn't.
    • KN: not too worried about that.
    • SW: sounds like people aren't happy with the workaround and would prefer to forbid it.
    • JB: yes. And we could always emulate it later if we find we need to.
    • MW: I'm OK if we disallow this.
  • Anisotropic filtering might not be available.
    • SW: need an extension to support max anisotropy level. There's a CTS level which fails if you don't set this parameter. 40% of GLES devices seem to have it.
    • KN: spec's not prescriptive about this because specs don't say much about how anisotropy works. Technically valid in Vk spec to ignore this option. Vk spec implies you're supposed to do something it but doesn't provide guarantees. The CTS test is aspirational.
    • CW: reason why anisotropic filtering doesn't have guarantees is because of patent issues. The patent just expired. This is also why WebGPU says we have anisotropic filtering, but no promises. Any behavior's OK here, but don't need to ask developers set 0 or 1.
    • TT: our current spec says it'll be clamped to the implementation's maximum, so think this should be fine. And if the extension's present, it'll work.
    • SW: sounds good. Maybe our GL impl wasn't clamping it.
    • KN: think the CTS test isn't strict to the spec. Testing things it shouldn't.
    • SW: so consensus is, it's OK to treat all maximums as 1.
    • KN: yes. Test can be aspirational in core and we can skip it in Compat.
    • MW: OK with us.
  • Read-only depth stencil being not ok!
    • SW: Corentin implemented this in our GL backend and it works. Not sure if it's just luck. Seems to require a feedback loop in the renderer.
    • CW: we tested the GLES backend on a bunch of desktop cards. Teo points out that this is undefined behavior in the GLES spec. wgpu had issues both with export to WebGL and on some old mobile devices, where separate read-only depth-stencil didn't work correctly through GLES. Disallow for now, and if we find it works everywhere, re-allow it.
    • TT: sounds good.
    • MW: I'm OK with this as well.
    • KG: you say there's a rendering feedback loop, but it seems to work?
    • CW: WebGPU says there's no feedback because it treats each aspect individually. The GLES spec doesn't. Technically there's rendering feedback happening.
    • KG: what do the drivers say?
    • KN: we don't know but there was a wgpu bug report about this not working on some devices.
    • KG: I'd guess it would work on some devices and not others, so you shouldn't have the functionality.
    • Consensus - forbid this.
  • [Teo] GPUDepthStencilState.depthBiasClamp must be 0.
    • SW: to support this you need the clamped flavor of polygon offset. Only 40% of devices. Might need to force this to only accept 0.
    • KR: could add a WebGPU feature for this. This just showed up as a WebGL extension. Shouldn't be a problem to validate this to be 0 and add a feature for it later.
    • TT: low support on GLES, about 7% in GPUinfo data.
    • CW: conclusion is clear, require that this must be 0 for now. Conveniently, that's the default value. :)
    • Consensus - validate this to be 0.
  • SW: note, there are other things already validated out in Compat that we could add back in as features later. (Like sample mask, which shows up in GLES 3.2.)

Support for arrays of textures #822

  • Skipped during this meeting.

Problem measuring time deltas with writeTimestamp #3952

  • CW: issue was reordering on Metal between blit encoders / render passes. Do we need to do anything about this? Or is this something that'll be fixed in Metal, or WebGPU impls?
  • JB: does Metal agree this is a bug?
  • CW: think the bug is - if you do blit encoder A with write timestamp, and render pass B with timestamp write, then blit encoder with writeTimestamp, there's no correlation between the write timestamps. Hardware's allowed to reorder command streams. Devs surprised, because they try to measure a bunch of render passes and it doesn't work.
  • MW: think this is allowed per Metal spec. But from WebGPU's side I understand the issue.
  • JB: our concern - in development, users need to be able to say, this timestamp must come before things I'll add soon, and this timestamp must come after all previous work. Concern - none of the underlying platforms have a good way to do this. In development, you need a trustworthy timing mechanism, even if this imposes cost. Better than something that never tells them what they need to know.
  • KN: I made two possible proposals here. One, try to implement a version of writeTimestamp that prevents anything from running until after the timestamp's written. Not easy. Would have perf impact. Would insert timestamp, and b/c timestamps are all "after" timestamps, we'd wait for everything to finish and for the timestamp to be written. Would have to block all cmdbuf execution. Not easy. Or, remove writeTimestamp from the spec for now. Doesn't seem very useful as it is. We do still have the compute / render pass begin/end timestamps. These work pretty well. Would work if you have ordering / data dependencies between passes, can compare timestamps between 2 different passes. Would make it difficult to time copy operations because those don't have a pass. Maybe we could add “blit pass” begin/endTimestamp?
  • MW: might be able to solve this with fences but could have perf implication and I'm not sure it would work.
  • CW: would it be reasonable - timestamps do give some correct indication, it's just that the driver can reorder commands under the as-if rule. If devs use flags to run browser in a mode, things are a bit slower, but the timestamps are consistent? In Chrome this would be easy, just a toggle.
  • MW: if we do that we might want to offer some perf warning.
  • CW: definitely, when that flag's turned on we'd emit a warning.
  • KN: if we can implement that, then I'd prefer to just expose it, and put something in the API like "insert stop the world timestamp".
  • CW: as sanity check, we're still happy with timestamps on passes?
  • (general nodding)
  • CW: I suggest removing writeTimestamp from the spec for now. Maybe stop-the-world timestamp seems fine. Think we need impl experience on Metal devices to make sure it's actually working.
  • KR: concerns that we'll break Intel's measurement of compute kernels, and MediaPipe again.
  • KN: we can keep what we have behind a flag.
  • KG: they're probably getting something different than they expect right now.
  • CW: that's why we need to work on writeTimestamp a bit more.
  • Consensus: remove writeTimestamp from spec for now. Get more impl experience before adding it back. Hairier than it seems.

Clarification regarding the validation of GPUQueue.submit and submitting duplicate command buffers? #4367

  • MW: seems slight omission from spec. Someone submits the same cmdbuf twice in same submit call. No validation failure? But probably not intentional.
  • CW/KN: yes.
  • KN: just editorial problem.

what happens if vertex w position is zero, or negative? #4129

  • CW: David was trying to make sure the WGSL spec's consistent / coherent.
  • CW: WebGPU says that pixels with w<=0 are discarded, I think.
  • JB: seems to say that. 0 has to be <= z <= w. Says pixel should be discarded. Have to read literally. Maybe not bad to have a note. If w is negative, discard.
  • KG: even if z is negative too?
  • JB: no. If z>0 and w<0, this condition isn't true.
  • CW: if w<0 and z<0 what happens?
  • JB: if z<0 we discard.
  • KG: even if w<0?
  • JB: check the last comment on the Github issue.
  • KG: is what we have on purpose?
  • KN: think we copied this from Vulkan but need to check.
  • MW: in addition to D3D having this requirement, Metal also has this requirement. If w<0 then no matter what the xyz values, vertex is discarded.
  • CW: let's record this on the issue so we have it for posterity. Note in the spec would help. WGSL spec?

Default entryPoints to shader modules #4342 (Corentin is very excited about this tiny addition)

  • CW: technically Milestone 2. Removes 2 lines from Hello, Triangle!
  • CW: gman@ pointed out - if single entry point per stage in shader module, easy to choose the default name at pipeline compilation time. He made a proposal, we're all in favor of proposal 4.
  • CW: seems everyone's very positive about this. I prototyped this in Dawn. More work than it seems because of HashMaps. :) But probably a very tiny change. People OK getting this in soon for ergonomics?
  • JB: we don't object.
  • MW: thumbs up.
  • CW: great, we'll get to this at some point. Tiny wins for developers.

Agenda for next meeting

  • CW: will we have enough Compat items?
    • SW: need to circle back on items where we didn't get consensus today.
    • CW: will we have enough items?
    • SW: we could wait a week, we're in the long tail of CTS failures now, could make sure that no more dragons are lurking.
    • CW: I'm concerned we might not have enough items so that the meeting would only be 15 minutes for Compat stuff.
    • KG/KN: that sounds fine.
    • SW: think we should go under the assumption that we have the meeting, and if the agenda's super thin, then cancel.
    • CW: OK, sounds fine to hold the meeting and end early if needed.
  • CW: more items, please put them here!
Clone this wiki locally