Minutes 2022 08 24

GPU Web 2022-08-24

Chair: Kai guest chairing for Kelsey

Scribe: Ken / others

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
“Tacit Resolution” Queue
Specify that texture_external sampling functions clamp coords to [0, 1] #2766
Colorspace conversions for GPUExternalTexture are arbitrarily complex #3293
Frame advancement of canvases that aren't visible #3295
Buffer state validation in unmap() is unnecessary #3251
maxColorAttachments not sufficient? #2965
Agenda for next meeting

Attendance

Google
- Brandon Jones
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
Microsoft
- Jesse Natalie
- Rafael Cintron
Mozilla
- Jim Blandy
Unity
- Brendan Duncan

Administrivia

KR: since Unity's here, any updates on WebGPU backend?
BD: things progressing. Hitting some impl issues - main hurdles: shaders' uniformity analysis. Lot of problems with that because we don't write our own shaders, they go through several transformation steps and abstraction layers are complex. Dependent texture fetches are one problem. Lot of projects/games are working though!

CTS Update

KN: Gyuyoung from Igalia working through validation in spec that's untested, and testing it
Also found discrepancy between spec and impl I opened an issue about; small change to validation to make things more ergonomic. Editors will discuss, bring back here.
Work going on from couple of folks.
Backlog of items landed in spec which need test updates. Planning to go through those and make sure they're filed, unless someone else would like to volunteer. Can give you list of Github issues you can file testing issues for.
Looking at test perf on automated test suite. Identified couple of really slow tests. Brandon fixed them! Allocating a lot of ArrayBuffers is slow, which is a problem we've run into multiple times before. 3x faster, or more, gains.
BJ: on chunk already done, CopyTextureToTexture suite went from 70s -> 20s (!).
KN: did this optimization for some other tests. Either accidentally removed that, or another instance of same problem.
BJ: another instance. Current one's a little more complicated. Allocating lots of typed arrays, don't need to allocate that many.
KN: would be nice if typed array allocation were faster.

“Tacit Resolution” Queue

KN: Since Myles isn't here we have to make sure he's had a chance to look at them.
When source usability is bad, error asynchronously #3311
- KN: when writing spec for getting various image sources into the API, esp. videos - tried to follow WebGPU structure - (accidentally) made some async validation synchronous.
- Video decoding can fail. Don't want that to have to throw an exception. Now we turn it into a validation error. Decode can proceed in the background now.
- Reported by Justin Novosad from the Chrome team; he reviewed the changes, said they looked good.
Reduce max binding index and make it a limit #3318
- Had hardcoded limit of indices in a bind group. Now 640, and it's a limit. Decided to call it maxBindingsPerBindGroup. Not possible to get remotely close to this because there are other limits preventing it from being reached.
- It's a limit on the index.
- Have another instance of a similar pattern. This seems reasonable.
- Not perfect name, but reasonably straightforward.
Make error reporting unordered #3362
- Will move to top-level agenda item; one new thing comes up.
Specify OOM behavior in mapping #3351
- 2+ years ago we discussed errors in mapping, and creating buffers with mappedAtCreation:true.
- CPU-side error with mappedAtCreation:true - supposed to happen immediately. Hide the error? Make the buffer invalid, but still let you get mapped ranges?
- Changes it to be simpler. Throw a RangeError if we can't allocate memory in createBuffer. Additional error that can happen, but same thing if you did "new ArrayBuffer(size)". Not a major additional thing apps would have to deal with.
- So: do everything we decided 2 years ago, as well as RangeError from createBuffer. Also rejected promise from mapAsync.
create*PipelineAsync prevents using the pipeline synchronously #3297
- You get Promise result - can't synchronously wait for completion of pipeline creation
- Decided a while ago - something to pursue, not critical for V1
- Issue to consider if customers keep running into it
- Not something that's blocking us from V1
- Tacit resolution: punt this to post-V1.
- Maybe in future, will figure out a good way of exposing this capability. Or, create the pipeline again with the same inputs.
- KR: add an awaitSync to JS? :)
- KN: maybe on workers. :)
Stopping there.
Please post comments if you have any issues with these things landing.

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

MM: Please defer resolutions here until I’m back (sorry everyone)
KR: Sounds like the compromise solutions we’re talking about are going to adversely affect both the API and implementations. Would like us to consider whether folding repeat modes into the texture will have an unnecessary cost - seems like having application allocate the sampler is the most clear and predictable signal. Experts in the area, please continue to think about this.

Colorspace conversions for GPUExternalTexture are arbitrarily complex #3293

MM: Please defer resolutions here until I’m back (sorry everyone)

Frame advancement of canvases that aren't visible #3295

KN: had active discussion with Myles about this.
Do other folks think the issue Myles raised is worthwhile to consider as a design constraint?
Proposal: when you call getCurrentTexture, we'd queue a microtask to expire the current texture, after the current task. Happens after microtasks queued before it - so, middle of microtask queue.
Myles' concern - possible and reasonable to do rendering spanning multiple microtasks.
E.g.: getCurrentTexture, then do something causing microtask to be posted. In that microtask, want to do more rendering. It's expired - can't do it.
Having hard time coming up with situations where this would be necessary, but agree it's a possible situation/pattern to be in. Most important thing to me - consistent, well-defined time for this texture expiry. Not a big deal if it expires in middle of microtask queue, and think people would find workarounds for it.
Can relax this later - would make more programs valid. Inclined to do this now, and perhaps relax it in the future.
GT: don't know if it's a real problem but brought it up earlier - want to know if getCurrentTexture succeeded. Wait on it. But now it's gone. You'd want to maybe allocate a smaller texture in response to failure. But can't do it because it'll expire on the microtask.
KN: that's true. Problem - even now, that won't work. Hard for us to guarantee the ErrorScope will resolve in the same frame. Left as is, where it expires when rAF happens / frame's getting drawn, no guarantee your ErrorScope comes back before that happens. Sometimes before, sometimes after - depends on impl. Don't want this situation - unreliable, unportable behavior.
GT: maybe different topic then. What do you do if you want to handle those situations?
KN: best you can do - don't render that frame. If failed to allocate texture, skip this frame, render the next one. (Can't skip rendering - you do the rendering, find out it failed - try again on the next frame with a different size.)
Within constraints of async errors, might be the best we can do. Reporting the same frame if allocation failed - results in ~1 ms round-trip stall. Not sure what right solution to that problem is. Think it'll be relatively unusual that apps have to handle that case robustly. Most don't today. If your canvases are so large they won't allocate, … don't really have a good answer.
KN: any other microtask concerns?

Buffer state validation in unmap() is unnecessary #3251

KN: right now unmap() verifies buffer was mapped. Think it's pointless. Number of things have to happen conditionally in unmap(). Right now we do some things based on current state of buffer, and do validation afterward. Validation not preventing anything from happening. Don't think people need to get an error if they unmap without it being mapped. Takes a step out of the spec.
JB: another advantage to not doing validation - multiple Promises issue. Want to make sure there aren't server-side paths hard to exercise from healthy content process. If paths that can only be reached by compromised content processes - hard to test.
KN: validation's on the service side right now. No validation on the client side. Might be something we need to add anyway for multithreading - or something similar. Have a diagram about how this could work in multithreading.
With resolution of other issue, already in that situation - server could get in state from unhealthy client that it couldn't get into from healthy client.
JB: w.r.t buffer mapping or other stuff?
KN: buffer mapping. Client side won't send map request if it knows it won't go anywhere.
KN: IPC fuzzing is such a fundamental requirement, don't think we have to design the API around it.
KN: guess this is implicitly tacit resolution.

maxColorAttachments not sufficient? #2965

MM: I made a pull request for this! \o/
KN: this is about limits on the total size of the render attachment formats you use. Size per sample of attachments.
KN: Myles figured out how the Metal validation works and wrote an algorithm that does the same validation. Computes # bytes per sample in a RenderPipeline/RenderPass. Would add a limit.
KN: haven't reviewed the PR yet. Contains algorithm, validates it against base limit, which is 32 bits.
KN: this validation would be Metal specific. We don't know of comparable validation constraints on other APIs. Little nervous about it. Would like similar constraints on other APIs, and have them represented here. Think other APIs will spill when this happens. They can fall back, but Metal makes this a validation error.
KR: Have run into related issues in WebGL with tile local storage. Similar limits exist abstractly in other APIs like D3D11 and concepts like shader images.
KN: there are 2 limits in Metal. If using tile size control and tile local storage, you get an additional limit, just on amt of memory on tile. In compute, = workgroup level storage. Rendering, tile local storage. That limit doesn't apply to normal rendering, only if you're using those features.
KN: this is an additional limit that applies regardless of use of tile local storage. Think this isn't tile local storage limit, but something to do with writing out to memory? A per-sample limit, doesn't depend on the # samples. Not a per-pixel or per-tile limit, so don't think it has to do with tile local storage directly. Don't have good understanding of what the limit is.
RC: asked Jesse offline; D3D doesn't have a similar limit, just a limit on the number of attachments, whatever they are.
KN: think that's pretty definitive - no strict limit in D3D. Maybe in practice architectures have some limit where they'll spill something, but we don't know what that is, so can't advertise it. Best to take this constraint from Metal.
RC: per Jesse, it's "8". Not based on feature level.
KN: whatever this represents in Metal, probably hidden in the driver in D3D/Vulkan. Somewhat restrictive limit. 32 bytes per sample. Will be able to lift the limit on most devices.
RC: D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT
KN: we have this as max color attachments.
KR: what about non-portability?
KN: we'll enforce this limit consistently across all backends. Devs should be aware there are consequences to lifting the limit.

Agenda for next meeting

KN: please add agenda items here for next week's meeting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2022 08 24

GPU Web 2022-08-24

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

Colorspace conversions for GPUExternalTexture are arbitrarily complex #3293

Frame advancement of canvases that aren't visible #3295

Buffer state validation in unmap() is unnecessary #3251

maxColorAttachments not sufficient? #2965

Agenda for next meeting

Clone this wiki locally