Minutes 2021 04 26

GPU Web 2021-04-26

Chair: Corentin / Jeff Gilbert

Scribe: Ken / Kai

Location: Google Meet

Tentative agenda

Method of ensuring GPUShaderModules can contain MTLLibraries #1064
Make lost devices appear to function as much as possible #1629
Should "mark adapters stale" be called more quickly? #1630
(stretch) Reduce blocking on shader compilation #1641
(stretch) Allow mat2/3/4 as attribute types #1652
PR burndown
- Pluralize requestAdapters() and add GPUAdapter.isSoftware #1660
Agenda for next meeting

Attendance

Apple
- Myles C. Maxfield
Google
- Brandon Jones
- Corentin Wallez
- James Darpinian
- Kai Ninomiya
- Ken Russell
- Shrek Shao
Intel
- Brandon Jones
Kings Distributed Systems
- Hamada Gasmallah
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Andrew Varga
Dominic Cerisano
Francois Guthmann
Michael Shannon
Mehmet Oguz Derin
Rick Battagline
Timo de Kort

Administrivia

TAG review stuff.
Kai working on explainers. gpuweb/explainers
- Think it's ready to go for review.
- All happy with the explainers in their current form?
- Group OK with us filing for review?
MM: haven't read them but don't want to block. Will review. Queue for TAG review is months long. Think will have time to review before it comes up.
We'll file now.

Method of ensuring GPUShaderModules can contain MTLLibraries #1064

CW: don't feel there's a good solution that makes everyone happy right now. Good solution would involve getting a lot more data on the API.
- Chromium would be happy to have a hint, as long as it's just a hint.
- No huge impl cost for Chromium; may help WebKit on Metal. When we get more data from web developers we may do something else.
- Solutions we talked about in this group were heavy-handed aside from just a hint.
MM: in post it was said: "proposals of GPUShaderModuleDescriptor.layout hint" - thumbs up - said other solutions were more heavy handed.
- Don't want to stop progress - just asking - prev direction was more broad hint, authors could provide anything they wanted, browsers could use some / all / none of it. Here, are you saying this as an example of a larger hint, or something more palatable?
CW: just more palatable - example of a larger hint. Our feeling - even if we had this discussion about partial descriptors, etc. - this group wasn't satisfied with it. Convo still going in circles. Just layout would be fine.
- MM: welcome news from our point of view.
DM: in previous convos - Vulkan backend will generate SPIR-V at shader module creation time. Prof Vk devs - SPIR-V conceptually handles multiple entry points but tooling doesn't, so you can't rely on it working properly. If we follow this advice we'll generate SPIR-V at pipeline creation time, so there's no difference between the backends and we don't need different APIs.
- CW: Dawn also specializes modules per-layout - we repack descriptor sets. So depends on the pipeline layout. We already do that, essentially.
- DM: will you use the layout hint?
- CW: probably not. By time we do the optimizations we'll have found other reasons to do the generation at pipeline creation time.
- DM: can we agree collectively that the shader creation time is only to parse WGSL, etc., module validation, and nothing else? These can reveal a lot of errors in user code. Has a lot of value. Can we agree all backend-specific stuff will happen at pipeline creation?
- MM: reason we created this issue is we wanted apps to be able to do compilation before createPipeline. OK if that's a third step that's out-of-band, between shader module creation and createPipeline. Goal though is to make it possible to write WebGPU app that does compilation before createPipeline.
- DM: and that's minus all cases where you need to work around driver issues?
- MM: our intuition - cases where it'll work outnumber cases where it won't work.
- DM: if we provide pipeline layout how will it be used?
- MM: layout field, is optional. First resolution here for that part?
- JG: is it a single layout and not multiple?
- MM: first proposal, would be dictionary of entry point to layout. If controversial, can discuss.
- JG: that's what I was expecting. Take advantage of compiling shader module once with all of its various vertex layouts.
- MM: agree
DM: if single layout, won't be able to validate anything. If layout per entry point, can't validate that bindings match shader.
- CW: we want to validate that?
- JG: want to provide enough info so that we can see whether validation will succeed/fail. And later when have to validate, can use cached result.
- CW: as much as it can be ignored is best on our side. We won't do anything with it, or single layout. If we can just do Web IDL validation and ignore it, that's best.
- MM: are you saying we'll take exactly the set of validation about the layout and move it up if the author supplies the layout?
- JG: Not quite. Rather, pre-cache the validation answers.
- MM: so if the hint is wrong it's un-observable, except for performance.
- JG: yes. Might emit a warning.
- MM: not awesome, but I think that's acceptable.
- DM: seems unlikely you'd use one layout in the hint and another during pipeline creation.
- JG: that's why I'm leaning toward Corentin's suggestion that this be a pure hint.
- MM: what about - if you supply it during shader module creation it must be identical during pipeline creation? Then we'll have some guarantees that our early compilation won't be wasted.
- JG: agree, it's a performance benefit, not a niceness benefit. If wanted to change behavior based on validation then have to spec something about that validation. If it has to match exactly - it's a way to make the validation almost free. But couldn't precache specialization of the shader.
- MM: right, have to call createShaderModule again, without the hint. Would that be a best practice? Create two different pipelines with different layouts, have to call createShaderModule twice?
- Discussion about whether to do this. Avoid draw call time validation errors. Might jank.
MM: this is OK.
CW: either way seems fine for Chromium - with or without the validation that the layout you supply at pipeline creation time is the same as during shader module creation. Prefer to not validate that, but OK if we decide it should be done.
Direction we're going:
- MM: supply dictionary of pipeline layouts. OK for impl to ignore that dictionary.
- CW: except for Web IDL validation.
- MM: correct.
- DM: what if pipeline layout is invalid object?
- MM: that's OK.
- DM: that contradicts other guidelines. Everything's contagious otherwise.
- JG: not necessarily desirable. Not looking for excuses to make errors propagate. Want things to be robust in depth.
MM: I'll write the PR.
MM: thanks to the chairs for spending the time on this. We appreciate that you treated this issue with respect.

Make lost devices appear to function as much as possible #1629

KN: James and I discussed device lost, esp. where requestDevice can return a lost device. Want to make sure apps (1) can expect a device loss at any time, (2) don't break because the device was lost. Don't want middle of rendering loop, everything to crash.
- Device appears as much as possible that it's functioning, until app can react to device loss.
- This is generally true in the API already. More specifically: should we make this a design choice?
- E.g.: popErrorScope rejects when the device is lost (not 100% sure). For example, here, we'd pretend there was no error, so rejection doesn't cause some sort of app breakage.
- Go through API and make sure we don't have other situations like this.
- JD: other situations: if we return lost devices on device creation, then when uploading textures, compiling shaders, etc. - these all work. Pass invalid objects in creation of other objects - doesn't raise exception.
- KN: most of API works like this already. Have to audit.
- JD: also queries. There aren't many, but should watch out.
KN: mapping's a can of worms, but think we've figured it out at this point. Pretend maps work even if OOM on the backend.
- RC: when you map such buffers, what do you get?
- KN: reading - zeros is fine. Writing - let it go into the void.
- CW: if you lose the GPU process - you still get shared memory. Still guarantees mapped buffers are accessible after device is lost.
- DP: would like to check that. There are ways to map GPU memory onto the CPU.
- RC: in D3D think that if things are lost we generate E_FAIL. Have to fill things with zeros, handle things more gracefully.
- KN: not sure we need that. Can return buffers of zeros.
- RC: so only way your app will behave any differently - reading mapped buffers returns 0, and lost promise resolves?
- KN: ideally. Might be few other holes.
- JD: main thing to avoid - JS exceptions, rejected promises that take different code paths in the app.
- JG: sort of moving around where the bug is. popErrorScope…
- JD: could make that error say "device was lost".
- JG: means people have to check for errors.
- KN: if people are using error scopes they have to be checking for errors.
- JG: this is sort of what WebGL does. Had to go back through our error checking code and let it handle the LOST_CONTEXT error in addition to others. We assert either that we didn't hit an error, or had expected error, or got CONTEXT_LOST. People can still write incorrect code. Understand desire to want to make code "just work", but a warning that it won't always.
- KN: any downsides to making this attempt?
- JG: no. Just want to be cautious about taking something, having it imply some approach, applying that with consistency and forgetting why we're doing so.
- JD: did small investigation of WebGL apps in the wild. Seems about half recover from lost context, can mostly thank Ricardo from Three.js for that. The other half fail. Hopefully we can do better in WebGPU.
- JG: hope so.
CW: sounds like we're mostly in agreement, can discuss on case by case basis going forward.

Should "mark adapters stale" be called more quickly? #1630

JD: might make sense to do PR for requestAdapters first. Matters how we design the API to requestAdapters. If we have a list of adapters makes sense to call requestDevice() on the adapter, otherwise no.
- MM: actually not prepared to discuss this now. Postpone?

(stretch) Reduce blocking on shader compilation #1641

JD: I've been implementing WebGL async shader compilation extension. Realized you could block on shader compilation in WebGPU. Guess everyone already knew this. Surprising to me. Understand why we're doing this this way. Lot of apps might do this. Is there a way to avoid blocking apps?
- If we relaxed the requirement that DOM updates wouldn't be blocked during shader compilation - we could continue to update other animations, allow you to scroll, etc., and maybe user would have better experience.
- JG: don't think you can distinguish between blocking on shader compilation and blocking on other work.
- JD: at native API level?
- JG: yes.
- MM: better solution - make createPipeline return a promise.
- JD: we already have createPipelineAsync. Some apps rely on this - esp. for porting of native apps.
- JG: not just that - there are engine designs that require it. You don't know you need it, and you need it right then. Presumptuous for us to say you won't have inter-frame turnaround time. Not just porting concerns.
- CW: maybe we could say - if during an animation frame, createPipeline Sync is called - browser may desynchronize the canvas?
- JG: I don't think so. Pretty big change. Asking for the compositing behavior to sometimes be decoupled on what the app does.
- JD: alternative is the page freezes.
- JG: I'd be more sympathetic if we didn't already have an API to solve this.
- JG: maybe ads should be fully desynchronized. If an ad allocates 2G of GPU time.
- JD: just a proposal to improve things.
- JG: if we don't have desynchronized, like WebGL, then we should add it. I think this is too tricky.
- MS: we plan to get around this with OffcreenCanvas, and desynchronize w.r.t. the DOM.
- JD: OK, this is something we could add later.
- CW: what I understand (1) too complicated to do this in browser impls. (2) with today's best practices you can already do this wtih createPipelineAsync, and desynchronized options. People who write good code can get good results. Is that correct?
- MM: so treat this similarly to infinite loop in JS.
- DM: this addresses one issue out of many. Maybe allocating large buffers per frame, reallocating large bind groups, etc. all do similar things. Solving the pipeline creation problem seems over-specific.
- JG: would like to make sure we have general API solutions for this. creating things asynchronously, desynchronizing rendering.
- CW: we looked at createPipelineAsync. If we want to be async, lots of costs in createShaderModule as well. Browsers will have to handle those asynchronously too in order for that to work.
- JG: do we do that validation synchronously? Not using Unicode identifiers, etc.?
- CW: can observe in 3 diff ways. (1) use shader module. (2) seeing in error scope - promise-based, can be delayed. (3) having the error event fire. Don't have an order guarantee.
- JD: we have a Promise to get compilation errors from shader module, but also async.
- CW: unhandled error - have to say that that event can return errors, not just the first error. createShaderModule launched to another thread - error from draw can trigger unhandled error first, before createShaderModule.
- MM: think that makes sense.
- JG: are we enforcing an order?
- KN: if we enforced an order we'd have to pick one.
- …
- CW: error scope is fine. Can delay it. Problem's more the unhandled error event - supposed to fire on first error in error scope.
- KN: don't think apps should be relying on those to show up in timely manner.

(stretch) Allow mat2/3/4 as attribute types #1652

PR burndown

Pluralize requestAdapters() and add GPUAdapter.isSoftware #1660
- BJ: discussing software, etc. a lot. Devs may want to identify specific adapter, avoid driver bug on specific card, etc.
- BJ: previous discussions, combinatorial explosion. Provoke what dev hopes will be the best result. Conclusion: just give you multiple adapters.
- BJ: primary interesting restriction: you only get one hardware and one software adapter out of it. That's to satisfy fingerprinting concerns. Rest follows logic right now. In future, if we have permissions API that lets you iterate through more adapters, or Electron has a built-in consent for using this information, then we already have the mechanism for exposing more adapters.
- BJ: algorithm does imply that software adapter would be the last one on the list. Might be nice from predictability standpoint.
- BJ: add "isSoftware" attribute to adapter.
- KN: two particular issues. Michael shannon had use case, want to disable particular GPU because it has issues, and either use different GPU or software. Can tell customer to disable GPU access, but doesn't work well at scale. Wouldn't work for Google Meet where you have arbitrarily many users. They need to detect to not use a particular HW impl and fall back to software. Meet would probably fall back to Wasm though. But other apps would benefit. Also JG pointed out that in WebGL, we make it too easy to say "I don't want any major performance caveats". People instead can iterate adapters, see that one's software, and avoid using it, rather than passing in a flag to the browser.
- MM: thought we resolved this with allowSoftware?
- BJ: discussed in Editors' meeting. In trying to consider all use cases we've heard and could conceive of, ended up with this weird combination - I provide this value here, but there are ~4 kinds of allowSoftware flags where it would sit above/below hardware adapters in terms of priority. Hard for editors to parse out the meaning of the flag. This is a more straightforward solution.
- RC: we still have powerPreference flags, more in the future, and that'll change what's in the list?
- BJ: correct. powerPreference is unchanged. It selects the same HW adapter it always has, and puts it in the list. powerPreference is kind of nonsensical when it comes to software.
- RC: so regardless of powerPreference, software always appears?
- BJ: correct.
- RC: so seems like an ordered list of recommendations, with software being in last place.
- BJ: yes. critical thing - would be marked "isSoftware" so there's no ambiguity.
- KN: most imp't thing - the first one is the one we recommend.
- JG: I like that design from D3D.
- RC: happy with this proposal.
- JG: think this goes counter to some concerns from Apple.
- MM: seems like better representation would be - make 2 calls, return the hardware and then the software one. How this opens up the possibility in the future of expanding the list from 2 items today to many items later - we deliberately don't want to go in that direction. Packing these 2 in the list encourages people to write for-loops.
- JG: Foresee wanting to know what adapters are on a system. Number of usecases where powerPreference is not enough. Not plurality, but real. Don’t think this particularly encourages people to write for loops. Most people can pick [0], it can be undefined, becomes the same as a null check.
- KR: Would like to highlight the already existing usecase of WebXR. Quite common that a system might have a regular adapter, an xrCompatible one, and a software one. Would like to push back on Apple’s fingerprinting concerns. Still can avoid fingerprinting with this API, don’t need to design this into API.
- MM: A preference for “prefer the xrCompatible adapter” is totally fine with us. Direction we’d like to avoid is a big list of adapters that represents all adapters in the system either because the spec says so or because the dominant implementation does so.
- CW: Let’s call this discussion for this week, revisit next week after people have had more chance to review.

Agenda for next meeting

Put items here, or mark Needs-Discussion on Github. We'll do #1660 for sure next week.
Pluralize requestAdapters() and add GPUAdapter.isSoftware #1660
Should "mark adapters stale" be called more quickly? #1630

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2021 04 26

GPU Web 2021-04-26

Tentative agenda

Attendance

Administrivia

Method of ensuring GPUShaderModules can contain MTLLibraries #1064

Make lost devices appear to function as much as possible #1629

Should "mark adapters stale" be called more quickly? #1630

(stretch) Reduce blocking on shader compilation #1641

(stretch) Allow mat2/3/4 as attribute types #1652

PR burndown

Agenda for next meeting

Clone this wiki locally