Minutes 2022 09 21

GPU Web 2022-09-21

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
“Tacit Resolution” Queue
Override constant should not provided by pipeline as NaN or Inf #3435
Does getBindGroupLayout return the same underlying object or a new one? #3454
Items from Editors’ Meeting that might be too substantial to discuss on short notice
- Frame advancement of canvases that aren't visible #3295
- mapAsync(WRITE) zero-fills the relevant region of the buffer #2926
Passive Fingerprinting Surface #3101
createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296
Agenda for next meeting

Attendance

Apple
- Myles C. Maxfield
Google
- Brandon Jones
- Corentin Wallez
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
- Stephen White
Microsoft
- Rafael Cintron
Mozilla
- Erich Gubler
- Jim Blandy
- Kelsey Gilbert

Administrivia

CW: WG charter will be sent for review to W3C by Francois soon.
- CG charter can be whatever the CG wants. Not blocked on anything - no W3C process for this.

CTS Update

KN: Gyuyoung working on operation tests! Wrote some tests for blending - making sure parameters passed through, basic state working.
Our team will do more CTS work, burning down backlog. Balancing with other V1 work.
Will probably do ~1 "hack week" on the CTS.

“Tacit Resolution” Queue

Nothing

Override constant should not provided by pipeline as NaN or Inf #3435

KN: WebIDL doesn't allow NaN and INF to be passed already.
MM: overridable constants aren't the only way data gets to the GPU - vertex buffers for example.
CW: WGSL says you can load values from storage buffers and they can be Inf/NaN but we don't guarantee they come in correctly in that form. Why are overridable constants different from vertex/storage data? We can validate them on the CPU because they live there.
KG: Multiple options for floats in WebIDL, UnrestrictedFloats might support infinity.
KN: Checked, it doesn't.
MM: sounds like the right direction
CW: close as invalid, impls can do whatever they want, decide what to do in C API.

Does getBindGroupLayout return the same underlying object or a new one? #3454

KN: ~2 weeks ago discussed #3263, whether labels should be "setLabel" - issue with multithreading. Label can be synchronization point between threads. Already have something where multiple threads can point to same underlying object - BindGroupLayout.
KN: if you call getBGL twice you get new objects. Set a label on it - is it visible from the other object?
KG: the spec tells us about whether it returns the same object. We wouldn't fill in expandos on another JS object.
KN: they're not expandos - they're part of the object and the API.
BJ: there's the label attribute in JS, and the underlying one in the APIs like Vulkan.
KN: not too important what's passed to Vulkan, but it's important what error messages show.
KN: spec doesn't need to prescribe.
BJ: have a logical thing you want the spec to do, but can't currently.
MM: deduplicating by-value globally defeats the purpose of having a label on them.
MM: another option - domain of labels is moved to DevTools, not part of WebGPU API.
CW: was going to suggest that. Though they show up in browser's error messages, those are DevTools-ish, devs shouldn't rely on programmatic value of error messages.
KN: that's pretty much how the spec works. Spec doesn't say anything about what we do with it; it's just there. But you can retrieve the attribute. Applications could reasonably use this.
MM: Another use case is using a native graphics debugger. Neither use cases require seeing the label from JS. Could even be just a creation time attribute.
KN: got feedback from folks not wanting it to be creation-time attribute, but OK with it being write-only.
RC: wanted it to not be just creation-time, Value in game engine, temp textures for terrain, reuse them. Nice to set label accordingly.
CW: hearing consensus to make "label" either just a setter, or a set-only method.
MM: across the whole API?
KG: feels weird to me. Sweeps it under the rug. Like we didn't tell you what'd happen if you set a label on a BGL, then did the same thing on another JS object for the same BGL.
KN: we can strictly define this in the spec. If we say the label's associated with the underlying BGL there's only one piece of state.
BJ: still will surprise developers until they learn about this. First time I look at it in RenderDoc it'll be confusing.
KG: this would change how I'd feel about getBGL always returning a new object. If label should inherit, should return same object, not a new one.
KN: it's a solution, but it doesn't solve multithreading issues in the future.
CW: Dawn for example deduplicates pipelines too. Call getBGL on each twice - says you get two JS objects, one for each pipeline. If we had global deduplication we'd have one GPU process objects.
KG: on content validation side they're different objects.
CW: but the label's GPU process-side for RenderDoc, etc.
MM: unfortunate for the user who wants to connect to the browser via RenderDoc.
BJ: if there's not a 1:1 relationship between JS object and WebGPU object - understand Kai's POV, but we've broken association between the two. I'd like JS object to be more functional, let me read the label. Admit it doesn't translate perfectly. Unfortunate to make labels less functional in JS as a result.
KN: orig, wasn't thinking about deduplicating objects because they have the same value. Spec doesn't specify this. Expect it'd maintain 2 labels for the object.
MM: that's required because getLabel's a sync operation - has to be done web process side.
KN: even if changing to setLabel, will have problem with error messages/tooling. Sounds like impl problem. Maybe change the API so impls don't have this problem anymore? What if instead of setLabel, we had addLabel, and can only addLabel to things.
MM: addLabel to an object, might appear on other objects?
KN: yes
MM: not great.
JB: i like this idea and was going to suggest it. On the GPU, this is multiple objects in your program. User can look at them, they're the same, that makes sense.
BJ: not sure that works well for Rafael's use case. Texture cache you're cycling through. I'm brick now, used to be cobblestone. :)
JB: legit.
RC: subtraction?
KG: feels like recreating atomic ops for labels.
MM: isn't "subtract" not a solved problem? 2 objects, add label to one, subtract from the other - disappears from the first.
JB: transactions.
KG: would like to punt on multithreading issue. Big solution space there. Spooky action - setting labels equivalent to something else - would like to return the same object.
MM: about global deduplication - did you gather metrics? What's the benefit?
CW: we did it because we could. Way back when we were working with Babylon.js. Their impl wasn't as strong then, and it was a big benefit to deduplicate pipelines for them. It's easy to do this, and we do this for all immutable objects - samplers, pipelines, etc. There are more.
MM: understand the deduplicating calls to perform compilation. Not as obviously good to me (naively) to do it for BGLs. Maybe difference between 2 compilations is in the BGL?
CW: yes, but we use this to do fast BGL comparison. E.g. BGL for bindgroup / pipeline are the same value - we just check the pointers because we deduplicated them. Very cheap.
MM: does Babylon use labels?
CW: if they're not they should. :)
KN: feedback has been - everyone who uses labels likes them.
BJ: emphatically so.
CW: back to issue - are labels shown in error messages / graphics debugging tools - are these impl details, or things the spec should mention?
MM: other question - encountered anyone who was confused about labels, setting on one object and it showed up on another?
CW: no. People love the error messages. If they're confused about them, sometimes we tell them to add labels, and that clarifies things.
BJ: I'm usually the one giving that advice. :) Generally - most people interact with the error messages coming from Dawn. Haven't found anyone confused by this.
MM: compilation error messages might be a special case - each has only one BGL. If you report a label that doesn't meet user's expectation - there's only one, so no confusion.
RC: as we move forward - maybe devs will have one piece of code making a BGL exactly the same as piece of code B - we might report something and they might get confused. Maybe code B will have been written by someone else?
MM: good point.
BJ: I'm sure that'll happen. Esp if I meant to make separate BGL, but error matches an earlier created one. Will be confusing. Also probably an edge case.
MM: the severity increases the bigger the error scope is.
GT: I strongly object to debuggers giving me the wrong info. :)
MM: I see both sides. Agree with RC's point. I do understand that we want to do deduplication, at creation time. Would be unfortunate if we couldn't do deduplication because someone might set a label in the future.
KN: I really just want "addLabel / removeLabel". I think it solves the problem.
MM: setLabel can be implemented by subtract / add.
KN: not true - to remove, you need to know what the old label was. setLabel doesn't.
KG: think there are impl solutions available that give users the info they need. Not dedup'ing things - potential impl directions. Being able to assign labels is reasonable.
MM: can we move labels to an optional extension?
CW: no. Suggest we keep labels as-is. KN's / KG's comments help - the wrapper JS objects can have setLabel, and GPU process can have "replace this specific label with this one". GPU process, you can 100% track the set of content process labels and surface them. If browser does a bad job, might do badly for debuggers, people will complain, and we'll fix it.
MM: I don't understand how add/removeLabel solves the problem.
CW: GPU process had add/remove; JS only has "set". Refcount, if you really want it. Can do perfect tracking of this between content/GPU process.
BJ: I like that.
KN: benefit - your underlying objects can have all the labels. It's also an impl choice, which is also nice.
KG: I think the ability to clear/reset a label is desirable. Don't want to do addLabel/removeLabel.
CW: my suggestion is - leave the API as is. Impls can do perfect tracking. Issue we're talking about now is just an impl detail.
KG: OK, I'm satisfied then.
MM: can you write down how the perfect tracking will work?
CW: yes, will do.
KN: I think I can do that.
KN: last remaining thing - getting the label off the JS object. Think it's fine if it's the last one set on the JS object.

Frame advancement of canvases that aren't visible #3295

KN: trying to create semantics that are very reliable to apps, so they won't break flakily depending on relative timing of 2 things like vsync and video playback.
KN: latest proposal BJ and I came up with - enqueue a microtask like we used to for external textures. Myles had concerns there are reasonable apps wanting to render across multiple microtasks. Would lose texture access in the middle.
KN: but, would break at a deterministic point.
KN: Q: how impt is that use case? Something we want to support, or are we oK with breaking that?
MM: if I'm writing a WebGL app and in middle of raf i queue a microtask, and in that microtask i finish the rendering - what's the behavior?
KN: supposed to work. Microtasks happen immediately, before presentation.
KG: my understanding - a microtask like this is like creating an async function and triggering it. Hard for me to talk in DOM spec language. Need concrete use case.
MM: https://developer.mozilla.org/en-US/docs/Web/API/HTML_DOM_API/Microtask_guide
JB: two-level structure of event loop. The real one - things coming in at any time - and microtask stuff that's deterministic. Microtask ordering is deterministic, specified. Want to avoid introducing nondeterminism.
BJ: to answer question - if you scheduled work in microtask updating a texture, and getTexture after that - texture would be alive. But if microtask scheduled after texture - microtask expiring the texture would be made in getCurrentTexture call.
MM: think problematic for 2 reasons. 1) analogy with WebGL. 2) reason Jim gave, but in reverse- because ordering/timing of microtasks is deterministic, and authors can schedule them at any point- if they have microtask soup they know the order they'll occur. Reasonable to write C++'s RAII using microtasks. Acquire resource, schedule microtask to release it - you know it'll be released at end of block
JB: they don't nest, but understood.
MM: not sure what right answer is. At least, larger granularity than microtasks.
KG: task boundary?
MM: think that's natural and makes sense.
KR: WebGL is different, maybe not even at a set time. (for preserveDrawingBuffer:false, which we don’t have) Would like to solve this better in webgpu. I would lean towards on a microtask boundary because that’s better defined than task-boundary.
MM: could we say, at the end of the last microtask of the current task?
KN: we could, if we inject something where the microtask queue is flushed. If any canvas textures, then expire them. Can in theory does this, but according to spec, we're adding something for every task ever. Putting it in super super hot code. Doesn't make sense to spec something assuming you'll put in a lot of effort optimizing it. Our spec folks' feedback: reason microtask queue exists is so you can do thing sin a well-defined way when the task ends. The reason it was added was there used to be a soup of tasks and now they're better defined/specified.
KG: saying "last microtask for current task" sounds like impl approach to doing something at task boundary.
KN: 2 ways we could spec it. Insert item in event loop after running a task. Also, insert items after running rAF. 1) after tasks are run; 2) after rAF callbacks are run in main thread; 3) after rAF's run in worker thread. Have to make sure it's expired after we display it.
CW: isn't inserting after rAF vs. after end of task rAF runs in non-observable from JS?
KG: no because you might have another task that's not the rAF task.
KN: rAF's not a task, but a point in the event loop that runs callbacks. It alternates tasks and rAF callbacks.
BJ: what if you schedule a microtask to schedule a microtask to expire textures? Taking to logical conclusion - schedule microtask to see if more tasks in microtask queue? If so - do the same check again, otherwise expire textures.
KG: sounds like task boundary.
BJ: yes, but written in current spec language.
KG: sounds like we want to do at task boundary - figure out what primitives spec folks want us to use.
KN: thought of this earlier. Returning to why we need this - I'm not convinced this is an imp't use case..
KG:
KN: do we want to let people render in microtasks?
KG: yes.
KN: what if …
KG: sounds like we want to do this at a task boundary? Can we do this?
KN: I think it'd be kind of OK to do it at a task boundary, but think it's complex and difficult to optimize. I don't see what's wrong with queueing a microtask to expire textures.
KG: others of us are convinced of this. If we don't know how to talk about task boundaries we
KN: problem is that anywhere in the web platform that a task happens, we have to potentially expire textures.
KG: I think we're not the right people to determine this.
BJ / KN: we asked them. They said, microtasks.
KG: did we iterate with them? Say that we wanted to use microtasks?
JB: I don't understand why using microtasks for expiration prevents us from using them in rendering.
KG: it's a queue.
KN: if your render task got enqueued after getCurrentTexture, wouldn't be able to use it.
CW: could we ask HTML people to comment on the issue?
KG: I can find the right person on our side at least.
MM: answering Jim's question - rendering often runs compute shaders. Natural for an app to do compute work not needing current texture, then do rendering work which does, then do compute work that doesn't, then rendering work which does.
JB: letting people use async functions is imp't.
KN: (
KR: Have run into problems with people (Construct engine) using async functions with rAF that turned out not to be already-resolved and ran into unreliable rendering results. Shouldn’t try to make things work for this pattern.
KG: I think there are use cases.
BJ: concern is - you want this on task boundaries - but if you await, you don't have control over microtask boundaries.
MM: there are 2 well defined ways people can put microtask boundaries in their code. queueMicrotask, and Promise constructor - new Promise(resolve,reject). JS author can call those functions in their code - deterministically awaiting, running in current task.
await new Promise(go => go)?
MM: understand there's difference between popularity and existence claim.
KN: problem is the usefulness claim - I'm not convinced of this.
CW: for next week - please think about - how we'll decide on this. If we need to pull in more people (HTML folks), let's do so.
CW: this is a breaking change compared to current API, and influences external textures. Let's resolve it soon.

mapAsync(WRITE) zero-fills the relevant region of the buffer #2926

MM: Not ready for discussion yet, needs more work before discussion

Passive Fingerprinting Surface #3101

createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296

Agenda for next meeting

Postpone APAC meeting for China National Day.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2022 09 21

GPU Web 2022-09-21

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Override constant should not provided by pipeline as NaN or Inf #3435

Does getBindGroupLayout return the same underlying object or a new one? #3454

Frame advancement of canvases that aren't visible #3295

mapAsync(WRITE) zero-fills the relevant region of the buffer #2926

Passive Fingerprinting Surface #3101

createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296

Agenda for next meeting

Clone this wiki locally