Skip to content

Minutes 2020 11 23

Corentin Wallez edited this page Nov 27, 2020 · 1 revision

GPU Web 2020-11-23

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

  • Administrivia
    • Joint WebGL / WebGPU meetups?
    • Tomorrow’s WGSL meeting
    • Patent policy
  • PR burndown
    • Refactor GPUBindGroupLayout to isolate binding type definitions #1223
    • Add validation steps for createTexture and createTextureView. #1237
    • Specify [=debug label=]. #1240
    • Remove GPUFence #1217
  • Primitive restart value #1220
  • Rename "depth-comparison" to "depth-stencil" #1230
  • Make blending state explicit #1134
  • Multi-queue synchronization #1169
  • Agenda for next meeting

Attendance

  • Apple
    • Dean Jackson
  • Google
    • Austin Eng
    • Brandon Jones
    • Corentin Wallez
    • James Darpinian
    • Kai Ninomiya
    • Ken Russell
  • Intel
    • Yunchao He
  • Kings Distributed Systems
    • Hamada Gasmallah
  • Microsoft
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Dominic Cerisano
  • Michael Shannon
  • Mehmet Oguz Derin
  • Nathan Johnson

Administrivia

  • Joint WebGL / WebGPU meetups?
    • CW: there are a number of WebGL meetups currently. Good attendance, lot of attention from ecosystem. Expectation is that ecosystem will expand
    • JG: as long as it isn’t to the detriment of either, having joint “web graphics” meetups would be good
    • BJ: haven’t we heard from Neil that Khronos is supportive of this?
    • CW: yes. WebGL / Khronos in general thinks it’s a good idea. Half of the people in this WG are in the WebGL WG.
    • KR: Reached out to Neil because there was a lot of questions in the last WebGL meetup about WebGPU. <list of questions>. Given the interest we should expand the meetups and start including WebGPU. Neil was strongly supportive about doing this on an ongoing basis. He wants to expand collaboration between Khronos and W3C. Khronos funds the WebGL meetups (GDC space, Zoom account, ...) and it would be great to use these resources. WebGL Webinar was 160 live participants. Suggest that we have WebGPU content in these meetups instead of forking.
    • DM: Only webinars and meetups or also standardization?
    • KR: Webinar + external audience meetups.
    • DM: Sounds great!
    • CW: if there’s any concern at some point, we can fork - e.g. if Khronos doesn’t want to support the cost of it, etc. But if problems arise we can figure them out. Think we should do it.
    • KR: Plan is that the next meetup would be mid january (6 meetups per year ish), could have WebGPU tutorials there
    • Nathan: about WebGPU tutorials - would be extremely useful - trying to get into WebGPU, but only code examples exist.
    • CW: makes sense. There are a couple of outdated tutorials, but as Chromium’s aiming for origin trial, we’re aiming to have a canonical compute example and others.
  • Patent policy
    • CW: as agreed last meeting: will use new patent policy. François filed the form to make that happen. We don’t have to do anything, companies have to rejoin the WG, but that’ll be trivial.
  • WGSL meeting tomorrow?
    • Dean / Dan can’t attend WGSL meeting tomorrow. Other constraints?
    • DM: Jeff has vacation tomorrow as well.
    • JG: I do, but was planning on taking the call. Useful to keep it going forward. Mozilla’s given us the days off.
    • CW: OK, let’s assume we do it tomorrow.
    • DM: I was thinking it might be worth skipping without Dan, Apple and potentially Jeff.
    • DM: let’s all focus on the homework, review PRs that David’s posted today.
    • CW: sounds good. Might start having less topics on the API side, so maybe we can use parts of this meeting for WGSL stuff.
    • CW: tomorrow’s meeting is cancelled then.

PR burndown

  • Refactor GPUBindGroupLayout to isolate binding type definitions #1223
    • BJ: everyone seems happy with this, small bikeshedding, I don’t have enough knowledge of breakdown aside from strawman Dzmitry suggested.
    • CW: strawman looks good. Since we can’t sample integer types. GPUTextureType is a bit generic. My texture’s multisampled, is that its type? No. It’s texture 2D, is that its type? No. Etc.
    • JG: aren’t these previously existing problems?
    • CW: we had GPUTextureComponentType. This removes it.
    • JG: would be nice - when we make these larger changes - would be nice to have an example embedded in the spec which is updated. Then can see the difference.
    • BJ: would like to have examples embedded in the spec, period.
    • JG: we have some, would like to have more.
    • DM: is this for readers / end users, or reviewers?
    • JG: everyone. Logical checksum to make sure we’re implementing what we expect. Examples are non-normative, but give us good view if we have spec inconsistencies later. Helps with review, too.
    • DM: there is some example in the repository.
    • CW: right, we have examples. They’re out of date.
    • DM: I try to update with my PRs. If we do that it’ll make reviewing easier.
    • JG: as these types become more composite, would be helpful to have them updated. Right there next to it. As reader of the spec, trying to figure out how it works. Should keep them updated.
    • DM: think naming needs to be worked on, can iterate. Structurally, think we should do the changes I suggested before landing, because they change the structure.
    • CW: yes. Arguable whether to separate storage textures or not. That’s a structural change needing to be highlighted.
    • BJ: I echo Myles’ comment - separating out storage textures made writing the validation code, and easier to read too.
    • KN: major point of this - to make it easier to understand. Since storage textures are something you don’t have to think about very much, it’s good that we separate them out so you mainly don’t have to think about them.
    • JG: +1
    • CW: sounds like everyone’s happy with it. Let’s integrate Dzmitry’s proposal, land it, and then work on naming.
    • KN: think we can come up with good names in Editors’ meeting later today.
    • BJ: can have structural change ready by that meeting later.
  • Add validation steps for createTexture and createTextureView. #1237
    • JG: addressed most of the concerns. DM brought up #799, did related work we need to integrate at some point. This forbids 0xn textures (min size 1x1). Not a lot left to talk about. DM brought up: we can turn things into switches, but I don’t know how to write - depth should be related to array layers for non-3D textures. Something needs to be fixed, not sure what it is.
    • CW: the duality of 2D array and 3D textures. Same thing but different because APIs might swizzle them differently.
    • JG: ok, just needs to live in a switch then. BaseArrayLayer, ArrayLayerCount should be 0 and 1 for non-array textures.
    • CW: yes.
    • JG: there’s the option to ignore them, but that puts backward compat at risk later.
    • CW: yes, and also there’s eventually the thing about interpreting 3D texture as 2D texture array.
    • JG: I’ll add that. Other comments?
    • CW: think we can defer to editors at this point.
  • Specify [=debug label=]. #1240
    • JG: sorry, haven’t addressed comments on this yet. Anyone else have comments? So far feedback is positive. I have nits but nothing preventing landed. Idea: should try to preserve these debug labels, and must not use labels for internal heuristics. Can’t enforce this, but impls must not rely on labels to give ourselves out-of-band information.
    • CW: oh, like DOOM 2016 runs faster when renamed to rage.exe?
    • JG: yes. Also maybe not clear mapping between label you give us - UTF16 DOM string - and underlying APIs that use ASCII and have length limits. Little stronger language now. Previous was completely ambiguous.
    • CW: non-controversial, can edit offline.
  • Remove GPUFence #1217
    • DM: had fences originally because we didn’t know how data transfers and multi-queue would work out. Now we know how they’re going. Fences not needed in the way we had them. PR simplifies them greatly. onCompletion promise on the queue, which handles all prior uses of fences in WebGPU. Any concerns?
    • CW: sounds good. Only thing: onCompletion might need to stay. If we add back signals, etc. in your multi-queue proposal, we might still need onCompletion. onSignalCompleted, too, taking specific signal? Don’t think we can overload onCompletion to take signals.
    • JG: this revives that decision between signals and semaphores, in Vulkan.
    • DM: they’re both fences.
    • CW: onCompletion without parameters - wait for all previous work. onSignalCompleted taking signal, and another call - … - can distinguish between onCompletion with no argument, and onSignalCompleted with single arg.
    • JG: this would be wait(), then?
    • CW: in current multi-queue proposal, we need signal values to signal and wait on on different queues. That proposal also rewords fences. Queue.onCompletion of signal value.
    • DM: I don’t think it’s necessary. IF you know where you want callback called, call onCompletion at this point. Don’t need to pass a signal there.
    • CW: what I meant - in MQ proposa, there’s teh same function name taking a signal value.
    • JG: not any more.
    • CW: ah - never mind then.
    • JG: about the name - one thing - would like name to emphasize that it’ll resolve promise when all pending conversations are complete.
    • CW: and writeBuffer and stuff?
    • JG: all pending queue operations.
    • DM: what about Kai’s suggestion, workCompleted?
    • JG: I liked pendingWorkCompleted. Would like it to be clear - work submitted previously. Not a “waitForIdle() or similar.
    • CW: name doesn’t have to be super-compact.
    • CW: pending is a bit weird. Is a submit really pending, or is it executing? Bikeshedding at this point.
    • DM: mind if we polish on Editors’ call?
    • JG: that’s fine.

Primitive restart value #1220

  • CW: think we can skip this - part of the bind group entry …
  • DM: I added #1220 to the agenda. Can we talk about this for a moment?
  • DM: have problem with primitive restart Metal, affects all topologies. Same effect can be had in Vulkan and D3D12. SetIndexBuffer - format not specified, only at pipeline creation.
  • DM: did test on Metal - doesn’t do the primitive restart on non-stripped topologies.
  • KN: got confirmation from Jasper that it doesn’t. Don’t want to have to do scan for …
  • DM: doc bug in Metal most likely. Just keeping spec we have today.
  • KN: Want to make sure that the behavior is actually correct, might not be able to test it easily for 32bit values. Should at least test it for 16bit. If there are drivers with bugs we'll have to treat it as an invalid index??
  • DM: my test confirms that 16-bit index with all F’s is treated as invalid index on non-stripped topologies.
  • KN: might be driver bugs where that’s not true.
  • CW: we’ll have to discuss what to do with driver bugs. Not sure what WebGL policy is, but we’ll take inspiration from it. Let’s trust the documentation for now, and that the APIs are bug-free.
  • JG: sounds fine for now. Keep eye out for any issues later. Want to make sure people don’t expect they can use max value for some weird terminal, and we have to change it later. Thanks DM for investigation & test case.

Rename "depth-comparison" to "depth-stencil" #1230

  • DM: we’re going to bind stencil with a separate aspect for it. Don’t need depth-stencil name for texture component type.
  • JG: would this just be depth then?
  • DM: depth, or depth-float, something along those lines.
  • JG: depth-sampling?
  • KN: think this’ll be folded into #1222.
  • CW: depends on what GPUTextureType is renamed to. Texture type = depth makes sense. Texture component type = depth-float. Let’s re-discuss after Brandon’s PR.

Multi-queue synchronization #1169

  • DM: main contention is what’s implicit vs. explicit. API currently has everything explicit. Interesting convo with Rafael - diff choices btwn what we can do in these cases. https://github.com/gpuweb/gpuweb/pull/1169#issuecomment-732199828
  • DM: bothers me. Don’t think API will be good with this. Think things will be better explicit. Less work for us too. Fewer fake submissions just to synchronize between queues.
  • CW: agree different levels of explicitness. If explicitness in transfer_start, then don’t need fake submissions. Explicitness in transfer receiving ends…
  • JG: Here, explicit source transfers. Not implicit use of resources on other queues without initiating resource transfer. Here: do we want implicit waitSignal or should these just be implicit? Note, for expedience, would be backward compatible to choose to allow skipping Queue.wait in the future. If waiting implicit in the future, it “just works”. Makes me inclined to accept this in its current form.
  • JG: I have concerns about what Rafael’s echoed too.
  • RC: I don’t understand with current API - you have to do correct signals. You can’t do them incorrectly. Can the developer do better wait/signal than what the API lets them do? If yes, then we should take the PR. Think we should add this as an extension later, maybe as part of bindless, or as explicit barriers. Nobody’s saying we should do implicit transfers, just wait/signal to be implicit vs. explicit.
  • JG: extra benefit DM and CW are putting forward here: forces users to know what they’re doing, and for synchronization to be visible to them. Though it makes extra work, much more obvious when you’re incurring a wait and pipeline stall. May choose to design the API to be more obvious and cause that extra work, because developers write better code.
  • RC: if you the web developer do the wait early, code works more poorly than it should, if the API does the work for you. If you want to do better and know what you’re running on, seems like something you should opt into.
  • CW: goes both ways. Hard to judge. Perf discussion to be had, if developers knows/doesn’t what they’re doing. Ergonomics seem superior in one case. Perf discussion - tradeoff. Explicit wait/signal - if dev gets it wrong, they can lose some perf. When browser does receiving on behalf of developer - can just delay wait a little bit more - but wanted both my shadow passes to run in parallel of this thing, not just the second one. For MQ, coarse-grained parallelism, juggling resources where we don’t know what’s going on in users’ code. Limited trust - but have to trust perf-wise they know what they’re doing, because we have no idea.
  • JG: related to D3D team’s feedback. If we can do the validation, we should just eject the fences. Worth remembering that those experts told us that.
  • RC: was going to say that too. Min bar is what browser can validate. Question for us, should we allow web developer to do better. Risk that they could do worse. Some hardware that all goes serial. Appears more readable, but parallelism you get is lower than you expect. I think signals should be implicit at first, and later if you’re writing for specific hardware, then later let them opt into that. I realize it’s 3 against 1, so if you want to do explicit waits/syncs, we can see how it goes.
  • CW: I think we’ve made a lot of progress on this discussion, but think it doesn’t need to be in MVP of the API. DM wanted this discussion to inform wgpu’s prototyping - think we have a precise enough direction of where this is going that we should just prototype it and defer it until later.
  • JG: I’d love to have the spec text saying this is how it works. This is why I approved the PR. We could go both ways even in backward-compatible way in the future. IF I had to choose if i wanted only explicit or only implicit - I’d want only explicit. Also want MQ in the spec even if we change it later.
  • CW: so you really want MQ in MVP of spec?
  • JG: yes.
  • DM: one thing not clear - if impl can do it better than the users. I think the answer is no. Users should be able to do it as efficient as impl in all cases. Rafael assumed answer was yes.
  • RC: if user puts in a wait that’s too early, could lead to poor perf.
  • DM: we’re talking about extra powerful feature.
  • JG: sort of a footgun. Users will think they know enough to use it, but they might wait too early and incur over-synchronization in the app. Rafael’s asking, why are we letting them make their code worse?
  • CW: making these ops on the Queue makes it much easier to understand what’s going on rather than browser doing it behind your back.
  • JG: use case of wanting more explicit transfer / receiving - have some way for apps to opt into strict mode, issue queue assert, opt into resource reception. If you want to do that work and make it more obvious in your code, user can choose to do so. Implementability, we wouldn’t need to check their signals just to tell them they’re wrong, even though we can issue the waits for them.
  • JG: API on the table for initiating queue transfers is good / sufficient to work with. Just discussing receiving side. There, do you wait on the right signal at the right time? Feels to me like people are trying to satisfy real user desire. Don’t think every user has this desire. Users do get a footgun, where if we didn’t give this to them, they wouldn’t mess up.
  • CW: but they wouldn’t get the perf they want. Not sure who uses it in Vk except for id Software. Today, people using MQ are AAA developers. That’s it. Nobody else using it. OK, we add this to the API now - but do we need to put training wheels on it when it’s an advanced feature?
  • JG: feels like we’re putting a kill-switch into it. Users most wanting to do this can - users who think they’re good enough to and get it wrong, will get it wrong - but we’re being mean to middle-ground users who would get benefits from this, but we’re making it hard.
  • MS: A much simpler MQ would mean that more people are trying to use it, and incorrectly - so what RC and JG are saying makes sense.
  • DM: maybe implicit transitions are usable / needed - but I strongly believe that studios will ask,how can I make sure this is happening? Otherwise can’t optimize. Maybe they’re writing middleware, and have to understand what submissions are happening. If WebGPU does things implicitly they’ll throw it in the trash. The path there is to give them what they need to get perf, and then think about how to make novice life easier.
  • JG: counter-proposal - this is what Metal does.
  • KN: no. Metal’s queues are doing something different than other APIs’ queues.
  • CW: they can’t talk to one another.
  • CW: what about: super expert developers want to know their graph ops on queues run exactly like this. This means wait/signal. Without resource transfers. Just want to run things in parallel. But when you do GPU resource transfers, can do autoWait:true and we’ll wait for you.
  • JG: what about for users needing to know when resource transfers happen, can they polyfill that?
  • CW: no, because they want to control the graph precisely.
  • JG: they can write the code for that if they need to. Sounds like validation layer that can be written later.
  • CW: if you do two transfers and only the second one uses the resource, how does browser assert that it only received for the second transfer and not the first? EA Seed wants to know that they both ran in parallel.
  • JG: think that any information can be done at user level. Dump information into SharedArrayBuffer.
  • CW: I think there’s a misunderstanding but we’re out of time.
  • JG: I just want us to think hard about how to solve use cases and be flexible about how we solve them. Not just focus on having a solution and using it.
  • CW: 2 things. If we go with implicit receiving have to talk about concurrent access on multiple queues at the same time. Second, we want to get WebGPU out, and MQ is big and complicated. ~2 months of multiple engineers to implement and test it. I don’t think it’s worth it for the MVP.
  • RC: +1 to that.
  • DC: heard - Metal doesn’t do resource transfers between its queues.
  • JG: ignore what I said about that. I have a less good grounding in Metal than others who responded.
  • DC: I thought Metal did support MQ and support the use case of parallel queues with no resource transfers. Could that be included in MVP of MQ?
  • JG: don’t think resource transfers are the hard part here.

Agenda for next meeting

  • Bikeshed more on MQ?
  • We don’t have lot of stuff for the API, so what about we use this slot next week for WGSL.
  • JG: I’d like to talk about MQ a bit more. Don’t want to spend more than half of timeslot on it.
  • CW: let’s do WGSL next meeting and collect thoughts for meeting afterward.
Clone this wiki locally