Minutes 2018 05 09

GPU Web 2018-05-09

Chair: Ken

Scribe: Corentin

Location: Google Hangout

Minutes from last meeting

TL;DR

Summarized offline feedback from ISVs/IHVs
- Explicit memory barriers don’t give that much gain and incredibly hard to get right in practice. Usually an IHV devrel optimizes them for AAA games.
- Vulkan-style render passes aren’t general enough to be useful, they might give small gains because they allow the driver to place barriers a bit better.
Came up with a list of questions to ask our partners.
Discussed structure of a C header for the API.
- They should be for WebGPU, community can create shims from D3D / Metal / Vulkan to WebGPU.
- Header would allow running the WASM test suite against implementation in native.
- But not all implementations might not be separable at the WebGPU level.
- Need a C header to start, can be a Javascript shim and MVP.

Tentative agenda

CLA
Joint questions to ask ISV/IHV
Structure of the C header for WebGPU
Agenda for next meeting

Attendance

Apple
- Myles C. Maxfield
Google
- Corentin Wallez
- James Darpinian
- John Kessenich
- Kai Ninomiya
- Ken Russell
Microsoft
- Chas Boyd
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
- Markus Siglreithmaier
Elviss Strazdiņš
Joshua Groves

CLA

CW: Described the CLA

CW: AI send info about the CLA so people can review.

CW: AI write down the conclusions of the discussions with partners at the Khronos F2F.

CW: AI produce a C header for a subset of the API.

Feedback from IHVs / ISVs

CW: people were adamant that manual memory barriers didn’t give you very much, and essentially needed a driver engineer sitting next to you to make them (1) correct, and (2) better than what you could do automatically.
- Got this feedback from multiple hardware vendors, and one software vendor.
CW: multi-pass render passes, like in Vulkan, are very difficult to take advantage of. Very good at a particular kind of deferred shading, and that’s about it.
Conclusions:
- Tentatively, do implicit, not explicit, memory barriers
- Do single-pass (Metal / D3D-style) render passes.
RC: didn’t realize more feedback had been received.
DM: Arseny (from Roblox) was probably the ISV providing the feedback.
- They agreed memory barriers aren’t that useful.
- Re: subpasses: they take advantage of them in order to place the barriers efficiently. Combined with feedback from NVIDIA that they can’t produce split barriers from SetEvent/WaitForEvent in Vulkan, you get the idea that they can produce split barriers from multiple subpasses. Probably get some performance from that.
- But, this all means that we probably don’t need to provide either memory barriers or multiple subpasses.
- Not the type of gain that tilers get from subpasses.
CW: unless we get some extraordinary argument, we should stick with this position.
DM: we should have asked them about shading languages and then we would have solved all the problems. :D

Questions for ISVs/IHVs

CW: need to come up with a cogent list of questions to ask partners
MM: the conversations at the Khronos F2F (with IHVs/ISVs) were not well written down.
- CW: we will note them somewhere. (See below)
RC: since the WebGPU F2F happens first, didn’t know that the conclusions from partners were in the meeting notes.
CW: would like to ask them about memory barriers.
CW: would like to ask about multi-pass render passes.
MM: ask about the relationship between SPIR-V and LLVM intermediate bytecode?
- This is a question for driver vendors.
CW: our buffer mapping is quite different from what other APIs do. (You don’t even get the memory when there’s a potential for data races)
JG: would like an overview of how they both upload and download memory. (Do they download anything, ever?)
MM: that question is interesting in the context of how many compute vs. graphics applications they see. Graphics probably doesn’t download much, compute probably does.
- JG: good point, they probably have different perf requirements.
- MM: also would like to know the trend lines to know if compute is growing or shrinking.
- CB: there may be an interest in AI / ML which is mostly compute.
JG: should figure out how much people want geometry / tessellation shaders.
- CW: yes, basically, what features they want beyond OpenGL ES 3.1.
RC: wonder how much they want multiple queues with differing capabilities per queue. (For IHVs)
- JG: for ISVs: do you use multiple queues? If you do, how do you choose which ones to use? (On Vulkan in particular. Like, do you use a transfer-only queue when you can?)
CW: would like to ask how they use multithreaded APIs on multiple threads. Can they give us a high-level overview of how their rendering works?
- JG: think nobody should be trying to build the same command buffer on multiple threads at once. I assume most people are building one command buffer per thread. If anyone’s not doing this, would like to know.
JG: do we have any questions about swap chain output?
- CW: it’s usually platform-specific code so it’s okay if there’s a special case for the web platform.
- JG: more about vsync / SwapBuffers / present
CW: would like to ask about error handling proposal and if it addresses their use cases.
CW: limitations of one or other APIs.
- E.g., Metal can’t do partial clears. (Would need to do it as a render pass.)
RC: how concerned should we be about the slowness of memory allocation, and if we’re going to handle allocation on behalf of the developer, what should we do?
- JG: do they keep their own pools? Do they want to? Do they do aliasing?
CW: would be nice to ask how many draw calls they want to do
- JG: also how many command buffer submits they would like to do
CW: which backends do people have in their engine? What experience do you have with the various APIs? What works and doesn’t?
DM: how do people use bundles / secondary command buffers / encoders in Metal? Both IHVs and ISVs.
JG: AI: make a document with the questions so people can review them.

Structure of the C header for the API

CW: Since there will be multiple implementations of WebGPU, we should come up with a standard set of C headers against which the test suite can be written.
JG: the shape it will take is probably more or less Vulkan-y (or Metal-y). Take an existing API that they have a backend for and use it as a pseudo-native backend. That does the translation to our API. Think that’s more attractive than exposing a C API of our own design.
KN: think we may want both.
MM: we don’t need to own a shim. The community can create it without us doing anything.
CW: for sure someone will do it. But for our own sanity, think we should define C FFI compatibility between the multiple WebGPU implementations in native.
KN: the thinnest possible thing as the testing interface.
CW: it doesn’t have to be nice; just portable between gfx-rs and whatever else implements WebGPU.
JG: not sure how useful that is. We don’t have that with WebGL.
CW: you have the ES headers.
JG: we don’t run the test suites against headless WebGL.
CW: I assume the work in Firefox’s WebGL implementation is mostly validation, not translation. The translation between ES and D3D is done by ANGLE, and ANGLE exposes the ES interface, which is tested separately.
JG: this is another layer which we’d need to keep stable. I’d rather spend the time to get full-stack tests running in the browser than polyfill APIs.
KN: this is the API I imagine would underlie any language’s bindings to WebGPU. Native app targeting WASM, native app running natively. Think it’s good to have the test suite be written in native code. That requires having a common interface, like dEQP runs on the OpenGL ES native interface.
- If the layer’s as thin as possible we ought to be able to write C tests.
JG: the layer in Firefox is likely to not be separable in that way. gfx-rs is exposing the Vulkan portability API and that’s what we’d be more interested in testing conformance against.
KN: NXT won’t have a Vulkan portability interface. So we can’t take Vulkan tests and run them on both implementations.
JG: I just don’t think it’s useful to put a WebGPU native API in front of gfx-rs.
MM: you want to run Vulkan tests on WebGPU?
JG: the Vulkan tests are a way to ensure we don’t make mistakes in gfx-rs.
KN: gfx-rs already has a Vulkan portability interface.
DM: it’s working toward an interface. Trying to understand Jeff’s point better. Is there an advantage to having a common header file used by all of us? So we define Web IDL, and have WebAssembly call-outs go through that IDL?
- JG: more or less. What’s being proposed here is putting together a native header and running our polyfill tests link against “WebGPU native”. I think that’s not an ideal direction. Would rather stand things up inside the browser.
KN: on our side, implementing this header is going to happen anyway; it’s not additional engineering effort.
JG: okay. But standardizing across these APIs is something that …
CW: if we have a WASM test suite, it will call out to something that’s the WebAssembly API for WebGPU. This header will exist. NXT is going to implement that header, or something close to it. And just so we can run the same tests in native and sidestep the entire browser, so we can make this simpler, we want that.
JG: Firefox isn’t going to be separable at that level.
MM: but this is how you define a WebAssembly API.
JG: you can’t link against the WebGPU entry points separately in Firefox.
MM: thought we were going to be the first consumer of a WebAssembly API that calls out without going through JS.
(Discussion about WebAssembly host bindings and import tables…)
JG: but the shape this takes is that you pretend you’re taking in JS.
CW: you can both say that it’s a WebAssembly specific API and that it’s not.
KN: I see how we can not standardizing the header.
JG: seems to be some idea that by standardizing the header that we can run the test suite on a half-stack which is not a goal.
MM: if NXT wants to implement this that’s their own business.
CW: that’s fair. For FF, still think it could be a good idea. You might want to partition things off to be able to test that portion of the stack outside the browser.
CW: basically I’d like to define the C header so that we can start writing tests. Want to do it early so that we don’t have to fix the implementation after the fact to conform to the header. But maybe others don’t have the need as strongly as us.
MM: Safari isn’t going to ship a library that’s “WebGPU outside the browser.”
JG: could be useful to throw a header together that defines how the shim to JS works. But defining more than that seems premature.
CW: what’s the layering between FF and gfx-rs going to look like?
JG: we don’t know yet. One option: do it similar to ANGLE today. Consume it as a pure library. Would be writing against the Vulkan portability API, and then turning it into WebGPU.
DM: so we need the header because we want WASM tests? But we don’t need to standardize it?
MM: if we claim we’re making a WASM API then we need to also make a header.
KN: if it’s not standardized then it would live in the Emscripten project.
JG: having a standard header is not a requirement for WASM today.
CW: Emscripten exposes existing APIs to WASM via shims. This API doesn’t exist at the C level yet. You need function prototypes for everything.
JG: yes, but the point here isn’t that you don’t need to use Emscripten’s OpenGL layer to call out to WebGL. Emscripten’s is the de facto default.
MM: would be great, in our model, if we could cut out the JS shims and go directly to the implementation.
JG: would be cool. But think we don’t have all the information we need to work on that. We have an idea of where the WASM group is going, and where we’re going. But don’t want to put too much investment on that.
KR: We will work with WASM CG and WebGL is a prototype for it. We still need a C header so we can compile WASM apps against it. It can be a Javascript shim to start and the shim can be eliminated later down the line.
JG: as long as we consider this a work-in-progress, MVP shim, that’s OK.
KR: sounds good.
CW: sounds correct.
CW: on our side, we’d like to have something we can tentatively write against. How about we produce a strawman “this IDL translates to this C structure” and show it? Discuss, see if there are any red flags? Would be more or less a straight translation of the IDL.
JG: proposals welcome.
MM: how will we handle Promises since there’s no runloop in WASM? One obvious thing is eliminate all places that use Promises, but we could possibly do better
CW: I think we can do callbacks. That’s the way to do promises in C.
KN: can also have a pollable Promise-like thing.
RC: what about all these dictionaries?
- CW: thinking extensible structs like Vulkan.
JG: can throw structs at it. Because you’re statically linking against it.
CW: but if you want extensions you need the “pNext” pointer.
MM: how does D3D do it?
JG: COM.
CW: D3D doesnt’ have extensions per se. Add new features, query a device, then try to cast it to a Device1 type. They don’t have extensions going in multiple directions at once, just foward.
MM: so there are three options and all are terrible.
CW: yes, all 3 are terrible but the community will build things on top of it.
KN: we don’t have a choice, a C API is the way you bind things into other languages.
JG: static linking can work. The COM thing can work.
KN: what if you want to use WebGPU from ruby?
JG: they write their own bindings.
KN: there needs to be a C FFI.
JG: you can just link against the thing? How do you call D3D from Ruby?
CW: it’s C.
JG: but D3D isn’t C.
KR: we have to keep WASM apps running across WebGPU feature upgrades.
CW: there will be a small amount of shimming on the non-critical function calls.
MM: (something about calling function 1 vs. function 2)
JG: the shims we’re talking about writing don’t have to be that consistent and portable because it’s always statically linked into the app. We have more leeway here. The API surface (which can’t change easily) is the WebIDL one. Everything else is what we want it to be.
CW: we’ll take an AI to translate some of the WebIDL to a C header.
JG: if you want to take a subset of the API and do that that’d be great.

Agenda for next meeting

JG: More stuff about error handling?
- CW: Think we’ll just merge.
JG: get the questions ready for the advisory list.
- KN: review them beforehand.
CW: how do we do mathematical definition of what is allowed and not allowed with implicit memory barriers? e.g., this is allowed, this is a validation error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly