Skip to content

Minutes 2019 02 11

Corentin Wallez edited this page Jan 4, 2022 · 1 revision

GPU Web 2019-02-11

Chair: Dean (Corentin was out)

Scribe: Ken (great! as usual)

Location: Google Meet

TL;DR

  • JF continuing to work on WebKit implementation and demo
  • AE continuing to work on Chrome IPC/wire work
  • Mozilla’s native implementation taking shape
  • #196: Validation Error Logging
    • General agreement.
    • Error message strings implementation-defined but we could sanity test them in “optional” tests.
    • KN: Should be orthogonal to object status queries.
    • RC: Not really an MVP blocker. KN: True.
    • Merged.
  • #197: tryCreate*
    • Design needs iteration since we will also need object status for tests. KN: Probably can fold them together.
    • KR: Need to trust drivers not to break down on OOMs.
    • Using an OOM object could be device lost instead of validation error. May allow better dedup of recoverable and non-recoverable allocation entry points.
    • KN: Will iterate by next week. Maybe will keep something close to what we already had.
  • #198: Device Loss
    • General agreement.
    • Slightly conflicts with the unsolved requestAdapter vs getAdapterList decision.
      • Finish #198 first, then work on that.
    • RC: Have comments about handling requestAdapter rejection. KN: Will address. Hope we can merge this week.
  • #188: Binding Model
    • Discussion (largely recap), no resolution. Wait for CW.

Tentative agenda

  • Error handling
  • #188 and binding model
  • Vertex input and attribute changes
  • Agenda for next meeting

Attendance

  • Apple
    • Dean Jackson
    • Justin Fan
    • Myles C. Maxfield
  • Google
    • Austin Eng
    • Dan Sinclair
    • James Darpinian
    • Kai Ninomiya
    • Ken Russell
    • Shuai Shao
  • Microsoft
    • Chas Boyd
    • Jungkee Song
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Yandex
    • Kirill Dmitrenko
  • Elviss Strazdiņš
  • Joshua Groves

Implementation Status

  • Apple
    • JF: have a patch that’ll let you specify a depth attachment. Next thing is to redo impl of buffer uploading because that’s approved and merged. Right now we have a simple persistently mapped buffer.
    • DJ: that means that JF can do a simple rotating cube w/o depth culling. Next it will have it, then be broken by buffer uploading, then fixed.
    • JF: I have a demo that I’m keeping up to date.
    • DJ: we should also upload it to the repo. Right now it uses MTL shaders, but within the week it will use WHLSL.
    • JF: I have two versions. One is Frankenstein, only works on WebKit internal version, and one’s a reference that I finished over the weekend. We should reach an agreement on what a WebGPU demo should look like when impls are ready for MVP.
    • DJ: technically WebKit nightly should still work for this demo, otherwise build for yourself and it’s behind a flag. (Enable developer menu, turn on item, etc.)
  • Google
    • AE: investigating how we’ll integrate impl into Chrome with all cross-process and cmdbuf impl. Have a plan to move forward. Working with fences now, may have small updates. Next we’ll look at buffer uploads. Still need to match WebGPU on buffer and error handling APIs
    • KN: we had a WebGL bug fixit last week. Corentin probably did WebGPU work last week.
    • DJ: we’re trying to write tests for this. Should make progress on Kai’s test harness.
    • KN: yes. I’ve been prioritizing IDL snapshot over tests recently.
    • DJ: right now even with your harness it would be possible to write simple reftests.
    • KN: should be easy to copy.
  • Microsoft
    • RC: not much new. Reviewing lots of pull requests. Brandon and Brian from Intel have been contributing to Dawn. Resource allocation, debug layers.
  • Mozilla
    • DM: progressing toward native impl. Can render textured cube. Fairly close to what Apple has in the browser, but we haven’t started our browser integration yet.

Error handling

https://github.com/gpuweb/gpuweb/pull/196

https://github.com/gpuweb/gpuweb/pull/197

https://github.com/gpuweb/gpuweb/pull/198

  • KN: #196: (Error logging callback)
    • Gives you OOB channel to receive validation errors. Generally, app doesn’t need to know when programmer’s made a mistake. Can be used to generate a bug report, etc. Reports validation errors in English. Does so through event mechanism.
    • DJ: will we define the strings?
    • KN: no plans to.
    • DJ: so in tests we’ll just verify that validation error fired, not that something useful was reported?
    • KN: we may want to add back some object status query just for tests. Can be orthogonal.
    • DJ: validation error won’t have a property keeping a reference to the thing that was bad, right?
    • KN: used to be there, but figured we can just use the debug labels to refer to them.
    • KR: Shrek didn’t you just deal with a WebGL bug that had to have the string “drawElements” in the error message?
    • SS: it was a typo.
    • KN: it was drawArrays vs. drawElements typo.
    • JG: in WebGL tests we tend to require e.g. INVALID_OPERATION vs. INVALID_VALUE errors.
    • RC: will we require specific error messages?
    • KN: maybe. Might be beyond conformance. Plenty of heuristics browser might use to avoid spamming error messages.
    • JG: if impls try to give good error messages, probably more a developer tools thing.
    • RC: API does spec what makes an object invalid, etc.
    • KN: we will have to test those.
    • RC: ok, for those, you’ll have to raise this event in those cases.
    • KN: that’s what I haven’t thought about yet. Since we have to test that an object is invalid in certain cases, objects have to be able to know whether they’re valid, and telemetry isn’t the way to know that.
    • RC: ok. I have no objections to details of PR, but: is doing this telemetry something people have asked us to do? Doesn’t seem to be an MVP thing.
    • KN: I understand this.
    • JG: it’s easy enough to do. Have prior art (KHR_debug extensions from OpenGL are extremely useful). INVALID_OPERATION reporting is useful. Same for WebGL’s warnings to the JS console.
    • KN: my thought was that we’d print the console warnings separately.
    • RC: I’m all for console warnings but have people asked for this?
    • KR: Maps team definitely verifies that the app is running correctly on users’ machines.
    • RC: ok.
    • JG: sgtm
    • DJ: do we want to say this one is accepted? Doesn’t have anyone say “tick, reviewed” on it.
    • KN: would anyone like to review this offline before we merge it?
    • DJ: I think we should just merge it.
    • JG: +1.
    • RC: fine merging it.
    • KN: ok. can’t merge rest because they’ll conflict.
  • #197: tryCreate.
    • KN: this is interesting. Gives us a Promise API for allocating Buffers and Textures specifically for trying and failing memory allocations. Issues: if you do fill up memory and you make another required allocation in the app, and it fails, your app dies.
    • This particular flavor, with Promises, is reasonably clean. Similar to createReadyPipelines. Can’t speculate on something where you allocate lots of objects and then fail cmdbuf submit. Probably fine. But maybe that’s not the implementation we want to have.
    • JG: would CIN be an approach for the infallible ones? Create these, I expect them to succeed, and if they don’t it’s a problem for the dependency graph?
    • KN: yes. Problem: would have to be able to check object status. Need that for tests anyway though. Also don’t want to print errors to console until you actually submit cmdbuf. Can delay error message a lot. Not sure what value you get from being able to allocate graphs of objects against a failing one.
    • JG: what’s strange to me is that if you have an unchecked allocation which fails I don’t think we should incur device loss. But maybe we should.
    • KN: incurring device loss for failure to allocate is something I’m not sure about. It makes sense to not let your app run in a state that’s invalid.
    • KR: Experience from WebGL. In Chrome, OOM is typically fatal. FF can generally recover from e.g. texture allocation. I tried to remove limitations on backbuffer. Relying on drivers to report and recover from OOM caused crashes on multiple platforms.
    • JG: FF’s structure is constrained. When you have a malloc/realloc equivalent, we flush everything at that point. Check on client side, did you actually get the memory? e.g. BufferData, TexImage2D, we look for OOM at that point.
    • KR: Haven’t solved bug, so don’t know why it happened. But have seen multiple platforms that behave poorly when a test OOMs.
    • DJ: I’m taking your comment to say we should device loss if we can’t create a buffer.
    • KR: not really. I think mainly we have to try something. Need experience with trying to recover. WebGL can’t query memory size. Has been a problem. Have needed to advise developers on heuristics like sizing texture caches to 100 bytes per pixel of the canvas.
    • JG: point about fallible allocations. Think you could do it with either Promise or CIN approach.
    • KN: think that if we were to have only one entry point it should not return a Promise. Since we need a mechanism for testing we should probably choose one. Certainly possible, and since we need a way to test object state for tests, would make sense for them to be the same.
    • JG: I like that idea.
    • KN: I think this basically means not merging this PR. We did have object status before this PR, but might want to rephrase it.
    • JD: is the suggestion we should have only one entry point for allocating buffers and textures?
    • KN: yes.
    • JG: all entry points have to be validatable. Seems redundant.
    • KN: might be nice to have entry points / arguments where if you allocate things you lose the context.
    • KR: can that be done with a debug layer?
    • JG: or do it yourself.
    • KN: probably. Let me think about it.
    • KR: we should make sure that if it’s a debug layer it doesn’t kill performance.
    • KN: we will think about that. Will have a way to lose the context. Not so much a debug layer as implementing a slightly different semantic on what we give.
    • RC: ok, so for now, not doing promises, all allocations appear synchronous, there’s CIN, and out-of-memory causes device loss?
    • KN: not aiming to do device loss. Can recover.
    • JD: if you get OOM and use it, is that device loss?
    • KN: planning to have this be a validation error. Let’s discuss more later after rethinking.
  • #198: fatal errors. (Device/adapter loss)
    • KN: this is the simplest API I could come up with for app being able to recover without having to do crazy things like “wait for this promise, check that flag over there, etc.”. What do people think?
    • JG: generally like it. For requestDevice, we’re now requiring a device loss callback.
    • KN: there’s some discussion on that in the comments.
    • JG: think it should be part of the device descriptor.
    • KN: probably more likely it’ll be changed to an event. Issue: requestDevice, then attach event, have to be guaranteed that there wasn’t an event fired that you couldn’t see. Then app will be in a weird state. There was some discussion with Francois, Rafael, etc.
    • RC: yes. Now that you’re making this two promises, do we get rid of the notion of an adapter? Maybe you just ask for the required list of extensions.
    • JG: I don’t think we should do that but can we discuss that separately?
    • RC: if we have adapters then we need the same kind of events there.
    • KN: Corentin did reply to your comment overnight. Removing the adapter prevents you from learning about limits, extensions, etc.
    • RC: WebXR got rid of device notion recently. You say what you want. Reduces fingerprinting.
    • DJ: that’s the way media queries work too. You don’t get an enumeration of all of the cameras, but you say what you want up front.
    • JG: think it’s hard for graphics APIs to describe what you want ahead of time. For example everyone tries to use eglChooseConfig up front, and that doesn’t work so they enumerate everything.
    • DJ: is that because you need logic in your configuration function?
    • JG: yes.
    • KR: perennial problem with config choosing. WebGL kept it simple (getContext args) deliberately because of problems trying to enumerate the configs.
    • DJ: what are the knobs you want?
    • JG: discrete vs. integrated.
    • KN: this is actually slightly separate. Today we have requestAdapter and that doesn’t let you do this selection logic. The sketch IDL doesn’t have enumeration.
    • JG: can we keep talking about this in a separate PR? Think we should handle device / adapter loss in an 80% fashion.
    • KN: SGTM.
    • Discussion about Promises resolving to null vs. rejecting.
    • DJ: OK, let’s resolve to merge this one as is. Kai or Jeff can open issue/PR about requestAdapter issue?
    • RC: if we leave the adapter in there, sample code doesn’t handle all situations where adapter fails. Will we take that up as a separate issue?
    • KN: let me think about that particular feedback more and make sure my sample code is handling the right cases. Need to document what it means for requestAdapter to fail. Separate issue of enumerating a list of adapters, we should discuss afterward. If we decide we don’t want a list of adapters then we can discuss requestAdapter again.
    • RC: ok agreed.

#188 and binding model

https://github.com/gpuweb/gpuweb/pull/188

  • DJ: Corentin was vocal in the pull request. Myles / Apple not completely happy with resolution. Maybe we can talk about that next week.
  • DM: I would be happy to talk about any possible misunderstandings about my comment, and describe how descriptors work in all of the APIs. If anything is unclear I can clear it up.
  • MM: I think I understand it. Fundamental disagreement is: what is the relative importance of being compatible with HLSL vs. impl complexity?
  • JG: my off-the-cuff answer: if we require people to add a line or two to their shader during porting, that would not be the end of the world. If that’s equivalent to polyfilling them I’m OK with that.
  • MM: I think that makes sense. The specific change is rewriting all of the register annotations so the numbers don’t collide. It’s a global find/replace. In general I think your point makes sense. The reason we’re interested in HLSL source compatibility is because this is something this CG said they were interested in.
  • JG: there’s not a midpoint. I think we derive value from being ~97% compatible.
  • MM: I think that’s OK but I think we need to decide what the 3% incompatibility is.
  • KN: the # of lines for bindings will generally be small.
  • DJ: though this will be a small change per shader it probably occurs in a huge number of shaders. Myles can correct me but our research shows that the number of changes would be large.
  • KN: there’s an important distinction between D3D11 and D3D12 shaders. I understand the root signature is usually done outside the shader source in D3D12. It’s not that we want to inherit that complexity but it ...
  • KR: point about not overspecializing the binding model for D3D11 shaders.
  • DM: upon further consideration I don’t think we should have the map I described. Most of the 3 APIs agree on the binding model and they don’t have this map. It’s not that we have to change the shaders. All of the D3D12 apps using these signatures have this root signature mapping. They just have to pass it in some shape. It’s not something new, it’s something they already have. It’s just providing the API to pass that info to us.
  • KN: this is true both for D3D12 and Vulkan apps that use HLSL shaders.
  • MM: OK, so either: providing an API to supply this info, or restricting WHLSL further.
  • KN: idea to have a flavor of WHLSL where you don’t use the type as part of your index. There was some discussion about this.
  • MM: I don’t think this is worth adding a flavor. Perhaps the API construct might only be necessary for source code that has conflicts.
    • KN: Code without conflicts is really what I meant by flavor.
  • DJ: would be nice to resolve this next week if Corentin’s around.

Vertex input and attribute changes

  • Not done this time

Agenda for next meeting

  • Vertex input and attribute changes
  • #188 (WHLSL and binding model)
  • DM: #205 (pipeline layout). Corentin proposed to remove the interface.
  • KN: will try to make changes to #198 and probably merged before meeting. #197 will see if we’re ready to discuss again next week.
  • DJ: maybe pull request burndown.
Clone this wiki locally