Skip to content

Minutes 2018 05 23

Corentin Wallez edited this page Jan 4, 2022 · 1 revision

GPU Web 2018-05-23

Chair: Corentin

Scribe: Dean

Location: Google Hangout

TL;DR

  • Validation rules for implicit barriers
    • Sub-resources
      • Track sub-resources (per-slice, per-mip)of textures independently.
        • Need to talk about resource aspects at some point (depth vs. stencil vs. color)
      • Do not track sub-ranges for buffers.
      • Discussion about how texture views are expose (separate object or “child textures” like Metal)
    • Write-write hazards
      • Strong desire to prevent write-write hazards in the general case.
      • Difficult point is about UAVs because draw-calls can be reordered but splitting render-passes kills tiling optimizations.

Tentative agenda

  • More on validation rules for implicit barriers.
  • Agenda for next meeting

Attendance

  • Apple
    • Dean Jackson
    • Myles C. Maxfield
  • Google
    • Corentin Wallez
    • Victor Miura
  • Microsoft
    • Chas Boyd
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
    • Markus Siglreithmaier
  • Yandex
    • Kirill Dmitrenko
  • Elviss Strazdiņš
  • Joshua Groves

Admin updates

CW: Reviews of the CLA are still in progress

CW: Hoping that in a couple of weeks everyone is agreed.

Validation rules for implicit barriers

  • CW: Last time we agreed that we need at least read/write memory hazards validated out. There was the hazard with UAV and sub-resources (mip level) that still needed discussion. Has there been any thoughts?
  • DM: The latter seems easy. Track them like Vulkan and D3D12. Do range tracking. Instead of working with a single resource, you work with a set.
  • CW: OK. We thought something similar. As an impl detail, we don’t need to remember things for most textures, because they are just used for sampling. It seems it would be fast enough.
  • DM: That aligns well with the generic advice we were given from AMD. Work with a small number of resources that can change state.
  • CW: Do we want to track ranges of buffers separately?
  • MOZILLA: No.
  • JG: There isn’t a big advantage to using a single mega-buffer. Just use smaller ones. We could also come back to this later.
  • CW: Yeah, we could possibly create sub-buffers from buffers, or sub-ranges, later.
  • MM: The other point to make is that when WebGPU says to make a buffer, we don’t have to always pass it on to the driver. We could reuse an existing buffer, or implement our own suballocator.
  • MM: In other words, if we require the user to make lots of small buffers, we can handle it ourselves.
  • CW: At least for impl on Vulkan and D3D, we’ll have to work on our allocator.
  • RC: I agree.
  • DM: This would mean that you can’t use half a buffer for reading, and the other half for writing.
  • CW: Correct. If developers complain, we can go back to it.
  • KD: I’m worried about performance here if the WebGPU impl isn’t allocating at the time the user asks for it. Streaming content would want allocations to happen instantly.
  • CW: Good point. Memory allocation will be expensive, but shouldn’t be too bad.
  • DM: We discussed this during the F2F, and the conclusion was that we’re not providing API for re-allocation. It doesn’t seem straightforward to go any further right now.
  • CW: We can easily measure this when we have implementations. For Chrome it will never be a problem, but that’s not typical use. We should measure the distribution for allocation times on other implementations.
  • CW: A user could also create a buffer on a worker.
  • JG: I don’t think that will help.
  • JG: If we don’t want to allow sub-view ranges, and if people want a few buffers around, and want to create and destroy them, then we might need to provide API to manage it rather than relying on re-use.
  • CW: Not quite what I meant. I just suggested that applications could hide that cost by performing it off-thread.
  • CW: Sub-resources for textures -> YES
  • CW: Sub-resources for buffers -> NO
  • MM: What are the rules? Are we adding a new API object?
  • CW: Per slice and per mip-level. No new API object.
  • DM: It is still exposed to the user when they have to create a bind-group.
  • CW: They are exposed inside the creation of ImageViews. You specifiy a base slice, range and mip level.
  • MM: So we are having a new object ImageView?
  • CW: Yes. Maybe we’ll have a BufferView in the future, but not for MVP.
  • MM: If I remember, our threading discussion involved buffer views.
  • JG: I think that was one of the theories. Immutable views could be sharable.
  • CW: That’s my recollection too.
  • MM: So we are having a new API object, it just isn’t tracked for hazards?
  • MM: What types are in the API?
  • CW: WebGPUTexture that is the full texture, all slices and mip-levels. When you want to use that texture inside a shader, you create a view. The hazard detection is done per-view.
  • MM: Is it the texture or the view that is hazard object?
  • CW: It is the texture itself but the texture view encodes which range of the texture is read from / written to.
  • DM: In Metal, there isn’t a view, there is a parent texture that is conceptually a view.
  • CW: Can it be recursive?
  • MM: Yes.
  • JG: I think in Metal, a MTLTexture is really a view. You only hold on to views.
  • CW: We should decide whether we want to explain it this way too.
  • DM: The other place image resources are specified is copy operations - image to image and buffer to image.
  • CW: That’s specified when we set up the render targets.
  • CW: I can’t remember how Metal treats this.
  • DG: We might have to track this. I can remember instances where we wanted read-only depth but mutable stencil.
  • MM: Doesn’t seem impossible.
  • MM: I don’t know how this will be implemented. If it involves a custom shader with swizzling, then that would be bad.
  • JG: The idea is as little hidden work as possible.
  • CW: Do we want write/write hazards?
  • JG: Also no.
  • JG: I don’t see why you’d remove read/write hazards, but leave write/write.
  • CW: What if you want to copy from one into multiple?
  • MM: Mulitiple passes
  • JG: Or barriers.
  • JG: You either choose one writer or many readers at the same time.
  • DM: There are use cases where write/write is important. Accumulation buffers.
  • MM: We covered this last week. It can be solved by adding another attachment or splitting it up into multiple passes.
  • CW: I think DM is saying we might want to do something different for UAVs.
  • DM: I don’t. I want to treat write/write hazards to be allowed.
  • JG: I think that term is misleading.
  • CW: I think an example is order-independent transparency.
  • JG: I get the idea.
  • [missed a bit]
  • JG: I just mean that I wouldn’t call this write/write. If you are doing an associative operation, then it isn’t contentious, and should be ok.
  • MM: Our model is that hazard detection is that it happens at submission time. And that fits Jeff’s model.
  • CW: My feeling that preventing w/w is ok in most cases. But since starting and stopping render passes is expensive, we shouldn’t insert barriers automatically in render passes but allow the UAV footgun instead.
  • MM: You example above requires reading from the buffer also. That’s not a w/w hazard, it’s a r/w hazard (that is on atomics so ok)
  • CW: My feeling is that if they go into UAVs they are getting a footgun.
  • JG: I would prefer to not do it and just warn in a debug mode. Don’t make the API harder to use just to eliminate half the possible errors.
  • JG: I think this requires more thought.
  • MM: Proposals - 1. Disallow 2. Allow everything 3. Add machinery that no-one has thought of yet, to work out what hazards are not real
  • CW: My proposal is 1 but allowed for UAVs
  • DM: My proposal is 2 but… (missed this)
  • DM: (describes why 3 won’t work)
  • MM: All these lists are in one big buffer. We can’t know if there are any collisions.
  • CW: You’re asking that the impl doesn’t need to do analysis to work out if there are problems.
  • CW: I think someone invented an AppendBuffer, but it didn’t work.
  • DM: They are not truly associative.
  • CW: Try to give it more thought. Write some more formal definitions of our proposals.
  • DM: If you exclude UAVs, then your own test case wouldn’t work efficiently anyway.
  • CW: Right, we should just think about this case.
  • RC: You’re saying that if you put a bunch of write commands, we don’t check. But as soon as you do a read, then the machinery kicks in.
  • CW: Inside a render pass I’d like to use the same r/w UAV between draw calls
  • MM: In a world where D3D can re-order draw calls, and we’re not inserting barriers inside passes, then how would this work?
  • JG: We’d have to work out what barriers to insert.
  • CW: You can use UAVs as much as you want inside a path - no hazard detection. And you’re on your own.
  • CW: These APIs have extremely predictable parts. But fine-grained parallelizism is controlled by the developer.
  • MM: If we accept this, then we either have to say draw calls inside a path can be re-ordered (which makes it very hard for the author), or the D3D backend will have to insert barriers.
  • DM: Myles, they can’t be re-ordered.
  • RC: If two vertex shaders have UAV access, order is not guaranteed. But for pixel access it is.
  • DM: Outside the UAV it is strictly ordered.
  • RC: Correct.
  • MM: WebGPU has to swallow that and make it part of the API, or D3D has to enforce the order.
  • CW: Even Metal has some parts that do not define an order
  • CW: Without allowing this, write access to UAVs is basically useless.
  • MM: Unless you do multiple passes
  • CW: That’s expensive
  • CW: (describes tiling GPU and UAV access).
  • CW: Splitting them up breaks tiling optimizations. I don’t think any API has a way around it.
  • MM: OK, so we have to go with “we swallow it” and spec it as part of the API
  • RC: You mean allow unchecked writes. Developer must order them by render pass.
  • RC: If we can determine that two shaders are writing to the same thing, we insert barriers.
  • MM: The WebGPU runtime can work out if a resource is read or written in a draw call.
  • MM: What do you do when there is a conflict?
  • MM: 1 Insert a barrier
  • MM: 2. Return an error and make the developer fix it
  • MM: We proposed 1 right from the start, but people disagreed.
  • RC: But I think we changed our mind at the F2F.
  • CW: Inserting a barrier would involve flushing the tiler. We really want one render pass to correspond to one actual hardware pass. It seems like we can’t do this.
  • RC: What does Metal do? Does it look at the shader and put in a barrier.
  • MM: Nope. Same as D3D. Anything goes. You put in barriers by making new passes.
  • MM: Which means to do this, Metal backends would be making more passes.
  • CW: Outside render passes, Metal does everything it can for you w.r.t barriers. Inside a pass, it does nothing
  • CW: It even does more and adds barriers inside compute passes, just not in render passes.
  • MM: The design for the types of problems that hit this is …
  • DM: I’m not convinced
  • MM: Right. This isn’t an automatic conversion.
  • CW: Let’s continue that next week.

ISV/IHV Questions Feedback

  • CW: Let’s do this next week too.
  • JG: Just wanted to give people a change to provide feedback.

Agenda for next meeting

  • Sync rules
  • Formal definitions of proposal
  • Gather more feedback (e.g. CW to talk to mrdoob)
  • Discussion about the testing/programming language
Clone this wiki locally