Skip to content

GPU Web 2020 04 13

François Daoust edited this page Dec 1, 2020 · 1 revision

Chair: Corentin

Scribe: Ken / Austin

Location: Google Meet

Tentative agenda

  • Data upload #605, #649, #650** **and #680
  • Add pipeline shader module error enumeration. #646
  • Zero-sized buffers #546
  • Binding to zero-sized range of a buffer #686
  • Copying of depth24plus formats #652
  • Disallow storage-buffers/textures in fragment shaders in core #639
  • Pipeline statistics query #691
  • Agenda for next meeting

Attendance

  • Apple
    • Justin Fan
    • Myles C. Maxfield
  • Google
    • Austin Eng
    • Brandon Jones
    • Corentin Wallez
    • Kai Ninomiya
    • Ken Russell
  • Intel
    • Yunchao He
  • Microsoft
    • Damyan Pepper
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Matijs Toonen
  • Mehmet Oguz Derin
  • Timo de Kort

Administrivia

  • CW: Please get your legal to rereview the draft charter for the WG.
    • JG: my understanding: multi-stage process where there’s formal submission for approval, but before that there’s all-but-official pre-flighting pass. If we’re at the pre-flighting pass, we need to do things, but before that we don’t.
    • CW: I can ask about it, but I’m pretty sure if everyone says it looks good, then we’ll do the formal submission.
  • CW: Next F2F in Phoenix beginning of October (5-9)
    • Would be around the time we'd been talking about going to Toronto
    • MM: fine with me - TPAC is last week of October
    • CW: we'll need to find a F2F spot for ~3 months after that. Not sure about Toronto in February.
  • CW: W3C asked about chairs for the WG
    • Typically the same chairs between the CG and WG
    • Opinions?
    • MM: having same chairs seems reasonable
    • JG: absent any reason to deviate, like someone with experience doing W3C stuff

Data upload #605, #649, #650** **and #680

  • CW: concerns I've heard:
    • Want range for mapReadAsync.
    • JG wants two map calls to be made the same again.
    • MAP_READ and MAP_WRITE allowed.
    • JG: that's something we could do later. Could change restrictions, but if we wanted to make that change, it'd be weird if the API was only read OR write.
  • CW: most people were happy-ish with #605?
    • Where the two calls become mapAsync with offset / size?
  • MM: WriteButter is also being included in this proposal?
  • CW: I say yes
  • JG: I think the answer's no. Different discussion to me.
  • MM: doesn't have to be writeBuffer, but some way...
  • JG: how much does it change your opinion to have this proposal if we might not have writeBuffer? How dependent are those two things?
  • MM: goal from our side: should be an API well-behaved by default, where you don't have to do memory management to upload and download data. Your data should be able to be transferred without you having to do a lot of bookkeeping.
  • JG: if we didn't have writeBuffer, would that change your opinion on this proposal?
  • MM: yes.
  • JG: is there a shape for this proposal that satisfies your concerns that isn't writeBuffer?
  • MM: not off the top of my head. Will think about it.
  • JG: there's some discussion offline about getMappedRange when Buffer's not in the Mapped state. API looks like it's infallible, but has to either return null, or throw.
  • CW: there's still the state machine. Can't guarantee you can't get the ArrayBuffer if the thing isn't mapped. So, throw or null. Up for discussion.
  • MM: core tenet: can't get mapped range w/o mapped buffer, and can't map buffer for the current frame. So, can't have a simple modification of this API that allows for the use case of someone wanting to upload something for the current frame and not having a queue of these lying around.
  • JG: the concern is that you'd need a polyfill.
  • CW: #605 does have mapAtCreation which allows you to shim this. But the efficiency won't be great. That's why I think we should have a writeBuffer at the same time. Technically possible to shim. mapAtCreation lets you get upload space synchronously.
  • MM: doesn't this have the same problem as createBufferMapped, where you have to destroy the buffer to not have unbounded mem growth?
  • JG: true, but true of a lot of patterns on the web.
  • DM: not true for writeToBuffer.
  • JG: true, but you have different tradeoffs. Buffer / data management will be our drawArrays- you can't just begin(triangles) any more. It's part of the complexity but also why the API's more efficient.
  • MM: think we can have an efficient API while still having a simple one (?)
  • JG: you can't have writeBuffer more efficient than what you can build with mapBufferAsync.
  • CW: true in terms of copies, maybe not in terms of garbage.
  • CW: Having an efficient staging allocator in terms of map async requires a bit of work and hooks into the app -- before / after submit, etc.
  • JG: this is polyfillable.
  • MM: point is: want API that is well-behaved by default, even without a polyfill.
  • JG: I see using writeBuffer to the exclusion of anything else as a perf and reliability footgun. Tradeoffs in implementing writeBuffer are better off done exposing the mapping primitives.
  • DM: what footgun?
  • JG: extra copies, synchronization, data still being live in-flight. Different fast paths you might want writeBuffer to have. Not one best fast path for all writeBuffer usages. If you give people this primitive, they'll use it for small uploads, big uploads, etc., and there isn't one impl that handles all of those cases without heuristics.
  • CW: How about the browser can emit warnings if you upload a ton of data every frame. Second, the API should allow for a mostly-happy path for very common and easy operations so people don’t get it wrong. There’s a lot of value in having an option that works, and is not bad. It might not be the greatest, but you know it will work and you can’t get it wrong when you get started.
  • JG: If you’re just learning, you won’t be writing from scratch.
  • MM: I don't think that's necessarily true.
  • JG: can you point me to a tutorial that onboards people to graphics without using a common library?
  • MM: There’s plenty of tutorials for learning Metal.
  • JG: how did they deal with matrices?
  • MM: Metal is only available on systems that also include a matrix multiplication routine. SIMD have a builtin library and GL Kit which does stuff like perspective transforms.
  • JG: that was my question. If you don't have those things, there will be library / code you're importing that you don't necessarily understand. Will have primitives / constructs that aren't in the core API. Will be bunch of impls of common workloads, upload chains, etc. The desire to take a 10-line polyfill and make it a core part of the language is moving things in the wrong direction from a discoverable design standpoint. Now you have a "writeBuffer" and you don't know how it's implemented / what it does. Mapping primitives are clearer.
  • MM: you are right, having understandability is valuable. API with good defaults is also valuable. A compromise would be good. Both APIs are a compromise. If writeBuffer were the only primitive, I'd be more sympathetic to your argument.
  • JG: I worry the area we’re compromising in is not fully understood by all parties. I feel some are going to hope it can stay slow and simple, and some think it’s a path that can be made fast with client-side tracking, etc. I feel the gaps in that understanding will make us choose a path that has flaws.
  • CW: what you're saying is that we can't see the future. It's true. There are certain ways to evaluate these opinions. We have little experience with WebGPU API so far.
  • JG: we have experience here - WebGL, OpenGL. That's what's advising my stance as a software / design professional. Think it will lead people into problems. That's why I'm advising against this. But it's just advice.
  • KR: I thought we were close to agreeing that if writeToBuffer had hard limits on the upload size, but we would strongly encourage other primitives for larger uploads, would we be on the same page?
  • DM: We didn’t want those restrictions. If you already have an ArrayBuffer of data, then writeToBuffer is as-efficient, if not more efficient, than mapping.
  • JG: not sure I necessarily agree with that.
  • CW: the number of copies stays the same, and the tracking's simpler if you already have your data in an ArrayBuffer which is definitely the case for WASM.
  • CW: would it make you more at ease if we added a warning if you tried to writeBuffer too much data?
  • JG: think it's easier to write a setSubData that works well for large data than small data. writeBuffer for uniform matrices, impl is tricky because you have to coalesce writes, passing through intermediary staging buffer, and that's what I want to avoid. Small amount of data was to surface embedding it in the command buffer. Then it's clear from a design standpoint where the data lives. with writeToBuffer, there’s heuristics in coalescing writes and then scattering them to correct locations.
  • CW: not sure I see where the complexity is.
  • JG: writeBuffer, create shmem of that size, in GPU process you map the resource and copy from shmem into destination. That's actually not what people want. People want writeBuffer to be good enough for general case, and I don't think that's good enough to work with a moderately complex scene with thousands of objects. Uploading thousands of matrices per frame, would not work well.
  • DM: so you'd coalesce those into a bigger submission?
  • JG: probably.
  • DM: conceptually same thing as going through the staging buffer.
  • JG: I think people should write the staging buffer themselves and understand the problems with it, rather than relying on the driver to do the good thing. And there would be pressure to optimize for this path. If we include this in the API I want us to do this expecting to have to implement it well, and that's non-trivial complexity.
  • CW: in Chromium we're looking at doing the shim, but in the browser. Allocate chunks of shmem, say this shmem range goes into that buffer. That's it.
  • JG: at this point I've given my advice, please take it into account. I'm still a weak "no" on writeToBuffer.
  • CW: I'd like to make progress on #605 and let's have more discussions on writeToBuffer or alternatives.
  • JG: could you coalesce #605 into PR with the final IDL?
  • CW; I will. Can we make this return null if the buffer/s not int he mapped state?
  • MM: yes.
  • CW: what about single mapAsync, but buffer can only be MAP_WRITE or MAP_READ?
  • MM: fine with me.
  • CW: I'll write IDL and bikeshed.
  • JG: thanks.
  • KN: nit: it's a programming error, should be error to fetch buffer in unmapped state?
  • DM: usages are constrained? How will you know what's written?
  • CW: good question, let's discuss more on the PR.
  • CW: how do we make progress on writeToBuffer?
  • DM: Jeff raised a good point. We do need to understand better the heuristics the browser needs to do to handle multiple small writes. Hidden surprises for users.
  • CW: our impl is naive right now. Embed in command stream and it's slow. Would it help if we wrote a design doc of what we plan the final impl to be?
  • DM: would appreciate.
  • MM: I'd appreciate that too.
  • CW: would appreciate Austin's help on that. I'll help.

Add pipeline shader module error enumeration. #646

  • JG: made some changes based on feedback. Message Text now. On ShaderModule rather than pipeline. There's some language about source maps. PR #645 has source maps. This probably depends on that.
  • RC: don't we eventually need this on both the ShaderModule and the pipeline?
  • JG: good question. There will likely be linking errors. Those would be at pipeline creation time. Would not have the benefit of this structured output. But tricky to have this structured output anyway. Link error: two sources you're referring to, or maybe two things in the same source. Linking errors will be less common, perhaps? I think they need less of this error reporting structure.
  • RC: linking errors you're thinking would go through normal push/pop error scope?
  • JG: yes.
  • CW: in C++ IDEs, you can see compilation errors right as you code. Linking errors are more subtle, depend on overall source, only see them when you put everything together.
  • RC: OK. Will a shader module with compilation messages, it'll be an error to use it in a rendering context?
  • JG: we do have info messages. For errors we do assume that we'll have a compilation message of type error. Only if that's the case will it be an error to use.
  • MM: It might be valuable to provide a convenience boolean that tells you if you can use the ShaderModule or not. So turn it into status + sequence<message>.
  • JG: if we do that I'd actually make a new interface for compilation results, that would have these things on it.
  • MM: makes sense.
  • CW: until we agree on the source maps, can we have it separated for now?
  • JG: could I get a straw poll on the source maps proposal, could we just get that in?
  • CW: one concern about it.
  • JG: OK, just file it, thanks.
  • JG: since there are concerns I can remove the source maps.
  • CW: not a concern, just could be a discussion.
  • RC: we're sure we want this as a separate thing from push/popErrorScope?
  • JG: this type of error benefits from structure. Instead of using the other path and errors in different formats, could have more structure to getting them. Just text messages coming out of pushErrorScopes.
  • RC: the other error scopes do have structure. Out of memory, no string, and validation error, which does have a string. I can see where it's not symmetric.
  • JG: Other reason is that it lends itself better to being done in parallel. With push/pop there’s less of that, maybe.
  • CW: maybe. Maybe compilation result, if we have reflection data, would be an interesting place to put it.

Zero-sized buffers #546

  • CW: the question comes up because we have crashes. Not clear what we do.
  • DM: backend would have to treat effective buffer size as larger than requested buffer size, e.g. because of alignment Vulkan requires. Effective size > 0 even though user requested 0 size. Don't think it's a big deal for us to support it.
  • DM: binding zero-sized range of buffer is a separate issue.
  • CW: no concern here. And robust access rules / validation should ensure we never touch that data. Or, it's a bunch of zeros. Can't write it, and if you read from it, the robust access rules will either make you not read it, or read zeros.
  • JG: seems to me something that doesn't need to be in MVP, but probably easy to do. Seems pretty easy, slight complication though. We could always add this later.
  • MM: think it matters whether it's an exception or just returns null. If we return null and you try to map it, that'll be an exception, so maybe it's no difference?
  • JG: you can map it but just can't request a single byte of it.
  • MM: I thought the two options were, upon buffer creation, throw an exception, or return an object that's not mapped by a platform buffer.
  • JG: I'd think it'd be a buffer, but in the error state.
  • KN: yes, question is whether it's a buffer in the error state, or a buffer you just can't read from.
  • MM: what's the difference?
  • JG: if you had a writeBuffer polyfill and naively requested length of ArrayBuffer, created buffer from that length, and enqueued series of copies, it'd be either a bunch of no-ops or a buffer upload failure. You can't do anything with zero-length buffers.
  • MM: someone will have to have an if-statement somewhere. In browser or application source code? Sounds like there's benefit to having it in-browser.
  • CW: in the browser, it’s only at buffer creation time, and the rest of the validation in the API which uses the size provided at creation will just work as expected.
  • JG: in GL I think this is valid. zero-sized buffers, textures, enqueue copies out of them. Strange use case, but helps with not having to do "if (total_size > 0)". We can give it a try and remove it if it becomes complicated.
  • KN: I think we've had two users run into this.
  • MM: because they're not allowed in Chrome?
  • CW: because they run into crashes in D3D12 because we don't have the if-statement.
  • DM: it's just an addition to the logic we all already have to compute the effective allocated size.

Binding to zero-sized range of a buffer #686

  • DM: it's an investigation. There are things that may go wrong, like asking for size of buffer in the shader.
  • CW: let's come back to it next week.
  • RC: thought it'd be a validation error if you used the zero-sized buffer? When will we run into these in the shader?
  • JG: One of the options is to not make it an error, it’s only an error when you try to bind or read/write more than 0 bytes.
  • CW: it's OK to move around null or garbage pointers in C++ as long as you don't dereference them. Same reasoning here.
  • RC: what happens when you try to read the buffer in a shader?
  • MM: same OOB access handling
  • DM: same problem as zero-sized binding. Whether we allow zero-sized buffers doesn't solve the zero-sized binding question. The use case I had in mind: you have N things the shader might want to access, they're in a storage buffer, and shader indexes dynamically. It works, when the shader doesn't access it, because you have 0 things there.
  • KN: I'd like this to work.
  • RC: so it won't be validation failure, and robust_buffer_access will handle it.
  • CW: zero-sized buffers are the same as other buffers. Question is, are zero-sized bindings allowed? If not, zero-sized buffers aren't useful.
  • RC: so when you try to bind it with the API it won't let you?
  • CW: depends on resolution of #686.

Disallow storage-buffers/textures in fragment shaders in core #639

  • CW: Jiawei from Intel found that in metal, ... requires fairly recent support. Mostly asking Apple. Do we want to target hardware older than this? unsupported up through A8, iOS 9, and macOS10.11
  • KN: also Adreno 400 series.
  • MM: I don't think we can discuss this in 0 minutes.
  • CW: let's put this on our radar.
  • KN: I remember you had a target platform in mind but I couldn't find it. Possible to write it down?
  • MM: not sure I'm allowed to.
  • KN: ok.
  • CW: let's work on this issue offline.

Agenda for next meeting

  • Zero sized bindings
  • Pipeline statics queries
  • depth24-plus copies
  • Will we have writeToBuffer stuff? Might take a bit longer.
  • PR burndown
  • WGSL side: have used ForMeeting tag for issues. Great idea. If by the homework deadline you want to volunteer an issue, please add that tag.
    • DM: it's mirroring some things in the project lifecycle.
    • CW: happy to change it. Discuss in spec editors' meeting.
Clone this wiki locally