GPU Web 2020 08 17

Chair: Corentin

Scribe: Ken + Austin

Location: Google Meet

Tentative agenda

New test development process, and invitation to contribute (Kai)
Out of range loadOp clear value behavior #965 (Corentin?)
Query API: execution time of the Dispatch command #992 (Kai / Corentin)
Copy Validation of Depth/Stencil textures #997 (Austin)
Case for empty usages #1001 (Dzmitry)
PR burndown
- Add depth texture component type #970
- Add rgb9e5Float texture format #975
- Internal usage flags and rules #994
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- James Darpinian
- Kai Ninomiya
- Ken Russell
Kings Distributed Systems
- Daniel Desjardins
- Dominic Cerisano
- Hamada Gasmallah
- Wes Garland
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Matijs Toonen
Mehmet Oguz Derin
Timo de Kort

WG Creation Updates

CW: Creation of the WebGPU working group has been approved. Some things still being set up.
CW: Boundaries between the WG and CG should be pretty porous. CG makes spec and WG stamps it as a W3C recommendation. The idea is that the CG does most of the things.
CW: Francois suggests:
- WG reuses the public-gpu@ mailing list for its public mailing list. Think this should be non-controversial.
- internal-gpu@ could be the internal mailing list of both the CG and WG. More controversial. Not sure how this works in W3C, where some members are in CG but not WG. But there are other examples of this.
- CG and WG meetings could be the same. We’d call out when it’s a WG section that talks about recommendation track, or other WG-specific things.
- We’d probably get pushback from the W3C in order to make some things more explicit.
DJ: makes sense to me. Don’t see why we can’t have joint meetings. “Switching into WG mode”. Not sure it matters - our WG operates in the public, so we allow visitors, like we do now for the CG (they just join the CG). Think from an IP perspective it should all be OK. Obviously we wouldn’t allow random people from the street to contribute new graphics algorithms.
CW: what if some IP discussion comes up? Switch to full WG mode w/only those participants?
DJ: don’t think it matters. CG/WG difference: in CG you’re only on the hook for IP you’ve contributed yourself. In WG, if you’re a member, you’re on the hook for all IP in the spec that we publish. If you’re in the group / W3C you already have that responsibility. If you’re in the CG you don’t. It’s up to you to decide if you should leave the room. Like, you don’t want your IP to be granted. Which we shouldn’t encourage anyway. As long as we’re clear about which state we’re operating in at all times, should be fine.
DC: the internal mailing list requires a W3C nomination process. Filtering out certain groups.
KN: the nomination process is only for people joining.
MM: we probably should continue having the internal list as an internal list. Having it be public would defeat the purpose.
CW: it’s an internal list that anyone’s free to join. If you’re part of a company then you have to get your W3C rep to nominate you. If you’re a random person you can join in 5 minutes. It’s not searchable on any search engine, but anyone can see it.

New test development process, and invitation to contribute (Kai)

KN: Thanks to all CTS contributors (recently most notably Intel) and input from DM, was able to create a process for test development that will help us make sure we robustly test the API’s behavior, including new spec clarifications and additions.
- For reference, the link to the test plan (on hackmd) and process can be found in the CTS readme.
- Each item is tracked in exactly one place.
- Don’t want to forget to test anything.
KN: With the process in place, we’d like to explicitly invite other folks to start contributing to the CTS.
DC: we have the webgpu-napi port, focused mainly on compute.
CW: these kinds of workload tests are useful to know that e.g. drivers support at least this many registers, these kinds of workloads.
- KN: the test plan workflow is more for testing spec features. Those kinds of one-off tests probably don’t need to go through this process. Things that make sure an entire workflow works are good, and useful.
DC: prob need to discuss at some point, we have a native implementation of WebGPU on top of NAPI and Dawn, probably WGPU version too, but how exactly do we include the CTS for the build? Right now CTS is a submodule.
- CW: are you asking how a project includes the CTS? Seems very project-specific.
- KN: it is project-specific. Something we’re dealing with in the browsers too. Question about how we use a separate repo for the tests and upload a build into the wpt repo. Issue with that: browsers auto-push and pull changes into wpt tests. Because the API’s not stable, wasn’t possible for browsers to be pinned to specific version of the test.
- KN: recently, I removed the build from wpt. We’ll add it back once it’s more stable and once impls can all use same version of the test suite. Till then, each impl’s expected to pick a revision, build it, and use it for their testing. And keep it up to date as the tests evolve. Until all browsers / NAPI / etc. can pick the same revision, have to do this. Process should be the same for browsers and NAPI for now.

Out of range loadOp clear value behavior #965 (Corentin?)

CW: loadOp clear is doubles right now so it can be ints or doubles. Jiawei found when passing a negative float. Some APIs clamp to 0, some give garbage. WebGPU should decide on one behavior. Is it a validation error? is it clamped/rounded?
MM: There’s another option: don’t use doubles. Use different data types.
KN: We might end up having cases where a component format in the spec doesn’t match some JS format. Right now, for example, we have normalized formats which are fixed-point numbers between 0-1. We still need some rule for that.
MM: JS doesn’t have distinction between ints and doubles. They’re just numbers. If we have different types then we will have narrowing rules.
KN: I meant to say that WebIDL doesn’t provide types like that. e.g. 0..1 floating-point numbers. We use [EnforceRange] in some places, like indices.
CW: Doesn’t Metal use doubles for the clear color?
KN: Yes. It’s not possible for us to describe the conversions at an IDL level. We have to put the numbers into Metal as doubles, but also the texture formats are more complicated than uint16, etc.
KR: Seems reasonable to clamp, and probably the easiest thing. We’ve done this in WebGL.
WG: I think that’s why Clamped exists in JS typed arrays as well
DM: We know that render targets expect a certain range based on the format. Why don’t we just throw an exception? Why do we even consider clamping?
CW: You mean validation error?
DM: Yes.
CW: That sounds fine. As I understand, on the JS side we can’t know between float and int?
WG: you can sort of know these things from JS. Some experience from doing SpiderMonkey FFI. Can know whether numbers are ints, IEEE 754 doubles, etc. W.r.t integers, they’re all subsets of the FP type (double) anyway. Impl in SpiderMonkey limited them to 2^31, when they flipped over to using floats internally. If you can look at a binding that works with 64-bit floats correctly, all impls can get them down into what WebGPU needs.
MM: I think there’s some benefit from being consistent w.r.t the rest of the web platform. If an API requires an int and you pass it a negative double, does anyone know what Web IDL’s behavior is?
KN: Web IDL defines it. I think it rounds it and does modulo 2^32 or similar.
MM: so if a field required unsigned int and I passed -1.5, it’d turn into very large number?
KN: think so.
MM: why would out of bounds - saying number’s too negative - be different than too positive?
KN: we’re talking about using negative numbers with uint formats here.
...discussion about behavior of large unsigned ints...
KN: Whatever rules we want to have, we have to apply them before we pass things to the driver. We could probably reproduce the rules WebIDL uses by default if not using EnforceRange for integer conversions. But WebIDL does not have rules for conversions to signed or unsigned normalized formats.
MM: could we try to generalize the rules that already exist? Consistency with the rest of the web platform’s more important than consistency with the native APIs in this case. Would be great if we could generalize the web’s behavior.
KN: Hard to generalize something that doesn’t apply to fixed-point. But we can do somewhat similar rules. Agree with JG that prior art is WebGL.
JG: WebGL’s clearColor is prior art.
MM: saying number has to be between 0..1 is conceptually similar to having it between 0 and 4 billion.
CW: In fixed point where you have 256 values between 0..1, maybe some API doesn’t round and it does ceil or floor. We want them to be consistent.
MM: we talking out of bounds or rounding?
KN: were’ talking about mapping the double spaced into arbitrarily sized float, int, and normalized formats. I see your point about making it similar to integer conversions. We could define it - do this clamping, divide by this number, for the normalized formats.
JG: do we have a concrete proposal?
MM: non-concrete one - generalize the rules already present in Web IDL.
KN: do we want to use the default non-[EnforceRange] rules for conversion?
MM: no strong opinion
KN: I think we should use those.
CW: we should probably discuss more on the issue, esp. if we’re trying to generalize the Web IDL rules. Maybe reach conclusion offline.
MM: should I try to come up with what that looks like?
KR: I think we should find some specific spec conversion. No other web spec is dealing with the types of numbers here, except for WebGL.
KN: yes. usually don’t think about mapping a double to RGB10A2.
JG: In WebGL, it’s up to the implementation of whether you get 0x7F or 0x80 for 0.5.
KR: Right there’s a tolerance of 1 in the conformance tests, but the wraparound is well defined and well tested.
MM: Not saying we should mirror WebIDL. I’m saying for the cases that do match we should match WebIDL, and for others we can do something similar (match in spirit).
MM: I’ll take the AI to make a concrete proposal.

Copy Validation of Depth/Stencil textures #997 (Austin)

Partial-region validation

AE: D3D12 disallows copies of partial regions of depth/stencil subresources (depth32float, depth24plus, depth24plus-stencil8, stencil8, and friends). WebGPU currently only has this restriction on T2T copies. We need to add it for B2T, T2B, and WriteTexture.
DM: we don’t have buffer to depth texture copies.
AE: we do have it to stencil.
CW: and we have depth32float to buffer.
AE: and stencil to buffer.
DM: sounds reasonable. Don’t understand point about programmable sample positions.
AE: 1) restriction only applies to textures used as output attachments in D3D12. In WebGPU, we could also have that restriction. Copy partial regions of depth/stencil textures you never render to. Not sure why you’d want to, but it works. Could make this part of the spec more accurate, but would be more API complexity.
AE: 2) with programmable sample positions feature level in D3D12, D3D12 team said that feature lifts this restriction. Docs on this page doesn’t say that though. Docs talk about copies the feature does allow. Partial region copies allowed as long as sample regions match. CW’s hypothesis was that previously, they always matched. A little unclear, though on older versions w/o this feature, validation layers do warn that copies are disallowed.
DM: they warn, or error? We already have a few warnings happening because of the way the API’s shaped. So we should consider allowing some warnings like this one.
CW: think what we need here is strong confirmation from D3D team that if you don’t have programmable sample shading, whether you can do partial copies of these resources. Think it’s not super clear.
RC: I can try to clarify that with the D3D team if needed.
AE: would appreciate that.
RC: thought that for these “plus” formats, didn’t we agree we weren’t going to allow buffer-to-texture copies for them?
CW: yes, depth24plus can’t be copied, but stencil8 part can.
RC: ok.
AE: also we allow depth24plus to depth24plus.
CW: and depth32float to buffer.
AE: if this partial thing is a rule, then you are allowed to do T2T copy of depth24plus, but must copy entire sub-resource.
CW: Austin please work with Rafael to ask D3D team whether we can do this or not.
AE: sure.
DM: some feedback on Austin’s first point. Suggest to not place restrictions based on usage. Our impl may need to add extra usages behind the scenes. Introducing b2t copies (emulated using render) for example.
AE: I’m fine with this. Doesn’t seem useful anyway.
CW: consensus: AE and RC to ask D3D12 oracle. We don’t worry about allowing it based on usage.

Potential future extension

These partial copy region restrictions are lifted if sample positions match and D3D12 has the Programmable Sample Positions feature. In the future we can consider an extension that allows partial copies.

Case for empty usages #1001 (Dzmitry)

DM: usage flags, if you don’t have any on a buffer, Vk complains. Can work around it. 1) not useful to do this. 2) workaround would be arbitrary. Like handling 0-sized buffers, where you bump the size. Here it’s not useful.
DM: usually, think there might be some programmable scenarios you’d want to make still work. If we’re talking about usage specifically, more functional than data bit. Users should still know what they’re doing with usage.
MM: in WebGPU today, if you make texture with empty usage, what function calls could you pass it to and have that call not fail?
JG: mapBuffer?
MM: need usage flags.
JG: then createBufferMapped.
KN: you can destroy it, create a view from the texture, and start with it mapped, but can’t do anything with it afterward.
MM: such a WebGPU texture doesn’t need a Vk texture underneath it then, since you can’t do anything with it
DM: it’s just another level of indirection. Had similar discussion with BindGroups. Make them adhere to pipeline layouts, since creating would be useless. We agreed to align these to avoid creating useless objects.
MM: I thought we allowed 0-sized buffers to be created?
JG: it is similar. Is it useful to have as a feature for the API though?
KN: I don’t think it matters that much. Could always allow it later without breaking anything.
MM: why is this different than creating empty buffers? If we disallow this kind of thing for textures then we should do it for both.
CW: difference between usages and sizes: size you can easily imagine an app that does a lot of computation and comes up with a zero-sized buffer. Just a case that might happen. Don’t want app to fail at submit time when this happens. An app should mostly know beforehand what a given buffer’s used for. Less likely a case where app would programmatically select usages and just fail to use the buffer.
MM: ok, think I’m convinced by that argument.
CW: resolution: disallow empty usages on resource creation.

Query API: execution time of the Dispatch command #992 (Kai / Corentin)

skipped

PR burndown

Add depth texture component type #970
- CW: There’s comparison-samplers and samplers. If you use a comparison-sampler with a texture that is not a depth texture, there are undefined results depending on the API. Ill-specified everywhere. DM’s suggestion is to add another texture component type which is depth-comparison which means it may only be used by comparison-samplers in the shader and you can only put a depth texture in that position in the bind group.
- KR: If I remember correctly, this came up pretty late in WebGL, and James found the issue and had to add more draw-call validation. In WebGPU think we should be strict and validate this early.
- All: sounds good.
Add rgb9e5Float texture format #975
- CW: think we agreed to call this rgb9e6UFloat because these floats don’t have a sign bit.
- CW: Also rename rg11b10float to rg11b10ufloat
- DM: I like that a lot.
- JG: agree it makes sense.
- CW: we have an object called BindGroupLayoutEntry so we’re no stranger to long confusing names.
- MM: is this the first time a web API has used the term “ufloat”?
- KN: think this comes from compressed textures. WebGL has these types but didn’t call them ufloat.
- JG: I’m a bit on the fence because float/ufloat aren’t different types like norm/snorm. They’re just floats that have no representation below 0. Useful to surface that they’re unsigned. We’d forgotten that RGB10/11 is unsigned.
Internal usage flags and rules #994
- Fairly big. Discuss next time. Nice PR where DM tried to formalize our usage validation rules more precisely. In particular, nice discussion to be had about what’s allowed / disallowed. For example, disallowing concurrent usage of read-only and write-only storage textures. Should be part of the main agenda next time.

Agenda for next meeting

Internal usage flags and rules #994
Progress on the CTS? :D (should we make this a regular update?)
CW: I plan on making a proposal for multi-queue in September.
- DC: put this on the agenda?
- CW: no, I’ll make a proposal on Github to be read first.
- CW: multi-queue doesn’t help with multi-GPU because you still need to expose things like cross-GPU texture copies.
CW: Kai, you had a rework of the device initialization? Could we talk about that next week?
- KN: probably not. Fortunately this issue is not blocking anything else, but do need to get to it.
DM: multisampling. Adding different bind group layout type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2020 08 17

Tentative agenda

Attendance

WG Creation Updates

New test development process, and invitation to contribute (Kai)

Out of range loadOp clear value behavior #965 (Corentin?)

Copy Validation of Depth/Stencil textures #997 (Austin)

Case for empty usages #1001 (Dzmitry)

Query API: execution time of the Dispatch command #992 (Kai / Corentin)

PR burndown

Agenda for next meeting

Clone this wiki locally