Minutes 2022 06 08

GPU Web 2022-06-08/09 APAC-timed

Chair: Kelsey

Scribe: MM, KN, KG

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
“Tacit Resolution” Queue
Specify that texture_external sampling functions clamp coords to [0, 1] #2766
texture_2d_depth for loading depth values might be unnecessary? #2094
Add WebGPU Adapter Info Registry document #2990
maxColorAttachments not sufficient? #2965
(stretch) Try-catching getMappedRange is useful but shouldn't be #3013
(stretch) [TAG review] Should there be a higher granularity for selecting devices? #2945
Agenda for next meeting

Attendance

Apple
- Dan Glastonbury
- Myles C. Maxfield
Google
- Brandon Jones
- Kai Ninomiya
Intel
- Hao Li
- Jiawei Shao
- Zhaoming Jiang
Michael Shannon

Administrivia

CTS Update

KN: Dan is working on infrastructure.
KN: Working on making the tests take less time to run through stubs.
KN: Thinking about infrastructure issues. Spanning CTS and Chrome. Figuring out how to get tests to run efficiently.
KN: We have a new contributor! They are starting to look at learning how to start. Will pick up some tasks soon.
KG: Glad to hear it.

“Tacit Resolution” Queue

https://github.com/gpuweb/gpuweb/pull/3018

KG: Removes multisampling capability. We require that if they support render attachment, then they …
JS: We need render attachment usage to initialize the multisampled texture. So now we can’t create render textures without multisampled usage.
KG: Makes sense. Let’s take it.

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

KG: The spirit of the request is to make sure that external textures are clamped between 0 and 1 on those resources. We’d do this anyways, I think. There is concern that because what we’re sampling from is a sub-rect of an actual resource (because you might get padding out of a video decoder) there is concern that we might allow operations to exceed the bounds of what they’re supposed toa access here.
MM: One option is that we guarantee that no sampler can ask for ‘repeat’, and must request clamp. However, we can also polyfill this ourself instead. The concern for this is that we don’t necessarily know at sampling time what the size is. There are ways to solve this.
MM: The usecase of tiling a video frame is something that we think that it would be a shame if we lost for webgpu. It would be unfortunate if we always clamp.
KN: I think we should discuss implementation, how we would achieve that.
MM: I think this stuff is possible.
KN: I don’t think we need to worry about comparison samplers, and also anisotropic filtering is conveniently poorly defined, so we’re pretty free there.
MM: I would be ok with aniso being under-specified.
KN: I think everything is implementable.
KG: Filtered+repeat, at the edges? Needs extra samples.
KN: Mirrored-repeat+filter would be fine.
BJ: We know we need to clamp in the shader, then we have to be doing tricks to prevent pulling in samples from outside the rect. That is, if we’re going to need clamping, we’re going to need to handle sampling anyway.
MM: My proposal, is we say in spec “filtering supported”, aspirationally, and respond to implementation feedback if problematic.
KN: I think it’s possible to do the extra samplers, though it’s a performance issue for the fragments around the edge.
KG: I worry that it affects performance more than just around the edge.
MM: I’m imagining a branch where we only do the extra sample for the edge fragments.
KN: Think the blocks of fragments that overlap edges would do 5 samples (SIMT across the two branches, 1 sample + 4 samples)
KG: Should we add this aspirationally? It’s extra work on top of implementing external textures to begin with.
GT: Can ask this offline but I think I don’t understand the problem.
KG: Clamping is table stakes. Seems obvious we should offer that option. The other options I’m not sure about.
…
BJ: Isn’t it a concern that if the clamped value is produced in the vertex shader then samples at the edge of the clamped range will naturally include undefined values outside the clamped range when using normal hardware filtering?
GT: I implemented something like this recently, don’t remember where. You clamp from the center of the pixels and lerp from the edges. So filtering won’t read outside the area you want.
KN: I agree with Kelsey that clamping is more basic. We could offer the rest after V1. I’m pretty convinced that it’s not very easy to do. That said, in order to enable that for the future we’re going to have to change the current API because right now we can’t offer that because we can’t introspect into the sampler object. We’d have to combine the external texture and the sampler into a single API object.
We will think more offline about solutions vs implementation cost would be here, and what changes we may need to make to the API.

texture_2d_depth for loading depth values might be unnecessary? #2094

Recap
- MM: Metal shading language has a depth texture type and according to the rules of Metal it isn’t valid to bind a depth texture to a non depth binding. But in practice, de facto, it works. Games rely on this. Should we intentionally break these rules in WebGPU?
- MM: No matter what, we need a depth texture type (KG: for comparison sampling), so that’s not in question.
- MM: (Describing the three options listed in the issue)
MM: We have no confidence that all future texture functions are going to work on depth textures. WGSL is presumably going to add more texture functions eventually. We wouldn’t be able to validate that they’re not used.
MM: So we think we need to do option 2.
KG: Think users can probably make this work.
KG: Anyone want option 1 or try harder for 3?
KN: just came up with an idea that maybe there’s a middle ground between 2 and 3 that would let us validate that those new texture functions aren’t used. Need to think about this.
KG: Take option 2 with the possibility of expansion as we flesh it out?
KN: sounds good
Resolved

Add WebGPU Adapter Info Registry document #2990

BJ: Recently we merged in the adapter info into the spec, where authors can query some things from the adapter. Still in flux, but hopefully we’ll get some parts of this.
BJ: Because we’re returning strings, we specify some rules about how the strings are constructive. We want to consider having a doc that is prescriptive for how things are formatted. (and also make clear that UAs may choose not to return anything)
BJ: Doc would say, “on such-and-such adapter, here is the string you should expect.
BJ: For chrome, we have come up with a database to map PCI device IDs to architectures. Would like to make this part of a public registry so we can all keep it up to date and use it in our browsers.
- Do we feel like a registry is a good idea?
- Do we want to maintain a public asset that can be ingested by various browsers?
MM: A registry may or may not be harmful, but it definitely does not belong in our registry/spec. This won’t be exhaustive, and will never be exhaustive. Devs may say, “great I’ll switch over all these” and ignore e.g. new devices. The idea that you can know-know what the entire list will be.
MM: Also, it (falsely) assures devs that they will get certain data on certain devices.
MM: If a browser wants to publish their registry, that’s fine. If multiple browsers want to agree to use the same registry, OK. caniuse OK. But not in the gpuweb organization.
BJ: Some parallels here to what we’ve done in the webxr space. E.g., we tell what a user has in terms of controller device name, which is a string. It does suffer from some of what you describe, where it’s incomplete, missing new devices. What we did was have it hosted as part of the org, but not the spec. Understand there are differing opinions on whether that’s different.
KG: The general feeling is similar. What i’m worried about is signing up to officially or not return certain things for certain devices. I don't’ want to give people the idea that they can get certain data from certain devices. I’ve gotten the mount of pushback I expect from Firefox not telling you exactly which GPU you have (using a model GPU - FF gives rough information). One user has (what we think is) a halfway decent reason for a case where we got this wrong and we’re not providing enough info based on teh granularity we’re exposing. In some ways I consider that a success. That said, the structure proposed here doesn’t forbid that, and it sounds like we’re taking seriously the worry about whether we’re saying that things are going to end up return certain things (explicitly or implicity).
KG: Maybe it’s useful for the org, but I think it would nto be part of the org, in a different place that is maintained. I’ve generally felt putting things in the org but not the spec is the wrong optics (as a participant but not a chair). There’s an implication that it’s the standard, but it’s not. It might be a valuable resource to share, bu this org is not the right place to do it.
BJ: That’s decent reasoning. My purpose for wanting ot host the JSON file somewhere outside of Chromium/Dawn is for ease of contribution. Going back to WebXR, we’ve had great success at having the different vendors pre-emptively offer assets to that repo. Adding their own controllers to the registry, to make them show up correctly on people’s new devices. My hope would be we could establish an asset that vendors could pre-emptively be more pro-active about adding things to. I don’t know what the right place is, but it sounds like there’s feedback against it.
KN: I think that resolution makes sense for the PCI device -> architecture mapping. I’m on board with the reasoning. The first half of this is whether we wanted to have a registry of possible values for architecture. Without saying anything about how they’re mapped. I still think this is a good idea. My thought is “if a user agent wants to expose this particular architecture, it should spell it this way” but it doesn’t have to. It’s just to normalize spelling across browsers.
MM: You don’t need it to be in the org for browser to agree on a spelling.
KN: Still think it’s bad to put the spellings of architectures in the org?
MM: Yes, same reasoning.
KG: WebGL has done no such thing, and everything is fine. So I’m not super keen on it either.
BJ: I don’t know if that’s a desired end-state, though
KG: We should address things commensurate with how much of a problem they are. The maximum benefit is telling people roughly what card they have so authors can work around zero-day problems. I don’t need them to be super standardized that way. I can see where you’d continue going in that direction, though, to make things even better. But, this doesn’t …
BJ: **One of my motivations is that this list in Chrome wasn’t easy to gather this data [pci to architecture mapping]. I don’t want other people to go through that. I’d like to share! I thought there was some benefit to having this in a more publicly visible place. **But I understand why wanting to shy away from that. Chrome has a file, so if people want to use it, let’s find a place to reference it.
KG: One of the best places possible to put that would be in notes in the relevant MDN page.
MM: We would have no problem with that.
KN: I think someone created some MDN stubs but I don’t know how long ago.
KG: I will look for it later.
KN: Maybe not.
KG: That seems to accomplish most things that people feel like they need. Right? We have various ideas about how things can be better, but the consensus for what we need right now is satisfied.
KN: I’m convinced the consistency isn’t that important. And eventually they are likely to end up consistent anyway.
BJ: I look forward to editing this MDN page in the future.

maxColorAttachments not sufficient? #2965

KN: Metal feature set tables define an additional limit of the max render target size per pixel (the sum of the sizes for all the attachments)
KN: I found an interesting documentation from Imagination about how this is a limit and if you go over it you will spill to main memory (so you should not do that)
KN: I’m surprised that Vulkan doesn’t have a limit like this (maybe they just spill?)
KN: Metal has this restriction. It is possible to go over this with 8 attachments. So we need a new limit. I started drafting a pull request, and that made me realize I didn’t fully understand the limit in Metal. I don’t know if it’s number of bytes per sample, or per pixel across all samples, and whether it includes downsampling (resolve) targets
KN: Based in Imagination’s documentation, I have a theory, but I’d like to verify the theory.
KN: We’re going to need a limit here
MM: Yes
KG: Yes
KG: Can Apple contribute more information about this limit?
MM: Yes

(stretch) Try-catching getMappedRange is useful but shouldn't be #3013

(stretch) [TAG review] Should there be a higher granularity for selecting devices? #2945

Agenda for next meeting

Try-catching getMappedRange is useful but shouldn't be #3013
[TAG review] Should there be a higher granularity for selecting devices? #2945
New items: *

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2022 06 08

GPU Web 2022-06-08/09 APAC-timed

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

texture_2d_depth for loading depth values might be unnecessary? #2094

Add WebGPU Adapter Info Registry document #2990

maxColorAttachments not sufficient? #2965

(stretch) Try-catching getMappedRange is useful but shouldn't be #3013

(stretch) [TAG review] Should there be a higher granularity for selecting devices? #2945

Agenda for next meeting

Clone this wiki locally