GPU Web 2020 02 12 Redmond F2F Day 1
Clone this wiki locally
So as to not be anonymous animals, the doc is shared for writing with the google accounts that are invited to the meeting. If you can’t edit let email@example.com know. Also be sure to be in “Edit” mode and not “Suggest” mode.
Scribe: Austin (inb4 fingers on fire)
Location: Building #22, Microsoft, 3050 152nd Ave NE, Redmond, WA 98052 + Google Meet
- Progress update
- TAG review
- WG creation
- Next F2Fs
- Shading language
Other topics with no timing constraints:
Keeping data on chip #435 (Myles)
(done) Feature levels (including what was formerly webgpu compat)
Timer query and other queries (occlusion query, pipeline statistics query)
~~[#548](https://github.com/gpuweb/gpuweb/pull/548) Switch dynamicOffsetsDataLength to 32 instead of 64 (Done)~~ [#543](https://github.com/gpuweb/gpuweb/pull/543)~~ (get bind group layout) (not ready)~~ [#522](https://github.com/gpuweb/gpuweb/pull/522) gpu.onadapteradded ~~[#520](https://github.com/gpuweb/gpuweb/pull/520) separate 3d texture depth to 2d texture array layer (Done)~~ ~~[#517](https://github.com/gpuweb/gpuweb/pull/517) (formerly known as imageHeight) (Done)~~
Agenda for next meeting
WIP, it is the list of all the people invited to the meeting. In bold the people that have been seen in the meeting:
- Qian Qian Yuxuan
- Dean Jackson
- Jason Aftosmis
- JF Bastien
- Justin Fan
- Myles C. Maxfield
- Robin Morisset
- Theresa O'Connor
- Thomas Denney
- Austin Eng
- Corentin Wallez
- Dan Sinclair
- David Neto
- Idan Raiter
- James Darpinian
- John Kessenich
- Kai Ninomiya
- Ken Russell
- Shrek Shao
- Ryan Harrisson
- Miller Hooks
- Brandon Jones
- Bryan Bernhart
- Yunchao He
- Chas Boyd
- Damyan Pepper
- Rafael Cintron
- Michael Dougherty
- Dzmitry Malyshau
- Jeff Gilbert
- Kirill Dmitrenko
- Elviss Strazdiņš
- Joshua Groves
- Markus Siglreithmaier
- Mateusz Kielan
- Mehmet Oguz Derin
- Samuel Williams
- Timo de Kort
- Tyler Larson
CW: Still haven't made much progress on Spec / CTS. We need to have a test plan and a more concrete idea of how to get there. CTS is mostly a problem of engagement so we need to have guidelines to do it.
DM: For spec, people are assigned to tasks, but mostly busy. If companies had more resources to contribute to spec, editors could help shape those contributions more. We’d appreciate it if there were more involvement in the spec.
MM: It’s common for people to submit tests to WPT directly. Can we do this?
KN: It’s a project that I think I know how to do, but haven’t done yet. It’s not how it works now; every file is autogenerated from typescript sources. Have to make changes on Github and then a roll script is used to roll it into the browser. Not amazing right now, but it does work.
MM: In a world where the typescript is checked in, would there be a build step?
KN: There would probably be a short build step to generate the list of test files, but it should be possible to get rid of the rest of the build step.
MM: Typescript -> JS in the browser at runtime?
KN: CTS update: We need better validation coverage. We have some, but we don’t have every entrypoint or exhaustive testing of all options of dictionaries.
We have almost no functional testing beyond that, like drawing a triangle. We have a few barebones tests that run compute or do copies, etc. We have nothing right now to test IDL rules (will address this later in the meeting). I think important now is validation tests, so that we’ll have consistent implementations across browsers.
DM: With small validation rules and entrypoints, do you think there is a way to automate / autogenerate the tests?
KN: I think it’s to some extent possible where the rules are encoded some way. Integer types should throw an exception if outside of [EnforceRange]. For the most part, those will be fairly easy. Those tests will be short and not more complicated than the rules themselves.
JF: Not much has changed since the last F2F in September. Most effort has been keeping up with spec updates.
MM: We’ve been doing a bunch of shading language investigations - address later today.
MM: For work on Windows, are you invoking FXC?
CW: Yes, through SPIRV-Cross which uses FXC. Been talking with Microsoft about producing DXIL directly.
DM: With so many demos running on WebGPU, any feedback on it?
CW: People are confused about bind groups.
DM: What about performance?
CW: A few measurements. I think Earth is about just as good as WebGL, but we don’t have detailed feedback yet.
DM: My understanding was that BGFX recently got a Vulkan backend, so I’m surprised they would be confused about bind groups.
CW: This contributor is not a maintainer - just doing it for fun.
DJ: Google is one of the places writing the most Vulkan code. What are the changes that Google internal stuff will start using webgpu.h instead of Vulkan? Is that a goal of webgpu.h?
CW: There’s Skia which has a Vulkan backend, and we also have a prototype Dawn backend. In a way, Google has enough engineering and enough vested interest in Vulkan that we're looking to target Vulkan directly. WebGPU gives us Metal and Web and D3D12 at the same time. Depending on the amount of engineering, you can use Vulkan directly and WebGPU as well. If you don’t have a lot, then just Vulkan. For Earth, they want to target the web, so you have to use WebGPU anyway.
DJ: Do you think that Earth would just use webgpu.h and use that to ship native apps as well? If it performs well enough.
CW: Can’t really comment. We support use cases like this.
MM: On fingerprinting: Yesterday we heard from David about the performance improvements on WebGPU. What do you think this particular workload makes it slower than WebGL? His workloads were faster.
YH: Right now it’s quite close. One thing: in WebGPU we use buffers. Don’t have storage textures yet. WebGL uses the fragment shader.
CW: From what I remember, copyImageBitmapToTexture has a fast path in WebGL, not yet in WebGPU.
CW: This workload is more GPU bound. For David, his workload was CPU bound.
YH: For compute shader, we have to tune the workground sizes. For a fragment shader, the hardware helps you execute the shader invocations.
MM: If the shader is ALU bound or memory bound, it could be that the shaders you’re writing in SPIR-V don’t map to the hardware as well as GLSL shaders.
KN: It’s due to the programming model of compute shaders, I think. They are inherently giving you more knobs. Workgroups sizes, manual control of workgroup local memory. Whereas fragment shaders don’t have control over everything else. The hardware can make it fast. For compute shaders, you’re responsible for getting the knobs right.
MM: Yea that explanation is a little concerning. If the natural way to use the tool is worse, then we’re doing something wrong.
KN: They can still use a fragment shader. But yes, we were really disappointed.
MM: This group should provide help to let people get the most out of compute shaders.
KN: Yes, still will be very hardware-specific. I think the way we can help developers is to make sure WebGPU works well with tools like NSight, etc.
MM: That’s a good idea. Not sure if it’s sufficient, but possibly necessary.
YH: Intel also compared experimental WebML workloads with WebGL TF.js. WebGL backend is being optimized by Google folks, but little optimizations on WebGPU backend before.
DM: Should we ultimately expect WebGPU to be faster than WebGL? I agree with MM that we should seek an opportunity to make the knobs easier for users. If we can get to the root of why it’s slower today.
KN: What we learned when working on TF.js -- one is that the way TF.js is structured is that you have many separate operations. Between each op, no matter how small, you’re writing out to memory and reading it back in. That’s a problem with both compute and draws. It’s possible with draws that there’s some tile-local memory that’s making it better. We were trying to squeeze performance out, and our compute shaders were pretty close to the theoretical maximum. WebGL was getting half which is crazy that a fragment shader could do that.
Work for WebGPU in Chromium on Windows. Lazy resource init, device loss handling (Natasha).
DM: Godot is interested in doing a WebGPU backend
DM: Made a FOSDEM talk about “Building WebGPU in Rust”
DM: Lots of improvements to the wgpu library. Growing the ecosystem of libraries and applications.
DM: Already applications in the iOS store that use wgpu.
DM: Pain points from our users: uploading data, and understanding timelines.
DM: Integrating wgpu in Gecko, and it works! Have compute shaders working. Most the API works but showing pictures on the screen is a challenge.
DM: Servo runs the compute examples correctly too as of last night!
MM: Why hook it into Servo?
DM: Servo is our playground for our latest and greatest things.
CW: Many people asking questions about WebGPU. We’ve starting chatting with the community on a matrix channel. Sent an email to the mailing list recently. Definitely not an official channel. More support, not decisions for WebGPU.
(with Wendy Seltzer)
Draft charter: https://github.com/gpuweb/admin/pull/15
CW: All companies have shown the legal folks the draft charter. As soon as we agree on the charter, we can go ahead and have a working group.
WS: Yes, W3C will then send it for Advisory Committee review for a month, encourage people who support the group to get representatives to indicate support, and provided we get support, we will have the working group.
CW: Can people in this group say we will work on the charter and send a final version by the end of this quarter?
DJ: Wendy, was there any feedback from the advance notice?
WS: None to me or to Francois.
CW: We haven’t had specific feedback either -- only what’s in the pull request.
WS: Generally there’s a sense that this is exciting work, and we would like to support that. Sometimes chartering a working group can help bring others into the mix or get team resources and attention.
CW: Okay, let’s agree to finish the charter by the end of this quarter; earlier if possible.
RC: How final is the charter? Lawyers like to review final things.
DJ: I think just minor edits left on bad links. I would think it’s one editing pass away from being final.
CW: We’ve put this off for a while, we should probably do it. There’s a long queue to get review. It’s better to send it earlier than later. At which point would people in this room be comfortable sending for TAG Review, and what additional documents would we like to send?
MM: I think there are at least two bars to cross. One of them we’ve already crossed: API well formed. The other bar: Some semblance of direction on shading languages. If we can cross those two, what we deliver to the TAG shouldn’t just be a spec. They don’t know 3D graphics languages, and many aren’t programming language experts either. This community group should produce an explainer for the general concepts. What’s also important is the Why.
CW: Should the explainer and Why be a living document for after TAG review as well, or do you see it as a throwaway?
MM: No preference. The audience for this type of document isn’t a web developer who wants to make an app, it’s not even for this group. Maybe it’s for new members to the group. Seems like a lot of effort to maintain, and I would be mildly opposed to it.
RC: FWIW the WebXR group does keep a living document, and it was helpful when they got TAG review to point to things and say why things are the way they are. They do keep it up to date even after TAG.
MM: Is it useful?
RC: It is useful to me; sometimes I look at it.
CW: Okay so cross the bars, and then figure out all the things that should go into the explainer.
MM: Before sent to the TAG, this group should review it.
CW: Some discussion about doing another in Phoenix, the week of May 4-8. It would be co-located with WebGL. We’ll do two days of WebGPU somewhere around the Khronos F2F.
DM: Ideally it would be adjacent to WebGL.
CW: And the next? Some discussions of Toronto.
DJ: I think Apple could also host whenever we need to, given enough notice.
MM: Haven’t met in Toronto before.
DM: I’ll look into it.
CW: Google could also host in Waterloo -- or Toronto. Maybe late summer? September-ish.
MM: If we go to Toronto, I’d like to not go in the winter. TPAC is 26-30 of October in Vancouver, CA.
Using buffer with multiple usages in the same compute pass #547
KN: Inside a single compute pass, can you use a buffer as both Storage and Indirect. We discussed there’s implicit synchronization for RW storage buffers on the dispatch. The question is if we define the usage scope as one dispatch (whereas per renderpass), or that compute passes have the same validation rules, but we do implicit synchronization between dispatches.
The benefit of it not being valid is that it makes things easier. Only Storage->Storage synchronization in implementations.
RC: What is the use case for what he wants?
KN: The benefit is that in a single compute pass is that you can create indirect data and use it. The workaround is that he has two separate compute passes. One that creates the data, and one that uses it. We can go either way.
DM: I agree we can go either way. My preference would be to not have the complication and allow various usages inside the compute pass. It would be more complicated to define synchronization scopes, etc. Every dispatch is it’s own synchronization scope.
MM: Fine for me.
CW: Okay, resolved that every dispatch is a synchronization scope.
#517 (formerly known as imageHeight)
KN: Important part is the link to #519 (table of APIs and how what units they use for rowPitch and imagePitch).
KN: #517 proposes making imageHeight optional. Two semantics. Tightly packed or only valid for 2D copies. We could do a similar thing with rowPitch.
KN: The complexity with tightly packed is that there are some rules in some APIs that make tightly packed not an option.
KN: Personally, I’d advocate that imageHeight is only valid for 3D copies.
DM: I agree.
RC: No strong opinion.
CW: Okay. Resolved that imageHeight is only valid for 3D copies.
JG: Vote no on weird strides.
RC: How difficult is it to convert between weird strides and beautiful strides?
KN: They’re all easy to convert. We would just disallow weird strides. I’m happy to stick with the D3D12 way.
JG: In some way, I like bytes better, but I have no strong opinions. Having it in terms of pitches makes it constrained to the thing you need. Whereas if you do bytes and mess up, it’s gross. You also have to do more math that way. Weak vote for rowPitches.
CW: Resolved: Count in number of rowPitches.
KN: “imagePitch”. Thoughts?
DM: Doesn’t that imply bytes?
KN: I guess? D3D phrases it very differently. When you do a copy, you tell it how to interpret buffer data as texture data by specifying a SubresourceFootprint, and a D3D12Box which tells you the source region of the virtual texture you copy out / in.
RC: So will we not have the box?
KN: We don’t. You can do the same thing, because we have the sourceOffset, and you can set the rowPitch such that it’s large enough. You can do it but there’s math to convert.
RC: Other than ImageBitmap, is there anything else in the API called “image”?
MM: That’s good though. Textures are textures because they have mipmaps. This is just one slice.
KN: Sort of a unique case. Don’t have anything else in the API that describes this concept. Most of the APIs call it “image”. Metal, Vulkan, and GL.
RC: But Vulkan has a ton of other “Image” things.
JG: Yes, but they’re all purely two-dimensional.
All: nah, VkImage is a texture and can be 3D.
JG: Is “slice” better?
KN: I like “slice” for 3D textures and “layer” for 2D array textures.
DM: “texelsPerRow” for rowPitch ?
KN: Okay: “bytesPerRow”, “rowsPerImage” ?
CW: “rowStrideInBytes”, “imageStrideInRows” ?
CW: Defer to spec editors!
CW: Google would like to talk about Tint which is our proposal for compromise.
- text based, not binary
- described in terms of SPIR-V
- there will be SPIR-V → Tint and Tint → SPIR-V converters
MM: This completely solves one of the three goals Apple has. We suggest that this language be designed in this group and used by WebGPU. Very happy.
MM: We have some ideas about improvements to the language, technically. But before the technical details of variable scoping, we should talk about whether or not it’s a good idea.
DM: I was skeptical at first, about doing a higher-level bijective text format. This looks good to me. There are some places that are rough, it is probably not the ideal language for people to write in, but it is probably ideal for writing the CTS. One question: why is the goal to be bijective as opposed to trivially convertible?
DS: The bijectivity helps us make sure that everything you say is absolutely convertible to SPIR-V. We also think it’s useful to go the other way: convert SPIR-V to Tint. If someone has SPIR-V generation already, they can easily use a library to convert to Tint and use in WebGPU. There’s an ecosystem around it.
CW: Bijectivity means it’s at least as powerful as SPIR-V and not more powerful. We cover things like that pointers don’t exist in GLSL. Also, outside of the web, almost every code base that supports multiple shading languages portably uses SPIR-V as their intermediate format. Transparent bijectivity does not close us to the native ecosystem via WebAssembly. For SPIR-V → GLSL we know there’s transpilers today. If we know there’s Tint → SPIR-V, we know we can definitely target GLSL.
DM: Is that a true claim? Most shader pipelines go through SPIR-V? My understanding is that most people write in HLSL, and avoid SPIR-V unless they need Vulkan.
CW: Well, they write in HLSL, and then go through SPIR-V to compile to MSL. My understanding that Unity, Unreal, Valve, etc. Go through SPIR-V. Except for DXIL because there’s DXC.
DM: The value of SPIR-V → Tint is that there’s existing codebases that use SPIR-V as an intermediate step. But don’t we know that some instructions are different in WebGPU and people will have to go through some steps to fix them?
CW: I don’t think so?
DJ: Are you asking that they would have had to be converted to WebGPU SPIR-V first? and that still applies.
MM: There are requirements that the Web has. SPIR-V is more expressive than any language we could possibly accept.
CW: To rephrase: the value of SPIR-V → Tint transparent is that the native ecosystem for portable stuff coalesces around SPIR-V for the shader compiling pipeline. So, something that transparently translates to Tint has a lot of value.
MM: I think it’s also worth pointing out that there are concepts in Tint that don’t correspond to exactly one SPIR-V instruction. So the language is capable of representing things that isn’t exactly 1:1. There aren’t Op prefixes everywhere.
JG: Would you prefer those?
MM: No, not at all.
MM: It is up to this group to decide the future of the language. IF it does get inducted, and the group adds something representable in SPIR-V but not in Tint, it’s up to us to fix that.
CW: Dan already mentioned this for for loops. The loop construct doesn’t exist in other languages. It would be desugared to something else. That’s something that can be designed by this group but doesn’t map 1:1 to a single SPIRV OpCode.
MM: Another way of saying this is that this language is a platform, and we can do with it what we want. This is a good starting point to accept now, so we can at least have those discussions.
DM: I’m just shocked to see the violent agreement we have.
MD: First question: Do you have examples of a more complicated shader that has been converted from SPIR-V to Tint? My guess is that it would be a little bit unreadable. We’ll get a lot of machine-looking output code. The second question: It looks like there’s a gate around all of SPIR-V. Some sort of validation error or that something isn’t supported for WebGPU.
DS: It’s not so unreadable if you haven’t stripped Debug names.
MD: The question is what is the value above and beyond SPIRV directly? Why did I spend effort converting to it? I see: \
- We like text \
- There’s a gate. We’re not taking all of SPIR-V, we’re taking a subset. The second question is around defining exactly what that gate means.
MM: Text is good because on the web anyone can just start writing something. The second win is that web applications can do runtime compilation by stitching together strings. We don’t want websites to be forced to include a compiler.
MD: I think it’s okay. I think in practice with really long shaders, the code that will be generated is not going to be something a human will want. If it’s 1000 lines, I’m not going to want to touch it.
CW: Also, SPIR-V Cross does have ways to make generated expressions look more human-readable. We, or another tool, could do that.
MD: Second question: Large number of features in HLSL and SPIR-V that won’t be supported. There will be some gating function that says what shaders are acceptable.
DS: Yes, in this flavor, it’s the WebGPU Tint. You can only have f32 and i32. Other constructs won’t be convertible.
DJ: At one time we made a proposal for a language that by design was limited, but that does not give you the ability to translate back and forth.
MM: Anything that isn’t in the intersection of D3D12, Metal, Vulkan, isn’t allowed. It’s a problem for any API to enforce the intersection.
CW: Tint’s syntax also ensures that the SPIR-V produced is valid. ex.) not basic block soup like LLVM. Ensures structural validity.
DP: Are we ready to start talking about what form the spec would look like? In terms of what is / isn’t valid? Is that specified at the Tint level, or in the SPIR-V spec?
DS: Will have validation rules included from SPIR-V and Vulkan, which would be a starting point. We would probably port the SPIR-V universal 1.0 validation rules. We’d probably then look at Vulkan stuff as well.
DS: We would have the spec reference a specific version of the SPIR-V spec, copy that one, and then reference it. Wouldn’t modify it in any way but just reference it.
DP: And there’s some effort to specify a WebGPU execution environment? is that still necessary?
DS: I believe that’s still necessary. The rules will always produce what is valid in that execution environment, but it is still useful to specify so we can write tests against it.
CW: Execution environment explains and constrains stuff like variable initialization. Also stuff at the intersection of the shader and the API. Like if you declare a uniform buffer, there are alignment rules. 1. Details specifying the interface. 2. Additional validation. 3. More precise semantic of instructions.
MM: I mentioned our three desires. One of them is a single spec document describing where everything lives. We already had a discussion about this, and I wanted to bring it up that it matters. It would be really cool if you didn’t have to cross-reference three documents to write a valid program.
DS: You shouldn’t need the environment spec to write a valid program. Tint should have that in it. That’s more for implementation. You might have to look at what OpKill means.
MM: I think that decreases the count from three to 2, but my point still stands.
CW: Based on discussions with Neil Trevit, we can definitely merge Tint and SPIR-V spec into a single document.
MM: Also, there are legal implications. Neil Trevit said some stuff, and we need to validate. Still in the process of doing so.
DJ: To be clear, we’re happy with what he said. I believe his suggestions was that SPIR would publish the spec Creative Commons after which WebGPU could take a copy and republish it. What we need to look into is the bit about the click-through license.
CW: Okay, to be perfectly clear, the strong constraint we have on Tint is that it’s based on SPIR-V semantics. We don’t believe this group is able to come up with semantics that cover all the GPUs out there. We think that targeting SPIR-V semantics is the simplest and only practical way to ensure Tint works portably across all GPU architectures.
MM: We’re not disagreeing.
DJ: Totally happy with that.
MM: We’re more worried about the presentation.
DJ: Will this group be able to publish a single document, and has an Appendix or reference to what OpKill is.
CW: What’s the normative one if it explains it and points to SPIR-V?
DJ: I think it would be the Tint document, but if we did have to make some changes, we would discuss it with the SPIR-V group. We don’t want to change SPIR-V. The flip side if SPIR changes SPIR-V….
DS: And we would pin a specific version. Even the environment spec pins us to 1.3.
MM: Can also say that in the unlikely event we have copied some text wrong, and it doesn’t match, we as a community group can say it’s a bug in our spec and we should fix it. If we’re targeting a particular version, it seems hard to believe we would copy-paste incorrectly.
DJ: David / John suggested ways to write the Tint spec and have the SPIR-V spec inline.
DJ: I think the other point Fil was making is that if we do that, there’s parts that may not be relevant to WebTint. This is just a processing issue. The question is whether we’re willing to do this.
CW: The worry I have with the normative version being Tint, is that there will be incentive or risk to evolve the language in ways that semantically don’t translate to SPIR-V.
MM: That’s up to us. We the community group get to decide the direction. Making the argument that something doesn’t map well to SPIR-V is a totally valid argument.
DS: Same sort of discussions we have now. Can we have push constants? well doesn’t work well on X so we can’t do it. Everything has to be representable otherwise it’s of no use to WebGPU.
CW: To clarify the worry, say we have TintKill and it produces OpKill which is a link to SPIR-V. If the normative version is TintKill, it’s a small refactor to say “TintKill does XYZ”. This type of refactoring ends up with something completely new that doesn’t structurally reference the parts of SPIR-V and derive more and more.
MM: We can just not do that though? We are in control here.
DJ: I think you would have to make an exceptionally strong argument to make a change to go away from SPIR-V. We do not right now have any desire for that.
CW: Building on the SPIR-V semantics is our number one constraint to make sure we cover all GPU vendors.
DS: I think at this point we’re in agreement. If someone says down the line we want to implement something not supported in SPIR-V, this would be very frustrating.
DJ: And again, this would happen Tint or no Tint. But strong agreement on SPIR-V semantics.
CW: Any other thoughts?
DM: I would like to say that if we are to copy stuff into our own spec and decide we’re not going to derive, then that’s a poor environment to not derive. We’ll have to keep track of how the SPIR-V ecosystem is changing and keep up. Would rather have a stronger forcing function.
RC: Are you saying as SPIR-V evolves, we should not evolve?
DM: Saying that as SPIR-V fixes semantics of OpCodes, we have to keep up.
MM: There has to be some process though, we can’t automatically take things because we need to ensure security. Needs more than zero words said about it.
RC: My point of view:
- Generally supportive. As with any SPIR-V variant, need to ensure whatever we do is portable.
- If not true, we need to look at the intersection and specify that.
- This new thing, should be done in this group, the W3C.
DJ: I think that was our point (3) as well. This group should be able to control the language. We don’t want to change it, but if we really have to, we should be able to.
CW: Okay, so what I hear from all browser implementers, is that yay for the concept of a text language that is defined in terms of SPIR-V semantics, the spec lives in this group. References the SPIR-V spec, but at any point can normatively say that X is the semantic for a particular op. And the spec would strip out things unnecessary for WebGPU, and we would ingest a DOMString that is Tint in ShaderModuleDescriptor.
Discussion on scoping
DS: Current proposal: shadowing is forbidden. But also if a variable is defined in an inner block it becomes forbidden to use it after that block.
<discussion about what’s allowed and what isn’t>
MM: Want full lexical scoping. We have to do all the validation of if scoping rules are correct anyway.
DP: SPIRV → Tint still easy, and it would never do scoping. Just declare all variables at the top.
KN: I like disallowing shadowing, and enforcing names are different.
JG: Can we try lexical scoping and see if it breaks?
DS: Yea we can try.
DM: Does var have to be initialized?
DS: In WebGPU, we say yes.
DM: Even in a loop, at the beginning of the iteration?
DS: If not specified, we will initialize it to 0 every time.
DM: What I don’t understand is why we’re trying to pursue lexical scopes.
JG: It’s so close to what we have.
DM: In Rust, we use const most of the time, not var. And in Rust.. there’s a lot of complexity in how lexical scopes are implemented, and it would be simpler if we just follow SPIR-V here.
JG: It feels really straightforward to me, based on the renaming we have to do to convert to SSA ids. Maybe I’m wrong, but if we can do lexical scoping basically for free, we should probably just do it.
DM: Are we doing renaming? between the same variable name of different functions?
JG: Well any variable will be a series of SSAs.
DP: Generating new names for subexpressions.
DS: ai: Will talk to David on Tuesday.
Operators (!=!= et al)
DS: Nuked all of them. Will do the thing where we have methods. SPIR-V has a bunch of weird comparisons like unordered equal or unordered greater than, etc. Those used to be !== and !=!=, etc. Those are functions now.
Can I write var x : i32 = f(g(y + 7)) ?
4 + 3 * 7 = 25
DS: Precendence is back in. Does whatever GLSL does
DS: Preference is to make it explicit so we can’t screw it up -- the implementation. I think the explicitness of types means that different implementations won’t do the wrong thing.
JG: In general, an expression has a return type. If we already know that, then we either need conversion rules.
CW: Hard to screw up until you have implicit promotions or implicit conversions.
DS: Don’t have those.
CW: Can you do (uint)1 + (int)1 ?
JG: I’d want that to be a type error.
JG: My dream would be there’s no type annotation on type declarations. Because every variable needs to be assigned, it can be inferred from the return type.
DS: We still need type annotation for input variables.
MM: I think it should at least be possible to write the type of something.
var <id> [ : <type>][= <expr>]
DP: There’s value in having the optional type so the programming is definitely intending to do something.
DS: ai: Ask David if we can make the type optional.
MM: I looked at the SPIR-V op codes. Of the ones that are implementable on Metal, the return type can be inferred from the arguments.
DM: There are definitely some instructions. I can look into it and follow up. FOLLOW UP
CW: Think optional type is okay as long as we have no implicit conversion or promotion.
DM: Cautious about optional type. Don’t want to go too high level from SPIR-V. Don’t want to aim to make it a language every super loves. Either we have the explicit type or no explicit type. Optional is more complex than it should be.
MM: So you want: var x = isfloat(_____) where isfloat is only defined for floats.
CW: Jumping on what DM said. Initial Tint proposal was as exactly SPIR-V as we could. The question is how much sugar we want to have.
JG: I want a lot of sugar. There are a couple of these nice things that should be pretty easy. As long as we can reasonably find our way back to SPIR-V, then it’s okay.
CW: Right now, we go ask David Neto and see what he thinks. Probably not good he is an oracle for all things. We should also probably separate shading language and API discussions.
DS: I want to run it by him because he knows most about it to ensure we didn’t miss X, Y, Z.
DS: We have a process for doing things at the API level. I assume we can do a similar thing for Tint. Create a Github Issue, discuss, etc.
MM: Two other questions: Should we make a different repository? Should we have a subset of people discussing shading language?I agree that we should use the same repository and process.
DS: Yes, need to check with Google how things happen. Need to go through Open Source releasing.
CW: I think it would be more like gpuweb/tint.
MM: separate meetings?
CW: I think it depends on whether we think we’ll make enough progress cutting API-side meetings in half. Should probably be separate.
MD: I think if it’s separate, it’s more interesting for me and my team.
CW: is it the same spec editors for Tint and the WebGPU API or is the two too much?
DM: I think too much, also different expertise.
CW: No strong preference. Don’t want WSL, etc because it’s overloaded.
DM: Also a preference to not reuse previous names, because they’ve been used already and backfired.
MM: Would be cool if the name indicated what it’s used for, what you can do with it, etc. I’m not particular about the name, but I think Tint is a bad name.
MM: Guessing we can’t use SPIR.
JF: I’d like WebSL, as long as it is never shortened to WSL.
DM: Web GPU Language, WebGL
JG: I like Tint. It’s not immediately clear what it’s about, but the name is going to be infrequently used outside of WebGPU.
JG: I think we should think about this and discuss again.
CW: Though I agree with MM that we need to know what the name is soon.
JG: I don’t think a lot of time, perhaps just the next conference call. Let’s give ourselves a chance to do some creative thinking.
Debug extended instruction set
DS: In theory it would just be another import. That’s why Tint is generating OpNames and stuff to make debugging easier.
DP: The reason I ask is if there’s enough debug information to round trip to/from HLSL.
DS: Not sure how far. Tint would definitely output debugging information. Don’t know if it would round trip stuff like OpLine.
MM: Yea, we should have that.
KN: Iffy on that. It becomes confusing if there’s line numbers from HLSL, as well.
MM: C++ has stuff like #pragma line
KN: Also, I think there’s a SPIR-V opcode that lets you put the entire source.
DS: Yes there’s OpSource and OpSourceContinued.
MM: I’d like to not put entire other languages inside. Because then you need \ or ` or etc..
Detection of the OpCapability used by Tint
MD: Had a question about feature declaration. SPIR-V lists all of the feature bits that will be used by the program. HLSL compiler checks what features are used and annotates the SPIR-V with it. Is it the responsibility of the Tint compiler to validate, or do I need to annotate at the top of my source?
DS: I’d like to say annotate, but I don’t know how feasible that is. Some of them turn on other capabilities. I feel like we would have to do it in the compiler, but I don’t want to. From the HLSL side, is doing it there problematic?
MD: Whether a feature is used or not is a huge complicated map. Reducing flags is also complicated, determining if one thing enables another. The other thing that is super expensive is validating correct usage.
DS: My gut instinct is that we say the users have to tell us in the first version. And we can see if we can relax that in the future. If not tenable, we can try to relax it.
MM: I think it’s a little different, because as far as I know, we won’t have that many optional capabilities -- just like extensions. We don’t want to fragment the web.
MD: I think that’s fair. I don’t know how many features we think we’ll be supporting. As it grows, it’s something you’ll need to think about, if it does.
DS: Right now, there’s seven capabilities, and one extension: the memory model
MM: And in Tint, there’s no way to enable/disable. Right now we don’t have this problem. At the point we encounter the problem, we should come up with a good solution for it.
CW: The intent of the capability being “one of the following” is that they are singular. But you can have all of them, declared separately, if you want.
MM: So an intelligent compiler could look through your program and figure out the capabilities used. But for a first pass, we’ll probably just turn them all on.
Mandatory block offset
MM: Remind me again why it’s mandatory.
DS: In Vulkan, you have UBOs and SSBOs. UBOs are std140. SSBOs are std430. And there are extensions in Vulkan to let UBOs be std430, and relaxed block layout and scalar block layout. From the perspective of not trying to guess, “you tell me” is better.
MM: What about optional?
DS: If not provided, how do we know the offsets?
MM: From the rules you just said.
DS: So do you specify it’s an std140 struct?
MM: From JS there are no structs.
MM: What you’ve put in the spec is strictly more expressive than std140.
CW: In vulkan, you can say the offset is 1. But the execution environment can enforce.
JG: Let’s consider rephrasing on we need to figure out what we can/can’t do with packing. Do we say std140 everywhere? etc.
DS: Right, explicit offsets are in Tint right now, because it makes the compiler part easier.
CW: Most of the modern hardware can use packed layouts. So the offsets are useful when you want to control the layout exactly.
MM: Yea, not arguing it should be impossible. Just don’t want every author to have to deal with byte offsets.
.. so we can default std140, but optionally say std430, and make offsets as well.
CW: But we would still need the block declaration.
MD: Don’t have a strong opinion. You are absolutely correct that the mystical alignment restrictions were due to hardware. In practice, developers just do whatever until they get it right. I’m a big fan of having something specific.
CW: Would it be possible to have a pass to make it so that scalar layout can be converted into something like std140?
MD: Seems bad for performance, but sure, I think.
MD: With respect to that, I don’t know if we’ve talked about reflection and if we’ll enable that. Can I compile and get a reflection object out?
JG: It should generally be possible to not use reflection. Shouldn’t be mandatory.
MD: When it gets big, every real game engine is using some form of reflection.
JG: I’m going to say we should not have reflection. This has been a bottleneck moving WebGL implementations out of process. With WebGL, uniform locations are opaque objects, and then you need to use those to bind data. The performant data points to having everything in the shader being an explicit layout. It’s not done using API-level reflection. The engine knows that if I combine these shaders, I know where the attributes are.
MD: I have a lot of big customers and that’s not what they’re doing.
JG: My assertion is that from WebGL, mandating and even encouraging reflection is a big footgun for performance.
MM: Even if it’s asynchronous?
MD: Other thing that happens a lot is that people don’t know when assembling shaders, what exactly the inputs to the shaders will be. The shaders are complicated enough and they expect the compiler to figure it out and reflect.
CW: For an HLSL compiler, I build all my shaders, give it to DXC, and get back some reflection data. This takes into account optimizations that may have been done by the compiler. Say a texture is not used, the compiler omits it, etc.
MD: Absolutely, they depend on it actually. When you’re dealing with a huge shader library, it takes hundreds of inputs. They say the shader does a ton of things. What I want it to do is a simple diffuse map. They throw a few defines, throw it at the compiler, and get a reflection object.
JG: My hope would be that this is a level above us and they don’t have to deal with our security guarantees. We have to move the shader compilation, etc. out of the content process. We have to take it over to a different process, pretransform it, pass it to the GLSL driver, have it do the compilation, wait for that, then we ask it for reflection, and propogate it back to the content process.
CW: But in WebGL, you can’t see that the driver has optimized out a texture. If one browser has a fancier compiler, then browsers will reflect different data.
MM: This reflection information can be gathered by any compiler.
JG: I’m okay having this if it’s async, but I’m worried that’s not how these clients are going to use it.
MD: What do you mean async? You’re saying that Tint is not responsible for any optimization at all? Who exactly does the reflection?
JG: The game engine should do that. There should be another compiler above to optimize the shader and remove unused things.
CW: Don’t think reflection is off the table completely. One constraint is that it has to be asynchronous.
DP: Games engines do indeed to asynchronous shader loading. They usually do it on a background thread.
DM: I think we agreed softly last time to treat pipeline creation as asynchronous.
Sad but true that people make decisions based on shader reflection (example).
KN: We’ll definitely have a way to do it.
JG: Feels like biting off more than we originally thought. Thought that in user space you’d use an HLSL compiler with reflection, and pass SPIR-V.’
CW: I think the question is is there reflection, and if there is, is it deterministic?
MM: My claim is you can have both, a mix. Some parts that are deterministic and some that are not.
CW: The other problem is a pipeline is compiled against a BindGroupLayout. Either at the ShaderModule, we have to reflect, or after creating the pipeline, reflect, and then rebuild the pipeline?
MD: People use the reflection data to build the pipeline.
CW: We don’t have a two-step compilation model. There’s no step to look at the reflection data and change things.
MM: We can add an optional step.
DS: My other question. If you have Tint with 100 inputs, then DXC gives you 1 input, what data do we reflect?
MD: Customers want to rely on the optimizer to tell them what the inputs are.
JG: Don’t want to give customers false hope reflecting shaders, but not the optimized version.
DM: Having portability dependent on the optimizer is not good.
MM: I’m saying you produce it earlier pre-optimized.
CW: I think it would be great for useability to expose unoptimized reflection data.
DP: Is this a step toward just exposing the AST?
CW: This would be only the interface of the shader.
DP: Would be clearly following the spec.
MD: In that case, don’t do it.
MD: Tooling is what’s important. It would be interesting if the Tint compiler had a mode where static analysis was guaranteed. Just for tooling (not included in the browser). Otherwise they will go Tint -> SPIR-V and reflect that, then SPIR-V -> Tint. Pick your poison.
CW: I like the last one.
CW: Break time. Agenda for the final session?
JG: Need to triage tomorrow’s agenda.
CW: Nothing else. Done with shading language discussions for today!
JG: One of the concerns is that our base hardware/driver requirements for webgpu mvp are generally not achievable ~everywhere. Part of the question is: do we care? what do we do about it?
JG: One of the options we floated before is that we are moving toward the ideal API we want, then make an “ES” version that adds some restrictions. Combined texture/sampler, etc.
CW: Texture view format reinterpretation. Others - will find list.
JG: Another option is to retarget our “baseline” API for these systems. Then go to our ideal API later. So: Do we want to change course?
- Separate texture and samplers (not in GL)
- Texture view reinterpretation
- Cubemap vs. 2D array
JG: Part of this question is how much support do we want to have for WebGPU 1.0?
RC: We should gather this list as soon as possible. Don’t want to realize we have to change a lot of stuff. If we are going to have feature levels, we should put all the limits (maxBindGroups, etc.) in the feature level. Better for fingerprinting and, from my experience in game dev, way easier to program against.
CW: I think it’s super important because devs can write code without peppering magic constants all over. Also, in contrast with D3D/OpenGL, you are validated against the limits you asked for, not the limits of the machine.
CW: WRT fingerprinting, the browser can “round” the limits to the feature levels. Or it can provide only min feature levels (e.g. for an ad frame).
RC: We should document feature levels and the limits of those feature levels.
CW: Agree. We can have feature levels and avoid fingerprinting by rounding adapters to specific limits to a feature level.
RC: Suggest specifying the feature levels and saying browsers must round to them.
JG: That’s definitely where we’re sort of heading.
DP: And explicit feature levels can be very nice because the developer knows exactly what to expect.
JG: Corentin’s point about exposing the dictionary anyway is also quite useful (though easy to polyfill).
JG: Explicit feature levels allows us to avoid people e.g. incorrectly guessing the max cubemap size based on the max texture size.
DJ: Can someone remind me where we landed on the fallback to WebGL with compute as feature level 0, and why that wasn’t approached.
JG: I think the primary concern was our development work. Easier to develop WebGPU without some features than make a new WebGL version.
CW: Our concern was that some people may not adopt WebGL 2 because it doesn’t exist on iOS at this point. If we say WebGPU doesn’t work on many, many devices, it would be a big blocker for adoption, similar to WebGL 2.
MM: People generally eventually update their devices. Devices will be eventually able to use WebGPU. Saying that some devs don’t choose WebGL 2 because of iOS is just because we haven’t implemented it.
MM: If you fast forward into the future, eventually almost everything will support WebGPU.
CW: But 3-4 years ago when WebGL 2 came out, you could say the same thing. But in the mean time, adoption has lagged a lot because you could not have more than 90% reach with it.
JG: I guess you’re saying that it’s okay if it takes a while.
DJ: If we could get WebGL 2 support everywhere before WebGPU FL0 everywhere…….
DJ: The devices you’re talking about are the Android devices with ES 3.1/the feature pack.
CW: For us, having WebGPU available on an additional very large fraction of Android devices and Windows devices is important enough that we want to do this. We think the things that change in the API can be subtractions and additional validation. Just remove things from the API to make it work. If this group said it didn’t want to do it, we would probably do it anyway for Chrome-specific stuff.
DJ: What would the performance impact be of using WebGPU FL0 rather than WebGPU?
CW: I think most applications would almost not see the difference. Some weirder cases like texture reinterpretation would see it.
DJ: Is there any way to implement the WebGPU 1 API fully, but emulate the bits that aren’t technically supported - and give a warning?
CW: I think it’s not possible to have a technically conformant implementation. I think it will pass 99.9% of the tests.
CW: We think we should add a level to the feature levels list.
DJ: Just trying to find a way to do it without fragmenting the spec. At lunch we were talking about feedback that WebGPU is already too far behind the latest (MM: doesn’t have last 2 generations of GPU features).
CW: We intend to do this work and find out exactly what we can and can’t paper over. I think the things that can’t be papered over will be counted on one hand. Then we can make an informed decision.
DM: Don’t we already know it’s not possible due to combined texture/samplers?
DM: Seems clear we can’t paper over everything. Maybe some. Would not like to have to limit the API in that way. I also don’t think this is a feature level. It’s not just limits.
MM: Agree. They’re not superset/subset. They’ve branched.
CW: Don’t agree that feature levels are strictly ordered. E.g. mobile has explicit tile control that desktop doesn’t have, desktop has uint64.
JG: Along the two branches they’re supersets. With just 2 tracks I think we can do well.
MM: I think there was a conflation between fingerprinting mitigations and supporting more devices, and those are different.
JG: This is basically a D3D11 feature level for D3D12.
RC: That exists.
CW: From talking with Geoff Lang, we figured out some things. But I think there’s a lot that we can’t find without just implementing and then running the CTS. And the list of tests that fails on our GLES3.1 and D3D11 backends will tell us what does and doesn’t work. It will take a while. But I wanted to discuss the possibility of FL0.
DM: We also intend to implement on ES3.1. We would like to share our set of limitations with Google so they match.
MM: If this is exposed to JS, we would be concerned about a version of WebGPU that doesn’t run in Safari.
DM: It should be all restrictions. Programs should run on WebGPU.
DM: Limitation would be that samplers and textures can only be combined statically.
JG: It would run, but there would be desire from people to use FL0 in Safari. So we’d be asking some work from Safari.
CW: It will be at least 6 months before we have an answer.
JG: Can we have a persistent doc? CW: Let’s start a HackMD.
CW: Ideas for what should be in it?
JG: Float blending. …
CW: A FL could require extensions; it wouldn’t be an extension.
DM: I like the idea of tiers that we all agree on. But the main problem is: how do we get them? How do we figure out the limits and their values?
RC: Let’s ask tomorrow about how they decided those. There are both FLs and tiers.
YH: How do you draw the line between feature levels and extensions?
MM: I propose: eventually everyone will have devices that have a feature level. An extension is optional forever. FLs designed to be mandatory “in the future”.
YH: we can’t say an extension is optional forever. It can become a core feature later.
MM: At the point when we decide whether something is an extension or a FL it’s done. We can’t do it retrospectively.
DP: Actually you can - we discover feature levels over time.
CW: OpenGL does this with taking extensions into core.
KN: Don’t think there are features that make sense to be included in an FL but don’t make sense as an extension.
YH: Say raytracing becomes a core feature in every mainstream API - then it becomes a higher feature level?
CW: Scared about raytracing in particular, but e.g. subgroup operations I think everyone will eventually have.
MM: A higher value for number of bind groups would be an extension first?
CW: This can be expressed by a limit.
CW: Kai would like to rename “extensions” to “capabilities”. Each limit is a number that can get better. Each capability is a boolean.
JG: Would like an FL to be defined exactly as a set of limits and extensions. So we can add limits and extensions and combine them into FLs later.
JG: I think there might be some need for FLs to define hard upper limits.
KN: Are you saying if you got a FL you can’t go past it?
JG: Yes, but don’t really support it.
CW: How do we define feature levels? Where do they live in the IDL?
KN: Good to have agreement, but don’t think we need feature levels yet.
JG: I think we can have at least one feature level right now: one that has maxBindGroups=8 etc.
CW: Think that particular one will be hard to get past. Some intel desktop chips...
KN: Maybe not: gpuinfo.org shows those may have moved past (Bryan: I think it’s 4 or 8 using a newer driver).
JG: Will make a proposal.
KN: Should not separate them, but should consider consistently combining throughout the API. So they’re the same.
KN: So TextureDescriptor would only have Extent3D, not arrayLayers.
DM: You can have 3D array?
MM: SampleLOD would need to take five paramters and need to fit in a float4
KN: As far as I can tell, no one has implemented it.
RESOLVED: Number of layers for 2d-arrays should be size.z.
DM: So if you're creating a cube texture, you don’t have to specify depth of 6, but cube array..?
CW: Not such thing as a cube texture. THere’s only 2D array and you create a cubemap view.
RC: In D3D11, you have to say you want a cubemap, it’s not just an array of textures.
JG: In ANGLE there’s a where conversion copy that happens.
CW: That’s something we can paper over at some performance cost without changing the API to support D3D11.