Validation of whole shader modules #974

kvark · 2020-08-04T19:40:29Z

This is spawned from #889, related to #644
The issue is split into 3 sections: problem statement, investigation, and proposal. Please consider these parts to be separate when reviewing.

Problem

Main question is, what can we validate at createShaderModule ("compiling"), and what should we put in the spec, versus what we validate at createXxxPipeline ("linking").

The question has 2 sides:

where/when the validation happens (i.e. "compiling" vs "linking")
what is validated (do we allow bad entry points that aren't used?)

Investigation

I can think of splitting all the shader work in 3 stages.

(1) Things we can do at compilation for the module as a whole:

parsing the module
- syntax checks, tokenization, reserved words, etc
type checking all variables, constants, and functions in the module
- this includes things like calling texture_sample ([wgsl] Adding textures to the spec. #909) with wrong number of texture coordinates

(2) Things we can do at compilation for any function:

check if it's (statically) recursive
determine if calling it breaks uniformity of the context (e.g. it discards pixels)
determine if it expects the context to be uniform (e.g. it samples from a texture with an implicit LOD)
figure out the interface: all inputs, outputs, samplers, uniform and storage resources, that the function statically uses
figure out what device features does a function require

For entry points specifically, we can detect binding collisions (#889) here.

(3) Things that we have to do at link time:

check for required device features
match the interface between programmable stages (Interface matching rules #644), e.g. varyings
match the interface with fixed-function hardware. This includes vertex attributes, output colors, multisampling, etc
match the binding with the pipeline layout
generate platform-dependent shader code. This stage should expect no errors, with an exception of unforeseen things, like the driver refusing to link a program that uses too many registers.

Proposal

We should check for shader validity early (i.e. (1) and (2) at shader compilation time), and report errors eagerly. Letting invalid code to be dead weight is only going to serve as a footgun, and I don't see how this would make our implementors life easier either.

The only things we have to validate at linking (e.g. createRenderPipeline) are basically interface matching rules, plus device features.

Whether or not we allow the implementations to defer the work to linking is something we can compromise on. However, forcing the implementations to be silent at shader compilation time is something I feel strongly against.

The text was updated successfully, but these errors were encountered:

Kangz · 2020-08-11T07:28:37Z

check for required device features

Don't we know which device features are available at createShaderModule time? The call to createShaderModule is on a GPUDevice so we should know what's available.

Whether or not we allow the implementations to defer the work to linking is something we can compromise on. However, forcing the implementations to be silent at shader compilation time is something I feel strongly against.

To go further, I think we should force some things to be checked at createShaderModule time, like parsing, typechecks and device feature checks, so that they can be reported in GPUShaderModule.compilationInfo.

kvark · 2020-08-11T13:33:58Z

Don't we know which device features are available at createShaderModule time? The call to createShaderModule is on a GPUDevice so we should know what's available.

We briefly touched on that in the last WGSL call: it would be nice if you could have entry points that use specific features. I.e. same module having entry points for subgroup operations and entry points without them (as a fall-back). In this case, shader module creation ("compilation") is totally fine, but at pipeline creation we'd compare the list of required features by the entry points with what the device supports.

grorg · 2020-08-11T18:43:13Z

Discussed at the 2020-08-11 meeting.

Resolution: Accept this and move it into Needs Specification

magcius · 2020-08-11T19:23:38Z

There's a possible complication here, which is how much cross-stage work WebGPU and drivers are going to do which might affect your "portable performance" goal.

Some drivers (basically NVIDIA) are a lot more aggressive at cross-stage optimizations, so they might notice that a vertex shader export isn't used in a pixel shader, and DCE all the way back up. This makes the compilation stage kind of pointless; it does basically the bare minimum error checking necessary by the standard, but defers most, if not all compilation work, to PSO creation, with options to do more shader compilation and shader replacement in the background on special driver threads.

Other drivers don't have these implemented, e.g. AMD's official driver does not do that many cross-stage DCE optimizations to the best of my knowledge, and so games that have only been tested on NVIDIA drivers might run more poorly on AMD. There is -- possibly! -- an opportunity for middleware like WebGPU to apply cross-stage optimizations, but it's a tricky tradeoff and it's always difficult to do serious performance analysis on a black box driver.

kvark · 2020-08-11T21:54:14Z

@magcius I agree with the input data you referenced (i.e. a lot of driver work happens at linking), but not with the "This makes the compilation stage kind of pointless" statement. Please note that this issue doesn't mention performance: it's focused on correctness. From that perspective, we can parse, load, and validate everything in a module before any linking is done. This is work that we have to do anyway, and deferring it to later doesn't bring any benefits, only confusing the users.

The example I provided on the call is users having a shader module that is not used in any pipeline. If the class (1) or (2) work is deferred to linking by the implementation, the user will not get any errors. But a different implementation will throw an error.

Kangz · 2022-02-03T10:54:01Z

#1241 fixes this partially and there's the notion of "spurious errors". Can this be closed?

kvark added investigation proposal wgsl WebGPU Shading Language Issues labels Aug 4, 2020

grorg added this to For Next Meeting in WGSL Aug 11, 2020

kdashg moved this from For Next Meeting to Resolved: Needs Specification Work in WGSL Aug 11, 2020

kvark mentioned this issue Nov 20, 2020

spell out when an implementation must reject a shader module #1241

Closed

Kangz added this to the V1.0 milestone Feb 3, 2022

kvark closed this as completed Feb 4, 2022

WGSL automation moved this from Resolved: Needs Specification Work to Done Feb 4, 2022

ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this issue Sep 6, 2022

Remove the last use of readonly-storage-textures. (gpuweb#974)

729be85

kainino0x mentioned this issue Feb 11, 2023

wgsl: limit validation #3815

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation of whole shader modules #974

Validation of whole shader modules #974

kvark commented Aug 4, 2020

Kangz commented Aug 11, 2020

kvark commented Aug 11, 2020

grorg commented Aug 11, 2020 •

edited

magcius commented Aug 11, 2020

kvark commented Aug 11, 2020

Kangz commented Feb 3, 2022

Validation of whole shader modules #974

Validation of whole shader modules #974

Comments

kvark commented Aug 4, 2020

Problem

Investigation

(1) Things we can do at compilation for the module as a whole:

(2) Things we can do at compilation for any function:

(3) Things that we have to do at link time:

Proposal

Kangz commented Aug 11, 2020

kvark commented Aug 11, 2020

grorg commented Aug 11, 2020 • edited

magcius commented Aug 11, 2020

kvark commented Aug 11, 2020

Kangz commented Feb 3, 2022

grorg commented Aug 11, 2020 •

edited