Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the uniformity analysis to the WGSL spec #1571

Merged
merged 15 commits into from
Feb 2, 2022
Merged

Conversation

RobinMorisset
Copy link
Contributor

@RobinMorisset RobinMorisset commented Mar 29, 2021

This is spec text based on my previous uniformity analysis for WSL, you can see the previous version at:https://lists.w3.org/Archives/Public/public-gpu/2019Jul/0000.html (and the detailed version in the doc at the bottom of that email). The main difference is adding support for "discard", which somewhat complicates things, but still slots reasonably well into the overall framework.

The analysis currently does not support pointers (since it is a part of the spec that is still in flux). When we do add pointers, every read of a pointer will be considered non-uniform, and every taking of the adress of a variable will make that variable non-uniform (improvable later if it proves crippling, but it would be fairly complex).

It has a couple of other limitations:

  • It cannot infer the uniformity of global variables, so it currently assumes that none of them is uniform. We can easily improve it by adding an annotation on global variables, declaring them as uniform. In that case, we would check that every store to them is uniform (preserving soundness), and consider every read of them to be uniform.
  • It does not split the lifetimes of variables. What I mean by that, is that if there is any non-uniform store to a local variable, then every read of that variable will be considered non-uniform as well, even those that happen before the store. Correcting for that in the context of loops would be extremely painful, and I don't consider this limitation particularly major (in any case where it bites the programmer, they may easily split the variable into two).

There are still a couple of TODOs in the uniformity section of the spec:

  • One about the treatment of infinite loops, I will do more research on that
  • One about double-checking the rules for loop. I will test them on a bunch of examples in the next few days as I add tests to the test suite.
  • One about the list of functions in the standard library that have interesting uniformity requirements. This one should be straightforward to fill.

I am very interested in getting feedback on this PR, both on the analysis itself and on its presentation. I've tried to explain it as clearly as I could, but I realize that I probably failed in quite a few places; I've just spent long enough working on this that I don't know which parts are hard to understand.


Preview | Diff

@github-actions
Copy link
Contributor

Previews, as seen at the time of posting this comment:
WebGPU | IDL
WGSL
c6cf71b

wgsl/index.bs Outdated

### Uniformity analysis overview ### {#uniformity-overview}

In this section we specify an analysis that verifies that functions which are only safe to call in uniform control flow (barriers and derivatives) are only called in such a context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"e.g." barriers and derivatives

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add.

wgsl/index.bs Outdated

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.
The location of a local variable is said to be uniform, if all invocations have the same value in that variable at every program point where it is live.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really mean "location" of a local variable? Why not just a variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to replace it with just variable, I can't remember why I went with the more complicated phrase.

wgsl/index.bs Outdated
- If the analysis refuses a program, it provides a straightforward chain of implications that can be used to craft a good error message
</div>

The analysis analyzes each function in turn, verifying that there is a context where it is safe to call this function (otherwise it rejects the program), and computing metadata about the function to help analyze its callers in turn. This means that the call graph must first be built, and functions must be analyzed from the leaves (functions that call no function outside the standard library) upwards (toward the entry point), so that whenever we analyze a function we have already computed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there are a lot of parentheticals here. I think reading them is required to understand the paragraph, so they should be promoted to regular prose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, will fix.

wgsl/index.bs Outdated
</div>

The analysis analyzes each function in turn, verifying that there is a context where it is safe to call this function (otherwise it rejects the program), and computing metadata about the function to help analyze its callers in turn. This means that the call graph must first be built, and functions must be analyzed from the leaves (functions that call no function outside the standard library) upwards (toward the entry point), so that whenever we analyze a function we have already computed
the metadata for all of its callees. There is no risk of being trapped in a cycle, as recurrence is forbidden in the language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a newline here? (Throughout the rest of this PR, you seem to be putting paragraphs on single lines, except for this one specific place.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be valuable to use the term "topographical sort" to make this more clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newline is just an accident. Explicitly using the term "topographical sort" is a good idea, I'll add it.

wgsl/index.bs Outdated
The first phase walks over the AST of the function, building a directed graph along the way.
The second phase explores that graph, resulting in either rejecting the program, or computing the constraints on calling this function.

Note: Each vertex in this graph corresponds either to the statement "The control-flow at a specific program point must be uniform" or to "The value of a specific expression must be uniform" or to "A specific variable must always hold a uniform value" or is one of the special vertices True and False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "corresponds to" mean here? Does it mean that each vertex holds log2(5) bits of state? Or does each vertex actually just hold 1 bit of state, and the meaning of that one bit is different depending on what kind of AST node it corresponds to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, paraphrasing each of these 5 values, I see:

  1. Something must be uniform
  2. Something else must be uniform
  3. A third thing must hold a uniform value
  4. true
  5. false

So how does a vertex indicate the possibility of non-uniformity? None of the states (except for "false") seem to be able to describe that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Corresponds to" means "here is how I think about the meaning of this, and why the rules look like they do". The vertices hold no information, they can be represented by a single integer, and edges are just pairs of integers.

Something having to be uniform is indicated by having a path from True.
Something that cannot be guaranteed to be uniform is indicated by having a path to False.
I will try to make it more explicit.

wgsl/index.bs Outdated

TODO: do we need an extra behavior for possibly infinite loops? I should check whether some form of reconvergence happens after something like `while(tid mod 2) {}`.

If the set of behaviors for a statement is a singleton, we take that to mean that reconvergence happened, and we pick the vertex corresponding to control-flow at the start of the statement as the vertex corresponding to control-flow at the exit of the statement, overriding the other rules below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using words like "we" in a spec? Maybe instead of "we take that to mean that" we should instead say "this means that"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will fix.

wgsl/index.bs Outdated

### Uniformity rules for function calls ### {#uniformity-function-calls}

The most important and complex rule is for function calls:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really be ascribing importance to individual rules? It's certainly the most complex, but "important" seems like a value judgement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I guess for a spec it might be inappropriate phrasing. I'll try to rephrase it.

wgsl/index.bs Outdated
<td>
<td class = "nowrap">*CF*, *s1* -> *CF1*, *B1*<br>
*CF1*, *s1* -> *CF2*, *B2*
<td>*B1* \ {Nothing} U *B2*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the actual Unicode characters for these operations?

  • U+2229 INTERSECTION
  • U+222A UNION
  • U+2208 ELEMENT OF
  • U+2209 NOT AN ELEMENT OF
  • U+2216 SET MINUS
  • U+2205 EMPTY SET
  • U+2282 SUBSET OF

https://www.unicode.org/charts/PDF/U2200.pdf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I took the shortcut of staying with ascii, but I probably should replace all of these. Thanks for the link.

Copy link
Contributor

@kdashg kdashg Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. &#x2229; -> ∩? Fancy!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.toptal.com/designers/htmlarrows/math/
&cap; &cup; &xcap; &xcup; &isin; &notin; &setminus; &empty; &sub; &nsub; &sube; &supe; &nsup; &sup;
∩ ∪ ⋂ ⋃ ∈ ∉ ∖ ∅ ⊂ ⊄ ⊆ ⊇ ⊅ ⊃

wgsl/index.bs Outdated
<table class='data'>
<caption>Uniformity rules for statements</caption>
<thead>
<tr><th>Statement<th>New vertices<th>Precondition<th>Behaviors<th>Resulting control-flow vertex<th>New edges
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for loops don't seem to be listed here. We should either
A) Describe the desugaring in a more robust way in the "for statement" section of the spec, and link to it here
B) Add a row in the table for the for loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused by the "new vertices" column heading of the chart, because the "resulting control-flow vertex" is a new vertex, but doesn't appear in the "new vertices" column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding the "precondition" column. Above, you say

In the table below, CF1, S -> CF2, B means "run the analysis on S starting with control-flow CF1, apply the required changes to the graph, and name the resulting control-flow CF2, and the resulting set of behaviors B".

And these CF1, S -> CF2, B rules are inside the "precondition" column of the table. How can a precondition involve running an analysis and modifying a graph? I always thought a precondition was just a claim that either is true or untrue, and didn't modify state or anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not list the rules for "for" loops, because they are supposed to desugar to "loop".

The "resulting control-flow vertex" may be a brand new vertex (in which case it will also be in the "new vertices column"), but it may instead be the result of the application of a rule on a sub-expression.

You are right that "precondition" is a bit of an abuse of language here, I should not have kept it from more conventional type systems. Maybe "recursive analyses" or something like that is a better name for this column.

*CF1*, *s1* -> *CF2*, *B2*
<td>*B1* \ {Nothing} U *B2*
<td>*CF2*
<td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can there not be a new edge between s1 and s2 in s1; s2? I think I'm missing something fundamental here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, there is a typo here: it should be *CF1*, *s2* -> *CF2*, *B2* on the second line, I accidentally repeated s1 instead of s2.

I guess it did not help with understanding, sorry. To further clarify things:

  • s1 and s2 are statements, not vertices in the graph, so it makes no sense to have edges between them.
  • the first line says that CF1 is the vertex corresponding to the control-flow at the exit of s1.
  • the second line says to then use it as the control-flow at the entrance of s2, and name the vertex corresponding to the control-flow afterwards CF2
  • so we don't need to explicitly create a new edge here, because the rules for *CF1*, *s2* -> *CF2*, *B2* must already have made a path from CF2 to CF1, since we cannot have uniform control-flow after s2 if it was non-uniform before it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 The way I'm thinking of it is that each statement generates two nodes in the analysis graph, essentially a precondition and a post-condition. So the two lines express CF1===Post(s1) === Pre(s1) and so the path in the analysis graph is CF -> CF1 -> CF2

wgsl/index.bs Outdated
<tr><td class="nowrap">loop {*s1*}
<td>*CF'*
<td class="nowrap">*CF'*, *s1* -> *CF1*, *B1*
<td>*B1* \ {Break, Continue} U {Nothing}
Copy link
Contributor

@litherum litherum Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that loop should U {Nothing}? Control flow will always enter the loop, and it can't exit without a break or a return or a discard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, I took that part from the rules for while, which can exit whenever the condition is true.
The more precise behaviors here should be B1, removing Continue, and replacing Break by Nothing if it exists.
I'll make the change in the next commit.

@RobinMorisset
Copy link
Contributor Author

Thanks a lot @litherum for all of these comments! They inform me of which parts need some clarification.
I'll answer all of them tomorrow morning, but will try to clarify a few things inline right now.

@litherum
Copy link
Contributor

Instead of "true" and "false" we might want to use "definitely uniform" and "possibly nonuniform"

@RobinMorisset
Copy link
Contributor Author

Instead of "true" and "false" we might want to use "definitely uniform" and "possibly nonuniform"

This is almost what I had in a previous version of the analysis. I picked True/False to make the analogy between edges and logical implication clearer, but I can easily reverse this and go back to these names.

@dneto0
Copy link
Contributor

dneto0 commented Mar 30, 2021

Before getting into the heart of the analysis:

  • data in var<uniform> is always uniform because it's the same variable shared across all invocations in the entry point, and a read-only storage class
  • samplers, and sampled textures have the same properties

Also, right now WebGPU is guaranteeing (i asked long ago) that there is no aliasing of storage locations across resource variables. That's probably implicitly assumed by the analysis.

For infinite loops: I think the standard thing to say is that no semantics is defined for the program if it loops infinitely. Shaders are assumed to finish in finite time; in real life operating systems intervene to kill runaway processes, which pragmatically speaking is the same.

I'd be interested to see if the analysis allows an interesting example such as @toji 's clustered-forward shading example: https://twitter.com/tojiro/status/1344409446802362372?lang=en and live page here

wgsl/index.bs Outdated

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.
A local variable is said to be uniform, if all invocations have the same value in that variable at every program point where it is live.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a little misleading, since it's not "same value" really, but rather the value is dependent on the same control flow in all invocations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something must be said about what "all invocations" mean here. Is it "all" through the draw/dispatch? "all" in the current scope? Or even "all" as in all active at any point within the wave?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with @jdashg's comment here. The way I use the term, a variable can be in uniform control flow and be non-uniform because it has different values across the scope of the uniformity. I would argue this definition because it is useful. If a control flow construct depends on a non-uniform variable under my definition, the control flow will be non-uniform. To express that in a definition where uniformity of a variable depends on the uniformity of its control flow, we would have to redefine a characteristic of variables to plug into that statement.

Furthermore, the uniformity of a uniform buffer concerns the persistence of its values across a broad scope. The same should apply to defining a variable as uniform.

@pow2clk pow2clk self-requested a review March 30, 2021 18:38
@kdashg kdashg self-requested a review March 30, 2021 18:38
@kvark kvark self-requested a review March 30, 2021 18:44
@alan-baker alan-baker self-requested a review March 30, 2021 19:04
Copy link
Contributor

@alan-baker alan-baker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the 2021-03-30 meeting, I was asked to link to Vulkan's scoping definitions.

Vulkan defines invocation group as workgroup if that is a defined concept, otherwise it is the command. See https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#shaders-scope-command

For shaders without defined workgroups, this set of invocations forms an invocation group as defined in the SPIR-V specification.

SPIR-V only defines uniform control flow (i.e. divergence and reconvergence) in terms of invocation groups. See https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#UniformControlFlow.

wgsl/index.bs Outdated Show resolved Hide resolved
wgsl/index.bs Outdated Show resolved Hide resolved
wgsl/index.bs Outdated
The following definitions are merely informative, trying to give an intuition for what the analysis in the next subsection is computing.
The analysis is what actually defines these concepts, and when a program is valid or break the uniformity rules.

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even informatively I would avoid saying lockstep execution occurs. Also, I think the invocations need scoped. For compute each workgroup defines uniformity. For other shaders I'd lean towards what SPIR-V defines as an invocation group (though this may prove to be to restrictive).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For purposes of utility, I would define uniform control flow as being control flow that depends on values certain to be the same over the defined scope of uniformity. The programmer doesn't care if they are executed in lockstep or not. It's not strictly relevant to the restrictions this introduces. Rather than having to define how hardware works, let's define how you can write a program that trips it up.

wgsl/index.bs Outdated

<table class='data'>
<caption>Uniformity rules for expressions in lvalue positions</caption>
<thread>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<thread>
<thead>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed.

wgsl/index.bs Outdated
<td>*B1* \ {Nothing} U *B2*
<td>*CF2*
<td>
<tr><td class="nowrap">if (*e*) then *s1* else *s2*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this chains for elseif, but it would be good to include.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@dneto0
Copy link
Contributor

dneto0 commented Mar 30, 2021

First discussion in meeting of 2021-03-30

@kdashg
Copy link
Contributor

kdashg commented Mar 30, 2021

WGSL meeting minutes 2021-03-30
  • AB: Misunderstanding of the proposal. Expected “there are some things that originate non-uniform values or behaviors” but don’t see that addressed. E.g. some builtins are definitely uniform and others definitely non-uniform. Eg. group ID value, invocation ID value. Is this more conservative than I thought?
  • RM: Have TODOs to fill in for different builtin functions. The analysis is about propagating the facts through a view of the program. But can change the inputs / leaf facts as needed.
  • AB: Looking at behaviours of a function body. If neither Nothing or Return, then reject the program. But each loop that terminates must have a Break; so anything with a loop is rejected.
  • RM: When you exit a loop, you remove the “Break” from the set of behaviours, and replace it with “Nothing”. Will address by separating the prose.
  • AB: I think I get the intent. The discard one confused me. The only behaviour is Discard, or Return. Is a singleton of behaviours allowable, as long as each invocation has the same behaviour.
  • RM: Yes, a singleton would be accepted. If there is a singleton behaviour for a function, then the behaviour after the call site is the same as the behaviour before the call site: either all the invocations are still there, or none of them are.
  • RM: Idea is if you have a single behaviour for a compound statement (or function body), then the convergence property after the end of that statement is the same as at the beginning. Will try to add examples, and check against the test suite.
  • MM: Examples would have helped reading through.
  • MM: Bikeshed has a mechanism for marking a section as an example; examples are non-normative.
  • DM: does the PR allow private variables to be uniform?
  • RM: Currently all globals are considered non-uniform. DN pointed out some kinds of module-scope variables are uniform.
  • Discussion: sampled textures are uniform. Storage textures are pointers
  • Need some things to be clarified editorially; RM to continue edits. Direction is good.
  • AB: Need to define the invocation group.
  • DM: For vertex shader it doesn’t matter.
  • AB: Yep. For a fragment shader, it gets complicated.
  • JG: Is invocation group an entire draw call? That’s what it brings to mind, but maybe it’s the raster group.
  • AB: Will look it up.
  • MM: Does it affect the analysis for today’s WGSL? Sure, it may when adding features.
  • DM: It affects flat-interpolated floats.
  • GR: Derivatives don’t care about the full group; it only cares about the quad.
  • AB: SPIR-V / Vulkan doesn’t guarantee reconvergence of the quad after a prior divergence. So it’s more than just the quad.
  • JG: If interested in the content, tag yourself as a reviewer. Wait on merging until all the reviewers have had a chance to review. Expect to be permissive: don’t expect it to be perfect.
  • RM: Feel should wait at least one week. Hope to do the first round of edits by office hour tomorrow; may confirm builtins lists in office hours tomorrow.
  • AB: Checked Vulkan spec: invocation group for a fragment shader is the whole draw.
    • DM: Please post the reference.

Copy link
Contributor

@dneto0 dneto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only partially reviewed this.
Sending this review now because it has actionable feedback, particularly on request for clarity of what's being modeled.

wgsl/index.bs Outdated
The following definitions are merely informative, trying to give an intuition for what the analysis in the next subsection is computing.
The analysis is what actually defines these concepts, and when a program is valid or break the uniformity rules.

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording: "lockstep" is the wrong word here. Let's work on this as an editorial fix.

wgsl/index.bs Outdated
The analysis is what actually defines these concepts, and when a program is valid or break the uniformity rules.

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is stronger than I'm used to: I'm used to "all invocations compute the same value for it".
But maybe it amounts to the same thing?
Is there extra conservatism added because of requiring it to be in uniform control flow. Is it for simplicity of analysis/ diagnostics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. How would you say "all invocations" without implying the uniform control flow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reason we need to define whether an expression is "uniform" is so we can place limits on what operations are in it. For this reason, we only care if it is in uniform control flow. These expressions might produce different results and still be in uniform control flow, that doesn't mean you can't use the uniform control limited operations in such expressions.

wgsl/index.bs Outdated

### Uniformity analysis overview ### {#uniformity-overview}

In this section we specify an analysis that verifies that functions which are only safe to call in uniform control flow (barriers and derivatives) are only called in such a context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call out that it's a static analysis: it is done only by inspecting the program source.

wgsl/index.bs Outdated
The first phase walks over the AST of the function, building a directed graph along the way.
The second phase explores that graph, resulting in either rejecting the program, or computing the constraints on calling this function.

Note: Each vertex in this graph corresponds either to the statement "The control-flow at a specific program point must be uniform" or to "The value of a specific expression must be uniform" or to "A specific variable must always hold a uniform value" or is one of the special vertices True and False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which agent has the obligation for the "must"? The analysis, the programmer, or the runtime behaviour of the implementation?

Is it provability?

The control flow at this point in the program is proven to be uniform?

Or

Semantics of the language require that this statement be executed in uniform control flow. (e.g. this is a control barrier)

wgsl/index.bs Outdated

In more detail, here is the algorithm for analyzing a function:

* Create some vertices called "True", "False", "CF_start", "CF_return" and if the function is non-void a vertex called "Value_return"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intuition behind True and False.

Is it

  • True = control flow is known to be uniform at this point
  • False = control flow is not known to be uniform at this point.

That highlights the one-sided error of the analysis, which is important to make clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I originally went for "mustBeUniform" "cannotBeGuaranteedToBeUniform", but found it too verbose, and think that True/False makes it clearer why edges can be interpreted as edges.

wgsl/index.bs Outdated
The second phase explores that graph, resulting in either rejecting the program, or computing the constraints on calling this function.

Note: Each vertex in this graph corresponds either to the statement "The control-flow at a specific program point must be uniform" or to "The value of a specific expression must be uniform" or to "A specific variable must always hold a uniform value" or is one of the special vertices True and False.
Each (directed) edge corresponds to an implication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to describing more what implication is supposed to capture.

@kvark kvark mentioned this pull request Apr 1, 2021
Copy link
Contributor

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finished my review yet, will do another pass after the first iteration.

wgsl/index.bs Outdated
The analysis is what actually defines these concepts, and when a program is valid or break the uniformity rules.

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. How would you say "all invocations" without implying the uniform control flow?

wgsl/index.bs Outdated
The following definitions are merely informative, trying to give an intuition for what the analysis in the next subsection is computing.
The analysis is what actually defines these concepts, and when a program is valid or break the uniformity rules.

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to define the scope of this, i.e. what subset of invocations is considered for uniformity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree completely that this needs to be defined. To do so, we'll have to introduce the concept of waves/subgroups/warps/wavefronts/whatever. Which I don't think is in the spec currently.

This brings up another concern of mine. Will having one scope for "uniform" that applies to buffers and another scope for "uniform" as it applies to variables and control flow be confusing? I think so.

wgsl/index.bs Outdated

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.
A local variable is said to be uniform, if all invocations have the same value in that variable at every program point where it is live.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something must be said about what "all invocations" mean here. Is it "all" through the draw/dispatch? "all" in the current scope? Or even "all" as in all active at any point within the wave?

wgsl/index.bs Outdated Show resolved Hide resolved
wgsl/index.bs Outdated Show resolved Hide resolved
wgsl/index.bs Outdated Show resolved Hide resolved
wgsl/index.bs Outdated
<thead>
<tr><th>Statement<th>New vertices<th>Recursive analyses<th>Behaviors<th>Resulting control-flow vertex<th>New edges
</thead>
<tr><td class="nowrap">*s1*; *s2*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this part. What statement is this in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a sequence. E.g. two function calls {foo(); bar();}, this simply says that you should first analyze the first statement, then the second, using the control flow at the exit of the first one as the control flow at the entrance fo the second.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically speaking, the grammar doesn't have this in the statements. Perhaps, we should move this rule out of the table?

wgsl/index.bs Outdated
<tr><td class="nowrap">if (*e*) then *s1* else *s2*
<td>*CFinside*, *CFend*
<td class="nowrap">*CF*, *e* -> *CF'*, *V*, *B*<br>
*CFinside*, *s1* -> *CF1*, *B1*<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm confused here again. We are defining a new vertex "CFinside", but what is it?
This command is supposed to mean "run the analysis on S starting with control-flow CF1, apply the required changes to t...", so it assumes that CF1 is known and given. But here where we are using this, we aren't defining it in any way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CFinside is a freshly allocated node.
It is passed to the recursive analysis so that it can make edges from other nodes to it.
And from the last column you have "CFinside -> {CF', V}", which adds two edges starting from it.

In terms of implementation, a vertex/node is just an integer. So this entire rule would look something like this (in pseudo-C++ syntax):

unsigned CFinside = ++allocatedVerticesCounter;
unsigned CFend = ++allocatedVerticesCounter;
auto (CF', V, B) = analyzeExpr(CF, e);
auto (CF1, B1) = analyzeStmt(CFinside, s1);
auto (CF2, B2) = analyzeStmt(CFinside, s2);
behaviorSet bReturn = union(B, B1, B2);
edges[CFinside].push_back(CF');
edges[CFinside].push_back(V);
edges[CFend].push_back(CF1);
edges[CFend].push_back(CF2);
return (CFend, bReturn);

It is probably not perfect C++ (in particular I have not checked the syntax for creating/destructuring tuples, and we probably should expand the edges vector when we increment allocatedVerticesCounter, and we should not use an apostrophe in an identifier), but it should hopefully clarify the meaning of this table.

Did this answer your question? And if so, do you have any advice on how I should present it in the spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think we should have some wording here to help the reader, like "CFinside is a node corresponding to XXX", and same for all the nodes we are introducing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edges[CFinside].push_back(CF');

I agree this is necessary, but I can't figure out from the table where this edge comes from. I'd suggest stating it explicitly.

wgsl/index.bs Show resolved Hide resolved
wgsl/index.bs Outdated
### Uniformity rules for function calls ### {#uniformity-function-calls}

The most complex rule is for function calls:
- For each argument, apply the corresponding expression rule, with the control-flow at the exit of the previous argument (using the control-flow at the beginning of the function call for the first argument). Name the corresponding value vertices "arg_i" and the corresponding control-flow vertices "CF_i"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the "CF_i" exactly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're the nodes returned by the recursive calls to the analysis when analyzing the arguments.
They're intended to correspond to the uniformity requirement of the control-flow at the end of evaluating the corresponding argument.

@kvark
Copy link
Contributor

kvark commented Apr 6, 2021

Also, before I forget, we really need to figure out the exact scopes for uniformity (see #669 thread) before proceeding with the analysis. The design space includes some interesting options if the scope is at least the sub-group.

@alan-baker
Copy link
Contributor

I'm concerned about the following case:

loop {
  storageBarrier();
  if (non_uniform_expr) {
    continue;
  }
  // ...
};

SPIR-V will not guarantee that the invocations are reconverged at the start of the subsequent loop iterations, so the barrier cannot be considered to be in uniform control flow. I don't think the analysis captures this.

Copy link
Contributor

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the last minute review.

Loathe as I am to bog us down in renaming discussions, I fear that fact that "uniform" means a different scope for variables and buffers will be confusing.

I think the definitions of "uniform" for different nouns should be based on utility as I've described internally.

I think introducing the notion of a block would be helpful in simplifying the way to determine uniformity as it would allow simply saying that a variable is uniform if it is assigned in a basic block whose execution depends on uniform control flow and all its dependencies are uniform. We could do away with defining a uniform expression and just say that certain operations cannot be used within a basic block that is non-uniform.

wgsl/index.bs Outdated

Control-flow is said to be uniform at a point in the program if all invocations execute in lockstep at that program point.
An expression is said to be uniform if it is in uniform control-flow, and all invocations compute the same value for it.