Skip to content

Commit

Permalink
Vulkan Guide draft entry for VK_KHR_shader_subgroup_uniform_control_f…
Browse files Browse the repository at this point in the history
…low (#118)

* Vulkan Guide draft entry for VK_KHR_shader_subgroup_uniform_control_flow
  • Loading branch information
alan-baker committed Jul 9, 2021
1 parent 6787f60 commit 605e3f3
Show file tree
Hide file tree
Showing 2 changed files with 124 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ The Vulkan Guide is designed to help developers get up and going with the world
- [VK_KHR_imageless_framebuffer](./chapters/extensions/VK_KHR_imageless_framebuffer.md)
- [VK_KHR_sampler_ycbcr_conversion](./chapters/extensions/VK_KHR_sampler_ycbcr_conversion.md)
- [VK_KHR_timeline_semaphore](https://www.khronos.org/blog/vulkan-timeline-semaphores)
- [VK_KHR_shader_subgroup_uniform_control_flow](./chapters/extensions/VK_KHR_shader_subgroup_uniform_control_flow.md)
----

#### [Contributing](./CONTRIBUTING.md)
Expand Down
123 changes: 123 additions & 0 deletions chapters/extensions/VK_KHR_shader_subgroup_uniform_control_flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# VK_KHR_shader_subgroup_uniform_control_flow

## Overview

[VK_KHR_shader_subgroup_uniform_control_flow](https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_KHR_shader_subgroup_uniform_control_flow.html)
provides stronger guarantees for reconvergence of invocations in a shader. If
the extension is supported, shaders can be modified to include a new attribute
that provides the stronger guarantees (see
[GL_EXT_subgroup_uniform_control_flow](https://github.com/KhronosGroup/GLSL/blob/master/extensions/ext/GL_EXT_subgroup_uniform_control_flow.txt)).
This attribute can only be applied to shader stages that support subgroup
operations (check `VkPhysicalDeviceSubgroupProperties::supportedStages` or
`VkPhysicalDeviceVulkan11Properties::subgroupSupportedStages`).

The stronger guarantees cause the uniform control flow rules in the SPIR-V
specification to also apply to individual subgroups. The most important part of
those rules is the requirement to reconverge at a merge block if the all
invocations were converged upon entry to the header block. This is often
implicitly relied upon by shader authors, but not actually guaranteed by the
core Vulkan specification.

## Example

Consider the following GLSL snippet of a compute shader that attempts to reduce
the number of atomic operations from one per invocation to one per subgroup:

```glsl
// Free should be initialized to 0.
layout(set=0, binding=0) buffer BUFFER { uint free; uint data[]; } b;
void main() {
bool needs_space = false;
...
if (needs_space) {
// gl_SubgroupSize may be larger than the actual subgroup size so
// calculate the actual subgroup size.
uvec4 mask = subgroupBallot(needs_space);
uint size = subgroupBallotBitCount(mask);
uint base = 0;
if (subgroupElect()) {
// "free" tracks the next free slot for writes.
// The first invocation in the subgroup allocates space
// for each invocation in the subgroup that requires it.
base = atomicAdd(b.free, size);
}
// Broadcast the base index to other invocations in the subgroup.
base = subgroupBroadcastFirst(base);
// Calculate the offset from "base" for each invocation.
uint offset = subgroupBallotExclusiveBitCount(mask);
// Write the data in the allocated slot for each invocation that
// requested space.
b.data[base + offset] = ...;
}
...
}
```

There is a problem with the code that might lead to unexpected results. Vulkan
only requires invocations to reconverge after the if statement that performs
the subgroup election if all the invocations in the __workgroup__ are converged at
that if statement. If the invocations don’t reconverge then the broadcast and
offset calculations will be incorrect. Not all invocations would write their
results to the correct index.

`VK_KHR_shader_subgroup_uniform_control_flow` can be utilized to make the shader
behave as expected in most cases. Consider the following rewritten version of
the example:

```glsl
// Free should be initialized to 0.
layout(set=0, binding=0) buffer BUFFER { uint free; uint data[]; } b;
// Note the addition of a new attribute.
void main() [[subroup_uniform_control_flow]] {
bool needs_space = false;
...
// Note the change of the condition.
if (subgroupAny(needs_space)) {
// gl_SubgroupSize may be larger than the actual subgroup size so
// calculate the actual subgroup size.
uvec4 mask = subgroupBallot(needs_space);
uint size = subgroupBallotBitCount(mask);
uint base = 0;
if (subgroupElect()) {
// "free" tracks the next free slot for writes.
// The first invocation in the subgroup allocates space
// for each invocation in the subgroup that requires it.
base = atomicAdd(b.free, size);
}
// Broadcast the base index to other invocations in the subgroup.
base = subgroupBroadcastFirst(base);
// Calculate the offset from "base" for each invocation.
uint offset = subgroupBallotExclusiveBitCount(mask);
if (needs_space) {
// Write the data in the allocated slot for each invocation that
// requested space.
b.data[base + offset] = ...;
}
}
...
}
```

The differences from the original shader are relatively minor. First, the
addition of the `subgroup_uniform_control_flow` attribute informs the
implementation that stronger guarantees are required by this shader. Second,
the first if statement no longer tests needs_space. Instead, all invocations in
the subgroup enter the if statement if any invocation in the subgroup needs to
write data. This keeps the subgroup uniform to utilize the enhanced guarantees
for the inner subgroup election.

There is a final caveat with this example. In order for the shader to operate
correctly in all circumstances, the subgroup must be uniform (converged) prior
to the first if statement.

## Related Extensions

* [GL_EXT_subgroup_uniform_control_flow](https://github.com/KhronosGroup/GLSL/blob/master/extensions/ext/GL_EXT_subgroup_uniform_control_flow.txt) - adds a GLSL attribute for entry points
to notify implementations that stronger guarantees for convergence are
required. This translates to a new execution mode in the SPIR-V entry point.
* [SPV_KHR_subgroup_uniform_control_flow](http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html) - adds an execution mode for entry
points to indicate the requirement for stronger reconvergence guarantees.

0 comments on commit 605e3f3

Please sign in to comment.