-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Barriers proposal #1374
Comments
I'd be interested to here from Apple about whether Metal allows composable |
Wonderful investigation and the proposal, thank you for contribution! |
We debated internally about that aspect. The major alternative considered was using enums and constexprs both of which are currently unspecified by WGSL. I think, long term, that would be good to add if a fully configurable barrier is ever added to WGSL (SPIR-V with the Vulkan memory model provides highly configurable barriers). I generally prefer a single configurable function call, but obviously different languages take different routes. |
Fixes gpuweb#1374 * Adds workgroupBarrier as a control barrier templated on affected storage classes
Fixes gpuweb#1374 * Adds workgroupBarrier as a control barrier templated on affected storage classes * Modifies func_call_statement grammar to allow limited templates
WGSL meeting minutes 2021-02-16
|
* Add workgroupBarrier to WGSL Fixes #1374 * Adds workgroupBarrier as a control barrier templated on affected storage classes * Modifies func_call_statement grammar to allow limited templates * Fix link to program order * fix typo * Implement outcome of VF2F 2020-02-23 * Remove templated function and replace with two control barriers * workgroupBarrier - affects memory and atomics in workgroup * storageBarrier - affects memory and atomics in storage * both barriers synchroize with each other * remove stale sentence
This PR syncs the CTS text to the spec text for the new @const decorations. Issue: gpuweb#2787
Workgroup Barrier
Add a single builtin function for barrier:
workgroupBarrier<storage_class_list>() -> void
workgroupBarrier
is templated on a list of affected storage classes.storage_class_list
is a comma-separated list of storage classes. The only valid storage classes for MVP arestorage
andworkgroup
.workgroupBarrier
is a control barrier with acquire_release memory ordering. That is, all memory, atomic, and barrier operations are ordered in program order relative to the barrier. Additionally, the affected memory and atomic operations program ordered before the barrier must be visible to all other threads in the workgroup before any affected memory or atomic operation program ordered after the barrier is executed by a member of the workgroup.workgroupBarrier
must only be used in compute shaders and must only be called from workgroup uniform control flow.Translations
MSL
HLSL
SPIR-V
Discussion
This proposal represents the intersection of functionality across the underlying implementations. Memory ordering is not exposed in the MVP since all barriers use acquire_release orderings. There is no separate memory barrier because MSL does not expose one. Translating a memory barrier to a control barrier is not ideal due to the necessity to require uniform control flow. A control-only (no memory) barrier was not included because there is no good translation into HLSL and its value is dubious.
Post-MVP, if read_write textures are supported the storage class list would need some way to include textures.
Subgroup barriers will be an interesting extension as MSL uses uniform subgroup barriers and Vulkan uses non-uniform subgroup barriers and subgroup barriers do not appear to be present in HLSL.
Survey
Barriers in MSL can only be used in kernel (compute) shaders. MSL provides two barrier functions: threadgroup_barrier and simdgroup_barrier. These are equivalent to OpControlBarrier with a Workgroup and Subgroup execution scope respectively. The synchronization is controlled by the mem_flags parameter on the barrier. mem_flags can have the following values:
All barriers are required to be executed in dynamically uniform control flow for threadgroup or simdgroup. MSL does not explicitly state that mem_flags can be or’d together like a bit mask (examples only use a single value). Since barriers order all affected memory operations they should have acquire_release memory order.
MSL does not appear to provide memory barriers, just control barriers.
Curiously, the documentation seems to indicate that simdgroup_barrier could be used in fragment shaders, but this is contradicted earlier in the specification.
HLSL
HLSL provides both memory barriers and control barriers.
Barriers:
The WithGroupSync variants also act as control barriers with a Group (workgroup) execution scope. As such these variants are required to be executed in dynamically uniform control flow (workgroup).
All variants are only permitted in compute shaders except DeviceMemoryBarrier which is also permitted in Pixel (fragment) shaders. Since barriers order all affected memory operations they should have acquire_release memory order.
SPIR-V
Vulkan provides both memory (OpMemoryBarrier) and control (OpControlBarrier) barriers that are configurable in terms of affected execution scopes, storage classes and memory ordering.
Unlike other APIs, OpControlBarrier with a Subgroup execution scope is not a uniform barrier. That is, only active threads in the subgroup synchronize at the barrier. Because active threads are not well defined, it is difficult to specify exactly which threads will be involved in the synchronization.
Both memory and control barriers support the following memory orderings: None, Acquire, Release, and AcquireRelease. Acquire (and AcquireRelease) orders all loads to occur in program order relative to the barrier. Release (and AcquireRelease) orders all stores to occur in program order relative to the barrier.
Both memory and control barriers support synchronizing the following storage classes: UniformMemory (storage), SubgroupMemory, WorkgroupMemory (workgroup), ImageMemory (texture) and OutputMemory (output).
The Vulkan Memory Model additionally enables finer grained control of availability and visibility of memory operations. Other APIs assume availability and visibility are automatic.
The text was updated successfully, but these errors were encountered: