Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HLSL barriers that better match SPIR-V semantics #383

Closed
wants to merge 1 commit into from

Conversation

rossy
Copy link
Contributor

@rossy rossy commented Jan 6, 2018

We noticed one of our upscaler compute shaders was producing artifacts in recent builds of mpv (see mpv-player/mpv#5353):

image

I narrowed this down to the recent barrier changes in glslang. Our upscaler uses the following barriers:

groupMemoryBarrier();
barrier();

KhronosGroup/glslang@070aaeafcdc0 (before the barrier changes) emits the following SPIR-V (parameters translated for easier reading):

OpMemoryBarrier Device CrossWorkgroup|Relaxed
OpControlBarrier Workgroup Device Relaxed

The current version of SPIRV-Cross translates them to the following HLSL. This worked in mpv:

DeviceMemoryBarrier();
GroupMemoryBarrierWithGroupSync();

KhronosGroup/glslang@2505057af8f4 (after the changes) emits different SPIR-V memory semantics:

OpMemoryBarrier Workgroup Uniform|Workgroup|AtomicCounter|Image|AcquireRelease
OpControlBarrier Workgroup Workgroup WorkgroupMemory|AcquireRelease

SPIRV-Cross, without this patch, translates the new semantics to the following HLSL, which results in the above artifacts in mpv:

DeviceMemoryBarrier();
DeviceMemoryBarrierWithGroupSync();

The new HLSL seems wrong because as far as I can tell, DeviceMemoryBarrier doesn't enforce memory ordering for accesses to workgroup memory. Since SPIRV-Cross currently uses equality to check for MemorySemanticsWorkgroupMemoryMask rather than treating the memory argument as a bitfield, adding the AcquireRelease flag will make it incorrectly emit a DeviceMemoryBarrierWithGroupSync.

To try and figure out the proper semantics, I had a look at the SPIR-V generated by glslang for GLSL and HLSL intrinsics and the SPIR-V generated by Microsoft's DirectXShaderCompiler for HLSL intrinsics. The results are below:

GLSL intrinsic HLSL intrinsic Opcode Execution scope Memory scope Storage Memory order
memoryBarrier AllMemoryBarrier MemoryBarrier Device All AcquireRelease
- AllMemoryBarrierWithGroupSync ControlBarrier Workgroup Device All AcquireRelease
- DeviceMemoryBarrier MemoryBarrier Device Uniform|Image AcquireRelease
- DeviceMemoryBarrierWithGroupSync ControlBarrier Workgroup Device Uniform|Image AcquireRelease
- GroupMemoryBarrier MemoryBarrier Workgroup Workgroup AcquireRelease
barrier GroupMemoryBarrierWithGroupSync ControlBarrier Workgroup Workgroup Workgroup AcquireRelease
memoryBarrierAtomicCounter - MemoryBarrier Device AtomicCounter AcquireRelease
memoryBarrierBuffer - MemoryBarrier Device Uniform AcquireRelease
memoryBarrierImage - MemoryBarrier Device Image AcquireRelease
memoryBarrierShared - MemoryBarrier Device Workgroup AcquireRelease
groupMemoryBarrier - MemoryBarrier Workgroup All AcquireRelease

As far as I understand, SPIRV-Cross should emit the least restrictive HLSL barrier intrinsic that still provides the semantics in the table above. I've done my best to implement this, though I'm not too familar with compute shader memory models, so I might have made some incorrect assumptions. I ignored the memory scope argument because spirv_glsl.cpp does as well and I didn't understand what it meant for memoryBarrierShared to operate at Device scope.

With these changes, SPIRV-Cross generates the following HLSL, which fixes the bug in mpv:

AllMemoryBarrier();
GroupMemoryBarrierWithGroupSync();

I also updated the tests with expected translations from GLSL to HLSL. Edit: I just noticed that Travis fails because its glslang is pinned to an older commit that generates different barriers. Testing the other shaders with the new glslang fails. I'm not sure what to do about this.

This should fix some issues with generating the wrong barriers after
KhronosGroup/glslang@8297936dd6eb.
@@ -18,6 +18,12 @@ void comp_main()
sShared[gl_LocalInvocationIndex] = asfloat(_22.Load(gl_GlobalInvocationID.x * 4 + 0));
GroupMemoryBarrierWithGroupSync();
_44.Store(gl_GlobalInvocationID.x * 4 + 0, asuint(sShared[(4u - gl_LocalInvocationIndex) - 1u]));
GroupMemoryBarrierWithGroupSync();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 barriers back to back here is pretty intense, this seems like overkill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just making sure each GLSL intrinsic was translated into the expected HLSL intrinsic. In general, I think I'd feel safer if SPIRV-Cross' tests eventually covered every GLSL intrinsic. If this is too much, some of the memoryBarrier*() and the intrinsics that were already tested a few lines up could probably be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, ok, we should have tests for all possible combinations.

@HansKristian-Work
Copy link
Contributor

If glslang changed behavior we can just change the commit-hash used in Travis and test_shader scripts.

@rossy
Copy link
Contributor Author

rossy commented Jan 8, 2018

If the glslang commit hash is changed, I think the GLSL and MSL codegen will have to be fixed as well. With the new version of glslang, SPIRV-Cross translates:

memoryBarrierShared();
barrier();

to

memoryBarrier();
memoryBarrier();
barrier();

...which is probably still correct, but redundant. I don't think I'm the best person to fix the GLSL and MSL barriers.

@HansKristian-Work
Copy link
Contributor

ok, if GLSL/MSL has to be changed as well, I should probably have a look at the new codegen in glslang first.

@rossy
Copy link
Contributor Author

rossy commented Jan 10, 2018

Fixed by #386

@rossy rossy closed this Jan 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants