New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make helper invocations take part in subgroupQuad operations #1798
Conversation
Test summary for commit 777c182CTS tests (Failed: 1/187820)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
You've only done this for CreateSubgroupQuadBroadcast. What about CreateSubgroupQuadSwap*? More importantly, what about all the other subgroup operations that don't have "quad" in their name. They can also access other lanes in their quad, but they can also access lanes from outside the quad, so what does the spec say about how they are supposed to work? (I.e. are there different rules for subgroup "quad" operations and subgroup "non-quad" operations?) |
Yes, I suspect it needs to be added to quite a few more operations. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2274033318/index.html. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2274033318/index.html. |
I think we need to use wqm instead of soft_wqm because we need to enable wqm to make subgroupQuadXXX() work correctly. Based on https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations, whether helper invocation need to be active/inactive for other subgroup operations is not quite clear. I guess we are inserting soft_wqm for subgroup vote operations is fixing either CTS or some application assume helper invocations will participate subgroup operations. |
Ideally, we only want helper invocations around if they are required as not having them can save energy. However, I am also willing to be pragmatic about it. |
Note: I've rewritten the way WQM is handled for subgroup operations. I believe this should be based on the shader stage alone and not looking for specific operations in the SPIRV. All the operations listed were specific the fragment shader stage alone, and helper invocations are only valid in the fragment shader. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2292479142/index.html. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2292479142/index.html. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2292468908/index.html. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2292468908/index.html. |
Test summary for commit d9bf703CTS tests (Failed: 0/187823)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
We need to consider correctness before talking about saving energy. What if a fragment shader has only calls to subgroupQuadBroadcast() with no image sample, no demotes? The launched wave still need to have helper invocations enabled to get defined behavior if some invocations are helpers at wave launch time. I think for quad group operations (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-quad-operations), we should use wqm intrinsic. For non quad group operations, I think it is ok to use soft_wqm based on (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations). The change to always put a soft_wqm for subgroup vote operation in fragment shader sounds fine to me. |
That makes sense to me. Thank you for explaining. I did not know that the Vulkan spec had special rules for the "quad" operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one nit. I don't know the semantics well enough to comment on the correctness of this.
to address a potential issue with Vulkan CTS test:
dEQP-VK.draw.renderpass.shader_invocation.helper_invocation
If this affects a CTS test, a shaderdb test that does not require a GPU would be very welcome: https://github.com/GPUOpen-Drivers/llpc/blob/dev/docs/Contributing.md#write-useful-tests. From what I remember, there weren't many tests that exercise OpKill
/OpDemoteToHelper
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some type of test would be nice.
I agree with @ruiling that quad ops must be (hard)wqm for correctness. For other subgroup ops, Carl's approach of enabling softwqm when demote is present is correct. If we want to simplify things because we don't care about power, then enabling wqm unconditionally in fragment shaders is also correct. |
Subgroup quad operations should be marked as WQM because helper invocations take part in subgroup operations if enabled. This addresses a potential issue with Vulkan CTS test: dEQP-VK.draw.renderpass.shader_invocation.helper_invocation Rework use of WQM intrinsics in subgroup operations to be based only on shader stage and not use knowledge of operations used in SPIRV.
d9bf703
to
40de282
Compare
|
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2305223867/index.html. |
The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2305223867/index.html. |
Test summary for commit 40de282CTS tests (Failed: 0/187823)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this new version and looks great to me. Removing useHelpInvocation helps a lot to make it easy to understand.
Looks like Jenkins test stalled. |
I re-run the CI for you. |
Test summary for commit 40de282CTS tests (Failed: 0/186422)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
Test summary for commit 40de282CTS tests (Failed: 1/172777)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
I think the GFX9 CTS test failure is spurious, as the test involved has no subgroup operations. |
retest this please |
Test summary for commit 40de282CTS tests (Failed: 0/186422)
Rhel 8.2, Gfx10Ubuntu 18.04, Gfx9Ubuntu 20.04, Gfx8 |
Subgroup quad broadcasts should be marked as WQM because
helper invocations take part in subgroup operations if enabled.
This patch is free standing, but is intended to be paired with
LLVM D124981 to address a potential issue with Vulkan CTS test:
dEQP-VK.draw.renderpass.shader_invocation.helper_invocation