Make helper invocations take part in subgroupQuad operations #1798

perlfu · 2022-05-05T06:25:11Z

Subgroup quad broadcasts should be marked as WQM because
helper invocations take part in subgroup operations if enabled.

This patch is free standing, but is intended to be paired with
LLVM D124981 to address a potential issue with Vulkan CTS test:
dEQP-VK.draw.renderpass.shader_invocation.helper_invocation

amdvlk-admin · 2022-05-05T07:22:03Z

Test summary for commit `777c182`

CTS tests (Failed: 1/187820)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65225 (56.2%)
Failed: 0/65225 (0.0%)
Not Supported: 28580/65225 (43.8%)
Warnings: 0/65225 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 31111/57370 (54.2%)
Failed: 0/57370 (0.0%)
Not Supported: 26259/57370 (45.8%)
Warnings: 0/57370 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37802/65225 (58.0%)

Failed: 1/65225 (0.0%)

Failures:

FAILURE: dEQP-VK.synchronization.basic.event.multi_secondary_command_buffer
Stack trace: Script:
synchronizationWrapper->queueSubmit(queue, *fence): VK_TIMEOUT at vktSynchronizationBasicEventTests.cpp:337

Not Supported: 27422/65225 (42.0%)
Warnings: 0/65225 (0.0%)

jayfoad · 2022-05-05T08:01:23Z

You've only done this for CreateSubgroupQuadBroadcast. What about CreateSubgroupQuadSwap*?

More importantly, what about all the other subgroup operations that don't have "quad" in their name. They can also access other lanes in their quad, but they can also access lanes from outside the quad, so what does the spec say about how they are supposed to work? (I.e. are there different rules for subgroup "quad" operations and subgroup "non-quad" operations?)

perlfu · 2022-05-05T09:07:20Z

You've only done this for CreateSubgroupQuadBroadcast. What about CreateSubgroupQuadSwap*?

More importantly, what about all the other subgroup operations that don't have "quad" in their name. They can also access other lanes in their quad, but they can also access lanes from outside the quad, so what does the spec say about how they are supposed to work? (I.e. are there different rules for subgroup "quad" operations and subgroup "non-quad" operations?)

Yes, I suspect it needs to be added to quite a few more operations.
We do already use softwqm for several of the ballots.

github-actions · 2022-05-05T09:15:29Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2274033318/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

github-actions · 2022-05-05T09:16:17Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2274033318/index.html.
Configuration: release_clang_coverage.

ruiling · 2022-05-08T15:21:51Z

I think we need to use wqm instead of soft_wqm because we need to enable wqm to make subgroupQuadXXX() work correctly. Based on https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations, whether helper invocation need to be active/inactive for other subgroup operations is not quite clear. I guess we are inserting soft_wqm for subgroup vote operations is fixing either CTS or some application assume helper invocations will participate subgroup operations.

perlfu · 2022-05-09T06:35:43Z

I think we need to use wqm instead of soft_wqm because we need to enable wqm to make subgroupQuadXXX() work correctly. Based on https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations, whether helper invocation need to be active/inactive for other subgroup operations is not quite clear. I guess we are inserting soft_wqm for subgroup vote operations is fixing either CTS or some application assume helper invocations will participate subgroup operations.

Ideally, we only want helper invocations around if they are required as not having them can save energy.
Hence why I proposed using softwqm and the backend change.
The point of the softwqm is to defer the actual decision to the backend, in a saying "run in WQM if it would change the result".
Having other explicit WQM operations, image samples, demotes, etc would change the result.

However, I am also willing to be pragmatic about it.
Testing, I found that softwqm only turned up in 29 out of 10362 game pipelines I combined.
In only one case was the softwqm not converted to WQM.
So in practice it probably makes little difference.
We can entirely remove all uses of softwqm and everything works (in the CTS test sense).

perlfu · 2022-05-09T06:45:03Z

Note: I've rewritten the way WQM is handled for subgroup operations. I believe this should be based on the shader stage alone and not looking for specific operations in the SPIRV. All the operations listed were specific the fragment shader stage alone, and helper invocations are only valid in the fragment shader.

github-actions · 2022-05-09T07:14:26Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2292479142/index.html.
Configuration: release_clang_coverage.

github-actions · 2022-05-09T07:14:31Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2292479142/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

github-actions · 2022-05-09T07:16:53Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2292468908/index.html.
Configuration: release_clang_coverage.

github-actions · 2022-05-09T07:16:59Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2292468908/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

amdvlk-admin · 2022-05-09T07:41:24Z

Test summary for commit `d9bf703`

CTS tests (Failed: 0/187823)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65226 (56.2%)
Failed: 0/65226 (0.0%)
Not Supported: 28581/65226 (43.8%)
Warnings: 0/65226 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 31111/57371 (54.2%)
Failed: 0/57371 (0.0%)
Not Supported: 26260/57371 (45.8%)
Warnings: 0/57371 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37804/65226 (58.0%)
Failed: 0/65226 (0.0%)
Not Supported: 27422/65226 (42.0%)
Warnings: 0/65226 (0.0%)

ruiling · 2022-05-09T08:23:23Z

Ideally, we only want helper invocations around if they are required as not having them can save energy. Hence why I proposed using softwqm and the backend change. The point of the softwqm is to defer the actual decision to the backend, in a saying "run in WQM if it would change the result". Having other explicit WQM operations, image samples, demotes, etc would change the result.

We need to consider correctness before talking about saving energy. What if a fragment shader has only calls to subgroupQuadBroadcast() with no image sample, no demotes? The launched wave still need to have helper invocations enabled to get defined behavior if some invocations are helpers at wave launch time.

I think for quad group operations (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-quad-operations), we should use wqm intrinsic. For non quad group operations, I think it is ok to use soft_wqm based on (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations).

The change to always put a soft_wqm for subgroup vote operation in fragment shader sounds fine to me.

jayfoad · 2022-05-09T08:30:35Z

I think for quad group operations (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-quad-operations), we should use wqm intrinsic. For non quad group operations, I think it is ok to use soft_wqm based on (https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#shaders-helper-invocations).

That makes sense to me. Thank you for explaining. I did not know that the Vulkan spec had special rules for the "quad" operations.

kuhar

Just one nit. I don't know the semantics well enough to comment on the correctness of this.

to address a potential issue with Vulkan CTS test:
dEQP-VK.draw.renderpass.shader_invocation.helper_invocation

If this affects a CTS test, a shaderdb test that does not require a GPU would be very welcome: https://github.com/GPUOpen-Drivers/llpc/blob/dev/docs/Contributing.md#write-useful-tests. From what I remember, there weren't many tests that exercise OpKill/OpDemoteToHelper.

lgc/builder/SubgroupBuilder.cpp

s-perron

LGTM. Some type of test would be nice.

nhaehnle · 2022-05-10T14:07:34Z

I agree with @ruiling that quad ops must be (hard)wqm for correctness. For other subgroup ops, Carl's approach of enabling softwqm when demote is present is correct.

If we want to simplify things because we don't care about power, then enabling wqm unconditionally in fragment shaders is also correct.

Subgroup quad operations should be marked as WQM because helper invocations take part in subgroup operations if enabled. This addresses a potential issue with Vulkan CTS test: dEQP-VK.draw.renderpass.shader_invocation.helper_invocation Rework use of WQM intrinsics in subgroup operations to be based only on shader stage and not use knowledge of operations used in SPIRV.

perlfu · 2022-05-11T06:27:44Z

Use explicit WQM for quad operations
Add basic tests
Address other review comments

github-actions · 2022-05-11T06:52:11Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2305223867/index.html.
Configuration: release_clang_coverage.

github-actions · 2022-05-11T06:52:17Z

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2305223867/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

amdvlk-admin · 2022-05-11T07:21:57Z

Test summary for commit `40de282`

CTS tests (Failed: 0/187823)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65226 (56.2%)
Failed: 0/65226 (0.0%)
Not Supported: 28581/65226 (43.8%)
Warnings: 0/65226 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 31111/57371 (54.2%)
Failed: 0/57371 (0.0%)
Not Supported: 26260/57371 (45.8%)
Warnings: 0/57371 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37804/65226 (58.0%)
Failed: 0/65226 (0.0%)
Not Supported: 27422/65226 (42.0%)
Warnings: 0/65226 (0.0%)

ruiling

Thanks for making this new version and looks great to me. Removing useHelpInvocation helps a lot to make it easy to understand.

perlfu · 2022-05-16T08:51:16Z

Looks like Jenkins test stalled.
Can we retest this please?

amdrexu · 2022-05-16T08:55:14Z

I re-run the CI for you.

amdvlk-admin · 2022-05-16T09:44:25Z

Test summary for commit `40de282`

CTS tests (Failed: 0/186422)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65226 (56.2%)
Failed: 0/65226 (0.0%)
Not Supported: 28581/65226 (43.8%)
Warnings: 0/65226 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 30002/55970 (53.6%)
Failed: 0/55970 (0.0%)
Not Supported: 25968/55970 (46.4%)
Warnings: 0/55970 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37804/65226 (58.0%)
Failed: 0/65226 (0.0%)
Not Supported: 27422/65226 (42.0%)
Warnings: 0/65226 (0.0%)

amdvlk-admin · 2022-05-16T10:40:13Z

Test summary for commit `40de282`

CTS tests (Failed: 1/172777)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65226 (56.2%)
Failed: 0/65226 (0.0%)
Not Supported: 28581/65226 (43.8%)
Warnings: 0/65226 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 25675/42325 (60.7%)

Failed: 1/42325 (0.0%)

Failures:

FAILURE: dEQP-VK.robustness.image_robustness.bind.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.img.samples_1.2d.vert
Stack trace: Script:
Crash

Not Supported: 16649/42325 (39.3%)
Warnings: 0/42325 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37804/65226 (58.0%)
Failed: 0/65226 (0.0%)
Not Supported: 27422/65226 (42.0%)
Warnings: 0/65226 (0.0%)

perlfu · 2022-05-18T08:29:58Z

I think the GFX9 CTS test failure is spurious, as the test involved has no subgroup operations.
Can you re-run the CI again? Thanks!

JaxLinAMD · 2022-05-18T08:44:35Z

retest this please

amdvlk-admin · 2022-05-18T09:41:55Z

Test summary for commit `40de282`

CTS tests (Failed: 0/186422)

Built with version 1.3.0.0

Rhel 8.2, Gfx10

Passed: 36645/65226 (56.2%)
Failed: 0/65226 (0.0%)
Not Supported: 28581/65226 (43.8%)
Warnings: 0/65226 (0.0%)

Ubuntu 18.04, Gfx9

Passed: 30002/55970 (53.6%)
Failed: 0/55970 (0.0%)
Not Supported: 25968/55970 (46.4%)
Warnings: 0/55970 (0.0%)

Ubuntu 20.04, Gfx8

Passed: 37804/65226 (58.0%)
Failed: 0/65226 (0.0%)
Not Supported: 27422/65226 (42.0%)
Warnings: 0/65226 (0.0%)

perlfu requested a review from a team as a code owner May 5, 2022 06:25

perlfu force-pushed the wqm-quad-broadcast branch from 777c182 to 2aebd50 Compare May 9, 2022 06:43

perlfu requested review from kuhar and s-perron as code owners May 9, 2022 06:43

perlfu force-pushed the wqm-quad-broadcast branch from 2aebd50 to d9bf703 Compare May 9, 2022 06:46

perlfu changed the title ~~Make helper invocations take part in subgroupQuadBroadcast.~~ Make helper invocations take part in subgroupQuad operations May 9, 2022

kuhar reviewed May 9, 2022

View reviewed changes

lgc/builder/SubgroupBuilder.cpp Outdated Show resolved Hide resolved

s-perron reviewed May 9, 2022

View reviewed changes

perlfu force-pushed the wqm-quad-broadcast branch from d9bf703 to 40de282 Compare May 11, 2022 06:25

ruiling approved these changes May 11, 2022

View reviewed changes

amdrexu approved these changes May 11, 2022

View reviewed changes

amdrexu merged commit 65603de into GPUOpen-Drivers:dev May 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make helper invocations take part in subgroupQuad operations #1798

Make helper invocations take part in subgroupQuad operations #1798

perlfu commented May 5, 2022

amdvlk-admin commented May 5, 2022

jayfoad commented May 5, 2022

perlfu commented May 5, 2022

github-actions bot commented May 5, 2022

github-actions bot commented May 5, 2022

ruiling commented May 8, 2022

perlfu commented May 9, 2022

perlfu commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

amdvlk-admin commented May 9, 2022

ruiling commented May 9, 2022 •

edited

jayfoad commented May 9, 2022

kuhar left a comment

s-perron left a comment

nhaehnle commented May 10, 2022

perlfu commented May 11, 2022

github-actions bot commented May 11, 2022

github-actions bot commented May 11, 2022

amdvlk-admin commented May 11, 2022

ruiling left a comment

perlfu commented May 16, 2022

amdrexu commented May 16, 2022

amdvlk-admin commented May 16, 2022

amdvlk-admin commented May 16, 2022

perlfu commented May 18, 2022

JaxLinAMD commented May 18, 2022

amdvlk-admin commented May 18, 2022

Make helper invocations take part in subgroupQuad operations #1798

Make helper invocations take part in subgroupQuad operations #1798

Conversation

perlfu commented May 5, 2022

amdvlk-admin commented May 5, 2022

Test summary for commit 777c182

jayfoad commented May 5, 2022

perlfu commented May 5, 2022

github-actions bot commented May 5, 2022

github-actions bot commented May 5, 2022

ruiling commented May 8, 2022

perlfu commented May 9, 2022

perlfu commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

github-actions bot commented May 9, 2022

amdvlk-admin commented May 9, 2022

Test summary for commit d9bf703

ruiling commented May 9, 2022 • edited

jayfoad commented May 9, 2022

kuhar left a comment

Choose a reason for hiding this comment

s-perron left a comment

Choose a reason for hiding this comment

nhaehnle commented May 10, 2022

perlfu commented May 11, 2022

github-actions bot commented May 11, 2022

github-actions bot commented May 11, 2022

amdvlk-admin commented May 11, 2022

Test summary for commit 40de282

ruiling left a comment

Choose a reason for hiding this comment

perlfu commented May 16, 2022

amdrexu commented May 16, 2022

amdvlk-admin commented May 16, 2022

Test summary for commit 40de282

amdvlk-admin commented May 16, 2022

Test summary for commit 40de282

perlfu commented May 18, 2022

JaxLinAMD commented May 18, 2022

amdvlk-admin commented May 18, 2022

Test summary for commit 40de282

Test summary for commit `777c182`

Test summary for commit `d9bf703`

ruiling commented May 9, 2022 •

edited

Test summary for commit `40de282`

Test summary for commit `40de282`

Test summary for commit `40de282`

Test summary for commit `40de282`