Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement part-pipeline scheme #1704

Merged
merged 15 commits into from Apr 4, 2022
Merged

Conversation

piotrAMD
Copy link
Contributor

This is a series of commits to implement LGC part-pipeline compilation scheme.

The part-pipeline is guarded by the new option EnablePartPipeline, which
is off by default. It is expected that when all the commits in the series
have been pushed, the old partial-pipeline compilation scheme will go
away.

The part-pipeline scheme is as follows.

First it compiles the FS with FS-applicable pipeline state (or finds in
the shader cache). The FS ELF's PAL metadata contains information on FS
input packing for use when compiling the pre-rasterization
part-pipeline.

Then it compiles the pre-rasterization part-pipeline (VS,TCS,TES,GS)
with pre-rasterization pipeline state and the FS input packing
information (or finds it in the shader cache).

Then it uses the LGC ELF linker to link them together.

See also: llpc/docs/LlpcOverview.md.

Patch by @trenouf with contributions from @bsaleil and myself.

@piotrAMD piotrAMD requested a review from a team as a code owner February 18, 2022 13:07
@piotrAMD
Copy link
Contributor Author

I realise this PR is quite large, but the way I have split the patch should make it more manageable to review.

Please expect delayed response time initially, as I am away next week.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_1864537179/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_1864537179/index.html.
Configuration: release_clang_coverage.

@amdvlk-admin
Copy link
Collaborator

Test summary for commit 68195e9

Driver commits used in build
  • CWPACK: amd-master 39f8940199e60c44d4211cf8165dfd12876316fa
  • METROHASH: amd-master 3c566dd9cda44ca7fd97659e0b53ac953f9037d2
  • PAL: dev b638e90ca4e6e5a6fc4f00029d62f8e064aa18eb
  • SPVGEN: dev 34c9f9a74bc1b2b9a739933923920c5eeb9aa08f
  • XGL: dev 89a4cf115b9a2972ff00e2fe4959f4f1c8e7175b
  • LLVM-PROJECT: amd-gfx-gpuopen-dev afe13751c21d672a992d83f7d33856958f7d2f3b
CTS tests (Failed: 0/199959)
  • Built with version 1.3.0.0
  • Rhel 8.2, Gfx10
    • Passed: 37595/66653 (56.4%)
    • Failed: 0/66653 (0.0%)
    • Not Supported: 29058/66653 (43.6%)
    • Warnings: 0/66653 (0.0%)
    Ubuntu 18.04, Gfx9
    • Passed: 37446/66653 (56.2%)
    • Failed: 0/66653 (0.0%)
    • Not Supported: 29207/66653 (43.8%)
    • Warnings: 0/66653 (0.0%)
    Ubuntu 20.04, Gfx8
    • Passed: 38653/66653 (58.0%)
    • Failed: 0/66653 (0.0%)
    • Not Supported: 28000/66653 (42.0%)
    • Warnings: 0/66653 (0.0%)

Copy link
Contributor

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few issues I would like to see fixed up:

  1. There are no new tests, but lots of new code. Tests are needed. I've commented on some new functionality that absolutely require a test.
  2. Some of these changes are useful for than part pipeline compilation. We should make sure they work in other compilation schemes as well.
  3. The intermingling of the hashing code in the pipeline context is awkward. A function should do 1 thing. Also, most of the code that does the hashing is found in the poorly named dumper class. I think it would be useful to all of the hashing code in one class. It make it easier for people to find. I find it hard enough when there is a bug caused by something missing from the hash when I know all of the hashing is in 1 file. If it gets spread out all over the place, it will be that much harder.

lgc/elfLinker/ElfLinker.cpp Show resolved Hide resolved
lgc/interface/lgc/Pipeline.h Show resolved Hide resolved
lgc/patch/PatchResourceCollect.cpp Outdated Show resolved Hide resolved
lgc/include/lgc/state/PalMetadata.h Outdated Show resolved Hide resolved
lgc/state/PalMetadata.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
void PipelineContext::setColorExportState(Pipeline *pipeline) const {
// @param [in/out] pipeline : Middle-end pipeline object; nullptr if only hashing
// @param [in/out] hasher : Hasher object; nullptr if only setting LGC pipeline state
void PipelineContext::setColorExportState(Pipeline *pipeline, Util::MetroHash64 *hasher) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like having a single entry point that does two different things. It should be two functions. Leave setColorExportState create a new function hashColorExportState.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

void PipelineContext::setVertexInputDescriptions(Pipeline *pipeline) const {
// @param [in/out] pipeline : Middle-end pipeline object; nullptr if only hashing
// @param [in/out] hasher : Hasher object; nullptr if only setting LGC pipeline state
void PipelineContext::setVertexInputDescriptions(Pipeline *pipeline, Util::MetroHash64 *hasher) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the color export state. Two different functions please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did that...
The rationale of the same function is that, when someone adds a new bit of state, it makes it more difficult to forget to add it to the hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have the two entry points that share code behind the scenes. It is very awkward when you read a line "setVertextInputDescriptions", but it won't actually set anything because pipeline is nullptr.

Also, having the hashing code in PipelineContext is inconsistent with the rest of the hashing code, which is located in its own class. We should be moving to a single design handling the hashing to make things clearer. I'm open to which way that is, but I don't want a mixture of designs.

Copy link
Member

@trenouf trenouf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo Steven's and Jakub's comments. Especially if it passes tests. :-)

lgc/interface/lgc/Pipeline.h Show resolved Hide resolved
void PipelineContext::setVertexInputDescriptions(Pipeline *pipeline) const {
// @param [in/out] pipeline : Middle-end pipeline object; nullptr if only hashing
// @param [in/out] hasher : Hasher object; nullptr if only setting LGC pipeline state
void PipelineContext::setVertexInputDescriptions(Pipeline *pipeline, Util::MetroHash64 *hasher) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did that...
The rationale of the same function is that, when someone adds a new bit of state, it makes it more difficult to forget to add it to the hash.

void PipelineContext::setColorExportState(Pipeline *pipeline) const {
// @param [in/out] pipeline : Middle-end pipeline object; nullptr if only hashing
// @param [in/out] hasher : Hasher object; nullptr if only setting LGC pipeline state
void PipelineContext::setColorExportState(Pipeline *pipeline, Util::MetroHash64 *hasher) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

@piotrAMD
Copy link
Contributor Author

I have a few issues I would like to see fixed up:

  1. There are no new tests, but lots of new code. Tests are needed. I've commented on some new functionality that absolutely require a test.
  2. Some of these changes are useful for than part pipeline compilation. We should make sure they work in other compilation schemes as well.
  3. The intermingling of the hashing code in the pipeline context is awkward. A function should do 1 thing. Also, most of the code that does the hashing is found in the poorly named dumper class. I think it would be useful to all of the hashing code in one class. It make it easier for people to find. I find it hard enough when there is a bug caused by something missing from the hash when I know all of the hashing is in 1 file. If it gets spread out all over the place, it will be that much harder.

Thanks for the comments. I will add more tests and look into the refactorings suggested.

Re 3, this is a deliberate decision to keep the state and hash in sync, but maybe there is a better way to do that.

@s-perron
Copy link
Contributor

s-perron commented Mar 3, 2022

Re 3, this is a deliberate decision to keep the state and hash in sync, but maybe there is a better way to do that.

The problem I have is that it creates a separate method for hashing. Maybe we can try to find a single way to do it. We can discuss this in another forum.

@piotrAMD
Copy link
Contributor Author

piotrAMD commented Mar 4, 2022

Addressing most of the review comments. The changes include:

  • A shaderdb test for each of the corner cases where a follow-up fix was needed. In addition to CTS, there is now shaderdb coverage for these cases.
  • Some extra tests.
  • Using shaderStageToMask() instead of bit manipulation.
  • Simplifying setRegister().
  • Refactoring of finalizePipeline().
  • Clang formatting fixes.
  • Rebase, where I dropped 8fda43c in favour of Fix ELF symbol type for symbols added to linker output sections #1708.

The update does not include any particular fixes to the interface commit 688ffe8 as it needs more work/discussion.

@piotrAMD
Copy link
Contributor Author

piotrAMD commented Mar 4, 2022

CTS did not start (I need another rebase to get a build error fix), but the sanitizer errors are real and they need fixing.

@amdvlk-admin
Copy link
Collaborator

Test summary for commit 3c3d216

Driver commits used in build
  • CWPACK: amd-master 39f8940199e60c44d4211cf8165dfd12876316fa
  • METROHASH: amd-master 3c566dd9cda44ca7fd97659e0b53ac953f9037d2
  • PAL: dev 2483d46fa27c30502e497ea169ee53b142e9fa06
  • SPVGEN: dev 9b30a4a91ec444943b23843853dfce2c6618f8fc
  • XGL: dev 1ce25b1ed8829c27645edd646a3289e4c524c84c
  • LLVM-PROJECT: amd-gfx-gpuopen-dev 0d6cb5407b689f465d37b7f39e2de2c747ef28f1
CTS tests (Failed: 1/189750)
  • Built with version 1.3.0.0
  • Rhel 8.2, Gfx10
    • Passed: 38237/67723 (56.5%)
    • Failed: 0/67723 (0.0%)
    • Not Supported: 29486/67723 (43.5%)
    • Warnings: 0/67723 (0.0%)
    Ubuntu 18.04, Gfx9
    • Passed: 33884/54304 (62.4%)
    • Failed: 1/54304 (0.0%)

      Failures:

      FAILURE: dEQP-VK.subgroups.clustered.graphics.subgroupclusteredmin_uvec3
      Stack trace: Script:
      Crash
      
      

    • Not Supported: 20419/54304 (37.6%)
    • Warnings: 0/54304 (0.0%)
    Ubuntu 20.04, Gfx8
    • Passed: 39235/67723 (57.9%)
    • Failed: 0/67723 (0.0%)
    • Not Supported: 28488/67723 (42.1%)
    • Warnings: 0/67723 (0.0%)

@piotrAMD
Copy link
Contributor Author

piotrAMD commented Mar 8, 2022

Rebased, fixed the sanitizer error and refactored buildGraphicsPipelineWithPartPipelines according to Steven's suggestions.

@piotrAMD
Copy link
Contributor Author

Rebased.

@s-perron @kuhar Are you happy with the changes so far?

I think the only remaining item is to rework the way hashing works by letting the LGC API hash the LGC structs.

Copy link
Contributor

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amdvlk-admin
Copy link
Collaborator

Test summary for commit cdb41ea

Driver commits used in build
  • CWPACK: amd-master 39f8940199e60c44d4211cf8165dfd12876316fa
  • METROHASH: amd-master 3c566dd9cda44ca7fd97659e0b53ac953f9037d2
  • PAL: dev 2483d46fa27c30502e497ea169ee53b142e9fa06
  • SPVGEN: dev fa39b110650bc72fd7fda54af621e69306036888
  • XGL: dev 6d693c7342bb7f18318e80c1da53061dfcd48262
  • LLVM-PROJECT: amd-gfx-gpuopen-dev 2a87cb67e0180322832e722db8a4c53a5fc4e867
CTS tests (Failed: 1/203169)
  • Built with version 1.3.0.0
  • Rhel 8.2, Gfx10
    • Passed: 38237/67723 (56.5%)
    • Failed: 0/67723 (0.0%)
    • Not Supported: 29486/67723 (43.5%)
    • Warnings: 0/67723 (0.0%)
    Ubuntu 18.04, Gfx9
    • Passed: 38074/67723 (56.2%)
    • Failed: 1/67723 (0.0%)

      Failures:

      FAILURE: dEQP-VK.memory.pipeline_barrier.all.65536_vertex_buffer_stride_2
      Stack trace: Script:
      34:HostMemoryAccess Result differs from reference, Expected: 0x00000027, Got: 0x00000085, At offset: 48937
      
      

    • Not Supported: 29648/67723 (43.8%)
    • Warnings: 0/67723 (0.0%)
    Ubuntu 20.04, Gfx8
    • Passed: 39235/67723 (57.9%)
    • Failed: 0/67723 (0.0%)
    • Not Supported: 28488/67723 (42.1%)
    • Warnings: 0/67723 (0.0%)

lgc/patch/PatchResourceCollect.cpp Outdated Show resolved Hide resolved
lgc/state/PalMetadata.cpp Outdated Show resolved Hide resolved
lgc/state/PalMetadata.cpp Outdated Show resolved Hide resolved
lgc/state/PalMetadata.cpp Outdated Show resolved Hide resolved
lgc/state/PalMetadata.cpp Show resolved Hide resolved
llpc/util/llpcUtil.cpp Outdated Show resolved Hide resolved
@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_1999265763/index.html.
Configuration: release_clang_coverage.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_1999265763/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

@piotrAMD
Copy link
Contributor Author

Rebased, addressed Jakub's comments and disabled the old caching scheme in part-pipeline mode.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2031013281/index.html.
Configuration: release_clang_coverage.

kuhar
kuhar previously approved these changes Mar 24, 2022
Copy link
Member

@amdrexu amdrexu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some naming nits. LGTM.

include/vkgcDefs.h Show resolved Hide resolved
include/vkgcDefs.h Outdated Show resolved Hide resolved
lgc/include/lgc/state/PalMetadata.h Outdated Show resolved Hide resolved
lgc/include/lgc/state/PalMetadata.h Outdated Show resolved Hide resolved
lgc/include/lgc/state/PipelineState.h Outdated Show resolved Hide resolved
llpc/context/llpcGraphicsContext.h Outdated Show resolved Hide resolved
llpc/context/llpcGraphicsContext.h Outdated Show resolved Hide resolved
llpc/context/llpcGraphicsContext.h Outdated Show resolved Hide resolved
llpc/context/llpcPipelineContext.h Outdated Show resolved Hide resolved
llpc/context/llpcPipelineContext.h Outdated Show resolved Hide resolved
@JaxLinAMD
Copy link
Contributor

retest this please

1 similar comment
@JaxLinAMD
Copy link
Contributor

retest this please

@amdvlk-admin
Copy link
Collaborator

Test summary for commit dd24db7

Driver commits used in build
  • CWPACK: amd-master 39f8940199e60c44d4211cf8165dfd12876316fa
  • METROHASH: amd-master 3c566dd9cda44ca7fd97659e0b53ac953f9037d2
  • PAL: dev 2483d46fa27c30502e497ea169ee53b142e9fa06
  • SPVGEN: dev fa39b110650bc72fd7fda54af621e69306036888
  • XGL: dev 36c5637649089bf65d42c36b431f47c170b57276
  • LLVM-PROJECT: amd-gfx-gpuopen-dev eb9d3792bdfda0b6fb94af6bd9989cad51842403
CTS tests (Failed: 1/189214)
  • Built with version 1.3.0.0
  • Rhel 8.2, Gfx10
    • Passed: 38237/67464 (56.7%)
    • Failed: 0/67464 (0.0%)
    • Not Supported: 29227/67464 (43.3%)
    • Warnings: 0/67464 (0.0%)
    Ubuntu 18.04, Gfx9
    • Passed: 33866/54286 (62.4%)
    • Failed: 1/54286 (0.0%)

      Failures:

      FAILURE: dEQP-VK.subgroups.clustered.graphics.subgroupclusteredadd_int8_t
      Stack trace: Script:
      Crash
      
      

    • Not Supported: 20419/54286 (37.6%)
    • Warnings: 0/54286 (0.0%)
    Ubuntu 20.04, Gfx8
    • Passed: 39235/67464 (58.2%)
    • Failed: 0/67464 (0.0%)
    • Not Supported: 28229/67464 (41.8%)
    • Warnings: 0/67464 (0.0%)

trenouf and others added 15 commits April 1, 2022 00:30
This is the first commit in the series to implement LGC part-pipeline
compilation scheme.

The part-pipeline is guarded by the new option EnablePartPipeline, which
is off by default. It is expected that when all the commits in the series
have been pushed, the old partial-pipeline compilation scheme will go
away.

The part-pipeline scheme is as follows.

First it compiles the FS with FS-applicable pipeline state (or finds in
the shader cache). The FS ELF's PAL metadata contains information on FS
input packing for use when compiling the pre-rasterization
part-pipeline.

Then it compiles the pre-rasterization part-pipeline (VS,TCS,TES,GS)
with pre-rasterization pipeline state and the FS input packing
information (or finds it in the shader cache).

Then it uses the LGC ELF linker to link them together.
Set usage flag for built-in passed as generic input to FS.
Do not set user data limit with empty data nodes list.
Pass ClipDistance and CullDistance array sizes in FS input mappings metadata.

The metadata fragBuiltInInputInfo is needed at all times, otherwise
the error occurs "unhandled empty msgpack node" when writing to a blob.
Pass information in the interface if a geometry shader is available
in the pre-rasteriazation stage, so then in the packing that info
can be queried in the fragment shader.

Add metadata serialization.
Change PalMetadata::setRegister() to overwrite the register value,
instead of ORing.
Rework how DB_SHADER_CONTROL is set for alphaToCoverageEnable.

Fixes: ./deqp-vk --deqp-case=*alpha_to_coverage*
Move code related to gl_ViewportIndex from the config builders
to finalizePipeline. Only at this stage do we know whether
gl_ViewportIndex was used in the pre-rasterizer stages.

Fixes: ./deqp-vk --deqp-case=*viewport_index*
Fixes:
test/shaderdb/multiple_inputs/GlslTwoStages.multi-input
test/shaderdb/multiple_inputs/SpirvTwoEntryPoints.spvasm
The default wave size for a part pipeline compilation rely on whether
sugroup size is used in the other part. In the whole pipeline mode it
is normally handled by querying the shader modes field, so we need to
make sure that the shader modes have proper information even in the part
pipeline mode.
Add tests with -enable-part-pipeline option.
@piotrAMD
Copy link
Contributor Author

Addressed Rex's comments.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_shadercache_coverage_assertions_2073959639/index.html.
Configuration: release_clang_shadercache_coverage_assertions.

@github-actions
Copy link

The LLPC code coverage report is available at https://storage.googleapis.com/amdvlk-llpc-github-ci-artifacts-public/coverage_release_clang_coverage_2073959639/index.html.
Configuration: release_clang_coverage.

@amdvlk-admin
Copy link
Collaborator

Test summary for commit 4113c4c

Driver commits used in build
  • CWPACK: amd-master 39f8940199e60c44d4211cf8165dfd12876316fa
  • METROHASH: amd-master 3c566dd9cda44ca7fd97659e0b53ac953f9037d2
  • PAL: dev 2483d46fa27c30502e497ea169ee53b142e9fa06
  • SPVGEN: dev fa39b110650bc72fd7fda54af621e69306036888
  • XGL: dev 36c5637649089bf65d42c36b431f47c170b57276
  • LLVM-PROJECT: amd-gfx-gpuopen-dev eb9d3792bdfda0b6fb94af6bd9989cad51842403
CTS tests (Failed: 0/201585)
  • Built with version 1.3.0.0
  • Rhel 8.2, Gfx10
    • Passed: 38010/67195 (56.6%)
    • Failed: 0/67195 (0.0%)
    • Not Supported: 29185/67195 (43.4%)
    • Warnings: 0/67195 (0.0%)
    Ubuntu 18.04, Gfx9
    • Passed: 37848/67195 (56.3%)
    • Failed: 0/67195 (0.0%)
    • Not Supported: 29347/67195 (43.7%)
    • Warnings: 0/67195 (0.0%)
    Ubuntu 20.04, Gfx8
    • Passed: 39069/67195 (58.1%)
    • Failed: 0/67195 (0.0%)
    • Not Supported: 28126/67195 (41.9%)
    • Warnings: 0/67195 (0.0%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants