Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Use alias scope to relax waitcounts for LDS DMA #75974

Closed
wants to merge 18 commits into from

Conversation

rampitec
Copy link
Collaborator

LDA DMA loads increase VMCNT and a load from the LDS stored must
wait on this counter to only read memory after it is written.
Wait count insertion pass does not track memory dependencies, it
tracks register dependencies. To model the LDS dependency a pseudo
register is used in the scoreboard, acting like if LDS DMA writes
it and LDS load reads it.

This patch adds 8 more pseudo registers to use for independent LDS
locations if we can prove they are disjoint using alias scope info.

Fixes: SWDEV-433427

LDA DMA loads increase VMCNT and a load from the LDS stored must
wait on this counter to only read memory after it is written.
Wait count insertion pass does not track memory dependencies, it
tracks register dependencies. To model the LDS dependency a
psuedo register is used in the scoreboard, acting like if LDS DMA
writes it and LDS load reads it.

This patch adds 8 more pseudo registers to use for independent LDS
locations if we can prove they are disjoint using alias analysis.

Fixes: SWDEV-433427
Changed getVmemWaitEventType() to use mayWriteLDSThroughDMA instead
of isLDSDMA as this is more sound and added a comment.
LDA DMA loads increase VMCNT and a load from the LDS stored must
wait on this counter to only read memory after it is written.
Wait count insertion pass does not track memory dependencies, it
tracks register dependencies. To model the LDS dependency a pseudo
register is used in the scoreboard, acting like if LDS DMA writes
it and LDS load reads it.

This patch adds 8 more pseudo registers to use for independent LDS
locations if we can prove they are disjoint using alias scope info.

Fixes: SWDEV-433427
Copy link

github-actions bot commented Dec 19, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@rampitec
Copy link
Collaborator Author

One thing to note: this alias.scope I am creating myself in the module LDS lowering, so I do exactly know what to expect. And then since there is this module LDS lowering even if any alias scope would be created before (which never happens, much less for an intrinsic call) it is already lost. It is lost along with the memory objects deleted by the lowering. That is the whole point of creating alias.scope metadata during the lowering: we are putting all module LDS into a single structure, so no AA will ever disambiguate it w/o alias scope info. In this situation I am the sole creator of the metadata, instructions carrying it, memory object accessed, and the consumer of this metadata.

At -O0 there will be no LDS lowering, but there will be no AA either. I do not see how to exploit it on practice.

One other thing to note here: there is also !noalias metadata generated in the very same place. I do not care about this because I am really searching for a store into this memory, which is a scope.

When I was writing code to generate this metadata I kept in mind exactly a scenario similar to this.

@rampitec
Copy link
Collaborator Author

This is the place I am creating it: https://reviews.llvm.org/D108315

@rampitec
Copy link
Collaborator Author

rampitec commented Jan 2, 2024

Ping

@rampitec rampitec closed this Jan 18, 2024
@rampitec rampitec deleted the lds-dma-wait-scope-only branch July 16, 2024 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant