-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[AMDGPU][InsertWaitCnts][NFC] Merge VMEM_ACCESS and VMEM_READ_ACCESS into a single event type #171973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU][InsertWaitCnts][NFC] Merge VMEM_ACCESS and VMEM_READ_ACCESS into a single event type #171973
Conversation
|
@llvm/pr-subscribers-backend-amdgpu Author: Pankaj Dwivedi (PankajDwivedi-25) ChangesThis patch merges VMEM_READ_ACCESS and VMEM_ACCESS into a single VMEM_READ_ACCESS event type, simplifying Full diff: https://github.com/llvm/llvm-project/pull/171973.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 70db7b4918515..53ad7d81e3d08 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -111,7 +111,6 @@ struct HardwareLimits {
};
#define AMDGPU_DECLARE_WAIT_EVENTS(DECL) \
- DECL(VMEM_ACCESS) /* vmem read & write */ \
DECL(VMEM_READ_ACCESS) /* vmem read */ \
DECL(VMEM_SAMPLER_READ_ACCESS) /* vmem SAMPLER read (gfx12+ only) */ \
DECL(VMEM_BVH_READ_ACCESS) /* vmem BVH read (gfx12+ only) */ \
@@ -362,8 +361,8 @@ class WaitcntGeneratorPreGFX12 : public WaitcntGenerator {
assert(ST);
static const unsigned WaitEventMaskForInstPreGFX12[NUM_INST_CNTS] = {
- eventMask({VMEM_ACCESS, VMEM_READ_ACCESS, VMEM_SAMPLER_READ_ACCESS,
- VMEM_BVH_READ_ACCESS}),
+ eventMask(
+ {VMEM_READ_ACCESS, VMEM_SAMPLER_READ_ACCESS, VMEM_BVH_READ_ACCESS}),
eventMask({SMEM_ACCESS, LDS_ACCESS, GDS_ACCESS, SQ_MESSAGE}),
eventMask({EXP_GPR_LOCK, GDS_GPR_LOCK, VMW_GPR_LOCK, EXP_PARAM_ACCESS,
EXP_POS_ACCESS, EXP_LDS_ACCESS}),
@@ -399,7 +398,7 @@ class WaitcntGeneratorGFX12Plus : public WaitcntGenerator {
assert(ST);
static const unsigned WaitEventMaskForInstGFX12Plus[NUM_INST_CNTS] = {
- eventMask({VMEM_ACCESS, VMEM_READ_ACCESS}),
+ eventMask({VMEM_READ_ACCESS}),
eventMask({LDS_ACCESS, GDS_ACCESS}),
eventMask({EXP_GPR_LOCK, GDS_GPR_LOCK, VMW_GPR_LOCK, EXP_PARAM_ACCESS,
EXP_POS_ACCESS, EXP_LDS_ACCESS}),
@@ -549,7 +548,7 @@ class SIInsertWaitcnts {
// LDS DMA loads are also stores, but on the LDS side. On the VMEM side
// these should use VM_CNT.
if (!ST->hasVscnt() || SIInstrInfo::mayWriteLDSThroughDMA(Inst))
- return VMEM_ACCESS;
+ return VMEM_READ_ACCESS;
if (Inst.mayStore() &&
(!Inst.mayLoad() || SIInstrInfo::isAtomicNoRet(Inst))) {
// FLAT and SCRATCH instructions may access scratch. Other VMEM
|
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are going to cover read and write, this should merge to the VMEM_ACCESS name?
Agreed, and we need a comment explaining that it's used for read and write pre-GFX10 but only for read in GFX10+. |
|
Thank you, I have addressed the review. |
jayfoad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
…into a single event type (llvm#171973)
Pre-GFX10 uses VMEM_ACCESS (reads/writes), GFX10+ uses VMEM_READ_ACCESS (reads only). Since no subtarget uses both, this patch merges them into VMEM_ACCESS.