-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Produce better memoperand for LDS DMA #75247
Conversation
1) It was marked as volatile. This is not needed and the only reason it was done is because it is both load and store and handled together with atomics. Global load to LDS was marked as volatile just because buffer load was done that way. 2) Preserve at least LDS (store) pointer which we always have with the intrinsics. 3) Use PoisonValue instead of nullptr for load memop as a Value.
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) Changes
Full diff: https://github.com/llvm/llvm-project/pull/75247.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 85cc3cfec19cd..2549e41ee527d 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -1128,11 +1128,10 @@ bool SITargetLowering::getTgtMemIntrinsic(IntrinsicInfo &Info,
MachineMemOperand::MOStore |
MachineMemOperand::MODereferenceable;
- // XXX - Should this be volatile without known ordering?
- Info.flags |= MachineMemOperand::MOVolatile;
-
switch (IntrID) {
default:
+ // XXX - Should this be volatile without known ordering?
+ Info.flags |= MachineMemOperand::MOVolatile;
break;
case Intrinsic::amdgcn_raw_buffer_load_lds:
case Intrinsic::amdgcn_raw_ptr_buffer_load_lds:
@@ -1140,6 +1139,7 @@ bool SITargetLowering::getTgtMemIntrinsic(IntrinsicInfo &Info,
case Intrinsic::amdgcn_struct_ptr_buffer_load_lds: {
unsigned Width = cast<ConstantInt>(CI.getArgOperand(2))->getZExtValue();
Info.memVT = EVT::getIntegerVT(CI.getContext(), Width * 8);
+ Info.ptrVal = CI.getArgOperand(1);
return true;
}
}
@@ -1268,8 +1268,8 @@ bool SITargetLowering::getTgtMemIntrinsic(IntrinsicInfo &Info,
Info.opc = ISD::INTRINSIC_VOID;
unsigned Width = cast<ConstantInt>(CI.getArgOperand(2))->getZExtValue();
Info.memVT = EVT::getIntegerVT(CI.getContext(), Width * 8);
- Info.flags |= MachineMemOperand::MOLoad | MachineMemOperand::MOStore |
- MachineMemOperand::MOVolatile;
+ Info.ptrVal = CI.getArgOperand(1);
+ Info.flags |= MachineMemOperand::MOLoad | MachineMemOperand::MOStore;
return true;
}
case Intrinsic::amdgcn_ds_bvh_stack_rtn: {
@@ -9095,7 +9095,9 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op,
MachinePointerInfo LoadPtrI = LoadMMO->getPointerInfo();
MachinePointerInfo StorePtrI = LoadPtrI;
- StorePtrI.V = nullptr;
+ LoadPtrI.V = PoisonValue::get(
+ PointerType::get(*DAG.getContext(), AMDGPUAS::GLOBAL_ADDRESS));
+ LoadPtrI.AddrSpace = AMDGPUAS::GLOBAL_ADDRESS;
StorePtrI.AddrSpace = AMDGPUAS::LOCAL_ADDRESS;
auto F = LoadMMO->getFlags() &
@@ -9173,6 +9175,8 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op,
MachinePointerInfo LoadPtrI = LoadMMO->getPointerInfo();
LoadPtrI.Offset = Op->getConstantOperandVal(5);
MachinePointerInfo StorePtrI = LoadPtrI;
+ LoadPtrI.V = PoisonValue::get(
+ PointerType::get(*DAG.getContext(), AMDGPUAS::GLOBAL_ADDRESS));
LoadPtrI.AddrSpace = AMDGPUAS::GLOBAL_ADDRESS;
StorePtrI.AddrSpace = AMDGPUAS::LOCAL_ADDRESS;
auto F = LoadMMO->getFlags() &
|
SIInstrInfo::areMemAccessesTriviallyDisjoint does a DS offset checks, but does not account for LDS DMA instructions. Added these checks. Without it code falls through and returns true which is wrong. As a result mayAlias would always return false for LDS DMA and a regular LDS instruction or 2 LDS DMA instructions. At the moment this is NFCI because we do not use this AA in a context which may touch LDS DMA instructions. This is also unreacheable now because of the ordered memory ref checks just above in the function and LDS DMA is marked as volatile. This volatile marking is removed in PR llvm#75247, therefore I'd submit this check before llvm#75247.
SIInstrInfo::areMemAccessesTriviallyDisjoint does a DS offset checks, but does not account for LDS DMA instructions. Added these checks. Without it code falls through and returns true which is wrong. As a result mayAlias would always return false for LDS DMA and a regular LDS instruction or 2 LDS DMA instructions. At the moment this is NFCI because we do not use this AA in a context which may touch LDS DMA instructions. This is also unreacheable now because of the ordered memory ref checks just above in the function and LDS DMA is marked as volatile. This volatile marking is removed in PR #75247, therefore I'd submit this check before #75247.
What is the effect of that? I thought nullptr was supposed to represent an unknown value, so you have to conservatively assume it might alias with anything. The effect is that MI.mayAlias does not bail before checking AA. It needs at least something as a Value, and that something must be not a PseudoValue. There is alias.scope info attached specifically to work after lowering, when no Values exist, but even to get there you need to have a Value. This is just a general distrust of the llvm in the AA info past lowering. And no, this is exactly the point: it may not alias with anything. It has alias.scope info, created just for that. |
1) It was marked as volatile. This is not needed and the only reason it was done is because it is both load and store and handled together with atomics. Global load to LDS was marked as volatile just because buffer load was done that way. 2) Preserve at least LDS (store) pointer which we always have with the intrinsics. 3) Use PoisonValue instead of nullptr for load memop as a Value. Change-Id: I3f937cff3280a411653712357a144d5ce819fea6
it was done is because it is both load and store and handled
together with atomics. Global load to LDS was marked as volatile
just because buffer load was done that way.
the intrinsics.