-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Only allow 31bit positive offset for scratch #71895
Conversation
This just helps code safer as how the value interpreted by hardware might change.
@llvm/pr-subscribers-backend-amdgpu Author: Ruiling, Song (ruiling) ChangesThis just helps code safer as how the value interpreted by hardware might change. Full diff: https://github.com/llvm/llvm-project/pull/71895.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index cd810f0b43e50db..014026dd209da25 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -1859,7 +1859,7 @@ bool AMDGPUDAGToDAGISel::SelectScratchSVAddr(SDNode *N, SDValue Addr,
std::tie(SplitImmOffset, RemainderOffset)
= TII->splitFlatOffset(COffsetVal, AMDGPUAS::PRIVATE_ADDRESS, true);
- if (isUInt<32>(RemainderOffset)) {
+ if (RemainderOffset > 0 && isInt<32>(RemainderOffset)) {
SDNode *VMov = CurDAG->getMachineNode(
AMDGPU::V_MOV_B32_e32, SL, MVT::i32,
CurDAG->getTargetConstant(RemainderOffset, SDLoc(), MVT::i32));
|
@@ -1859,7 +1859,7 @@ bool AMDGPUDAGToDAGISel::SelectScratchSVAddr(SDNode *N, SDValue Addr, | |||
std::tie(SplitImmOffset, RemainderOffset) | |||
= TII->splitFlatOffset(COffsetVal, AMDGPUAS::PRIVATE_ADDRESS, true); | |||
|
|||
if (isUInt<32>(RemainderOffset)) { | |||
if (RemainderOffset > 0 && isInt<32>(RemainderOffset)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also needs matching GlobalISel change? Is it testable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GlobalISel does not have such logic, so no changes needed there. I am not quite sure do we really need a test for this. I guess this can only be tested with out-of-bounds access, as we cannot allocate too much scratch, right? I felt the change is not quite useful in practice. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to get all the edge cases right, even if it's an unrealistically large allocation. We haven't sorted out how TLS should work, but it's possible we'll end up using scratch instructions to access it at weird offsets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we can write LLVM IR like: %ptr = getelementptr inbounds i8, ptr addrspace(5) %arg, i64 u0x80000000
for a positive offset with 31bit being one, it will be lowered into t4: i32 = add t2, Constant:i32<-2147483648>
as our private address is 32bit. Seems like we cannot make a test to show difference here.
I cannot think of ways to show the existing code is wrong. So it should be correct. |
This just helps code safer as how the value interpreted by hardware might change.