-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Disallow negative offset when addressing constant memory #76009
Conversation
When loading from scalar memory, the load instruction is illegal and undefined if the immediate offset is negative and is less than M0 or SGPR[offset]. Therefore disallow any negative offset for now as a workaround until this issue is fixed.
@llvm/pr-subscribers-backend-amdgpu Author: None (vangthao95) ChangesWhen loading from scalar memory, the load instruction is illegal and undefined if the immediate offset is negative and is less than M0 or SGPR[offset] according to SPG. This patch is a workaround that makes negative offset illegal for GFX9+ when addressing constant memory. It is not a full fix as we must also check if there is use of M0 or soffset present. I believe there also needs to be a fix during instruction selection regarding this issue but would like to open this to discussion as how to address this. Full diff: https://github.com/llvm/llvm-project/pull/76009.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 4f4bc45e49b43e..f4dbb852cf9487 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -1462,6 +1462,13 @@ bool SITargetLowering::isLegalAddressingMode(const DataLayout &DL,
// for S_BUFFER_* instructions).
if (!isInt<21>(AM.BaseOffs))
return false;
+
+ // FIXME: When addressing scalar memory, it is illegal and undefined for
+ // IOFFSET + (M0 or soffset) to be negative. This also need to
+ // verify that M0 or soffset is present. For now it is only rejecting
+ // all negative offsets.
+ if (AM.BaseOffs < 0)
+ return false;
} else {
// On GFX12, all offsets are signed 24-bit in bytes.
if (!isInt<24>(AM.BaseOffs))
diff --git a/llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-smem.ll b/llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-smem.ll
index f5846c3d6db737..add148c2f43064 100644
--- a/llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-smem.ll
+++ b/llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-smem.ll
@@ -279,31 +279,19 @@ end:
}
define amdgpu_cs void @test_sink_smem_offset_neg400(ptr addrspace(4) inreg %ptr, i32 inreg %val) {
-; GFX678-LABEL: test_sink_smem_offset_neg400:
-; GFX678: ; %bb.0: ; %entry
-; GFX678-NEXT: s_add_u32 s0, s0, 0xfffffe70
-; GFX678-NEXT: s_addc_u32 s1, s1, -1
-; GFX678-NEXT: .LBB5_1: ; %loop
-; GFX678-NEXT: ; =>This Inner Loop Header: Depth=1
-; GFX678-NEXT: s_waitcnt lgkmcnt(0)
-; GFX678-NEXT: s_load_dword s3, s[0:1], 0x0
-; GFX678-NEXT: s_add_i32 s2, s2, -1
-; GFX678-NEXT: s_cmp_lg_u32 s2, 0
-; GFX678-NEXT: s_cbranch_scc1 .LBB5_1
-; GFX678-NEXT: ; %bb.2: ; %end
-; GFX678-NEXT: s_endpgm
-;
-; GFX9-LABEL: test_sink_smem_offset_neg400:
-; GFX9: ; %bb.0: ; %entry
-; GFX9-NEXT: .LBB5_1: ; %loop
-; GFX9-NEXT: ; =>This Inner Loop Header: Depth=1
-; GFX9-NEXT: s_waitcnt lgkmcnt(0)
-; GFX9-NEXT: s_load_dword s3, s[0:1], -0x190
-; GFX9-NEXT: s_add_i32 s2, s2, -1
-; GFX9-NEXT: s_cmp_lg_u32 s2, 0
-; GFX9-NEXT: s_cbranch_scc1 .LBB5_1
-; GFX9-NEXT: ; %bb.2: ; %end
-; GFX9-NEXT: s_endpgm
+; GFX6789-LABEL: test_sink_smem_offset_neg400:
+; GFX6789: ; %bb.0: ; %entry
+; GFX6789-NEXT: s_add_u32 s0, s0, 0xfffffe70
+; GFX6789-NEXT: s_addc_u32 s1, s1, -1
+; GFX6789-NEXT: .LBB5_1: ; %loop
+; GFX6789-NEXT: ; =>This Inner Loop Header: Depth=1
+; GFX6789-NEXT: s_waitcnt lgkmcnt(0)
+; GFX6789-NEXT: s_load_dword s3, s[0:1], 0x0
+; GFX6789-NEXT: s_add_i32 s2, s2, -1
+; GFX6789-NEXT: s_cmp_lg_u32 s2, 0
+; GFX6789-NEXT: s_cbranch_scc1 .LBB5_1
+; GFX6789-NEXT: ; %bb.2: ; %end
+; GFX6789-NEXT: s_endpgm
;
; GFX12-LABEL: test_sink_smem_offset_neg400:
; GFX12: ; %bb.0: ; %entry
@@ -331,3 +319,6 @@ loop:
end:
ret void
}
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; GFX678: {{.*}}
+; GFX9: {{.*}}
|
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this does nothing to address the correctness issue. All treatment of constant address space addressing modes here is a very rough heuristic assuming constant loads will really use scalar loads. This patch should be a separate optimization patch to consider after the correctness issue is addressed.
For fixing selection, it's probably sufficient to modify getSMRDEncodedOffset. If you want to be more aggressive, you can also consider known bits for the register offset in each of the SMRD selection complex patterns
When loading from scalar memory, the load instruction is illegal and undefined if the immediate offset is negative and is less than M0 or SGPR[offset] according to SPG. This patch is a workaround that makes negative offset illegal for GFX9+ when addressing constant memory. It is not a full fix as we must also check if there is use of M0 or soffset present. I believe there also needs to be a fix during instruction selection regarding this issue but would like to open this to discussion as how to address this.