-
Notifications
You must be signed in to change notification settings - Fork 15.3k
AMDGPU: Handle invariant loads when considering if a load can be scalar #168787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Handle invariant loads when considering if a load can be scalar #168787
Conversation
|
@llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesDoesn't touch the globalisel version because the handling Full diff: https://github.com/llvm/llvm-project/pull/168787.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index 6a0a9e3d3e5ac..6c36f8ad9b6a9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -4437,7 +4437,8 @@ bool AMDGPUDAGToDAGISel::isUniformLoad(const SDNode *N) const {
Ld->getAlign() >=
Align(std::min(MMO->getSize().getValue().getKnownMinValue(),
uint64_t(4))) &&
- ((Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS ||
+ (MMO->isInvariant() ||
+ (Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS ||
Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS_32BIT) ||
(Subtarget->getScalarizeGlobalBehavior() &&
Ld->getAddressSpace() == AMDGPUAS::GLOBAL_ADDRESS &&
diff --git a/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll b/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
index 6815050d0a441..23970d454526c 100644
--- a/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
+++ b/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
@@ -10,7 +10,7 @@
; GCN-DAG: buffer_load_dwordx2 [[PTR:v\[[0-9]+:[0-9]+\]]],
; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x1c8007b
; GCN: buffer_store_dword [[K]], [[PTR]]
-define amdgpu_kernel void @test_merge_store_constant_i16_invariant_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
+define void @test_merge_store_constant_i16_invariant_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
%ptr = load ptr addrspace(1), ptr addrspace(1) %in, !invariant.load !0
%ptr.1 = getelementptr i16, ptr addrspace(1) %ptr, i64 1
store i16 123, ptr addrspace(1) %ptr, align 4
@@ -30,6 +30,19 @@ define amdgpu_kernel void @test_merge_store_constant_i16_invariant_constant_poin
ret void
}
+; Invariant global load should be equivalently handled to constant.
+; GCN-LABEL: {{^}}test_merge_store_global_i16_invariant_uniform_global_pointer_load:
+; GCN: s_load_dwordx2 s[[[SPTR_LO:[0-9]+]]:[[SPTR_HI:[0-9]+]]]
+; GCN: v_mov_b32_e32 [[K:v[0-9]+]], 0x1c8007b
+; GCN: buffer_store_dword [[K]], off, s[[[SPTR_LO]]:
+define amdgpu_kernel void @test_merge_store_global_i16_invariant_uniform_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
+ %ptr = load ptr addrspace(1), ptr addrspace(1) %in, !invariant.load !0
+ %ptr.1 = getelementptr i16, ptr addrspace(1) %ptr, i64 1
+ store i16 123, ptr addrspace(1) %ptr, align 4
+ store i16 456, ptr addrspace(1) %ptr.1
+ ret void
+}
+
!0 = !{}
attributes #0 = { nounwind }
|
🐧 Linux x64 Test Results
|
7bad49b to
e2c7ac8
Compare
7d7cabd to
cd38396
Compare
jayfoad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is saying that invariant loads can be selected to SMEM instructions regardless of any "no clobber" check, right? I think that's safe...
Doesn't touch the globalisel version because the handling there looks a bit broken.
cd38396 to
34253e8
Compare

Doesn't touch the globalisel version because the handling
there looks a bit broken.