Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Nov 19, 2025

Doesn't touch the globalisel version because the handling
there looks a bit broken.

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Doesn't touch the globalisel version because the handling
there looks a bit broken.


Full diff: https://github.com/llvm/llvm-project/pull/168787.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+2-1)
  • (modified) llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll (+14-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index 6a0a9e3d3e5ac..6c36f8ad9b6a9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -4437,7 +4437,8 @@ bool AMDGPUDAGToDAGISel::isUniformLoad(const SDNode *N) const {
          Ld->getAlign() >=
              Align(std::min(MMO->getSize().getValue().getKnownMinValue(),
                             uint64_t(4))) &&
-         ((Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS ||
+         (MMO->isInvariant() ||
+          (Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS ||
            Ld->getAddressSpace() == AMDGPUAS::CONSTANT_ADDRESS_32BIT) ||
           (Subtarget->getScalarizeGlobalBehavior() &&
            Ld->getAddressSpace() == AMDGPUAS::GLOBAL_ADDRESS &&
diff --git a/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll b/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
index 6815050d0a441..23970d454526c 100644
--- a/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
+++ b/llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
@@ -10,7 +10,7 @@
 ; GCN-DAG: buffer_load_dwordx2 [[PTR:v\[[0-9]+:[0-9]+\]]],
 ; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x1c8007b
 ; GCN: buffer_store_dword [[K]], [[PTR]]
-define amdgpu_kernel void @test_merge_store_constant_i16_invariant_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
+define void @test_merge_store_constant_i16_invariant_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
   %ptr = load ptr addrspace(1), ptr addrspace(1) %in, !invariant.load !0
   %ptr.1 = getelementptr i16, ptr addrspace(1) %ptr, i64 1
   store i16 123, ptr addrspace(1) %ptr, align 4
@@ -30,6 +30,19 @@ define amdgpu_kernel void @test_merge_store_constant_i16_invariant_constant_poin
   ret void
 }
 
+; Invariant global load should be equivalently handled to constant.
+; GCN-LABEL: {{^}}test_merge_store_global_i16_invariant_uniform_global_pointer_load:
+; GCN: s_load_dwordx2 s[[[SPTR_LO:[0-9]+]]:[[SPTR_HI:[0-9]+]]]
+; GCN: v_mov_b32_e32 [[K:v[0-9]+]], 0x1c8007b
+; GCN: buffer_store_dword [[K]], off, s[[[SPTR_LO]]:
+define amdgpu_kernel void @test_merge_store_global_i16_invariant_uniform_global_pointer_load(ptr addrspace(1) dereferenceable(4096) nonnull %in) #0 {
+  %ptr = load ptr addrspace(1), ptr addrspace(1) %in, !invariant.load !0
+  %ptr.1 = getelementptr i16, ptr addrspace(1) %ptr, i64 1
+  store i16 123, ptr addrspace(1) %ptr, align 4
+  store i16 456, ptr addrspace(1) %ptr.1
+  ret void
+}
+
 !0 = !{}
 
 attributes #0 = { nounwind }

@github-actions
Copy link

github-actions bot commented Nov 19, 2025

🐧 Linux x64 Test Results

  • 186425 tests passed
  • 4867 tests skipped

@arsenm arsenm force-pushed the users/arsenm/amdgpu/reapply-allow-select-ptr-combine-168292 branch from 7bad49b to e2c7ac8 Compare November 20, 2025 16:11
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-invariant-load-isUniformLoad branch from 7d7cabd to cd38396 Compare November 20, 2025 16:11
Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is saying that invariant loads can be selected to SMEM instructions regardless of any "no clobber" check, right? I think that's safe...

Base automatically changed from users/arsenm/amdgpu/reapply-allow-select-ptr-combine-168292 to main November 20, 2025 17:13
Doesn't touch the globalisel version because the handling
there looks a bit broken.
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-invariant-load-isUniformLoad branch from cd38396 to 34253e8 Compare November 20, 2025 17:18
@arsenm arsenm merged commit e79c7c1 into main Nov 20, 2025
8 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/handle-invariant-load-isUniformLoad branch November 20, 2025 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants