Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Nov 20, 2025

Global with invariant should be treated identically to
constant.

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Global with invariant should be treated identically to
constant.


Full diff: https://github.com/llvm/llvm-project/pull/168914.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/load-global-invariant.ll (+15-8)
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 875278a3b4f97..c681d12ba7499 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -11944,7 +11944,7 @@ SDValue SITargetLowering::LowerLOAD(SDValue Op, SelectionDAG &DAG) const {
       AS == AMDGPUAS::CONSTANT_ADDRESS_32BIT ||
       (AS == AMDGPUAS::GLOBAL_ADDRESS &&
        Subtarget->getScalarizeGlobalBehavior() && Load->isSimple() &&
-       isMemOpHasNoClobberedMemOperand(Load))) {
+       (Load->isInvariant() || isMemOpHasNoClobberedMemOperand(Load)))) {
     if ((!Op->isDivergent() || AMDGPU::isUniformMMO(MMO)) &&
         Alignment >= Align(4) && NumElements < 32) {
       if (MemVT.isPow2VectorType() ||
diff --git a/llvm/test/CodeGen/AMDGPU/load-global-invariant.ll b/llvm/test/CodeGen/AMDGPU/load-global-invariant.ll
index b881edde0f448..6cdadc5bab5fb 100644
--- a/llvm/test/CodeGen/AMDGPU/load-global-invariant.ll
+++ b/llvm/test/CodeGen/AMDGPU/load-global-invariant.ll
@@ -50,15 +50,22 @@ define amdgpu_kernel void @load_global_v3i64(ptr addrspace(1) %dst, ptr addrspac
 define amdgpu_kernel void @load_global_v3i64_invariant(ptr addrspace(1) %dst, ptr addrspace(1) %src) #0 {
 ; CHECK-LABEL: load_global_v3i64_invariant:
 ; CHECK:       ; %bb.0:
-; CHECK-NEXT:    v_mov_b32_e32 v6, 0
-; CHECK-NEXT:    s_load_dwordx2 s[0:1], s[8:9], 0x0
-; CHECK-NEXT:    s_load_dwordx2 s[2:3], s[8:9], 0x8
+; CHECK-NEXT:    v_mov_b32_e32 v4, 0
+; CHECK-NEXT:    s_load_dwordx2 s[4:5], s[8:9], 0x0
+; CHECK-NEXT:    s_load_dwordx2 s[6:7], s[8:9], 0x8
 ; CHECK-NEXT:    s_waitcnt lgkmcnt(0)
-; CHECK-NEXT:    global_load_dwordx4 v[0:3], v6, s[2:3]
-; CHECK-NEXT:    global_load_dwordx2 v[4:5], v6, s[2:3] offset:16
-; CHECK-NEXT:    s_waitcnt vmcnt(0)
-; CHECK-NEXT:    global_store_dwordx2 v6, v[4:5], s[0:1] offset:16
-; CHECK-NEXT:    global_store_dwordx4 v6, v[0:3], s[0:1]
+; CHECK-NEXT:    s_load_dwordx4 s[0:3], s[6:7], 0x0
+; CHECK-NEXT:    s_nop 0
+; CHECK-NEXT:    s_load_dwordx2 s[6:7], s[6:7], 0x10
+; CHECK-NEXT:    s_waitcnt lgkmcnt(0)
+; CHECK-NEXT:    v_mov_b32_e32 v0, s6
+; CHECK-NEXT:    v_mov_b32_e32 v1, s7
+; CHECK-NEXT:    global_store_dwordx2 v4, v[0:1], s[4:5] offset:16
+; CHECK-NEXT:    v_mov_b32_e32 v0, s0
+; CHECK-NEXT:    v_mov_b32_e32 v1, s1
+; CHECK-NEXT:    v_mov_b32_e32 v2, s2
+; CHECK-NEXT:    v_mov_b32_e32 v3, s3
+; CHECK-NEXT:    global_store_dwordx4 v4, v[0:3], s[4:5]
 ; CHECK-NEXT:    s_endpgm
   %ld = load <3 x i64>, ptr addrspace(1) %src, align 32, !invariant.load !0
   store <3 x i64> %ld, ptr addrspace(1) %dst, align 32

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

🐧 Linux x64 Test Results

  • 186432 tests passed
  • 4864 tests skipped

@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-invariant-load-split branch from a406bd2 to 954dc93 Compare November 20, 2025 18:59
@arsenm arsenm force-pushed the users/arsenm/amdgpu/add-baseline-test-invariant-vector-load branch from dd3f339 to 1c8ddb0 Compare November 20, 2025 18:59
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-invariant-load-split branch from 954dc93 to 4be9e5b Compare November 21, 2025 00:58
@arsenm arsenm force-pushed the users/arsenm/amdgpu/add-baseline-test-invariant-vector-load branch from 1c8ddb0 to a8b806c Compare November 21, 2025 00:58
(AS == AMDGPUAS::GLOBAL_ADDRESS &&
Subtarget->getScalarizeGlobalBehavior() && Load->isSimple() &&
isMemOpHasNoClobberedMemOperand(Load))) {
(Load->isInvariant() || isMemOpHasNoClobberedMemOperand(Load)))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and there is no test change with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there was in the test from #168913 but it turned out to be restoring a regression from one of the later patches

Base automatically changed from users/arsenm/amdgpu/add-baseline-test-invariant-vector-load to main November 21, 2025 18:53
@arsenm arsenm merged commit f8bbb21 into main Nov 21, 2025
16 of 17 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/handle-invariant-load-split branch November 21, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants