Skip to content

Conversation

@carlobertolli
Copy link
Member

This is observed when exhausting scalar registers in kernels with high register usage.

This is observed when exhausting scalar registers in kernels with high register usage.
@llvmbot
Copy link
Member

llvmbot commented Oct 13, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: None (carlobertolli)

Changes

This is observed when exhausting scalar registers in kernels with high register usage.


Full diff: https://github.com/llvm/llvm-project/pull/163244.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+2-1)
  • (modified) llvm/test/CodeGen/AMDGPU/sgpr-phys-copy.mir (+9)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index ec5c5bb349ac4..86a60e5a6242b 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -899,7 +899,8 @@ void SIInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
     }
 
     if (DestReg == AMDGPU::VCC) {
-      if (AMDGPU::SReg_64RegClass.contains(SrcReg)) {
+      if (AMDGPU::SReg_64RegClass.contains(SrcReg) ||
+      AMDGPU::SReg_64_EncodableRegClass.contains(SrcReg)) {
         BuildMI(MBB, MI, DL, get(AMDGPU::S_MOV_B64), AMDGPU::VCC)
           .addReg(SrcReg, getKillRegState(KillSrc));
       } else {
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-phys-copy.mir b/llvm/test/CodeGen/AMDGPU/sgpr-phys-copy.mir
index 9553fcc1c51c8..f11fe4aa6e00e 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-phys-copy.mir
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-phys-copy.mir
@@ -58,6 +58,15 @@ body:             |
     $sgpr0_sgpr1 = COPY $src_shared_base
 ...
 
+---
+name: src_shared_base_to_vcc
+body:             |
+  bb.0:
+    ; GFX9-LABEL: name: src_shared_base_to_vcc
+    ; GFX9: $vcc = S_MOV_B64 $src_shared_base
+    $vcc = COPY $src_shared_base
+...
+
 ---
 name: sgpr96_aligned_src_dst
 body:             |

@github-actions
Copy link

github-actions bot commented Oct 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

SReg_64 is part of SReg_64_EncodableRegClass.

Co-authored-by: Stanislav Mekhanoshin <rampitec.github@runbox.com>
@carlobertolli carlobertolli force-pushed the enable_saving_shareb_base_to_vcc.trunk branch from cb2fc37 to 06ca679 Compare October 13, 2025 19:25
Copy link
Collaborator

@rampitec rampitec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shiltian shiltian changed the title Enable saving SHARED_BASE to VCC. [AMDGPU] Enable saving SHARED_BASE to VCC Oct 13, 2025
bb.0:
; GFX9-LABEL: name: src_shared_base_to_vcc
; GFX9: $vcc = S_MOV_B64 $src_shared_base
$vcc = COPY $src_shared_base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test the inverse case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping. We do not structurally prevent writing to constant registers and it is encodable

Copy link
Member Author

@carlobertolli carlobertolli Oct 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm I added a test to copy vcc to shared_base but it fails

name: vcc_to_src_shared_base
body: |
bb.0:
; GFX9-LABEL: name: vcc_to_src_shared_base
; GFX9: $src_shared_base = S_MOV_B64 $vcc
$src_shared_base = COPY $vcc
...

I';ll have to fix the copy to enable this, unless I misunderstood you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this?
#164138

@carlobertolli carlobertolli merged commit 8892825 into llvm:main Oct 13, 2025
10 checks passed
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants