-
Notifications
You must be signed in to change notification settings - Fork 15.2k
AMDGPU: Constrain readfirstlane operand when writing to m0 #168004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Constrain readfirstlane operand when writing to m0 #168004
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesFixes another verifier error after introducing AV registers. Full diff: https://github.com/llvm/llvm-project/pull/168004.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp b/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
index 27b9af2d3885f..6a54057e26897 100644
--- a/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
@@ -902,14 +902,28 @@ bool SIFixSGPRCopies::lowerSpecialCase(MachineInstr &MI,
// really much we can do to fix this.
// Some special instructions use M0 as an input. Some even only use
// the first lane. Insert a readfirstlane and hope for the best.
- if (DstReg == AMDGPU::M0 &&
- TRI->hasVectorRegisters(MRI->getRegClass(SrcReg))) {
+ const TargetRegisterClass *SrcRC = MRI->getRegClass(SrcReg);
+ if (DstReg == AMDGPU::M0 && TRI->hasVectorRegisters(SrcRC)) {
Register TmpReg =
MRI->createVirtualRegister(&AMDGPU::SReg_32_XM0RegClass);
- BuildMI(*MI.getParent(), MI, MI.getDebugLoc(),
- TII->get(AMDGPU::V_READFIRSTLANE_B32), TmpReg)
+
+ const MCInstrDesc &ReadFirstLaneDesc =
+ TII->get(AMDGPU::V_READFIRSTLANE_B32);
+ BuildMI(*MI.getParent(), MI, MI.getDebugLoc(), ReadFirstLaneDesc, TmpReg)
.add(MI.getOperand(1));
+
+ unsigned SubReg = MI.getOperand(1).getSubReg();
MI.getOperand(1).setReg(TmpReg);
+ MI.getOperand(1).setSubReg(AMDGPU::NoSubRegister);
+
+ const TargetRegisterClass *OpRC = TII->getRegClass(ReadFirstLaneDesc, 1);
+ const TargetRegisterClass *ConstrainRC =
+ SubReg == AMDGPU::NoSubRegister
+ ? OpRC
+ : TRI->getMatchingSuperRegClass(SrcRC, OpRC, SubReg);
+
+ if (!MRI->constrainRegClass(SrcReg, ConstrainRC))
+ llvm_unreachable("failed to constrain register");
} else if (tryMoveVGPRConstToSGPR(MI.getOperand(1), DstReg, MI.getParent(),
MI, MI.getDebugLoc())) {
I = std::next(I);
diff --git a/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
index b05b89fe503f2..116f46df01049 100644
--- a/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+++ b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
@@ -49,4 +49,19 @@ bb16: ; preds = %bb16, %bb
br label %bb16
}
-
+define void @av_class_to_m0(ptr addrspace(1) %ptr) {
+; CHECK-LABEL: av_class_to_m0:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: global_load_dword v0, v[0:1], off
+; CHECK-NEXT: s_waitcnt vmcnt(0)
+; CHECK-NEXT: v_readfirstlane_b32 s4, v0
+; CHECK-NEXT: s_mov_b32 m0, s4
+; CHECK-NEXT: ;;#ASMSTART
+; CHECK-NEXT: ; use m0
+; CHECK-NEXT: ;;#ASMEND
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ %load = load i32, ptr addrspace(1) %ptr
+ call void asm sideeffect "; use $0", "{m0}"(i32 %load)
+ ret void
+}
diff --git a/llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir b/llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
index ac4f41282ab73..03e3ff95bbad2 100644
--- a/llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+++ b/llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
@@ -90,3 +90,22 @@ body: |
S_ENDPGM 0
...
+---
+name: constrain_readfirstlane_av64_subreg_m0
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; CHECK-LABEL: name: constrain_readfirstlane_av64_subreg_m0
+ ; CHECK: liveins: $vgpr0_vgpr1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[DEF:%[0-9]+]]:sreg_32 = IMPLICIT_DEF
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
+ ; CHECK-NEXT: [[V_READFIRSTLANE_B32_:%[0-9]+]]:sreg_32_xm0 = V_READFIRSTLANE_B32 [[COPY]].sub0, implicit $exec
+ ; CHECK-NEXT: $m0 = COPY [[V_READFIRSTLANE_B32_]]
+ %0:sreg_32 = IMPLICIT_DEF
+ %1:av_64 = COPY $vgpr0_vgpr1
+ $m0 = COPY %1.sub0
+...
+
|
jayfoad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as #168001 - LGTM but I would really hope we could stop proliferating the subreg handling code.
Fixes another verifier error after introducing AV registers. Also fixes not clearing the subregister index if there was one.
557be13 to
ab51862
Compare
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/20985 Here is the relevant piece of the build log for the reference |
Fixes another verifier error after introducing AV registers. Also fixes not clearing the subregister index if there was one.

Fixes another verifier error after introducing AV registers.
Also fixes not clearing the subregister index if there was
one.