-
Notifications
You must be signed in to change notification settings - Fork 15.2k
GlobalISel: Relax verifier between physreg and typed vreg #159281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GlobalISel: Relax verifier between physreg and typed vreg #159281
Conversation
Accept mismatched register size and type size if the type is legal for the register class. For AMDGPU boolean registers have 2 possible interpretations depending on the use context type. e.g., these are both equally valid: %0:_(s1) = COPY $vcc %1:_(s64) = COPY $vcc vcc is a 64-bit register, which can be interpreted as a 1-bit or 64-bit value depending on the use context. SelectionDAG has never required exact match between the register size and the used value type. You can assign a type with a smaller size to a larger register class. Relax the verifier to match. There are several hacks holding together these copies in various places, and this is preparation to remove one of them. The x86 test change is from what I would consider an X86 usage bug. X86 defines an FR32 register class and F16 register class, but the F16 register class is functionally an alias of F32 with the same members and size. There's no need to have the F16 class.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-globalisel Author: Matt Arsenault (arsenm) ChangesAccept mismatched register size and type size if the type is legal For AMDGPU boolean registers have 2 possible interpretations depending %0:(s1) = COPY $vcc vcc is a 64-bit register, which can be interpreted as a 1-bit or 64-bit The x86 test change is from what I would consider an X86 usage bug. X86 Full diff: https://github.com/llvm/llvm-project/pull/159281.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 2b24fe49c970b..e911ce8a75828 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -2376,20 +2376,24 @@ void MachineVerifier::visitMachineInstrBefore(const MachineInstr *MI) {
// If we have only one valid type, this is likely a copy between a virtual
// and physical register.
- TypeSize SrcSize = TRI->getRegSizeInBits(SrcReg, *MRI);
- TypeSize DstSize = TRI->getRegSizeInBits(DstReg, *MRI);
+ TypeSize SrcSize = TypeSize::getZero();
+ TypeSize DstSize = TypeSize::getZero();
if (SrcReg.isPhysical() && DstTy.isValid()) {
const TargetRegisterClass *SrcRC =
TRI->getMinimalPhysRegClassLLT(SrcReg, DstTy);
- if (SrcRC)
- SrcSize = TRI->getRegSizeInBits(*SrcRC);
+ if (!SrcRC)
+ SrcSize = TRI->getRegSizeInBits(SrcReg, *MRI);
+ } else {
+ SrcSize = TRI->getRegSizeInBits(SrcReg, *MRI);
}
if (DstReg.isPhysical() && SrcTy.isValid()) {
const TargetRegisterClass *DstRC =
TRI->getMinimalPhysRegClassLLT(DstReg, SrcTy);
- if (DstRC)
- DstSize = TRI->getRegSizeInBits(*DstRC);
+ if (!DstRC)
+ DstSize = TRI->getRegSizeInBits(DstReg, *MRI);
+ } else {
+ DstSize = TRI->getRegSizeInBits(DstReg, *MRI);
}
// The next two checks allow COPY between physical and virtual registers,
diff --git a/llvm/test/MachineVerifier/AMDGPU/test_copy_physregs_llt_virtreg.mir b/llvm/test/MachineVerifier/AMDGPU/test_copy_physregs_llt_virtreg.mir
new file mode 100644
index 0000000000000..0fd50391a7e3a
--- /dev/null
+++ b/llvm/test/MachineVerifier/AMDGPU/test_copy_physregs_llt_virtreg.mir
@@ -0,0 +1,58 @@
+# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -filetype=null %s 2>&1 | FileCheck -implicit-check-not="Bad machine code" %s
+
+---
+name: test_valid_copies
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vcc
+ %0:_(s32) = COPY $vgpr0
+ %1:_(s16) = COPY $vgpr0
+ %2:_(s64) = COPY $vcc
+ %3:_(s1) = COPY $vcc
+ $vgpr0 = COPY %0
+ $vgpr0 = COPY %0
+ $vcc = COPY %2
+ $vcc = COPY %3
+...
+
+---
+name: test_invalid_copies
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1, $vgpr2, $vcc
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: %0:_(s32) = COPY $vgpr0_vgpr1
+ %0:_(s32) = COPY $vgpr0_vgpr1
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: %1:_(s64) = COPY $vgpr2
+ %1:_(s64) = COPY $vgpr2
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: %2:_(s32) = COPY $vcc
+ %2:_(s32) = COPY $vcc
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: %3:_(s8) = COPY $vgpr2
+ %3:_(s8) = COPY $vgpr2
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: $vgpr0_vgpr1 = COPY %0:_(s32)
+ $vgpr0_vgpr1 = COPY %0
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: $vgpr2 = COPY %1:_(s64)
+ $vgpr2 = COPY %1
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: $vcc = COPY %2:_(s32)
+ $vcc = COPY %2
+
+ ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
+ ; CHECK: - instruction: $vgpr2 = COPY %3:_(s8)
+ $vgpr2 = COPY %3
+
+...
diff --git a/llvm/test/MachineVerifier/test_copy_physregs_x86.mir b/llvm/test/MachineVerifier/test_copy_physregs_x86.mir
index a239379a34e62..f3323c4353142 100644
--- a/llvm/test/MachineVerifier/test_copy_physregs_x86.mir
+++ b/llvm/test/MachineVerifier/test_copy_physregs_x86.mir
@@ -29,34 +29,26 @@ body: |
liveins: $xmm0, $xmm1, $xmm2, $xmm3
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: %0:_(s16) = COPY $xmm0
- %0:_(s16) = COPY $xmm0
+ ; CHECK: - instruction: %0:_(<4 x s16>) = COPY $xmm1
+ %0:_(<4 x s16>) = COPY $xmm1
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: %1:_(<4 x s16>) = COPY $xmm1
- %1:_(<4 x s16>) = COPY $xmm1
+ ; CHECK: - instruction: %1:_(s256) = COPY $xmm2
+ %1:_(s256) = COPY $xmm2
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: %2:_(s256) = COPY $xmm2
- %2:_(s256) = COPY $xmm2
+ ; CHECK: - instruction: %2:_(<8 x s32>) = COPY $xmm3
+ %2:_(<8 x s32>) = COPY $xmm3
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: %3:_(<8 x s32>) = COPY $xmm3
- %3:_(<8 x s32>) = COPY $xmm3
+ ; CHECK: - instruction: $xmm1 = COPY %0:_(<4 x s16>)
+ $xmm1 = COPY %0
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: $xmm0 = COPY %0:_(s16)
- $xmm0 = COPY %0
-
- ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: $xmm1 = COPY %1:_(<4 x s16>)
- $xmm1 = COPY %1
+ ; CHECK: - instruction: $xmm2 = COPY %1:_(s256)
+ $xmm2 = COPY %1
; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: $xmm2 = COPY %2:_(s256)
- $xmm2 = COPY %2
-
- ; CHECK: *** Bad machine code: Copy Instruction is illegal with mismatching sizes ***
- ; CHECK: - instruction: $xmm3 = COPY %3:_(<8 x s32>)
- $xmm3 = COPY %3
+ ; CHECK: - instruction: $xmm3 = COPY %2:_(<8 x s32>)
+ $xmm3 = COPY %2
...
|
|
Not vcc specifically, but this is to support lowering i1 argument values to lanemask SGPRs |
Ok, looks good so far. Does uniformity analysis correctly calculate if argument is uniform or divergent? If not it would be easier to use ballot or inverse ballot in argument lowering and normal copy |
I'd need to check, but it's not important for the purposes of this patch. Right now I'm just trying to avoid breaking existing tests using these copies after I rip out one of the wavesize hacks supporting them |
Accept mismatched register size and type size if the type is legal for the register class. For AMDGPU boolean registers have 2 possible interpretations depending on the use context type. e.g., these are both equally valid: %0:_(s1) = COPY $vcc %1:_(s64) = COPY $vcc vcc is a 64-bit register, which can be interpreted as a 1-bit or 64-bit value depending on the use context. SelectionDAG has never required exact match between the register size and the used value type. You can assign a type with a smaller size to a larger register class. Relax the verifier to match. There are several hacks holding together these copies in various places, and this is preparation to remove one of them. The x86 test change is from what I would consider an X86 usage bug. X86 defines an FR32 register class and F16 register class, but the F16 register class is functionally an alias of F32 with the same members and size. There's no need to have the F16 class.
Accept mismatched register size and type size if the type is legal for the register class. For AMDGPU boolean registers have 2 possible interpretations depending on the use context type. e.g., these are both equally valid: %0:_(s1) = COPY $vcc %1:_(s64) = COPY $vcc vcc is a 64-bit register, which can be interpreted as a 1-bit or 64-bit value depending on the use context. SelectionDAG has never required exact match between the register size and the used value type. You can assign a type with a smaller size to a larger register class. Relax the verifier to match. There are several hacks holding together these copies in various places, and this is preparation to remove one of them. The x86 test change is from what I would consider an X86 usage bug. X86 defines an FR32 register class and F16 register class, but the F16 register class is functionally an alias of F32 with the same members and size. There's no need to have the F16 class.
Accept mismatched register size and type size if the type is legal
for the register class.
For AMDGPU boolean registers have 2 possible interpretations depending
on the use context type. e.g., these are both equally valid:
%0:(s1) = COPY $vcc
%1:(s64) = COPY $vcc
vcc is a 64-bit register, which can be interpreted as a 1-bit or 64-bit
value depending on the use context. SelectionDAG has never required exact
match between the register size and the used value type. You can assign
a type with a smaller size to a larger register class. Relax the verifier
to match. There are several hacks holding together these copies in
various places, and this is preparation to remove one of them.
The x86 test change is from what I would consider an X86 usage bug. X86
defines an FR32 register class and F16 register class, but the F16 register
class is functionally an alias of F32 with the same members and size. There's
no need to have the F16 class.