-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV][GISel][WIP] Support s64 G_PHI on RV32 when D extension is enabled. #119026
base: main
Are you sure you want to change the base?
Conversation
…bled. When the D extension is enabled we should be able to have s64 phis to support f64 values. Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary. I'm looking for advice on how to improve this.
|
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesWhen the D extension is enabled we should be able to have s64 phis to support f64 values. Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary. I'm looking for advice on how to improve this. Patch is 20.71 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119026.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index 1199dabf3d1659..d7ac081913c5d9 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -412,9 +412,10 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
getActionDefinitionsBuilder(G_BRINDIRECT).legalFor({p0});
getActionDefinitionsBuilder(G_PHI)
- .legalFor({p0, s32, sXLen})
+ .legalFor({p0, s32})
+ .legalFor(XLen == 64 || ST.hasStdExtD(), {s64})
.widenScalarToNextPow2(0)
- .clampScalar(0, s32, sXLen);
+ .clampScalar(0, s32, (XLen == 64 || ST.hasStdExtD()) ? s64 : s32);
getActionDefinitionsBuilder({G_GLOBAL_VALUE, G_JUMP_TABLE, G_CONSTANT_POOL})
.legalFor({p0});
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
index c8f73afc0ac7c7..49b1538294c748 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
@@ -1,6 +1,8 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
# RUN: llc -mtriple=riscv32 -run-pass=legalizer %s -o - \
-# RUN: | FileCheck %s
+# RUN: | FileCheck %s --check-prefixes=CHECK,RV32I
+# RUN: llc -mtriple=riscv32 -mattr=+d -run-pass=legalizer %s -o - \
+# RUN: | FileCheck %s --check-prefixes=CHECK,RV32ID
--- |
define i1 @phi_i1(i1 %cnd, i1 %a, i1 %b) {
@@ -413,30 +415,57 @@ machineFunctionInfo:
varArgsFrameIndex: 0
varArgsSaveSize: 0
body: |
- ; CHECK-LABEL: name: phi_i48
- ; CHECK: bb.0.entry:
- ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
- ; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
- ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
- ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
- ; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
- ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
- ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
- ; CHECK-NEXT: G_BRCOND [[AND]](s32), %bb.2
- ; CHECK-NEXT: G_BR %bb.1
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.1.cond.false:
- ; CHECK-NEXT: successors: %bb.2(0x80000000)
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.2.cond.end:
- ; CHECK-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
- ; CHECK-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
- ; CHECK-NEXT: $x10 = COPY [[PHI]](s32)
- ; CHECK-NEXT: $x11 = COPY [[PHI1]](s32)
- ; CHECK-NEXT: PseudoRET implicit $x10, implicit $x11
+ ; RV32I-LABEL: name: phi_i48
+ ; RV32I: bb.0.entry:
+ ; RV32I-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32I-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+ ; RV32I-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+ ; RV32I-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32I-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32I-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32I-NEXT: G_BR %bb.1
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.1.cond.false:
+ ; RV32I-NEXT: successors: %bb.2(0x80000000)
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.2.cond.end:
+ ; RV32I-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
+ ; RV32I-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
+ ; RV32I-NEXT: $x10 = COPY [[PHI]](s32)
+ ; RV32I-NEXT: $x11 = COPY [[PHI1]](s32)
+ ; RV32I-NEXT: PseudoRET implicit $x10, implicit $x11
+ ;
+ ; RV32ID-LABEL: name: phi_i48
+ ; RV32ID: bb.0.entry:
+ ; RV32ID-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32ID-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32ID-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32ID-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32ID-NEXT: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+ ; RV32ID-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+ ; RV32ID-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+ ; RV32ID-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[COPY4]](s32)
+ ; RV32ID-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32ID-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32ID-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32ID-NEXT: G_BR %bb.1
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.1.cond.false:
+ ; RV32ID-NEXT: successors: %bb.2(0x80000000)
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.2.cond.end:
+ ; RV32ID-NEXT: [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV1]](s64), %bb.1, [[MV]](s64), %bb.0
+ ; RV32ID-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+ ; RV32ID-NEXT: $x10 = COPY [[UV]](s32)
+ ; RV32ID-NEXT: $x11 = COPY [[UV1]](s32)
+ ; RV32ID-NEXT: PseudoRET implicit $x10, implicit $x11
bb.1.entry:
liveins: $x10, $x11, $x12, $x13, $x14
@@ -486,30 +515,57 @@ machineFunctionInfo:
varArgsFrameIndex: 0
varArgsSaveSize: 0
body: |
- ; CHECK-LABEL: name: phi_i64
- ; CHECK: bb.0.entry:
- ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
- ; CHECK-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
- ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
- ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
- ; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
- ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
- ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
- ; CHECK-NEXT: G_BRCOND [[AND]](s32), %bb.2
- ; CHECK-NEXT: G_BR %bb.1
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.1.cond.false:
- ; CHECK-NEXT: successors: %bb.2(0x80000000)
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.2.cond.end:
- ; CHECK-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
- ; CHECK-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
- ; CHECK-NEXT: $x10 = COPY [[PHI]](s32)
- ; CHECK-NEXT: $x11 = COPY [[PHI1]](s32)
- ; CHECK-NEXT: PseudoRET implicit $x10, implicit $x11
+ ; RV32I-LABEL: name: phi_i64
+ ; RV32I: bb.0.entry:
+ ; RV32I-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32I-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+ ; RV32I-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+ ; RV32I-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32I-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32I-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32I-NEXT: G_BR %bb.1
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.1.cond.false:
+ ; RV32I-NEXT: successors: %bb.2(0x80000000)
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.2.cond.end:
+ ; RV32I-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
+ ; RV32I-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
+ ; RV32I-NEXT: $x10 = COPY [[PHI]](s32)
+ ; RV32I-NEXT: $x11 = COPY [[PHI1]](s32)
+ ; RV32I-NEXT: PseudoRET implicit $x10, implicit $x11
+ ;
+ ; RV32ID-LABEL: name: phi_i64
+ ; RV32ID: bb.0.entry:
+ ; RV32ID-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32ID-NEXT: liveins: $x10, $x11, $x12, $x13, $x14
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32ID-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32ID-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32ID-NEXT: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+ ; RV32ID-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+ ; RV32ID-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+ ; RV32ID-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[COPY4]](s32)
+ ; RV32ID-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32ID-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32ID-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32ID-NEXT: G_BR %bb.1
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.1.cond.false:
+ ; RV32ID-NEXT: successors: %bb.2(0x80000000)
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.2.cond.end:
+ ; RV32ID-NEXT: [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV1]](s64), %bb.1, [[MV]](s64), %bb.0
+ ; RV32ID-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+ ; RV32ID-NEXT: $x10 = COPY [[UV]](s32)
+ ; RV32ID-NEXT: $x11 = COPY [[UV1]](s32)
+ ; RV32ID-NEXT: PseudoRET implicit $x10, implicit $x11
bb.1.entry:
liveins: $x10, $x11, $x12, $x13, $x14
@@ -562,34 +618,71 @@ machineFunctionInfo:
varArgsFrameIndex: 0
varArgsSaveSize: 0
body: |
- ; CHECK-LABEL: name: phi_i72
- ; CHECK: bb.0.entry:
- ; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
- ; CHECK-NEXT: liveins: $x10, $x11, $x12
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
- ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
- ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
- ; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
- ; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
- ; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
- ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
- ; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
- ; CHECK-NEXT: G_BRCOND [[AND]](s32), %bb.2
- ; CHECK-NEXT: G_BR %bb.1
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.1.cond.false:
- ; CHECK-NEXT: successors: %bb.2(0x80000000)
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: bb.2.cond.end:
- ; CHECK-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY1]](s32), %bb.0
- ; CHECK-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY5]](s32), %bb.1, [[COPY2]](s32), %bb.0
- ; CHECK-NEXT: [[PHI2:%[0-9]+]]:_(s32) = G_PHI [[COPY6]](s32), %bb.1, [[COPY3]](s32), %bb.0
- ; CHECK-NEXT: $x10 = COPY [[PHI]](s32)
- ; CHECK-NEXT: $x10 = COPY [[PHI1]](s32)
- ; CHECK-NEXT: $x10 = COPY [[PHI2]](s32)
- ; CHECK-NEXT: PseudoRET implicit $x10, implicit $x10, implicit $x10
+ ; RV32I-LABEL: name: phi_i72
+ ; RV32I: bb.0.entry:
+ ; RV32I-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32I-NEXT: liveins: $x10, $x11, $x12
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32I-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32I-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32I-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32I-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32I-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32I-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32I-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32I-NEXT: G_BR %bb.1
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.1.cond.false:
+ ; RV32I-NEXT: successors: %bb.2(0x80000000)
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: bb.2.cond.end:
+ ; RV32I-NEXT: [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY1]](s32), %bb.0
+ ; RV32I-NEXT: [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY5]](s32), %bb.1, [[COPY2]](s32), %bb.0
+ ; RV32I-NEXT: [[PHI2:%[0-9]+]]:_(s32) = G_PHI [[COPY6]](s32), %bb.1, [[COPY3]](s32), %bb.0
+ ; RV32I-NEXT: $x10 = COPY [[PHI]](s32)
+ ; RV32I-NEXT: $x10 = COPY [[PHI1]](s32)
+ ; RV32I-NEXT: $x10 = COPY [[PHI2]](s32)
+ ; RV32I-NEXT: PseudoRET implicit $x10, implicit $x10, implicit $x10
+ ;
+ ; RV32ID-LABEL: name: phi_i72
+ ; RV32ID: bb.0.entry:
+ ; RV32ID-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; RV32ID-NEXT: liveins: $x10, $x11, $x12
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+ ; RV32ID-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32ID-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32ID-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
+ ; RV32ID-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32ID-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32ID-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
+ ; RV32ID-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+ ; RV32ID-NEXT: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+ ; RV32ID-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[DEF]](s32)
+ ; RV32ID-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; RV32ID-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+ ; RV32ID-NEXT: G_BRCOND [[AND]](s32), %bb.2
+ ; RV32ID-NEXT: G_BR %bb.1
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.1.cond.false:
+ ; RV32ID-NEXT: successors: %bb.2(0x80000000)
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+ ; RV32ID-NEXT: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY4]](s32), [[COPY5]](s32)
+ ; RV32ID-NEXT: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY6]](s32), [[DEF1]](s32)
+ ; RV32ID-NEXT: {{ $}}
+ ; RV32ID-NEXT: bb.2.cond.end:
+ ; RV32ID-NEXT: [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV2]](s64), %bb.1, [[MV]](s64), %bb.0
+ ; RV32ID-NEXT: [[PHI1:%[0-9]+]]:_(s64) = G_PHI [[MV3]](s64), %bb.1, [[MV1]](s64), %bb.0
+ ; RV32ID-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+ ; RV32ID-NEXT: [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI1]](s64)
+ ; RV32ID-NEXT: $x10 = COPY [[UV]](s32)
+ ; RV32ID-NEXT: $x10 = COPY [[UV1]](s32)
+ ; RV32ID-NEXT: $x10 = COPY [[UV2]](s32)
+ ; RV32ID-NEXT: PseudoRET implicit $x10, implicit $x10, implicit $x10
bb.1.entry:
liveins: $x10, $x11, $x12
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll b/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
index 494d2c94c181db..b21657dc3efe16 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
@@ -1,8 +1,10 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc -mtriple=riscv32 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV32
-; RUN: llc -mtriple=riscv64 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV64
-; RUN: llc -mtriple=riscv32 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV32
-; RUN: llc -mtriple=riscv64 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV64
+; RUN: llc -mtriple=riscv32 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV32,RV32I
+; RUN: llc -mtriple=riscv64 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV64,RV64I
+; RUN: llc -mtriple=riscv32 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV32,RV32IF
+; RUN: llc -mtriple=riscv64 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV64,RV64IF
+; RUN: llc -mtriple=riscv32 -mattr=+d -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKID,RV32,RV32ID
+; RUN: llc -mtriple=riscv64 -mattr=+d -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKID,RV64,RV64ID
define i1 @phi_i1(i1 %cnd, i1 %a, i1 %b) {
; CHECK-LABEL: phi_i1:
@@ -93,18 +95,18 @@ cond.end: ; preds = %entry, %cond.false
}
define i64 @phi_i64(i1 %cnd, i64 %a, i64 %b) {
-; RV32-LABEL: phi_i64:
-; RV32: # %bb.0: # %entry
-; RV32-NEXT: mv a5, a0
-; RV32-NEXT: mv a0, a1
-; RV32-NEXT: mv a1, a2
-; RV32-NEXT: andi a5, a5, 1
-; RV32-NEXT: bnez a5, .LBB4_2
-; RV32-NEXT: # %bb.1: # %cond.false
-; RV32-NEXT: mv a0, a3
-; RV32-NEXT: mv a1, a4
-; RV32-NEXT: .LBB4_2: # %cond.end
-; RV32-NEXT: ret
+; RV32I-LABEL: phi_i64:
+; RV32I: # %bb.0: # %entry
+; RV32I-NEXT: mv a5, a0
+; RV32I-NEXT: mv a0, a1
+; RV32I-NEXT: mv a1, a2
+; RV32I-NEXT: andi a5, a5, 1
+; RV32I-NEXT: bnez a5, .LBB4_2
+; RV32I-NEXT: # %bb.1: # %cond.false
+; RV32I-NEXT: mv a0, a3
+; RV32I-NEXT: mv a1, a4
+; RV32I-NEXT: .LBB4_2: # %cond.end
+; RV32I-NEXT: ret
;
; RV64-LABEL: phi_i64:
; RV64: # %bb.0: # %entry
@@ -116,6 +118,40 @@ define i64 @phi_i64(i1 %cnd, i64 %a, i64 %b) {
; RV64-NEXT: mv a0, a2
; RV64-NEXT: .LBB4_2: # %cond.end
; RV64-NEXT: ret
+;
+; RV32IF-LABEL: phi_i64:
+; RV32IF: # %bb.0: # %entry
+; RV32IF-NEXT: mv a5, a0
+; RV32IF-NEXT: mv a0, a1
+; RV32IF-NEXT: mv a1, a2
+; RV32IF-NEXT: andi a5, a5, 1
+; RV32IF-NEXT: bnez a5, .LBB4_2
+; RV32IF-NEXT: # %bb.1: # %cond.false
+; RV32IF-NEXT: mv a0, a3
+; RV32IF-NEXT: mv a1, a4
+; RV32IF-NEXT: .LBB4_2: # %cond.end
+; RV32IF-NEXT: ret
+;
+; RV32ID-LABEL: phi_i64:
+; RV32ID: # %bb.0: # %entry
+; RV32ID-NEXT: addi sp, sp, -16
+; RV32ID-NEXT: .cfi_def_cfa_offset 16
+; RV32ID-NEXT: sw a1, 8(sp)
+; RV32ID-NEXT: sw a2, 12(sp)
+; RV32ID-NEXT: fld fa5, 8(sp)
+; RV32ID-NEXT: andi a0, a0, 1
+; RV32ID-NEXT: sw a3, 8(sp)
+; RV32ID-NEXT: sw a4, 12(sp)
+; RV32ID-NEXT: bnez a0, .LBB4_2
+; RV32ID-NEXT: # %bb.1: # %cond.false
+; RV32ID-NEXT: fld fa5, 8(sp)
+; RV32ID-NEXT: .LBB4_2: # %cond.end
+; RV32ID-NEXT: fsd fa5, 8(sp)
+; RV32ID-NEXT: lw a0, 8(sp)
+; RV32ID-NEXT: lw a1, 12(sp)
+; RV32ID-NEXT: addi sp, sp, 16
+; RV32ID-NEXT: .cfi_def_cfa_offset 0
+; RV32ID-NEXT: ret
entry:
br i1 %cnd, label %cond.end, label %cond.false
@@ -169,6 +205,15 @@ define float @phi_float(i1 %cnd, float %a, float %b) {
; CHECKIF-NEXT: fmv.s fa0, fa1
; CHECKIF-NEXT: .LBB6_2: # %cond.end
; CHECKIF-NEXT: ret
+;
+; CHECKID-LABEL: phi_float:
+; CHECKID: # %bb.0: # %entry
+; CHECKID-NEXT: andi a0, a0, 1
+; CHECKID-NEXT: bnez a0, .LBB6_2
+; CHECKID-NEXT: # %bb.1: # %cond.false
+; CHECKID-NEXT: fmv.s fa0, fa1
+; CHECKID-NEXT: .LBB6_2: # %cond.end
+; CHECKID-NEXT: ret
entry:
br i1 %cnd, label %cond.end, label %cond.false
@@ -179,3 +224,74 @@ cond.end: ; preds = %entry, %cond.false
%cond = phi float [ %b, %cond.false ], [ %a, %entry ]
ret float %cond
}
+
+define double @phi_double(i1 %cnd, double %a, double %b) {
+; RV32I-LABEL: phi_double:
+; RV32I: # %bb.0: # %entry
+; RV32I-NEXT: mv a5, a0
+; RV32I-NEXT: mv a0, a1
+; RV32I-NEXT: mv a1, a2
+; RV32I-NEXT: andi a5, a5, 1
+; RV32I-NEXT: bnez a5, .LBB7_2
+; RV32I-NEXT: # %bb.1: # %cond.false
+; RV32I-NEXT: mv a0, a3
+; RV32I-NEXT: mv a1, a4
+; RV32I-NEXT: .LBB7_2: # %cond.end
+; RV32I-NEXT: ret
+;
+; RV64I-LABEL: phi_double:
+; RV64I: # %bb.0: # %entry
+; RV64I-NEXT: mv a3, a0
+; RV64I-NEXT: mv a0, a1
+; RV64I-NEXT: andi a3, a3, 1
+; RV64I-NEXT: bnez a3, .LBB7_2
+; RV64I-NEXT: # %bb.1: # %cond.false
+; RV64I-NEXT: mv a0, a2
+; RV64I-NEXT: .LBB7_2: # %cond.end
+; RV64I-NEXT: ret
+;
+; RV32IF-LABEL: phi_double:
+; RV32IF: # %bb.0: # %entry
+; RV32IF-NEXT: mv a5, a0
+; RV32IF-NEXT: mv a0, a1
+; RV32IF-NEXT: mv a1, a2
+; RV32IF-NEXT: andi a5, a5, 1
+; RV32IF-NEXT: bnez a5, .LBB7_2
+; RV32IF-NEXT: # %bb.1: # %cond.false
+; RV32IF-NEXT: mv a0, a3
+; RV32IF-NEXT: mv a1, a4
+; RV32IF-NEXT: .LBB7_2: # %cond.end
+; RV32IF-NEXT: ret
+;
+; RV64IF-LABEL: phi_double:
+; RV64IF: # %bb.0: # %entry
+; RV64IF-NEXT: mv a3, a0
+; RV64IF-NEXT: mv a0, a1
+; RV64IF-NEXT: andi a3, a3, 1
+; RV64IF-NEXT: bnez a3, .LBB7_2
+; RV64IF-NEXT: # %bb.1: # %cond.false
+; RV64IF-NEXT: mv a0, a2
+; RV64IF-NEXT: .LBB7_2: # %cond.end
+; RV64IF-NEXT: re...
[truncated]
|
Just to make sure I understand this.
If they are not, the thing that trips me is why do we use FPR for i64? While writing this, I think I understand: that's because only FPR can hold the i64 value after legalization and you're stuck with it. That's actually what RegBankSelect is for. At this point you're supposed to propagate the GPR requirements and break down the G_PHI into smaller G_PHI to keep the whole chain on GPRs. I believe this type of use cases are supported, but I may be misremembering. |
|
Make s64 unconditionally legal and split and remap the phi in regbankselect |
Yes. Because s64 G_PHI is legal, the G_MERGES/G_UNMERGES created for i64 function arguments/returns or when we legalize i64 arithmetic don't get removed. Those G_MERGE/G_UNMERGE get selected to pseudo instructions that eventually become stack store/load. So it all functionally works its just not optimal. |
Is there prior art for that? Would I create G_UNMERGEs to split the s64 PHI inputs or do I need to peek through the existing G_MERGEs to find their inputs? |
AMDGPU has to split most 64-bit operations this way since VALU ops usually only support 32-bit operations, but SALU often supports 64-bits.
You should create new G_UNMERGEs in applyMappingImpl. The API for this really is a mess though. What the generic code tries to do for you is counter-productive. It attempts to (sometimes) create new virtual registers for you, and sometimes doesn't. When it does create the registers for you, there's no sensible way to do anything with them, like create the new instructions with appropriate type you need. We should rip all of that out at some point |
Looks like Mips might be solving my exact problem. |
When the D extension is enabled we should be able to have s64 phis to support f64 values.
Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary.
I'm looking for advice on how to improve this.