Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV][GISel][WIP] Support s64 G_PHI on RV32 when D extension is enabled. #119026

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Dec 6, 2024

When the D extension is enabled we should be able to have s64 phis to support f64 values.

Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary.

I'm looking for advice on how to improve this.

…bled.

When the D extension is enabled we should be able to have s64 phis
to support f64 values.

Unfortunately, this hurts the codegen of i64 phis because we now
end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE.
Since are no instructions to move between 2 GPRs and an FPR, we have
to go through a stack temporary.

I'm looking for advice on how to improve this.
@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

When the D extension is enabled we should be able to have s64 phis to support f64 values.

Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary.

I'm looking for advice on how to improve this.


Patch is 20.71 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119026.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir (+170-77)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/phi.ll (+132-16)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index 1199dabf3d1659..d7ac081913c5d9 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -412,9 +412,10 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
   getActionDefinitionsBuilder(G_BRINDIRECT).legalFor({p0});
 
   getActionDefinitionsBuilder(G_PHI)
-      .legalFor({p0, s32, sXLen})
+      .legalFor({p0, s32})
+      .legalFor(XLen == 64 || ST.hasStdExtD(), {s64})
       .widenScalarToNextPow2(0)
-      .clampScalar(0, s32, sXLen);
+      .clampScalar(0, s32, (XLen == 64 || ST.hasStdExtD()) ? s64 : s32);
 
   getActionDefinitionsBuilder({G_GLOBAL_VALUE, G_JUMP_TABLE, G_CONSTANT_POOL})
       .legalFor({p0});
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
index c8f73afc0ac7c7..49b1538294c748 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-phi-rv32.mir
@@ -1,6 +1,8 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
 # RUN: llc -mtriple=riscv32 -run-pass=legalizer %s -o - \
-# RUN:   | FileCheck %s
+# RUN:   | FileCheck %s --check-prefixes=CHECK,RV32I
+# RUN: llc -mtriple=riscv32 -mattr=+d -run-pass=legalizer %s -o - \
+# RUN:   | FileCheck %s --check-prefixes=CHECK,RV32ID
 
 --- |
   define i1 @phi_i1(i1 %cnd, i1 %a, i1 %b) {
@@ -413,30 +415,57 @@ machineFunctionInfo:
   varArgsFrameIndex: 0
   varArgsSaveSize: 0
 body:             |
-  ; CHECK-LABEL: name: phi_i48
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
-  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
-  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
-  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
-  ; CHECK-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
-  ; CHECK-NEXT:   G_BRCOND [[AND]](s32), %bb.2
-  ; CHECK-NEXT:   G_BR %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.cond.false:
-  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.cond.end:
-  ; CHECK-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
-  ; CHECK-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
-  ; CHECK-NEXT:   $x10 = COPY [[PHI]](s32)
-  ; CHECK-NEXT:   $x11 = COPY [[PHI1]](s32)
-  ; CHECK-NEXT:   PseudoRET implicit $x10, implicit $x11
+  ; RV32I-LABEL: name: phi_i48
+  ; RV32I: bb.0.entry:
+  ; RV32I-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32I-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32I-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32I-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32I-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+  ; RV32I-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+  ; RV32I-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32I-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32I-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32I-NEXT:   G_BR %bb.1
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.1.cond.false:
+  ; RV32I-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.2.cond.end:
+  ; RV32I-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
+  ; RV32I-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
+  ; RV32I-NEXT:   $x10 = COPY [[PHI]](s32)
+  ; RV32I-NEXT:   $x11 = COPY [[PHI1]](s32)
+  ; RV32I-NEXT:   PseudoRET implicit $x10, implicit $x11
+  ;
+  ; RV32ID-LABEL: name: phi_i48
+  ; RV32ID: bb.0.entry:
+  ; RV32ID-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32ID-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32ID-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32ID-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32ID-NEXT:   [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+  ; RV32ID-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+  ; RV32ID-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+  ; RV32ID-NEXT:   [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[COPY4]](s32)
+  ; RV32ID-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32ID-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32ID-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32ID-NEXT:   G_BR %bb.1
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.1.cond.false:
+  ; RV32ID-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.2.cond.end:
+  ; RV32ID-NEXT:   [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV1]](s64), %bb.1, [[MV]](s64), %bb.0
+  ; RV32ID-NEXT:   [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+  ; RV32ID-NEXT:   $x10 = COPY [[UV]](s32)
+  ; RV32ID-NEXT:   $x11 = COPY [[UV1]](s32)
+  ; RV32ID-NEXT:   PseudoRET implicit $x10, implicit $x11
   bb.1.entry:
     liveins: $x10, $x11, $x12, $x13, $x14
 
@@ -486,30 +515,57 @@ machineFunctionInfo:
   varArgsFrameIndex: 0
   varArgsSaveSize: 0
 body:             |
-  ; CHECK-LABEL: name: phi_i64
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
-  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
-  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
-  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
-  ; CHECK-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
-  ; CHECK-NEXT:   G_BRCOND [[AND]](s32), %bb.2
-  ; CHECK-NEXT:   G_BR %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.cond.false:
-  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.cond.end:
-  ; CHECK-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
-  ; CHECK-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
-  ; CHECK-NEXT:   $x10 = COPY [[PHI]](s32)
-  ; CHECK-NEXT:   $x11 = COPY [[PHI1]](s32)
-  ; CHECK-NEXT:   PseudoRET implicit $x10, implicit $x11
+  ; RV32I-LABEL: name: phi_i64
+  ; RV32I: bb.0.entry:
+  ; RV32I-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32I-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32I-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32I-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32I-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+  ; RV32I-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+  ; RV32I-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32I-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32I-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32I-NEXT:   G_BR %bb.1
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.1.cond.false:
+  ; RV32I-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.2.cond.end:
+  ; RV32I-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY3]](s32), %bb.1, [[COPY1]](s32), %bb.0
+  ; RV32I-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY2]](s32), %bb.0
+  ; RV32I-NEXT:   $x10 = COPY [[PHI]](s32)
+  ; RV32I-NEXT:   $x11 = COPY [[PHI1]](s32)
+  ; RV32I-NEXT:   PseudoRET implicit $x10, implicit $x11
+  ;
+  ; RV32ID-LABEL: name: phi_i64
+  ; RV32ID: bb.0.entry:
+  ; RV32ID-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32ID-NEXT:   liveins: $x10, $x11, $x12, $x13, $x14
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32ID-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32ID-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32ID-NEXT:   [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+  ; RV32ID-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x13
+  ; RV32ID-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x14
+  ; RV32ID-NEXT:   [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[COPY4]](s32)
+  ; RV32ID-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32ID-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32ID-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32ID-NEXT:   G_BR %bb.1
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.1.cond.false:
+  ; RV32ID-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.2.cond.end:
+  ; RV32ID-NEXT:   [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV1]](s64), %bb.1, [[MV]](s64), %bb.0
+  ; RV32ID-NEXT:   [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+  ; RV32ID-NEXT:   $x10 = COPY [[UV]](s32)
+  ; RV32ID-NEXT:   $x11 = COPY [[UV1]](s32)
+  ; RV32ID-NEXT:   PseudoRET implicit $x10, implicit $x11
   bb.1.entry:
     liveins: $x10, $x11, $x12, $x13, $x14
 
@@ -562,34 +618,71 @@ machineFunctionInfo:
   varArgsFrameIndex: 0
   varArgsSaveSize: 0
 body:             |
-  ; CHECK-LABEL: name: phi_i72
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
-  ; CHECK-NEXT:   liveins: $x10, $x11, $x12
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
-  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
-  ; CHECK-NEXT:   [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
-  ; CHECK-NEXT:   [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
-  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
-  ; CHECK-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
-  ; CHECK-NEXT:   G_BRCOND [[AND]](s32), %bb.2
-  ; CHECK-NEXT:   G_BR %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.cond.false:
-  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.cond.end:
-  ; CHECK-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY1]](s32), %bb.0
-  ; CHECK-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY5]](s32), %bb.1, [[COPY2]](s32), %bb.0
-  ; CHECK-NEXT:   [[PHI2:%[0-9]+]]:_(s32) = G_PHI [[COPY6]](s32), %bb.1, [[COPY3]](s32), %bb.0
-  ; CHECK-NEXT:   $x10 = COPY [[PHI]](s32)
-  ; CHECK-NEXT:   $x10 = COPY [[PHI1]](s32)
-  ; CHECK-NEXT:   $x10 = COPY [[PHI2]](s32)
-  ; CHECK-NEXT:   PseudoRET implicit $x10, implicit $x10, implicit $x10
+  ; RV32I-LABEL: name: phi_i72
+  ; RV32I: bb.0.entry:
+  ; RV32I-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32I-NEXT:   liveins: $x10, $x11, $x12
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32I-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32I-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32I-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32I-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32I-NEXT:   [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32I-NEXT:   [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32I-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32I-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32I-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32I-NEXT:   G_BR %bb.1
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.1.cond.false:
+  ; RV32I-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32I-NEXT: {{  $}}
+  ; RV32I-NEXT: bb.2.cond.end:
+  ; RV32I-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[COPY4]](s32), %bb.1, [[COPY1]](s32), %bb.0
+  ; RV32I-NEXT:   [[PHI1:%[0-9]+]]:_(s32) = G_PHI [[COPY5]](s32), %bb.1, [[COPY2]](s32), %bb.0
+  ; RV32I-NEXT:   [[PHI2:%[0-9]+]]:_(s32) = G_PHI [[COPY6]](s32), %bb.1, [[COPY3]](s32), %bb.0
+  ; RV32I-NEXT:   $x10 = COPY [[PHI]](s32)
+  ; RV32I-NEXT:   $x10 = COPY [[PHI1]](s32)
+  ; RV32I-NEXT:   $x10 = COPY [[PHI2]](s32)
+  ; RV32I-NEXT:   PseudoRET implicit $x10, implicit $x10, implicit $x10
+  ;
+  ; RV32ID-LABEL: name: phi_i72
+  ; RV32ID: bb.0.entry:
+  ; RV32ID-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; RV32ID-NEXT:   liveins: $x10, $x11, $x12
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $x10
+  ; RV32ID-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32ID-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32ID-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $x11
+  ; RV32ID-NEXT:   [[COPY4:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32ID-NEXT:   [[COPY5:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32ID-NEXT:   [[COPY6:%[0-9]+]]:_(s32) = COPY $x12
+  ; RV32ID-NEXT:   [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+  ; RV32ID-NEXT:   [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
+  ; RV32ID-NEXT:   [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY3]](s32), [[DEF]](s32)
+  ; RV32ID-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; RV32ID-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
+  ; RV32ID-NEXT:   G_BRCOND [[AND]](s32), %bb.2
+  ; RV32ID-NEXT:   G_BR %bb.1
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.1.cond.false:
+  ; RV32ID-NEXT:   successors: %bb.2(0x80000000)
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT:   [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+  ; RV32ID-NEXT:   [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY4]](s32), [[COPY5]](s32)
+  ; RV32ID-NEXT:   [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[COPY6]](s32), [[DEF1]](s32)
+  ; RV32ID-NEXT: {{  $}}
+  ; RV32ID-NEXT: bb.2.cond.end:
+  ; RV32ID-NEXT:   [[PHI:%[0-9]+]]:_(s64) = G_PHI [[MV2]](s64), %bb.1, [[MV]](s64), %bb.0
+  ; RV32ID-NEXT:   [[PHI1:%[0-9]+]]:_(s64) = G_PHI [[MV3]](s64), %bb.1, [[MV1]](s64), %bb.0
+  ; RV32ID-NEXT:   [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI]](s64)
+  ; RV32ID-NEXT:   [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[PHI1]](s64)
+  ; RV32ID-NEXT:   $x10 = COPY [[UV]](s32)
+  ; RV32ID-NEXT:   $x10 = COPY [[UV1]](s32)
+  ; RV32ID-NEXT:   $x10 = COPY [[UV2]](s32)
+  ; RV32ID-NEXT:   PseudoRET implicit $x10, implicit $x10, implicit $x10
   bb.1.entry:
     liveins: $x10, $x11, $x12
 
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll b/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
index 494d2c94c181db..b21657dc3efe16 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/phi.ll
@@ -1,8 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc -mtriple=riscv32 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV32
-; RUN: llc -mtriple=riscv64 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV64
-; RUN: llc -mtriple=riscv32 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV32
-; RUN: llc -mtriple=riscv64 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV64
+; RUN: llc -mtriple=riscv32 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV32,RV32I
+; RUN: llc -mtriple=riscv64 -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKI,RV64,RV64I
+; RUN: llc -mtriple=riscv32 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV32,RV32IF
+; RUN: llc -mtriple=riscv64 -mattr=+f -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKIF,RV64,RV64IF
+; RUN: llc -mtriple=riscv32 -mattr=+d -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKID,RV32,RV32ID
+; RUN: llc -mtriple=riscv64 -mattr=+d -global-isel < %s | FileCheck %s --check-prefixes=CHECK,CHECKID,RV64,RV64ID
 
 define i1 @phi_i1(i1 %cnd, i1 %a, i1 %b) {
 ; CHECK-LABEL: phi_i1:
@@ -93,18 +95,18 @@ cond.end:                                         ; preds = %entry, %cond.false
 }
 
 define i64 @phi_i64(i1 %cnd, i64 %a, i64 %b) {
-; RV32-LABEL: phi_i64:
-; RV32:       # %bb.0: # %entry
-; RV32-NEXT:    mv a5, a0
-; RV32-NEXT:    mv a0, a1
-; RV32-NEXT:    mv a1, a2
-; RV32-NEXT:    andi a5, a5, 1
-; RV32-NEXT:    bnez a5, .LBB4_2
-; RV32-NEXT:  # %bb.1: # %cond.false
-; RV32-NEXT:    mv a0, a3
-; RV32-NEXT:    mv a1, a4
-; RV32-NEXT:  .LBB4_2: # %cond.end
-; RV32-NEXT:    ret
+; RV32I-LABEL: phi_i64:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    mv a5, a0
+; RV32I-NEXT:    mv a0, a1
+; RV32I-NEXT:    mv a1, a2
+; RV32I-NEXT:    andi a5, a5, 1
+; RV32I-NEXT:    bnez a5, .LBB4_2
+; RV32I-NEXT:  # %bb.1: # %cond.false
+; RV32I-NEXT:    mv a0, a3
+; RV32I-NEXT:    mv a1, a4
+; RV32I-NEXT:  .LBB4_2: # %cond.end
+; RV32I-NEXT:    ret
 ;
 ; RV64-LABEL: phi_i64:
 ; RV64:       # %bb.0: # %entry
@@ -116,6 +118,40 @@ define i64 @phi_i64(i1 %cnd, i64 %a, i64 %b) {
 ; RV64-NEXT:    mv a0, a2
 ; RV64-NEXT:  .LBB4_2: # %cond.end
 ; RV64-NEXT:    ret
+;
+; RV32IF-LABEL: phi_i64:
+; RV32IF:       # %bb.0: # %entry
+; RV32IF-NEXT:    mv a5, a0
+; RV32IF-NEXT:    mv a0, a1
+; RV32IF-NEXT:    mv a1, a2
+; RV32IF-NEXT:    andi a5, a5, 1
+; RV32IF-NEXT:    bnez a5, .LBB4_2
+; RV32IF-NEXT:  # %bb.1: # %cond.false
+; RV32IF-NEXT:    mv a0, a3
+; RV32IF-NEXT:    mv a1, a4
+; RV32IF-NEXT:  .LBB4_2: # %cond.end
+; RV32IF-NEXT:    ret
+;
+; RV32ID-LABEL: phi_i64:
+; RV32ID:       # %bb.0: # %entry
+; RV32ID-NEXT:    addi sp, sp, -16
+; RV32ID-NEXT:    .cfi_def_cfa_offset 16
+; RV32ID-NEXT:    sw a1, 8(sp)
+; RV32ID-NEXT:    sw a2, 12(sp)
+; RV32ID-NEXT:    fld fa5, 8(sp)
+; RV32ID-NEXT:    andi a0, a0, 1
+; RV32ID-NEXT:    sw a3, 8(sp)
+; RV32ID-NEXT:    sw a4, 12(sp)
+; RV32ID-NEXT:    bnez a0, .LBB4_2
+; RV32ID-NEXT:  # %bb.1: # %cond.false
+; RV32ID-NEXT:    fld fa5, 8(sp)
+; RV32ID-NEXT:  .LBB4_2: # %cond.end
+; RV32ID-NEXT:    fsd fa5, 8(sp)
+; RV32ID-NEXT:    lw a0, 8(sp)
+; RV32ID-NEXT:    lw a1, 12(sp)
+; RV32ID-NEXT:    addi sp, sp, 16
+; RV32ID-NEXT:    .cfi_def_cfa_offset 0
+; RV32ID-NEXT:    ret
 entry:
   br i1 %cnd, label %cond.end, label %cond.false
 
@@ -169,6 +205,15 @@ define float @phi_float(i1 %cnd, float %a, float %b) {
 ; CHECKIF-NEXT:    fmv.s fa0, fa1
 ; CHECKIF-NEXT:  .LBB6_2: # %cond.end
 ; CHECKIF-NEXT:    ret
+;
+; CHECKID-LABEL: phi_float:
+; CHECKID:       # %bb.0: # %entry
+; CHECKID-NEXT:    andi a0, a0, 1
+; CHECKID-NEXT:    bnez a0, .LBB6_2
+; CHECKID-NEXT:  # %bb.1: # %cond.false
+; CHECKID-NEXT:    fmv.s fa0, fa1
+; CHECKID-NEXT:  .LBB6_2: # %cond.end
+; CHECKID-NEXT:    ret
 entry:
   br i1 %cnd, label %cond.end, label %cond.false
 
@@ -179,3 +224,74 @@ cond.end:                                         ; preds = %entry, %cond.false
   %cond = phi float [ %b, %cond.false ], [ %a, %entry ]
   ret float %cond
 }
+
+define double @phi_double(i1 %cnd, double %a, double %b) {
+; RV32I-LABEL: phi_double:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    mv a5, a0
+; RV32I-NEXT:    mv a0, a1
+; RV32I-NEXT:    mv a1, a2
+; RV32I-NEXT:    andi a5, a5, 1
+; RV32I-NEXT:    bnez a5, .LBB7_2
+; RV32I-NEXT:  # %bb.1: # %cond.false
+; RV32I-NEXT:    mv a0, a3
+; RV32I-NEXT:    mv a1, a4
+; RV32I-NEXT:  .LBB7_2: # %cond.end
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: phi_double:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    mv a3, a0
+; RV64I-NEXT:    mv a0, a1
+; RV64I-NEXT:    andi a3, a3, 1
+; RV64I-NEXT:    bnez a3, .LBB7_2
+; RV64I-NEXT:  # %bb.1: # %cond.false
+; RV64I-NEXT:    mv a0, a2
+; RV64I-NEXT:  .LBB7_2: # %cond.end
+; RV64I-NEXT:    ret
+;
+; RV32IF-LABEL: phi_double:
+; RV32IF:       # %bb.0: # %entry
+; RV32IF-NEXT:    mv a5, a0
+; RV32IF-NEXT:    mv a0, a1
+; RV32IF-NEXT:    mv a1, a2
+; RV32IF-NEXT:    andi a5, a5, 1
+; RV32IF-NEXT:    bnez a5, .LBB7_2
+; RV32IF-NEXT:  # %bb.1: # %cond.false
+; RV32IF-NEXT:    mv a0, a3
+; RV32IF-NEXT:    mv a1, a4
+; RV32IF-NEXT:  .LBB7_2: # %cond.end
+; RV32IF-NEXT:    ret
+;
+; RV64IF-LABEL: phi_double:
+; RV64IF:       # %bb.0: # %entry
+; RV64IF-NEXT:    mv a3, a0
+; RV64IF-NEXT:    mv a0, a1
+; RV64IF-NEXT:    andi a3, a3, 1
+; RV64IF-NEXT:    bnez a3, .LBB7_2
+; RV64IF-NEXT:  # %bb.1: # %cond.false
+; RV64IF-NEXT:    mv a0, a2
+; RV64IF-NEXT:  .LBB7_2: # %cond.end
+; RV64IF-NEXT:    re...
[truncated]

@qcolombet
Copy link
Collaborator

Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary.

Just to make sure I understand this.
These are two separate problems:

  1. Merge/unmerge for i64 values
  2. Moving 2 gprs in an FPR

If they are not, the thing that trips me is why do we use FPR for i64?

While writing this, I think I understand: that's because only FPR can hold the i64 value after legalization and you're stuck with it.
Is that the problem?

That's actually what RegBankSelect is for. At this point you're supposed to propagate the GPR requirements and break down the G_PHI into smaller G_PHI to keep the whole chain on GPRs.

I believe this type of use cases are supported, but I may be misremembering.

@arsenm
Copy link
Contributor

arsenm commented Dec 6, 2024

Make s64 unconditionally legal and split and remap the phi in regbankselect

@topperc
Copy link
Collaborator Author

topperc commented Dec 6, 2024

Unfortunately, this hurts the codegen of i64 phis because we now end up emitting them as an s64 G_PHI surrounded by G_MERGE/G_UNMERGE. Since are no instructions to move between 2 GPRs and an FPR, we have to go through a stack temporary.

Just to make sure I understand this. These are two separate problems:

  1. Merge/unmerge for i64 values
  2. Moving 2 gprs in an FPR

If they are not, the thing that trips me is why do we use FPR for i64?

While writing this, I think I understand: that's because only FPR can hold the i64 value after legalization and you're stuck with it. Is that the problem?

Yes. Because s64 G_PHI is legal, the G_MERGES/G_UNMERGES created for i64 function arguments/returns or when we legalize i64 arithmetic don't get removed. Those G_MERGE/G_UNMERGE get selected to pseudo instructions that eventually become stack store/load. So it all functionally works its just not optimal.

@topperc
Copy link
Collaborator Author

topperc commented Dec 6, 2024

Make s64 unconditionally legal and split and remap the phi in regbankselect

Is there prior art for that?

Would I create G_UNMERGEs to split the s64 PHI inputs or do I need to peek through the existing G_MERGEs to find their inputs?

@arsenm
Copy link
Contributor

arsenm commented Dec 6, 2024

Make s64 unconditionally legal and split and remap the phi in regbankselect

Is there prior art for that?

AMDGPU has to split most 64-bit operations this way since VALU ops usually only support 32-bit operations, but SALU often supports 64-bits.

Would I create G_UNMERGEs to split the s64 PHI inputs or do I need to peek through the existing G_MERGEs to find their inputs?

You should create new G_UNMERGEs in applyMappingImpl. The API for this really is a mess though. What the generic code tries to do for you is counter-productive. It attempts to (sometimes) create new virtual registers for you, and sometimes doesn't. When it does create the registers for you, there's no sensible way to do anything with them, like create the new instructions with appropriate type you need. We should rip all of that out at some point

@topperc
Copy link
Collaborator Author

topperc commented Dec 6, 2024

Make s64 unconditionally legal and split and remap the phi in regbankselect

Is there prior art for that?

AMDGPU has to split most 64-bit operations this way since VALU ops usually only support 32-bit operations, but SALU often supports 64-bits.

Would I create G_UNMERGEs to split the s64 PHI inputs or do I need to peek through the existing G_MERGEs to find their inputs?

You should create new G_UNMERGEs in applyMappingImpl. The API for this really is a mess though. What the generic code tries to do for you is counter-productive. It attempts to (sometimes) create new virtual registers for you, and sometimes doesn't. When it does create the registers for you, there's no sensible way to do anything with them, like create the new instructions with appropriate type you need. We should rip all of that out at some point

Looks like Mips might be solving my exact problem.

@topperc topperc marked this pull request as draft December 7, 2024 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants