Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Prefer Zcmp push/pop instead of save-restore calls. #66046

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

yetingk
Copy link
Contributor

@yetingk yetingk commented Sep 12, 2023

Zcmp push/pop can reduce more code size then save-restore calls. There are two reasons,

  1. Call for save-restore calls needs 4-8 bytes, but Zcmp push/pop only needs 2 bytes.
  2. Zcmp push/pop can also handles small shift of sp.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 12, 2023

@llvm/pr-subscribers-backend-risc-v

Changes

Zcmp push/pop can reduce more code size then save-restore calls. There are two reasons,

  1. Call for save-restore calls needs 4-8 bytes, but Zcmp push/pop only needs 2 bytes.
  2. Zcmp push/pop can also handles small shift of sp.
    --
    Full diff: https://github.com/llvm/llvm-project/pull/66046.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+56-64)
diff --git a/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h b/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
index 099ebb4014ca46f..6ee5790b272adbb 100644
--- a/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
@@ -110,7 +110,8 @@ class RISCVMachineFunctionInfo : public MachineFunctionInfo {
   bool useSaveRestoreLibCalls(const MachineFunction &MF) const {
     // We cannot use fixed locations for the callee saved spill slots if the
     // function uses a varargs save area, or is an interrupt handler.
-    return MF.getSubtarget().enableSaveRestore() &&
+    return !isPushable(MF) &&
+           MF.getSubtarget().enableSaveRestore() &&
            VarArgsSaveSize == 0 && !MF.getFrameInfo().hasTailCall() &&
            !MF.getFunction().hasFnAttribute("interrupt");
   }
@@ -131,8 +132,7 @@ class RISCVMachineFunctionInfo : public MachineFunctionInfo {
     // We cannot use fixed locations for the callee saved spill slots if the
     // function uses a varargs save area.
     // TODO: Use a seperate placement for vararg registers to enable Zcmp.
-    return !useSaveRestoreLibCalls(MF) &&
-           MF.getSubtarget().hasStdExtZcmp() &&
+    return MF.getSubtarget().hasStdExtZcmp() &&
            !MF.getTarget().Options.DisableFramePointerElim(MF) &&
            VarArgsSaveSize == 0;
   }
diff --git a/llvm/test/CodeGen/RISCV/push-pop-popret.ll b/llvm/test/CodeGen/RISCV/push-pop-popret.ll
index af3828ed7d839ce..d8570112a2a50f8 100644
--- a/llvm/test/CodeGen/RISCV/push-pop-popret.ll
+++ b/llvm/test/CodeGen/RISCV/push-pop-popret.ll
@@ -41,27 +41,25 @@ define i32 @foo() {
 ;
 ; RV32IZCMP-SR-LABEL: foo:
 ; RV32IZCMP-SR:       # %bb.0:
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_0
-; RV32IZCMP-SR-NEXT:    addi sp, sp, -512
+; RV32IZCMP-SR-NEXT:    cm.push {ra}, -64
+; RV32IZCMP-SR-NEXT:    addi sp, sp, -464
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 528
 ; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
 ; RV32IZCMP-SR-NEXT:    mv a0, sp
 ; RV32IZCMP-SR-NEXT:    call test@plt
-; RV32IZCMP-SR-NEXT:    li a0, 0
-; RV32IZCMP-SR-NEXT:    addi sp, sp, 512
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_0
+; RV32IZCMP-SR-NEXT:    addi sp, sp, 464
+; RV32IZCMP-SR-NEXT:    cm.popretz {ra}, 64
 ;
 ; RV64IZCMP-SR-LABEL: foo:
 ; RV64IZCMP-SR:       # %bb.0:
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_0
-; RV64IZCMP-SR-NEXT:    addi sp, sp, -512
+; RV64IZCMP-SR-NEXT:    cm.push {ra}, -64
+; RV64IZCMP-SR-NEXT:    addi sp, sp, -464
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 528
 ; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
 ; RV64IZCMP-SR-NEXT:    mv a0, sp
 ; RV64IZCMP-SR-NEXT:    call test@plt
-; RV64IZCMP-SR-NEXT:    li a0, 0
-; RV64IZCMP-SR-NEXT:    addi sp, sp, 512
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_0
+; RV64IZCMP-SR-NEXT:    addi sp, sp, 464
+; RV64IZCMP-SR-NEXT:    cm.popretz {ra}, 64
 ;
 ; RV32I-LABEL: foo:
 ; RV32I:       # %bb.0:
@@ -131,10 +129,10 @@ define i32 @pushpopret0(i32 signext %size){
 ;
 ; RV32IZCMP-SR-LABEL: pushpopret0:
 ; RV32IZCMP-SR:       # %bb.0: # %entry
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
-; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -4
 ; RV32IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV32IZCMP-SR-NEXT:    addi a0, a0, 15
@@ -142,16 +140,15 @@ define i32 @pushpopret0(i32 signext %size){
 ; RV32IZCMP-SR-NEXT:    sub a0, sp, a0
 ; RV32IZCMP-SR-NEXT:    mv sp, a0
 ; RV32IZCMP-SR-NEXT:    call callee_void@plt
-; RV32IZCMP-SR-NEXT:    li a0, 0
 ; RV32IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV32IZCMP-SR-NEXT:    cm.popretz {ra, s0}, 16
 ;
 ; RV64IZCMP-SR-LABEL: pushpopret0:
 ; RV64IZCMP-SR:       # %bb.0: # %entry
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
-; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -8
 ; RV64IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV64IZCMP-SR-NEXT:    slli a0, a0, 32
@@ -161,9 +158,8 @@ define i32 @pushpopret0(i32 signext %size){
 ; RV64IZCMP-SR-NEXT:    sub a0, sp, a0
 ; RV64IZCMP-SR-NEXT:    mv sp, a0
 ; RV64IZCMP-SR-NEXT:    call callee_void@plt
-; RV64IZCMP-SR-NEXT:    li a0, 0
 ; RV64IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV64IZCMP-SR-NEXT:    cm.popretz {ra, s0}, 16
 ;
 ; RV32I-LABEL: pushpopret0:
 ; RV32I:       # %bb.0: # %entry
@@ -255,10 +251,10 @@ define i32 @pushpopret1(i32 signext %size) {
 ;
 ; RV32IZCMP-SR-LABEL: pushpopret1:
 ; RV32IZCMP-SR:       # %bb.0: # %entry
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
-; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -4
 ; RV32IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV32IZCMP-SR-NEXT:    addi a0, a0, 15
@@ -268,14 +264,14 @@ define i32 @pushpopret1(i32 signext %size) {
 ; RV32IZCMP-SR-NEXT:    call callee_void@plt
 ; RV32IZCMP-SR-NEXT:    li a0, 1
 ; RV32IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV64IZCMP-SR-LABEL: pushpopret1:
 ; RV64IZCMP-SR:       # %bb.0: # %entry
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
-; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -8
 ; RV64IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV64IZCMP-SR-NEXT:    slli a0, a0, 32
@@ -287,7 +283,7 @@ define i32 @pushpopret1(i32 signext %size) {
 ; RV64IZCMP-SR-NEXT:    call callee_void@plt
 ; RV64IZCMP-SR-NEXT:    li a0, 1
 ; RV64IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV32I-LABEL: pushpopret1:
 ; RV32I:       # %bb.0: # %entry
@@ -379,10 +375,10 @@ define i32 @pushpopretneg1(i32 signext %size) {
 ;
 ; RV32IZCMP-SR-LABEL: pushpopretneg1:
 ; RV32IZCMP-SR:       # %bb.0: # %entry
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
-; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -4
 ; RV32IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV32IZCMP-SR-NEXT:    addi a0, a0, 15
@@ -392,14 +388,14 @@ define i32 @pushpopretneg1(i32 signext %size) {
 ; RV32IZCMP-SR-NEXT:    call callee_void@plt
 ; RV32IZCMP-SR-NEXT:    li a0, -1
 ; RV32IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV64IZCMP-SR-LABEL: pushpopretneg1:
 ; RV64IZCMP-SR:       # %bb.0: # %entry
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
-; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -8
 ; RV64IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV64IZCMP-SR-NEXT:    slli a0, a0, 32
@@ -411,7 +407,7 @@ define i32 @pushpopretneg1(i32 signext %size) {
 ; RV64IZCMP-SR-NEXT:    call callee_void@plt
 ; RV64IZCMP-SR-NEXT:    li a0, -1
 ; RV64IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV32I-LABEL: pushpopretneg1:
 ; RV32I:       # %bb.0: # %entry
@@ -503,10 +499,10 @@ define i32 @pushpopret2(i32 signext %size) {
 ;
 ; RV32IZCMP-SR-LABEL: pushpopret2:
 ; RV32IZCMP-SR:       # %bb.0: # %entry
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
-; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -4
 ; RV32IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV32IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV32IZCMP-SR-NEXT:    addi a0, a0, 15
@@ -516,14 +512,14 @@ define i32 @pushpopret2(i32 signext %size) {
 ; RV32IZCMP-SR-NEXT:    call callee_void@plt
 ; RV32IZCMP-SR-NEXT:    li a0, 2
 ; RV32IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV64IZCMP-SR-LABEL: pushpopret2:
 ; RV64IZCMP-SR:       # %bb.0: # %entry
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_1
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0}, -16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 16
-; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
-; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -8
 ; RV64IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV64IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
 ; RV64IZCMP-SR-NEXT:    slli a0, a0, 32
@@ -535,7 +531,7 @@ define i32 @pushpopret2(i32 signext %size) {
 ; RV64IZCMP-SR-NEXT:    call callee_void@plt
 ; RV64IZCMP-SR-NEXT:    li a0, 2
 ; RV64IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_1
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0}, 16
 ;
 ; RV32I-LABEL: pushpopret2:
 ; RV32I:       # %bb.0: # %entry
@@ -1220,7 +1216,7 @@ define void @many_args(i32, i32, i32, i32, i32, i32, i32, i32, i32) nounwind {
 ;
 ; RV32IZCMP-SR-LABEL: many_args:
 ; RV32IZCMP-SR:       # %bb.0: # %entry
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_5
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0-s4}, -32
 ; RV32IZCMP-SR-NEXT:    lui a0, %hi(var0)
 ; RV32IZCMP-SR-NEXT:    lw a6, %lo(var0)(a0)
 ; RV32IZCMP-SR-NEXT:    lw a7, %lo(var0+4)(a0)
@@ -1259,11 +1255,11 @@ define void @many_args(i32, i32, i32, i32, i32, i32, i32, i32, i32) nounwind {
 ; RV32IZCMP-SR-NEXT:    sw t0, %lo(var0+8)(a0)
 ; RV32IZCMP-SR-NEXT:    sw a7, %lo(var0+4)(a0)
 ; RV32IZCMP-SR-NEXT:    sw a6, %lo(var0)(a0)
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_5
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0-s4}, 32
 ;
 ; RV64IZCMP-SR-LABEL: many_args:
 ; RV64IZCMP-SR:       # %bb.0: # %entry
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_5
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0-s4}, -48
 ; RV64IZCMP-SR-NEXT:    lui a0, %hi(var0)
 ; RV64IZCMP-SR-NEXT:    lw a6, %lo(var0)(a0)
 ; RV64IZCMP-SR-NEXT:    lw a7, %lo(var0+4)(a0)
@@ -1302,7 +1298,7 @@ define void @many_args(i32, i32, i32, i32, i32, i32, i32, i32, i32) nounwind {
 ; RV64IZCMP-SR-NEXT:    sw t0, %lo(var0+8)(a0)
 ; RV64IZCMP-SR-NEXT:    sw a7, %lo(var0+4)(a0)
 ; RV64IZCMP-SR-NEXT:    sw a6, %lo(var0)(a0)
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_5
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0-s4}, 48
 ;
 ; RV32I-LABEL: many_args:
 ; RV32I:       # %bb.0: # %entry
@@ -1456,7 +1452,7 @@ define void @alloca(i32 %n) nounwind {
 ;
 ; RV32IZCMP-SR-LABEL: alloca:
 ; RV32IZCMP-SR:       # %bb.0:
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_2
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0-s1}, -16
 ; RV32IZCMP-SR-NEXT:    addi s0, sp, 16
 ; RV32IZCMP-SR-NEXT:    mv s1, sp
 ; RV32IZCMP-SR-NEXT:    addi a0, a0, 15
@@ -1466,11 +1462,11 @@ define void @alloca(i32 %n) nounwind {
 ; RV32IZCMP-SR-NEXT:    call notdead@plt
 ; RV32IZCMP-SR-NEXT:    mv sp, s1
 ; RV32IZCMP-SR-NEXT:    addi sp, s0, -16
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_2
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0-s1}, 16
 ;
 ; RV64IZCMP-SR-LABEL: alloca:
 ; RV64IZCMP-SR:       # %bb.0:
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_2
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0-s1}, -32
 ; RV64IZCMP-SR-NEXT:    addi s0, sp, 32
 ; RV64IZCMP-SR-NEXT:    mv s1, sp
 ; RV64IZCMP-SR-NEXT:    slli a0, a0, 32
@@ -1482,7 +1478,7 @@ define void @alloca(i32 %n) nounwind {
 ; RV64IZCMP-SR-NEXT:    call notdead@plt
 ; RV64IZCMP-SR-NEXT:    mv sp, s1
 ; RV64IZCMP-SR-NEXT:    addi sp, s0, -32
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_2
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0-s1}, 32
 ;
 ; RV32I-LABEL: alloca:
 ; RV32I:       # %bb.0:
@@ -1790,15 +1786,15 @@ define void @foo_no_irq() nounwind{
 ;
 ; RV32IZCMP-SR-LABEL: foo_no_irq:
 ; RV32IZCMP-SR:       # %bb.0:
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_0
+; RV32IZCMP-SR-NEXT:    cm.push {ra}, -16
 ; RV32IZCMP-SR-NEXT:    call foo_test_irq@plt
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_0
+; RV32IZCMP-SR-NEXT:    cm.popret {ra}, 16
 ;
 ; RV64IZCMP-SR-LABEL: foo_no_irq:
 ; RV64IZCMP-SR:       # %bb.0:
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_0
+; RV64IZCMP-SR-NEXT:    cm.push {ra}, -16
 ; RV64IZCMP-SR-NEXT:    call foo_test_irq@plt
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_0
+; RV64IZCMP-SR-NEXT:    cm.popret {ra}, 16
 ;
 ; RV32I-LABEL: foo_no_irq:
 ; RV32I:       # %bb.0:
@@ -2739,8 +2735,7 @@ define void @callee_no_irq() nounwind{
 ;
 ; RV32IZCMP-SR-LABEL: callee_no_irq:
 ; RV32IZCMP-SR:       # %bb.0:
-; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_12
-; RV32IZCMP-SR-NEXT:    addi sp, sp, -32
+; RV32IZCMP-SR-NEXT:    cm.push {ra, s0-s11}, -96
 ; RV32IZCMP-SR-NEXT:    lui a7, %hi(var_test_irq)
 ; RV32IZCMP-SR-NEXT:    lw a0, %lo(var_test_irq)(a7)
 ; RV32IZCMP-SR-NEXT:    sw a0, 28(sp) # 4-byte Folded Spill
@@ -2819,13 +2814,11 @@ define void @callee_no_irq() nounwind{
 ; RV32IZCMP-SR-NEXT:    sw a0, %lo(var_test_irq+4)(a7)
 ; RV32IZCMP-SR-NEXT:    lw a0, 28(sp) # 4-byte Folded Reload
 ; RV32IZCMP-SR-NEXT:    sw a0, %lo(var_test_irq)(a7)
-; RV32IZCMP-SR-NEXT:    addi sp, sp, 32
-; RV32IZCMP-SR-NEXT:    tail __riscv_restore_12
+; RV32IZCMP-SR-NEXT:    cm.popret {ra, s0-s11}, 96
 ;
 ; RV64IZCMP-SR-LABEL: callee_no_irq:
 ; RV64IZCMP-SR:       # %bb.0:
-; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_12
-; RV64IZCMP-SR-NEXT:    addi sp, sp, -48
+; RV64IZCMP-SR-NEXT:    cm.push {ra, s0-s11}, -160
 ; RV64IZCMP-SR-NEXT:    lui a7, %hi(var_test_irq)
 ; RV64IZCMP-SR-NEXT:    lw a0, %lo(var_test_irq)(a7)
 ; RV64IZCMP-SR-NEXT:    sd a0, 40(sp) # 8-byte Folded Spill
@@ -2904,8 +2897,7 @@ define void @callee_no_irq() nounwind{
 ; RV64IZCMP-SR-NEXT:    sw a0, %lo(var_test_irq+4)(a7)
 ; RV64IZCMP-SR-NEXT:    ld a0, 40(sp) # 8-byte Folded Reload
 ; RV64IZCMP-SR-NEXT:    sw a0, %lo(var_test_irq)(a7)
-; RV64IZCMP-SR-NEXT:    addi sp, sp, 48
-; RV64IZCMP-SR-NEXT:    tail __riscv_restore_12
+; RV64IZCMP-SR-NEXT:    cm.popret {ra, s0-s11}, 160
 ;
 ; RV32I-LABEL: callee_no_irq:
 ; RV32I:       # %bb.0:

; RV32IZCMP-SR-NEXT: .cfi_def_cfa_offset 16
; RV32IZCMP-SR-NEXT: .cfi_offset ra, -4
; RV32IZCMP-SR-NEXT: .cfi_offset s0, -8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems be a problem,

the func of __riscv_save_1 is

FUNC_BEGIN (__riscv_save_1)
  .cfi_startproc
  # __riscv_save_* routine use t0/x5 as return address
  .cfi_return_column 5
  addi sp, sp, -16
  .cfi_def_cfa_offset 16
  sd s0, 0(sp)
  .cfi_offset 8, -16
  sd ra, 8(sp)
  .cfi_offset 1, -8
  jr t0
  .cfi_endproc
FUNC_END (__riscv_save_1)

while, I think the inst cm.push {ra, s0}, -16 will execute following insts as Zc Spec has mentioned.

sd s0, -8(sp)
sd ra, -16(sp)
addi sp, sp, -16

That's why I think .cfi_offset should be

.cfi_offset ra, -4
.cfi_offset s0, -8

Copy link
Contributor Author

@yetingk yetingk Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think the behavior of cm.push {ra, s0}, -16 is

sd s0, -4(sp)
sd ra, -8(sp)
addi sp, sp, -16

And there is a llvm fix https://reviews.llvm.org/D156437 for that.

@yetingk
Copy link
Contributor Author

yetingk commented Sep 19, 2023

Ping.

@tclin914
Copy link
Contributor

LGTM.

Zcmp push/pop can reduce more code size then save-restore calls. There are two reasons,
1. Calls for save-restore call needs 4-8 bytes, but Zcmp push/pop only needs 2 bytes.
2. Zcmp push/pop can also handles small shift of sp.
@yetingk
Copy link
Contributor Author

yetingk commented Sep 20, 2023

Rebase.

@yetingk yetingk merged commit ab94fbb into llvm:main Sep 20, 2023
1 of 2 checks passed
@yetingk yetingk deleted the zcmp branch September 27, 2023 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants