Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Support preserve_none calling convention #91046

Merged
merged 10 commits into from
Jun 3, 2024

Conversation

antangelo
Copy link
Contributor

@antangelo antangelo commented May 4, 2024

Adds AArch64 support for the preserve_none calling convention. Registers X0-X7, X9-X15 and X19-X28 are caller save, and can be used to pass arguments. Delegates to AAPCS for all other registers.

Closes #87423

@pinskia
Copy link

pinskia commented May 4, 2024

I don't think you can use x16 and x17 for argument passing due to them being reserved for PLTs and call veneers.
That is if the linker decides to create a branch island or if the function is called via a PLT, x16 and x17 will be clobbered on the call so arguments using that will also won't work.

@brandtbucher
Copy link
Contributor

Aw, but that means we only have twenty-six registers for argument-passing... ;)

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels May 5, 2024
@llvmbot
Copy link

llvmbot commented May 5, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang

Author: None (antangelo)

Changes

Adds AArch64 support for the preserve_none calling convention. Registers X0-X17 and X19-X28 are caller save, and can be used to pass arguments. Delegates to AAPCS for all other registers.

Closes #87423


Patch is 44.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91046.diff

16 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+2-1)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+11-8)
  • (modified) clang/lib/Basic/Targets/AArch64.cpp (+1)
  • (modified) clang/test/CodeGen/preserve-call-conv.c (+3-3)
  • (modified) llvm/docs/LangRef.rst (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64CallingConvention.h (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64CallingConvention.td (+29)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+33-1)
  • (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+9-3)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp (+1)
  • (added) llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll (+88)
  • (modified) llvm/test/CodeGen/AArch64/preserve.ll (+8-1)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc.ll (+92)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_call.ll (+337)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_musttail.ll (+11)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_swift.ll (+16)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 0225598cbbe8ad..712c79927304e2 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3038,7 +3038,8 @@ def M68kRTD: DeclOrTypeAttr {
   let Documentation = [M68kRTDDocs];
 }
 
-def PreserveNone : DeclOrTypeAttr, TargetSpecificAttr<TargetAnyX86> {
+def PreserveNone : DeclOrTypeAttr,
+                   TargetSpecificAttr<TargetArch<!listconcat(TargetAArch64.Arches, TargetAnyX86.Arches)>> {
   let Spellings = [Clang<"preserve_none">];
   let Subjects = SubjectList<[FunctionLike]>;
   let Documentation = [PreserveNoneDocs];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index f8253143b596c0..d23465b77e7edd 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -5658,17 +5658,20 @@ experimental at this time.
 def PreserveNoneDocs : Documentation {
   let Category = DocCatCallingConvs;
   let Content = [{
-On X86-64 target, this attribute changes the calling convention of a function.
+On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
 The ``preserve_none`` calling convention tries to preserve as few general
 registers as possible. So all general registers are caller saved registers. It
 also uses more general registers to pass arguments. This attribute doesn't
-impact floating-point registers (XMMs/YMMs). Floating-point registers still
-follow the c calling convention.
-
-- Only RSP and RBP are preserved by callee.
-
-- Register RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15 and RAX now can
-  be used to pass function arguments.
+impact floating-point registers. 
+
+- On X86-64, only RSP and RBP are preserved by the callee.
+  Registers RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15 and RAX now can
+  be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
+  follow the C calling convention.
+- On AArch64, only LR and FP are preserved by the callee.
+  Registers X19-X28 and X0-X17 are used to pass function arguments.
+  X18, SIMD and floating-point registers follow the AAPCS calling
+  convention.
   }];
 }
 
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index c8d243a8fb7aea..e1f7dbf1d9f20b 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -1202,6 +1202,7 @@ AArch64TargetInfo::checkCallingConvention(CallingConv CC) const {
   case CC_SwiftAsync:
   case CC_PreserveMost:
   case CC_PreserveAll:
+  case CC_PreserveNone:
   case CC_OpenCLKernel:
   case CC_AArch64VectorCall:
   case CC_AArch64SVEPCS:
diff --git a/clang/test/CodeGen/preserve-call-conv.c b/clang/test/CodeGen/preserve-call-conv.c
index 74bf695e6f331d..65973206403f70 100644
--- a/clang/test/CodeGen/preserve-call-conv.c
+++ b/clang/test/CodeGen/preserve-call-conv.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,X86-LINUX
-// RUN: %clang_cc1 -triple arm64-unknown-unknown -emit-llvm < %s | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,LINUX
+// RUN: %clang_cc1 -triple arm64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,LINUX
 
 // RUN: %clang_cc1 -triple x86_64-unknown-windows-msvc -emit-llvm %s -o - | FileCheck %s
 // RUN: %clang_cc1 -triple aarch64-unknown-windows-msvc -emit-llvm %s -o - | FileCheck %s
@@ -23,5 +23,5 @@ void boo(void) __attribute__((preserve_all)) {
 // is lowered to the corresponding calling convention attrribute at the LLVM IR
 // level.
 void bar(void) __attribute__((preserve_none)) {
-  // X86-LINUX-LABEL: define {{(dso_local )?}}preserve_nonecc void @bar()
+  // LINUX-LABEL: define {{(dso_local )?}}preserve_nonecc void @bar()
 }
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 2077fdd841fcd6..1259cc568204f9 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -458,7 +458,7 @@ added in the future:
     registers to pass arguments. This attribute doesn't impact non-general
     purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
     Non-general purpose registers still follow the standard c calling
-    convention. Currently it is for x86_64 only.
+    convention. Currently it is for x86_64 and AArch64 only.
 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
     Clang generates an access function to access C++-style TLS. The access
     function generally has an entry block, an exit block and an initialization
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.h b/llvm/lib/Target/AArch64/AArch64CallingConvention.h
index 3b51ee12b7477e..63185a97cba03d 100644
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.h
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.h
@@ -52,6 +52,9 @@ bool CC_AArch64_Arm64EC_CFGuard_Check(unsigned ValNo, MVT ValVT, MVT LocVT,
 bool CC_AArch64_GHC(unsigned ValNo, MVT ValVT, MVT LocVT,
                     CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
                     CCState &State);
+bool CC_AArch64_Preserve_None(unsigned ValNo, MVT ValVT, MVT LocVT,
+                              CCValAssign::LocInfo LocInfo,
+                              ISD::ArgFlagsTy ArgFlags, CCState &State);
 bool RetCC_AArch64_AAPCS(unsigned ValNo, MVT ValVT, MVT LocVT,
                          CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
                          CCState &State);
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
index 8e67f0f5c8815f..7d24aae99356f6 100644
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
@@ -494,6 +494,31 @@ def CC_AArch64_GHC : CallingConv<[
   CCIfType<[i64], CCAssignToReg<[X19, X20, X21, X22, X23, X24, X25, X26, X27, X28]>>
 ]>;
 
+let Entry = 1 in
+def CC_AArch64_Preserve_None : CallingConv<[
+    // We can pass arguments in all general registers, except:
+    // - X16/X17, used by the linker as IP0/IP1
+    // - X18, used for the 'nest' parameter
+    // - X29, the frame pointer
+    // - X30, the link register
+    // General registers are not preserved with the exception of
+    // FP, LR, and X18
+    // Non-volatile registers are used first, so functions may call
+    // normal functions without saving and reloading arguments.
+    CCIfType<[i32], CCAssignToReg<[W19, W20, W21, W22, W23,
+                                   W24, W25, W26, W27, W28,
+                                   W0, W1, W2, W3, W4, W5,
+                                   W6, W7, W8, W9, W10, W11,
+                                   W12, W13, W14, W15]>>,
+    CCIfType<[i64], CCAssignToReg<[X19, X20, X21, X22, X23,
+                                   X24, X25, X26, X27, X28,
+                                   X0, X1, X2, X3, X4, X5,
+                                   X6, X7, X8, X9, X10, X11,
+                                   X12, X13, X14, X15]>>,
+
+    CCDelegateTo<CC_AArch64_AAPCS>
+]>;
+
 // The order of the callee-saves in this file is important, because the
 // FrameLowering code will use this order to determine the layout the
 // callee-save area in the stack frame. As can be observed below, Darwin
@@ -606,6 +631,8 @@ def CSR_AArch64_AllRegs
 
 def CSR_AArch64_NoRegs : CalleeSavedRegs<(add)>;
 
+def CSR_AArch64_NoneRegs : CalleeSavedRegs<(add LR, FP)>;
+
 def CSR_AArch64_RT_MostRegs :  CalleeSavedRegs<(add CSR_AArch64_AAPCS,
                                                 (sequence "X%u", 9, 15))>;
 
@@ -681,6 +708,8 @@ def CSR_Darwin_AArch64_RT_AllRegs
 // These all preserve x18 in addition to any other registers.
 def CSR_AArch64_NoRegs_SCS
     : CalleeSavedRegs<(add CSR_AArch64_NoRegs, X18)>;
+def CSR_AArch64_NoneRegs_SCS
+    : CalleeSavedRegs<(add CSR_AArch64_NoneRegs, X18)>;
 def CSR_AArch64_AllRegs_SCS
     : CalleeSavedRegs<(add CSR_AArch64_AllRegs, X18)>;
 def CSR_AArch64_AAPCS_SwiftError_SCS
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index b27d204f3dded0..4e69d6c7f7ca95 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -6825,6 +6825,8 @@ CCAssignFn *AArch64TargetLowering::CCAssignFnForCall(CallingConv::ID CC,
     report_fatal_error("Unsupported calling convention.");
   case CallingConv::GHC:
     return CC_AArch64_GHC;
+  case CallingConv::PreserveNone:
+    return CC_AArch64_Preserve_None;
   case CallingConv::C:
   case CallingConv::Fast:
   case CallingConv::PreserveMost:
@@ -7348,6 +7350,20 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
     FuncInfo->setLazySaveTPIDR2Obj(TPIDR2Obj);
   }
 
+  if (CallConv == CallingConv::PreserveNone) {
+    for (const ISD::InputArg &I : Ins) {
+      if (I.Flags.isSwiftSelf() || I.Flags.isSwiftError() ||
+          I.Flags.isSwiftAsync()) {
+        MachineFunction &MF = DAG.getMachineFunction();
+        DAG.getContext()->diagnose(DiagnosticInfoUnsupported(
+            MF.getFunction(),
+            "Swift attributes can't be used with preserve_none",
+            DL.getDebugLoc()));
+        break;
+      }
+    }
+  }
+
   return Chain;
 }
 
@@ -7519,6 +7535,7 @@ static bool mayTailCallThisCC(CallingConv::ID CC) {
   case CallingConv::AArch64_SVE_VectorCall:
   case CallingConv::PreserveMost:
   case CallingConv::PreserveAll:
+  case CallingConv::PreserveNone:
   case CallingConv::Swift:
   case CallingConv::SwiftTail:
   case CallingConv::Tail:
@@ -7949,9 +7966,10 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
       ++NumTailCalls;
   }
 
-  if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall())
+  if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall()) {
     report_fatal_error("failed to perform tail call elimination on a call "
                        "site marked musttail");
+  }
 
   // Get a count of how many bytes are to be pushed on the stack.
   unsigned NumBytes = CCInfo.getStackSize();
@@ -8576,6 +8594,20 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
     }
   }
 
+  if (CallConv == CallingConv::PreserveNone) {
+    for (const ISD::OutputArg &O : Outs) {
+      if (O.Flags.isSwiftSelf() || O.Flags.isSwiftError() ||
+          O.Flags.isSwiftAsync()) {
+        MachineFunction &MF = DAG.getMachineFunction();
+        DAG.getContext()->diagnose(DiagnosticInfoUnsupported(
+            MF.getFunction(),
+            "Swift attributes can't be used with preserve_none",
+            DL.getDebugLoc()));
+        break;
+      }
+    }
+  }
+
   return Result;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index ad29003f1e8173..570ea28646b628 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -75,6 +75,8 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
     // GHC set of callee saved regs is empty as all those regs are
     // used for passing STG regs around
     return CSR_AArch64_NoRegs_SaveList;
+  if (MF->getFunction().getCallingConv() == CallingConv::PreserveNone)
+    return CSR_AArch64_NoneRegs_SaveList;
   if (MF->getFunction().getCallingConv() == CallingConv::AnyReg)
     return CSR_AArch64_AllRegs_SaveList;
 
@@ -264,6 +266,9 @@ AArch64RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
   if (CC == CallingConv::GHC)
     // This is academic because all GHC calls are (supposed to be) tail calls
     return SCS ? CSR_AArch64_NoRegs_SCS_RegMask : CSR_AArch64_NoRegs_RegMask;
+  if (CC == CallingConv::PreserveNone)
+    return SCS ? CSR_AArch64_NoneRegs_SCS_RegMask
+               : CSR_AArch64_NoneRegs_RegMask;
   if (CC == CallingConv::AnyReg)
     return SCS ? CSR_AArch64_AllRegs_SCS_RegMask : CSR_AArch64_AllRegs_RegMask;
 
@@ -298,12 +303,11 @@ AArch64RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
   if (CC == CallingConv::PreserveMost)
     return SCS ? CSR_AArch64_RT_MostRegs_SCS_RegMask
                : CSR_AArch64_RT_MostRegs_RegMask;
-  else if (CC == CallingConv::PreserveAll)
+  if (CC == CallingConv::PreserveAll)
     return SCS ? CSR_AArch64_RT_AllRegs_SCS_RegMask
                : CSR_AArch64_RT_AllRegs_RegMask;
 
-  else
-    return SCS ? CSR_AArch64_AAPCS_SCS_RegMask : CSR_AArch64_AAPCS_RegMask;
+  return SCS ? CSR_AArch64_AAPCS_SCS_RegMask : CSR_AArch64_AAPCS_RegMask;
 }
 
 const uint32_t *AArch64RegisterInfo::getCustomEHPadPreservedMask(
@@ -588,6 +592,8 @@ bool AArch64RegisterInfo::isArgumentRegister(const MachineFunction &MF,
     report_fatal_error("Unsupported calling convention.");
   case CallingConv::GHC:
     return HasReg(CC_AArch64_GHC_ArgRegs, Reg);
+  case CallingConv::PreserveNone:
+    return HasReg(CC_AArch64_Preserve_None_ArgRegs, Reg);
   case CallingConv::C:
   case CallingConv::Fast:
   case CallingConv::PreserveMost:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
index c4197ff73187af..2615ea7f81653b 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
@@ -782,6 +782,7 @@ static bool mayTailCallThisCC(CallingConv::ID CC) {
   case CallingConv::C:
   case CallingConv::PreserveMost:
   case CallingConv::PreserveAll:
+  case CallingConv::PreserveNone:
   case CallingConv::Swift:
   case CallingConv::SwiftTail:
   case CallingConv::Tail:
diff --git a/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll b/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll
new file mode 100644
index 00000000000000..2d4fefe82b9911
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll
@@ -0,0 +1,88 @@
+; RUN: llc -mtriple=aarch64-apple-darwin -stop-after finalize-isel <%s | FileCheck %s
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc i64 @callee1(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  ret i64 %a4
+}
+; CHECK:     name: callee1
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0
+
+; Check that RegMask is csr_aarch64_noneregs.
+define i64 @caller1(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc i64 @callee1(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  %b2 = add i64 %b1, %a0
+  ret i64 %b2
+}
+; CHECK:    name: caller1
+; CHECK:    BL @callee1, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0
+
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc {i64, i64} @callee2(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  %b4 = insertvalue {i64, i64} undef, i64 %a3, 0
+  %b5 = insertvalue {i64, i64} %b4, i64 %a4, 1
+  ret {i64, i64} %b5
+}
+; CHECK:     name: callee2
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0
+
+
+; Check that RegMask is csr_aarch64_noneregs.
+define {i64, i64} @caller2(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc {i64, i64} @callee2(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  ret {i64, i64} %b1
+}
+; CHECK:    name: caller2
+; CHECK:    BL @callee2, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0
+
+
+%struct.Large = type { i64, double, double }
+
+; Declare the callee with a sret parameter.
+declare preserve_nonecc void @callee3(ptr noalias nocapture writeonly sret(%struct.Large) align 4 %a0, i64 %b0) nounwind;
+
+; Check that RegMask is csr_aarch64_noneregs.
+define void @caller3(i64 %a0) nounwind {
+  %a1 = alloca %struct.Large, align 8
+  call preserve_nonecc void @callee3(ptr nonnull sret(%struct.Large) align 8 %a1, i64 %a0)
+  ret void
+}
+; CHECK:    name: caller3
+; CHECK:    BL @callee3, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR
+
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc {i64, double} @callee4(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  %b4 = insertvalue {i64, double} undef, i64 %a3, 0
+  %b5 = insertvalue {i64, double} %b4, double 1.2, 1
+  ret {i64, double} %b5
+}
+; CHECK:     name: callee4
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0, implicit $d0
+
+; Check that RegMask is csr_aarch64_noneregs.
+define {i64, double} @caller4(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc {i64, double} @callee4(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  ret {i64, double} %b1
+}
+; CHECK:    name: caller4
+; CHECK:    BL @callee4, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0, implicit $d0
diff --git a/llvm/test/CodeGen/AArch64/preserve.ll b/llvm/test/CodeGen/AArch64/preserve.ll
index d11a45144a9049..a8acdc1df97606 100644
--- a/llvm/test/CodeGen/AArch64/preserve.ll
+++ b/llvm/test/CodeGen/AArch64/preserve.ll
@@ -15,8 +15,15 @@ define preserve_allcc void @foo() #0 {
   call void @bar2()
   ret void
 }
+define preserve_nonecc void @qux() #0 {
+;  CHECK: qux Clobbered Registers: $ffr $fpcr $fpsr $nzcv $sp $vg $wsp $za $b0 $b1 $b2 $b3 $b4 $b5 $b6 $b7 $b16 $b17 $b18 $b19 $b20 $b21 $b22 $b23 $b24 $b25 $b26 $b27 $b28 $b29 $b30 $b31 $d0 $d1 $d2 $d3 $d4 $d5 $d6 $d7 $d16 $d17 $d18 $d19 $d20 $d21 $d22 $d23 $d24 $d25 $d26 $d27 $d28 $d29 $d30 $d31 $h0 $h1 $h2 $h3 $h4 $h5 $h6 $h7 $h16 $h17 $h18 $h19 $h20 $h21 $h22 $h23 $h24 $h25 $h26 $h27 $h28 $h29 $h30 $h31 $p0 $p1 $p2 $p3 $p4 $p5 $p6 $p7 $p8 $p9 $p10 $p11 $p12 $p13 $p14 $p15 $pn0 $pn1 $pn2 $pn3 $pn4 $pn5 $pn6 $pn7 $pn8 $pn9 $pn10 $pn11 $pn12 $pn13 $pn14 $pn15 $q0 $q1 $q2 $q3 $q4 $q5 $q6 $q7 $q8 $q9 $q10 $q11 $q12 $q13 $q14 $q15 $q16 $q17 $q18 $q19 $q20 $q21 $q22 $q23 $q24 $q25 $q26 $q27 $q28 $q29 $q30 $q31 $s0 $s1 $s2 $s3 $s4 $s5 $s6 $s7 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 $s28 $s29 $s30 $s31 $w0 $w1 $w2 $w3 $w4 $w5 $w6 $w7 $w8 $w9 $w10 $w11 $w12 $w13 $w14 $w15 $w16 $w17 $w18 $x0 $x1 $x2 $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x13 $x14 $x15 $x16 $x17 $x18 $z0 $z1 $z2 $z3 $z4 $z5 $z6 $z7 $z8 $z9 $z10 $z11 $z12 $z13 $z14 $z15 $z16 $z17 $z18 $z19 $z20 $z21 $z22 $z23 $z24 $z25 $z26 $z27 $z28 $z29 $z30 $z31 $zab0 $zad0 $zad1 $zad2 $zad3 $zad4 $zad5 $zad6 $zad7 $zah0 $zah1 $zaq0 $zaq1 $zaq2 $zaq3 $zaq4 $zaq5 $zaq6 $zaq7 $zaq8 $zaq9 $zaq10 $zaq11 $zaq12 $zaq13 $zaq14 $zaq15 $zas0 $zas1 $zas2 $zas3 $zt0 $d0_d1 $d1_d2 $d2_d3 $d3_d4 $d4_d5 $d5_d6 $d6_d7 $d7_d8 $d15_d16 $d16_d17 $d17_d18 $d18_d19 $d19_d20 $d20_d21 $d21_d22 $d22_d23 $d23_d24 $d24_d25 $d25_d26 $d26_d27 $d27_d28 $d28_d29 $d29_d30 $d30_d31 $d31_d0 $d0_d1_d2_d3 $d1_d2_d3_d4 $d2_d3_d4_d5 $d3_d4_d5_d6 $d4_d5_d6_d7 $d5_d6_d7_d8 $d6_d7_d8_d9 $d7_d8_d9_d10 $d13_d14_d15_d16 $d14_d15_d16_d17 $d15_d16_d17_d18 $d16_d17_d18_d19 $d17_d18_d19_d20 $d18_d19_d20_d21 $d19_d20_d21_d22 $d20_d21_d22_d23 $d21_d22_d23_d24 $d22_d23_d24_d25 $d23_d24_d25_d26 $d24_d25_d26_d27 $d25_d26_d27_d28 $d26_d27_d28_d29 $d27_d28_d29_d30 $d28_d29_d30_d31 $d29_d30_d31_d0 $d30_d31_d0_d1 $d31_d0_d1_d2 $d0_d1_d2 $d1_d2_d3 $d2_d3_d4 $d3_d4_d5 $d4_d5_d6 $d5_d6_d7 $d6_d7_d8 $d7_d8_d9 $d14_d15_d16 $d15_d16_d17 $d16_d17_d18 $d17_d18_d19 $d18_d19_d20 $d19_d20_d21 $d20_d21_d22 $d21_d22_d23 $d22_d23_d24 $d23_d24_d25 $d24_d25_d26 $d25_d26_d27 $d26_d27_d28 $d27_d28_d29 $d28_d29_d30 $d29_d30_d31 $d30_d31_d0 $d31_d0_d1 $p0_p1 $p1_p2 $p2_p3 $p3_p4 $p4_p5 $p5_p6 $p6_p7 $p7_p8 $p8_p9 $p9_p10 $p10_p11 $p11_p12 $p12_p13 $p13_p14 $p14_p15 $p15_p0 $q0_q1 $q1_q2 $q2_q3 $q3_q4 $q4_q5 $q5_q6 $q6_q7 $q7_q8 $q8_q9 $q9_q10 $q10_q11 $q11_q12 $q12_q13 $q13_q14 $q14_q15 $q15_q16 $q16_q17 $q17_q18 $q18_q19 $q19_q20 $q20_q21 $q21_q22 $q22_q23 $q23_q24 $q24_q25 $q25_q26 $q26_q27 $q27_q28 $q28_q29 $q29_q30 $q30_q31 $q31_q0 $q0_q1_q2_q3 $q1_q2_q3_q4 $q2_q3_q4_q5 $q3_q4_q5_q6 $q4_q5_q6_q7 $q5_q6_q7_q8 $q6_q7_q8_q9 $q7_q8_q9_q10 $q8_q9_q10_q11 $q9_q10_q11_q12 $q10_q11_q12_q13 $q11_q12_q13_q14 $q12_q13_q14_q15 $q13_q14_q15_q16 $q14_q15_q16_q17 $q15_q16_q17_q18 $q16_q17_q18_q19 $q17_q18_q19_q20 $q18_q19_q20_q21 $q19_q20_q21_q22 $q20_q21_q22_q23 $q21_q22_q23_q24 $q22_q23_q24_q25 $q23_q24_q25_q26 $q24_q25_q26_q27 $q25_q26_q27...
[truncated]

@llvmbot
Copy link

llvmbot commented May 5, 2024

@llvm/pr-subscribers-llvm-ir

Author: None (antangelo)

Changes

Adds AArch64 support for the preserve_none calling convention. Registers X0-X17 and X19-X28 are caller save, and can be used to pass arguments. Delegates to AAPCS for all other registers.

Closes #87423


Patch is 44.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91046.diff

16 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+2-1)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+11-8)
  • (modified) clang/lib/Basic/Targets/AArch64.cpp (+1)
  • (modified) clang/test/CodeGen/preserve-call-conv.c (+3-3)
  • (modified) llvm/docs/LangRef.rst (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64CallingConvention.h (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64CallingConvention.td (+29)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+33-1)
  • (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+9-3)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp (+1)
  • (added) llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll (+88)
  • (modified) llvm/test/CodeGen/AArch64/preserve.ll (+8-1)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc.ll (+92)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_call.ll (+337)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_musttail.ll (+11)
  • (added) llvm/test/CodeGen/AArch64/preserve_nonecc_swift.ll (+16)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 0225598cbbe8ad..712c79927304e2 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3038,7 +3038,8 @@ def M68kRTD: DeclOrTypeAttr {
   let Documentation = [M68kRTDDocs];
 }
 
-def PreserveNone : DeclOrTypeAttr, TargetSpecificAttr<TargetAnyX86> {
+def PreserveNone : DeclOrTypeAttr,
+                   TargetSpecificAttr<TargetArch<!listconcat(TargetAArch64.Arches, TargetAnyX86.Arches)>> {
   let Spellings = [Clang<"preserve_none">];
   let Subjects = SubjectList<[FunctionLike]>;
   let Documentation = [PreserveNoneDocs];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index f8253143b596c0..d23465b77e7edd 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -5658,17 +5658,20 @@ experimental at this time.
 def PreserveNoneDocs : Documentation {
   let Category = DocCatCallingConvs;
   let Content = [{
-On X86-64 target, this attribute changes the calling convention of a function.
+On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
 The ``preserve_none`` calling convention tries to preserve as few general
 registers as possible. So all general registers are caller saved registers. It
 also uses more general registers to pass arguments. This attribute doesn't
-impact floating-point registers (XMMs/YMMs). Floating-point registers still
-follow the c calling convention.
-
-- Only RSP and RBP are preserved by callee.
-
-- Register RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15 and RAX now can
-  be used to pass function arguments.
+impact floating-point registers. 
+
+- On X86-64, only RSP and RBP are preserved by the callee.
+  Registers RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15 and RAX now can
+  be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
+  follow the C calling convention.
+- On AArch64, only LR and FP are preserved by the callee.
+  Registers X19-X28 and X0-X17 are used to pass function arguments.
+  X18, SIMD and floating-point registers follow the AAPCS calling
+  convention.
   }];
 }
 
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index c8d243a8fb7aea..e1f7dbf1d9f20b 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -1202,6 +1202,7 @@ AArch64TargetInfo::checkCallingConvention(CallingConv CC) const {
   case CC_SwiftAsync:
   case CC_PreserveMost:
   case CC_PreserveAll:
+  case CC_PreserveNone:
   case CC_OpenCLKernel:
   case CC_AArch64VectorCall:
   case CC_AArch64SVEPCS:
diff --git a/clang/test/CodeGen/preserve-call-conv.c b/clang/test/CodeGen/preserve-call-conv.c
index 74bf695e6f331d..65973206403f70 100644
--- a/clang/test/CodeGen/preserve-call-conv.c
+++ b/clang/test/CodeGen/preserve-call-conv.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,X86-LINUX
-// RUN: %clang_cc1 -triple arm64-unknown-unknown -emit-llvm < %s | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,LINUX
+// RUN: %clang_cc1 -triple arm64-unknown-unknown -emit-llvm < %s | FileCheck %s --check-prefixes=CHECK,LINUX
 
 // RUN: %clang_cc1 -triple x86_64-unknown-windows-msvc -emit-llvm %s -o - | FileCheck %s
 // RUN: %clang_cc1 -triple aarch64-unknown-windows-msvc -emit-llvm %s -o - | FileCheck %s
@@ -23,5 +23,5 @@ void boo(void) __attribute__((preserve_all)) {
 // is lowered to the corresponding calling convention attrribute at the LLVM IR
 // level.
 void bar(void) __attribute__((preserve_none)) {
-  // X86-LINUX-LABEL: define {{(dso_local )?}}preserve_nonecc void @bar()
+  // LINUX-LABEL: define {{(dso_local )?}}preserve_nonecc void @bar()
 }
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 2077fdd841fcd6..1259cc568204f9 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -458,7 +458,7 @@ added in the future:
     registers to pass arguments. This attribute doesn't impact non-general
     purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
     Non-general purpose registers still follow the standard c calling
-    convention. Currently it is for x86_64 only.
+    convention. Currently it is for x86_64 and AArch64 only.
 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
     Clang generates an access function to access C++-style TLS. The access
     function generally has an entry block, an exit block and an initialization
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.h b/llvm/lib/Target/AArch64/AArch64CallingConvention.h
index 3b51ee12b7477e..63185a97cba03d 100644
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.h
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.h
@@ -52,6 +52,9 @@ bool CC_AArch64_Arm64EC_CFGuard_Check(unsigned ValNo, MVT ValVT, MVT LocVT,
 bool CC_AArch64_GHC(unsigned ValNo, MVT ValVT, MVT LocVT,
                     CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
                     CCState &State);
+bool CC_AArch64_Preserve_None(unsigned ValNo, MVT ValVT, MVT LocVT,
+                              CCValAssign::LocInfo LocInfo,
+                              ISD::ArgFlagsTy ArgFlags, CCState &State);
 bool RetCC_AArch64_AAPCS(unsigned ValNo, MVT ValVT, MVT LocVT,
                          CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
                          CCState &State);
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
index 8e67f0f5c8815f..7d24aae99356f6 100644
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
@@ -494,6 +494,31 @@ def CC_AArch64_GHC : CallingConv<[
   CCIfType<[i64], CCAssignToReg<[X19, X20, X21, X22, X23, X24, X25, X26, X27, X28]>>
 ]>;
 
+let Entry = 1 in
+def CC_AArch64_Preserve_None : CallingConv<[
+    // We can pass arguments in all general registers, except:
+    // - X16/X17, used by the linker as IP0/IP1
+    // - X18, used for the 'nest' parameter
+    // - X29, the frame pointer
+    // - X30, the link register
+    // General registers are not preserved with the exception of
+    // FP, LR, and X18
+    // Non-volatile registers are used first, so functions may call
+    // normal functions without saving and reloading arguments.
+    CCIfType<[i32], CCAssignToReg<[W19, W20, W21, W22, W23,
+                                   W24, W25, W26, W27, W28,
+                                   W0, W1, W2, W3, W4, W5,
+                                   W6, W7, W8, W9, W10, W11,
+                                   W12, W13, W14, W15]>>,
+    CCIfType<[i64], CCAssignToReg<[X19, X20, X21, X22, X23,
+                                   X24, X25, X26, X27, X28,
+                                   X0, X1, X2, X3, X4, X5,
+                                   X6, X7, X8, X9, X10, X11,
+                                   X12, X13, X14, X15]>>,
+
+    CCDelegateTo<CC_AArch64_AAPCS>
+]>;
+
 // The order of the callee-saves in this file is important, because the
 // FrameLowering code will use this order to determine the layout the
 // callee-save area in the stack frame. As can be observed below, Darwin
@@ -606,6 +631,8 @@ def CSR_AArch64_AllRegs
 
 def CSR_AArch64_NoRegs : CalleeSavedRegs<(add)>;
 
+def CSR_AArch64_NoneRegs : CalleeSavedRegs<(add LR, FP)>;
+
 def CSR_AArch64_RT_MostRegs :  CalleeSavedRegs<(add CSR_AArch64_AAPCS,
                                                 (sequence "X%u", 9, 15))>;
 
@@ -681,6 +708,8 @@ def CSR_Darwin_AArch64_RT_AllRegs
 // These all preserve x18 in addition to any other registers.
 def CSR_AArch64_NoRegs_SCS
     : CalleeSavedRegs<(add CSR_AArch64_NoRegs, X18)>;
+def CSR_AArch64_NoneRegs_SCS
+    : CalleeSavedRegs<(add CSR_AArch64_NoneRegs, X18)>;
 def CSR_AArch64_AllRegs_SCS
     : CalleeSavedRegs<(add CSR_AArch64_AllRegs, X18)>;
 def CSR_AArch64_AAPCS_SwiftError_SCS
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index b27d204f3dded0..4e69d6c7f7ca95 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -6825,6 +6825,8 @@ CCAssignFn *AArch64TargetLowering::CCAssignFnForCall(CallingConv::ID CC,
     report_fatal_error("Unsupported calling convention.");
   case CallingConv::GHC:
     return CC_AArch64_GHC;
+  case CallingConv::PreserveNone:
+    return CC_AArch64_Preserve_None;
   case CallingConv::C:
   case CallingConv::Fast:
   case CallingConv::PreserveMost:
@@ -7348,6 +7350,20 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
     FuncInfo->setLazySaveTPIDR2Obj(TPIDR2Obj);
   }
 
+  if (CallConv == CallingConv::PreserveNone) {
+    for (const ISD::InputArg &I : Ins) {
+      if (I.Flags.isSwiftSelf() || I.Flags.isSwiftError() ||
+          I.Flags.isSwiftAsync()) {
+        MachineFunction &MF = DAG.getMachineFunction();
+        DAG.getContext()->diagnose(DiagnosticInfoUnsupported(
+            MF.getFunction(),
+            "Swift attributes can't be used with preserve_none",
+            DL.getDebugLoc()));
+        break;
+      }
+    }
+  }
+
   return Chain;
 }
 
@@ -7519,6 +7535,7 @@ static bool mayTailCallThisCC(CallingConv::ID CC) {
   case CallingConv::AArch64_SVE_VectorCall:
   case CallingConv::PreserveMost:
   case CallingConv::PreserveAll:
+  case CallingConv::PreserveNone:
   case CallingConv::Swift:
   case CallingConv::SwiftTail:
   case CallingConv::Tail:
@@ -7949,9 +7966,10 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
       ++NumTailCalls;
   }
 
-  if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall())
+  if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall()) {
     report_fatal_error("failed to perform tail call elimination on a call "
                        "site marked musttail");
+  }
 
   // Get a count of how many bytes are to be pushed on the stack.
   unsigned NumBytes = CCInfo.getStackSize();
@@ -8576,6 +8594,20 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
     }
   }
 
+  if (CallConv == CallingConv::PreserveNone) {
+    for (const ISD::OutputArg &O : Outs) {
+      if (O.Flags.isSwiftSelf() || O.Flags.isSwiftError() ||
+          O.Flags.isSwiftAsync()) {
+        MachineFunction &MF = DAG.getMachineFunction();
+        DAG.getContext()->diagnose(DiagnosticInfoUnsupported(
+            MF.getFunction(),
+            "Swift attributes can't be used with preserve_none",
+            DL.getDebugLoc()));
+        break;
+      }
+    }
+  }
+
   return Result;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index ad29003f1e8173..570ea28646b628 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -75,6 +75,8 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
     // GHC set of callee saved regs is empty as all those regs are
     // used for passing STG regs around
     return CSR_AArch64_NoRegs_SaveList;
+  if (MF->getFunction().getCallingConv() == CallingConv::PreserveNone)
+    return CSR_AArch64_NoneRegs_SaveList;
   if (MF->getFunction().getCallingConv() == CallingConv::AnyReg)
     return CSR_AArch64_AllRegs_SaveList;
 
@@ -264,6 +266,9 @@ AArch64RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
   if (CC == CallingConv::GHC)
     // This is academic because all GHC calls are (supposed to be) tail calls
     return SCS ? CSR_AArch64_NoRegs_SCS_RegMask : CSR_AArch64_NoRegs_RegMask;
+  if (CC == CallingConv::PreserveNone)
+    return SCS ? CSR_AArch64_NoneRegs_SCS_RegMask
+               : CSR_AArch64_NoneRegs_RegMask;
   if (CC == CallingConv::AnyReg)
     return SCS ? CSR_AArch64_AllRegs_SCS_RegMask : CSR_AArch64_AllRegs_RegMask;
 
@@ -298,12 +303,11 @@ AArch64RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
   if (CC == CallingConv::PreserveMost)
     return SCS ? CSR_AArch64_RT_MostRegs_SCS_RegMask
                : CSR_AArch64_RT_MostRegs_RegMask;
-  else if (CC == CallingConv::PreserveAll)
+  if (CC == CallingConv::PreserveAll)
     return SCS ? CSR_AArch64_RT_AllRegs_SCS_RegMask
                : CSR_AArch64_RT_AllRegs_RegMask;
 
-  else
-    return SCS ? CSR_AArch64_AAPCS_SCS_RegMask : CSR_AArch64_AAPCS_RegMask;
+  return SCS ? CSR_AArch64_AAPCS_SCS_RegMask : CSR_AArch64_AAPCS_RegMask;
 }
 
 const uint32_t *AArch64RegisterInfo::getCustomEHPadPreservedMask(
@@ -588,6 +592,8 @@ bool AArch64RegisterInfo::isArgumentRegister(const MachineFunction &MF,
     report_fatal_error("Unsupported calling convention.");
   case CallingConv::GHC:
     return HasReg(CC_AArch64_GHC_ArgRegs, Reg);
+  case CallingConv::PreserveNone:
+    return HasReg(CC_AArch64_Preserve_None_ArgRegs, Reg);
   case CallingConv::C:
   case CallingConv::Fast:
   case CallingConv::PreserveMost:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
index c4197ff73187af..2615ea7f81653b 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
@@ -782,6 +782,7 @@ static bool mayTailCallThisCC(CallingConv::ID CC) {
   case CallingConv::C:
   case CallingConv::PreserveMost:
   case CallingConv::PreserveAll:
+  case CallingConv::PreserveNone:
   case CallingConv::Swift:
   case CallingConv::SwiftTail:
   case CallingConv::Tail:
diff --git a/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll b/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll
new file mode 100644
index 00000000000000..2d4fefe82b9911
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/dynamic-regmask-preserve-none.ll
@@ -0,0 +1,88 @@
+; RUN: llc -mtriple=aarch64-apple-darwin -stop-after finalize-isel <%s | FileCheck %s
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc i64 @callee1(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  ret i64 %a4
+}
+; CHECK:     name: callee1
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0
+
+; Check that RegMask is csr_aarch64_noneregs.
+define i64 @caller1(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc i64 @callee1(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  %b2 = add i64 %b1, %a0
+  ret i64 %b2
+}
+; CHECK:    name: caller1
+; CHECK:    BL @callee1, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0
+
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc {i64, i64} @callee2(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  %b4 = insertvalue {i64, i64} undef, i64 %a3, 0
+  %b5 = insertvalue {i64, i64} %b4, i64 %a4, 1
+  ret {i64, i64} %b5
+}
+; CHECK:     name: callee2
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0
+
+
+; Check that RegMask is csr_aarch64_noneregs.
+define {i64, i64} @caller2(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc {i64, i64} @callee2(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  ret {i64, i64} %b1
+}
+; CHECK:    name: caller2
+; CHECK:    BL @callee2, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0
+
+
+%struct.Large = type { i64, double, double }
+
+; Declare the callee with a sret parameter.
+declare preserve_nonecc void @callee3(ptr noalias nocapture writeonly sret(%struct.Large) align 4 %a0, i64 %b0) nounwind;
+
+; Check that RegMask is csr_aarch64_noneregs.
+define void @caller3(i64 %a0) nounwind {
+  %a1 = alloca %struct.Large, align 8
+  call preserve_nonecc void @callee3(ptr nonnull sret(%struct.Large) align 8 %a1, i64 %a0)
+  ret void
+}
+; CHECK:    name: caller3
+; CHECK:    BL @callee3, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR
+
+
+; Check that the callee doesn't have calleeSavedRegisters.
+define preserve_nonecc {i64, double} @callee4(i64 %a0, i64 %b0, i64 %c0, i64 %d0, i64 %e0) nounwind {
+  %a1 = mul i64 %a0, %b0
+  %a2 = mul i64 %a1, %c0
+  %a3 = mul i64 %a2, %d0
+  %a4 = mul i64 %a3, %e0
+  %b4 = insertvalue {i64, double} undef, i64 %a3, 0
+  %b5 = insertvalue {i64, double} %b4, double 1.2, 1
+  ret {i64, double} %b5
+}
+; CHECK:     name: callee4
+; CHECK-NOT: calleeSavedRegisters:
+; CHECK:     RET_ReallyLR implicit $x0, implicit $d0
+
+; Check that RegMask is csr_aarch64_noneregs.
+define {i64, double} @caller4(i64 %a0) nounwind {
+  %b1 = call preserve_nonecc {i64, double} @callee4(i64 %a0, i64 %a0, i64 %a0, i64 %a0, i64 %a0)
+  ret {i64, double} %b1
+}
+; CHECK:    name: caller4
+; CHECK:    BL @callee4, csr_aarch64_noneregs
+; CHECK:    RET_ReallyLR implicit $x0, implicit $d0
diff --git a/llvm/test/CodeGen/AArch64/preserve.ll b/llvm/test/CodeGen/AArch64/preserve.ll
index d11a45144a9049..a8acdc1df97606 100644
--- a/llvm/test/CodeGen/AArch64/preserve.ll
+++ b/llvm/test/CodeGen/AArch64/preserve.ll
@@ -15,8 +15,15 @@ define preserve_allcc void @foo() #0 {
   call void @bar2()
   ret void
 }
+define preserve_nonecc void @qux() #0 {
+;  CHECK: qux Clobbered Registers: $ffr $fpcr $fpsr $nzcv $sp $vg $wsp $za $b0 $b1 $b2 $b3 $b4 $b5 $b6 $b7 $b16 $b17 $b18 $b19 $b20 $b21 $b22 $b23 $b24 $b25 $b26 $b27 $b28 $b29 $b30 $b31 $d0 $d1 $d2 $d3 $d4 $d5 $d6 $d7 $d16 $d17 $d18 $d19 $d20 $d21 $d22 $d23 $d24 $d25 $d26 $d27 $d28 $d29 $d30 $d31 $h0 $h1 $h2 $h3 $h4 $h5 $h6 $h7 $h16 $h17 $h18 $h19 $h20 $h21 $h22 $h23 $h24 $h25 $h26 $h27 $h28 $h29 $h30 $h31 $p0 $p1 $p2 $p3 $p4 $p5 $p6 $p7 $p8 $p9 $p10 $p11 $p12 $p13 $p14 $p15 $pn0 $pn1 $pn2 $pn3 $pn4 $pn5 $pn6 $pn7 $pn8 $pn9 $pn10 $pn11 $pn12 $pn13 $pn14 $pn15 $q0 $q1 $q2 $q3 $q4 $q5 $q6 $q7 $q8 $q9 $q10 $q11 $q12 $q13 $q14 $q15 $q16 $q17 $q18 $q19 $q20 $q21 $q22 $q23 $q24 $q25 $q26 $q27 $q28 $q29 $q30 $q31 $s0 $s1 $s2 $s3 $s4 $s5 $s6 $s7 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 $s28 $s29 $s30 $s31 $w0 $w1 $w2 $w3 $w4 $w5 $w6 $w7 $w8 $w9 $w10 $w11 $w12 $w13 $w14 $w15 $w16 $w17 $w18 $x0 $x1 $x2 $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x13 $x14 $x15 $x16 $x17 $x18 $z0 $z1 $z2 $z3 $z4 $z5 $z6 $z7 $z8 $z9 $z10 $z11 $z12 $z13 $z14 $z15 $z16 $z17 $z18 $z19 $z20 $z21 $z22 $z23 $z24 $z25 $z26 $z27 $z28 $z29 $z30 $z31 $zab0 $zad0 $zad1 $zad2 $zad3 $zad4 $zad5 $zad6 $zad7 $zah0 $zah1 $zaq0 $zaq1 $zaq2 $zaq3 $zaq4 $zaq5 $zaq6 $zaq7 $zaq8 $zaq9 $zaq10 $zaq11 $zaq12 $zaq13 $zaq14 $zaq15 $zas0 $zas1 $zas2 $zas3 $zt0 $d0_d1 $d1_d2 $d2_d3 $d3_d4 $d4_d5 $d5_d6 $d6_d7 $d7_d8 $d15_d16 $d16_d17 $d17_d18 $d18_d19 $d19_d20 $d20_d21 $d21_d22 $d22_d23 $d23_d24 $d24_d25 $d25_d26 $d26_d27 $d27_d28 $d28_d29 $d29_d30 $d30_d31 $d31_d0 $d0_d1_d2_d3 $d1_d2_d3_d4 $d2_d3_d4_d5 $d3_d4_d5_d6 $d4_d5_d6_d7 $d5_d6_d7_d8 $d6_d7_d8_d9 $d7_d8_d9_d10 $d13_d14_d15_d16 $d14_d15_d16_d17 $d15_d16_d17_d18 $d16_d17_d18_d19 $d17_d18_d19_d20 $d18_d19_d20_d21 $d19_d20_d21_d22 $d20_d21_d22_d23 $d21_d22_d23_d24 $d22_d23_d24_d25 $d23_d24_d25_d26 $d24_d25_d26_d27 $d25_d26_d27_d28 $d26_d27_d28_d29 $d27_d28_d29_d30 $d28_d29_d30_d31 $d29_d30_d31_d0 $d30_d31_d0_d1 $d31_d0_d1_d2 $d0_d1_d2 $d1_d2_d3 $d2_d3_d4 $d3_d4_d5 $d4_d5_d6 $d5_d6_d7 $d6_d7_d8 $d7_d8_d9 $d14_d15_d16 $d15_d16_d17 $d16_d17_d18 $d17_d18_d19 $d18_d19_d20 $d19_d20_d21 $d20_d21_d22 $d21_d22_d23 $d22_d23_d24 $d23_d24_d25 $d24_d25_d26 $d25_d26_d27 $d26_d27_d28 $d27_d28_d29 $d28_d29_d30 $d29_d30_d31 $d30_d31_d0 $d31_d0_d1 $p0_p1 $p1_p2 $p2_p3 $p3_p4 $p4_p5 $p5_p6 $p6_p7 $p7_p8 $p8_p9 $p9_p10 $p10_p11 $p11_p12 $p12_p13 $p13_p14 $p14_p15 $p15_p0 $q0_q1 $q1_q2 $q2_q3 $q3_q4 $q4_q5 $q5_q6 $q6_q7 $q7_q8 $q8_q9 $q9_q10 $q10_q11 $q11_q12 $q12_q13 $q13_q14 $q14_q15 $q15_q16 $q16_q17 $q17_q18 $q18_q19 $q19_q20 $q20_q21 $q21_q22 $q22_q23 $q23_q24 $q24_q25 $q25_q26 $q26_q27 $q27_q28 $q28_q29 $q29_q30 $q30_q31 $q31_q0 $q0_q1_q2_q3 $q1_q2_q3_q4 $q2_q3_q4_q5 $q3_q4_q5_q6 $q4_q5_q6_q7 $q5_q6_q7_q8 $q6_q7_q8_q9 $q7_q8_q9_q10 $q8_q9_q10_q11 $q9_q10_q11_q12 $q10_q11_q12_q13 $q11_q12_q13_q14 $q12_q13_q14_q15 $q13_q14_q15_q16 $q14_q15_q16_q17 $q15_q16_q17_q18 $q16_q17_q18_q19 $q17_q18_q19_q20 $q18_q19_q20_q21 $q19_q20_q21_q22 $q20_q21_q22_q23 $q21_q22_q23_q24 $q22_q23_q24_q25 $q23_q24_q25_q26 $q24_q25_q26_q27 $q25_q26_q27...
[truncated]

@antangelo
Copy link
Contributor Author

I don't think you can use x16 and x17 for argument passing due to them being reserved for PLTs and call veneers. That is if the linker decides to create a branch island or if the function is called via a PLT, x16 and x17 will be clobbered on the call so arguments using that will also won't work.

Thanks for catching this, I've removed them from the argument passing list.

be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
follow the C calling convention.
- On AArch64, only LR and FP are preserved by the callee.
Registers X19-X28 and X0-X17 are used to pass function arguments.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though you removed X16/X17 you seemly forgot to update the documentation.

CCIfType<[i64], CCAssignToReg<[X19, X20, X21, X22, X23,
X24, X25, X26, X27, X28,
X0, X1, X2, X3, X4, X5,
X6, X7, X8, X9, X10, X11,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X8 is used to pass SRet parameter in AArch64_Common, so it can not be used as a general argument register.

@antangelo
Copy link
Contributor Author

Friendly ping

@antangelo antangelo requested a review from weiguozhi May 15, 2024 05:24
Copy link
Contributor

@weiguozhi weiguozhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Please wait for AArch64 maintainer's approval.

@antangelo
Copy link
Contributor Author

Friendly ping

1 similar comment
@antangelo
Copy link
Contributor Author

Friendly ping

Copy link
Member

@DanielKristofKiss DanielKristofKiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, just NITS otherwise LGTM

@@ -7949,9 +7966,10 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
++NumTailCalls;
}

if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall())
if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if with 1 statement doesn't need braces.


; When calling a preserve_none function, all live registers must be saved and
; restored around the function call.
declare preserve_nonecc RETTYPE @bar(i64, i64, double, double)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: maybe easier to follow like:

Suggested change
declare preserve_nonecc RETTYPE @bar(i64, i64, double, double)
declare preserve_nonecc RETTYPE @preserve_nonecc2(i64, i64, double, double)
define void @bar() nounwind {

// We can pass arguments in all general registers, except:
// - X8, used for sret
// - X16/X17, used by the linker as IP0/IP1
// - X18, used for the 'nest' parameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X18 is the Platform Register

- Label X18 as the platform register in the calling convention comment
- Fix braces on if with single statement
- Update test function naming to be easier to follow
@antangelo antangelo merged commit ae1596a into llvm:main Jun 3, 2024
8 checks passed
@efriedma-quic
Copy link
Collaborator

X19 is the base register; can we actually allocate arguments in it in general? This seems hard to fix.

It looks like frame lowering assumes X9 is available; that's probably fixable, but the code needs to be reworked, I think.

X15 is used on Windows for stack allocation; I think you can use it in this context, but probably worth a test to verify the interaction works the way you want. (And more generally, that this calling convention doesn't explode on Windows targets.)

What's the interaction between varargs and preserve_none?

@antangelo
Copy link
Contributor Author

Thank you for catching these.

X19 is the base register; can we actually allocate arguments in it in general? This seems hard to fix.

I will remove X19 from the argument passing list. ghccc also uses X19 for argument passing, and I didn't run into issues while testing this PR, so I'm not sure under what conditions issues present for this but it makes sense to remove it regardless.

It looks like frame lowering assumes X9 is available; that's probably fixable, but the code needs to be reworked, I think.

The function that allocates X9 as scratch has a fallback provision for when the basic block is not an entry block, where it checks for liveness before picking a scratch register if X9 is unavailable. I think we can modify this so that preserve_none functions always follow this path, and then X9 should be usable for argument passing.

X15 is used on Windows for stack allocation; I think you can use it in this context, but probably worth a test to verify the interaction works the way you want. (And more generally, that this calling convention doesn't explode on Windows targets.)

It looks like X15 is clobbered so I don't think it will be usable on Windows. Do you think it should be excluded from argument passing only on Windows or always?

As an aside, it seems like I missed adding the calling convention to the Windows target in clang, so this is currently only reachable from IR directly.

What's the interaction between varargs and preserve_none?

I don't believe there is a defined interaction between varargs and preserve_none as per the RFC, given that the primary use case is for tail calls (which don't support varargs) I imagine it's unlikely to be used outright. It does not seem to work correctly on either AArch64 with this implementation or the X86_64 reference implementation, so we should probably treat it as unsupported.

I will prepare a patch to address the above issues. Do you think this warrants a revert of this PR in the meantime?

@efriedma-quic
Copy link
Collaborator

Since nothing should be using preserve_none on aarch64 yet, it's probably fine to just fix forward.

I'm not sure under what conditions issues present for this

We only allocate a base pointer under restricted circumstances (primarily, functions with dynamic allocation), so maybe GHC never tripped over the issues.

It looks like X15 is clobbered so I don't think it will be usable on Windows. Do you think it should be excluded from argument passing only on Windows or always?

You probably could make this work if you really wanted to: you can allocate small amounts of stack without __chkstk, which you could then use to spill X15. Not sure if that's the right tradeoff here.

@diegorusso
Copy link

We (CPython project) will use it for the new JIT. I'll be testing the current implementation and report back any issue. I don't think you need to revert the PR but please let me know when you create the new one to address the post-merge comments so I can test it again.

Thanks

@antangelo
Copy link
Contributor Author

I have posted the followup changes in PR #96259 . I have left out the frontend changes required for preserve_none to be usable from clang on Windows for a separate patch, since they will require some changes to mangling (which are also required for X86).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support preserve_none calling convention on AArch64 targets
8 participants