Skip to content

Conversation

liusy58
Copy link
Contributor

@liusy58 liusy58 commented Dec 26, 2024

(1)Use CFI directives OpDefCfaOffset 0 to identify indirect tail call like br x*. OpDefCfaOffset 0 implies that the stack frame from the caller to the target function remains unchanged. So if a br x* instruction is preceded by a OpDefCfaOffset 0 directive, we can conclude that it is an indirect tail call, under the assumption that no other instructions have modified the stack pointer (sp) between the OpDefCfaOffset 0 directive and the br x* instruction. (2)If there are no instructions manipulating the stack within a function, we can conclude that br x* is an indirect tail call.

@llvmbot
Copy link
Member

llvmbot commented Dec 26, 2024

@llvm/pr-subscribers-bolt

Author: Nicholas (liusy58)

Changes

(1)Use CFI directives OpDefCfaOffset 0 to identify indirect tail call like br x*. OpDefCfaOffset 0 implies that the stack frame from the caller to the target function remains unchanged. So if a br x* instruction is preceded by a OpDefCfaOffset 0 directive, we can conclude that it is an indirect tail call, under the assumption that no other instructions have modified the stack pointer (sp) between the OpDefCfaOffset 0 directive and the br x* instruction. (2)If there are no instructions manipulating the stack within a function, we can conclude that br x* is an indirect tail call.


Full diff: https://github.com/llvm/llvm-project/pull/121146.diff

4 Files Affected:

  • (modified) bolt/include/bolt/Core/MCPlusBuilder.h (+5)
  • (modified) bolt/lib/Core/BinaryFunction.cpp (+33)
  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+10)
  • (modified) bolt/lib/Target/X86/X86MCPlusBuilder.cpp (+11)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 3634fed9757ceb..b4a284fba19a6e 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -610,6 +610,11 @@ class MCPlusBuilder {
 
   virtual bool isLeave(const MCInst &Inst) const { return false; }
 
+  virtual bool hasUseOrDefofSPOrFP(const MCInst &Inst) const {
+    llvm_unreachable("not implemented");
+    return false;
+  }
+
   virtual bool isADRP(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return false;
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 1c5cd62a095b24..e137effce37dcb 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -1961,6 +1961,39 @@ bool BinaryFunction::postProcessIndirectBranches(
       bool IsEpilogue = llvm::any_of(BB, [&](const MCInst &Instr) {
         return BC.MIB->isLeave(Instr) || BC.MIB->isPop(Instr);
       });
+      // Any adr instruction of aarch64 will generate a new entry,
+      // Adr instruction cannt afford to do any optimizations
+      if (!IsEpilogue && !isMultiEntry()) {
+        BinaryBasicBlock::iterator LastDefCFAOffsetInstIter = BB.end();
+        // find the last OpDefCfaOffset 0 instruction.
+        for (BinaryBasicBlock::iterator Iter = BB.begin(); Iter != BB.end();
+             ++Iter) {
+          if (&*Iter == &Instr) {
+            break;
+          }
+          if (BC.MIB->isCFI(*Iter)) {
+            const MCCFIInstruction *CFIInst = BB.getParent()->getCFIFor(*Iter);
+            if ((CFIInst->getOperation() == MCCFIInstruction::OpDefCfaOffset) &&
+                (CFIInst->getOffset() == 0)) {
+              LastDefCFAOffsetInstIter = Iter;
+              break;
+            }
+          }
+        }
+        if (LastDefCFAOffsetInstIter != BB.end()) {
+          IsEpilogue = true;
+          // make sure there is no instruction manipulating sp between the two
+          // instructions
+          BinaryBasicBlock::iterator Iter = LastDefCFAOffsetInstIter;
+          while (&*Iter != &Instr) {
+            if (BC.MIB->hasUseOrDefofSPOrFP(*Iter)) {
+              IsEpilogue = false;
+              break;
+            }
+            ++Iter;
+          }
+        }
+      }
       if (IsEpilogue) {
         BC.MIB->convertJmpToTailCall(Instr);
         BB.removeAllSuccessors();
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 7e08e5c81d26ff..91a8f755b9aa47 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -1790,6 +1790,16 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
   }
 
   uint16_t getMinFunctionAlignment() const override { return 4; }
+
+  bool hasUseOrDefofSPOrFP(const MCInst &Inst) const override {
+    if (isPseudo(Inst) || isNoop(Inst) || isCFI(Inst)) {
+      return false;
+    }
+    return hasDefOfPhysReg(Inst, AArch64::SP) ||
+           hasUseOfPhysReg(Inst, AArch64::SP) ||
+           hasDefOfPhysReg(Inst, AArch64::FP) ||
+           hasUseOfPhysReg(Inst, AArch64::FP);
+  }
 };
 
 } // end anonymous namespace
diff --git a/bolt/lib/Target/X86/X86MCPlusBuilder.cpp b/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
index 63086c06d74fd9..ad998d8601e26f 100644
--- a/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
+++ b/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
@@ -89,6 +89,17 @@ class X86MCPlusBuilder : public MCPlusBuilder {
 public:
   using MCPlusBuilder::MCPlusBuilder;
 
+  virtual bool hasUseOrDefofSPOrFP(const MCInst &Inst) const override {
+    bool IsLoad, IsStore, IsStoreFromReg, IsSimple, IsIndexed;
+    MCPhysReg Reg;
+    int32_t SrcImm;
+    uint16_t StackPtrReg;
+    int64_t StackOffset;
+    uint8_t Size;
+    return isStackAccess(Inst, IsLoad, IsStore, IsStoreFromReg, Reg, SrcImm,
+                         StackPtrReg, StackOffset, Size, IsSimple, IsIndexed);
+  }
+
   std::unique_ptr<MCSymbolizer>
   createTargetSymbolizer(BinaryFunction &Function,
                          bool CreateNewSymbols) const override {

@liusy58 liusy58 force-pushed the indirect_tail_call branch 3 times, most recently from 3b1a64e to 9721ee8 Compare December 26, 2024 10:44
@liusy58 liusy58 force-pushed the indirect_tail_call branch 2 times, most recently from 8125e73 to a8baf01 Compare January 17, 2025 08:11
Copy link

github-actions bot commented Jan 17, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@liusy58 liusy58 force-pushed the indirect_tail_call branch 2 times, most recently from b69a82c to 99dde03 Compare January 17, 2025 08:49
AArch64 instructions have a fixed size 4 bytes, no need to compute.
@liusy58 liusy58 force-pushed the indirect_tail_call branch from 99dde03 to d016f30 Compare January 17, 2025 08:54
@liusy58 liusy58 closed this Jan 17, 2025
@liusy58 liusy58 deleted the indirect_tail_call branch January 17, 2025 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants