Skip to content

Conversation

Sterling-Augustine
Copy link
Contributor

Some cfi_escape directives don't affect sframe unwind info, some do. Detect those cases appropriately, following gnu-gas for most cases.

Using a full-blown dwarf expression parser allows for somewhat more precise error detection than other sframe implementations. So this code is less conservative for long and more involved expressions. It could be made even more permissive.

Some cfi_escape directives don't affect sframe unwind info, some
do. Detect those cases appropriately, following gnu-gas for most
cases.

Using a full-blown dwarf expression parser allows for somewhat more
precise error detection than other sframe implementations. So this
code is less conservative for long and more involved expressions. It
could be made even more permissive.
@llvmbot llvmbot added llvm:mc Machine (object) code bazel "Peripheral" support tier build system: utils/bazel labels Oct 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2025

@llvm/pr-subscribers-llvm-mc

Author: None (Sterling-Augustine)

Changes

Some cfi_escape directives don't affect sframe unwind info, some do. Detect those cases appropriately, following gnu-gas for most cases.

Using a full-blown dwarf expression parser allows for somewhat more precise error detection than other sframe implementations. So this code is less conservative for long and more involved expressions. It could be made even more permissive.


Full diff: https://github.com/llvm/llvm-project/pull/161927.diff

5 Files Affected:

  • (modified) llvm/lib/MC/CMakeLists.txt (+2-1)
  • (modified) llvm/lib/MC/MCSFrame.cpp (+147-2)
  • (added) llvm/test/MC/ELF/cfi-sframe-cfi-escape-errors.s (+37)
  • (added) llvm/test/MC/ELF/cfi-sframe-cfi-escape.s (+43)
  • (modified) utils/bazel/llvm-project-overlay/llvm/BUILD.bazel (+1)
diff --git a/llvm/lib/MC/CMakeLists.txt b/llvm/lib/MC/CMakeLists.txt
index 1e1d0a6d00601..70c4577aeec0a 100644
--- a/llvm/lib/MC/CMakeLists.txt
+++ b/llvm/lib/MC/CMakeLists.txt
@@ -73,9 +73,10 @@ add_llvm_component_library(LLVMMC
   ${LLVM_MAIN_INCLUDE_DIR}/llvm/MC
 
   LINK_COMPONENTS
+  BinaryFormat
+  DebugInfoDWARFLowLevel
   Support
   TargetParser
-  BinaryFormat
 
   DEPENDS
   intrinsics_gen
diff --git a/llvm/lib/MC/MCSFrame.cpp b/llvm/lib/MC/MCSFrame.cpp
index d6fa54c087ca3..c00262471f123 100644
--- a/llvm/lib/MC/MCSFrame.cpp
+++ b/llvm/lib/MC/MCSFrame.cpp
@@ -8,6 +8,8 @@
 
 #include "llvm/MC/MCSFrame.h"
 #include "llvm/BinaryFormat/SFrame.h"
+#include "llvm/DebugInfo/DWARF/LowLevel/DWARFCFIProgram.h"
+#include "llvm/DebugInfo/DWARF/LowLevel/DWARFDataExtractorSimple.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCObjectFileInfo.h"
@@ -211,6 +213,148 @@ class SFrameEmitterImpl {
     return true;
   }
 
+  // Technically, the escape data could be anything, but it is commonly a dwarf
+  // CFI program. Even then, it could contain an arbitrarily complicated Dwarf
+  // expression. Following gnu-gas, look for certain common cases that could
+  // invalidate an FDE, emit a warning for those sequences, and don't generate
+  // an FDE in those cases. Allow any that are known safe. It is likely that
+  // more thorough test cases could refine this code, but it handles the most
+  // important ones compatibly with gas.
+  bool IsCFIEscapeSafe(SFrameFDE &FDE, const SFrameFRE &FRE,
+                       const MCCFIInstruction &CFI) {
+    const MCAsmInfo *AI = Streamer.getContext().getAsmInfo();
+    DWARFDataExtractorSimple data(CFI.getValues(), AI->isLittleEndian(),
+                                  AI->getCodePointerSize());
+
+    // Normally, both alignment factors are extracted from the enclosing Dwarf
+    // FDE or CIE. We don't have one here. Alignments are used for scaling
+    // factors for ops like CFA_def_cfa_offset_sf. But this particular function
+    // is only interested in registers.
+    dwarf::CFIProgram P(/* CodeAlignmentFactor */ 1,
+                        /* DataAlignmentFactor*/ 1,
+                        Streamer.getContext().getTargetTriple().getArch());
+    uint64_t Offset = 0;
+    if (P.parse(data, &Offset, CFI.getValues().size())) {
+      // Not a parsable dwarf expression. Assume the worst.
+      Streamer.getContext().reportWarning(
+          CFI.getLoc(),
+          "skipping SFrame FDE; .cfi_escape with unknown effects");
+      return false;
+    }
+
+    // This loop deals with are dwarf::CFIProgram::Instructions. Everywhere
+    // else, This file deals with MCCFIInstructions..
+    for (const dwarf::CFIProgram::Instruction &I : P) {
+      switch (I.Opcode) {
+      case dwarf::DW_CFA_nop:
+        break;
+      case dwarf::DW_CFA_val_offset: {
+        // First argument is a register. Anything that touches CFA, FP, or RA is
+        // a problem, but allow others through. As an even more special case,
+        // allow SP + 0.
+        auto Reg = I.getOperandAsUnsigned(P, 0);
+        if (!Reg) {
+          Streamer.getContext().reportWarning(
+              CFI.getLoc(),
+              "skipping SFrame FDE; .cfi_escape with unknown effects");
+        }
+        bool SPOk = true;
+        if (*Reg == SPReg) {
+          auto Opnd = I.getOperandAsSigned(P, 1);
+          if (!Opnd || *Opnd != 0)
+            SPOk = false;
+        }
+        if (!SPOk || *Reg == RAReg || *Reg == FPReg) {
+          StringRef RN = *Reg == SPReg
+                             ? "SP reg "
+                             : (*Reg == FPReg ? "FP reg " : "RA reg ");
+          Streamer.getContext().reportWarning(
+              CFI.getLoc(),
+              Twine(
+                  "skipping SFrame FDE; .cfi_escape DW_CFA_val_offset with ") +
+                  RN + Twine(*Reg));
+          return false;
+        }
+      } break;
+      case dwarf::DW_CFA_expression: {
+        // First argument is a register. Anything that touches CFA, FP, or RA is
+        // a problem, but allow others through.
+        auto Reg = I.getOperandAsUnsigned(P, 0);
+        if (!Reg) {
+          Streamer.getContext().reportWarning(
+              CFI.getLoc(),
+              "skipping SFrame FDE; .cfi_escape with unknown effects");
+          return false;
+        }
+        if (*Reg == SPReg || *Reg == RAReg || *Reg == FPReg) {
+          StringRef RN = *Reg == SPReg
+                             ? "SP reg "
+                             : (*Reg == FPReg ? "FP reg " : "RA reg ");
+          Streamer.getContext().reportWarning(
+              CFI.getLoc(),
+              Twine(
+                  "skipping SFrame FDE; .cfi_escape DW_CFA_expression with ") +
+                  RN + Twine(*Reg));
+          return false;
+        }
+      } break;
+      case dwarf::DW_CFA_GNU_args_size:
+        if (FRE.Info.getBaseRegister() != BaseReg::FP) {
+          Streamer.getContext().reportWarning(
+              CFI.getLoc(),
+              Twine("skipping SFrame FDE; .cfi_escape DW_CFA_GNU_args_size "
+                    "with non frame-pointer CFA"));
+          return false;
+        }
+        break;
+      // Cases that gas doesn't specially handle. TODO: Some of these could be
+      // analyzed and handled instead of just punting. But these are uncommon,
+      // or should be written as normal cfi directives. Some will need fixes to
+      // the scaling factor.
+      case dwarf::DW_CFA_advance_loc:
+      case dwarf::DW_CFA_offset:
+      case dwarf::DW_CFA_restore:
+      case dwarf::DW_CFA_set_loc:
+      case dwarf::DW_CFA_advance_loc1:
+      case dwarf::DW_CFA_advance_loc2:
+      case dwarf::DW_CFA_advance_loc4:
+      case dwarf::DW_CFA_offset_extended:
+      case dwarf::DW_CFA_restore_extended:
+      case dwarf::DW_CFA_undefined:
+      case dwarf::DW_CFA_same_value:
+      case dwarf::DW_CFA_register:
+      case dwarf::DW_CFA_remember_state:
+      case dwarf::DW_CFA_restore_state:
+      case dwarf::DW_CFA_def_cfa:
+      case dwarf::DW_CFA_def_cfa_register:
+      case dwarf::DW_CFA_def_cfa_offset:
+      case dwarf::DW_CFA_def_cfa_expression:
+      case dwarf::DW_CFA_offset_extended_sf:
+      case dwarf::DW_CFA_def_cfa_sf:
+      case dwarf::DW_CFA_def_cfa_offset_sf:
+      case dwarf::DW_CFA_val_offset_sf:
+      case dwarf::DW_CFA_val_expression:
+      case dwarf::DW_CFA_MIPS_advance_loc8:
+      case dwarf::DW_CFA_AARCH64_negate_ra_state_with_pc:
+      case dwarf::DW_CFA_AARCH64_negate_ra_state:
+      case dwarf::DW_CFA_LLVM_def_aspace_cfa:
+      case dwarf::DW_CFA_LLVM_def_aspace_cfa_sf:
+        Streamer.getContext().reportWarning(
+            CFI.getLoc(), "skipping SFrame FDE; .cfi_escape "
+                          "CFA expression with unknown side effects");
+        return false;
+      default:
+        // Dwarf expression was only partially valid, and user could have
+        // written anything.
+        Streamer.getContext().reportWarning(
+            CFI.getLoc(),
+            "skipping SFrame FDE; .cfi_escape with unknown effects");
+        return false;
+      }
+    }
+    return true;
+  }
+
   // Add the effects of CFI to the current FDE, creating a new FRE when
   // necessary.
   bool handleCFI(SFrameFDE &FDE, SFrameFRE &FRE, const MCCFIInstruction &CFI) {
@@ -265,8 +409,9 @@ class SFrameEmitterImpl {
       FRE = FDE.SaveState.pop_back_val();
       return true;
     case MCCFIInstruction::OpEscape:
-      // TODO: Implement. Will use FDE.
-      return true;
+      // This is a string of bytes that contains an aribtrary dwarf-expression
+      // that may or may not affect uwnind info.
+      return IsCFIEscapeSafe(FDE, FRE, CFI);
     default:
       // Instructions that don't affect the CFA, RA, and SP can be safely
       // ignored.
diff --git a/llvm/test/MC/ELF/cfi-sframe-cfi-escape-errors.s b/llvm/test/MC/ELF/cfi-sframe-cfi-escape-errors.s
new file mode 100644
index 0000000000000..1304d5b7d1f57
--- /dev/null
+++ b/llvm/test/MC/ELF/cfi-sframe-cfi-escape-errors.s
@@ -0,0 +1,37 @@
+# RUN: llvm-mc --filetype=obj --gsframe -triple x86_64 %s -o %t.o 2>&1 | FileCheck %s
+# RUN: llvm-readelf --sframe %t.o | FileCheck %s --check-prefix=NOFDES
+
+# Tests that .cfi_escape sequences that are unrepresentable in sframe warn
+# and do not produce FDEs.
+
+        .align 1024
+cfi_escape_sp:
+        .cfi_startproc
+        .long 0
+# Setting SP via other registers makes it unrepresentable in sframe
+# DW_CFA_expression,reg 0x7,length 2,DW_OP_breg6,SLEB(-8)
+# CHECK: skipping SFrame FDE; .cfi_escape DW_CFA_expression with SP reg 7
+	.cfi_escape 0x10, 0x7, 0x2, 0x76, 0x78
+        .long 0
+.cfi_endproc
+
+cfi_escape_args_sp:
+	.cfi_startproc
+	.long 0
+# DW_CFA_GNU_args_size is not OK if cfa is SP
+# CHECK: skipping SFrame FDE; .cfi_escape DW_CFA_GNU_args_size with non frame-pointer CFA
+	.cfi_escape 0x2e, 0x20
+	.cfi_endproc
+
+cfi_escape_val_offset:
+	.cfi_startproc
+	.long 0
+	.cfi_def_cfa_offset 16
+# DW_CFA_val_offset,rbp,ULEB scaled offset(16)
+# CHECK: skipping SFrame FDE;  .cfi_escape DW_CFA_val_offset with FP reg 6
+	.cfi_escape 0x14,0x6,0x2
+	.long 0
+	.cfi_endproc
+
+
+# NOFDES: Num FDEs: 0
diff --git a/llvm/test/MC/ELF/cfi-sframe-cfi-escape.s b/llvm/test/MC/ELF/cfi-sframe-cfi-escape.s
new file mode 100644
index 0000000000000..73120157b46d5
--- /dev/null
+++ b/llvm/test/MC/ELF/cfi-sframe-cfi-escape.s
@@ -0,0 +1,43 @@
+# RUN: llvm-mc --filetype=obj --gsframe -triple x86_64 %s -o %t.o
+# RUN: llvm-readelf --sframe %t.o | FileCheck %s
+
+# Tests that .cfi_escape sequences that are are ok to pass through work.
+
+        .align 1024
+cfi_escape_ok:
+	.cfi_startproc
+	.long 0
+	.cfi_def_cfa_offset 16
+        # Uninteresting register
+# DW_CFA_expression,reg 0xc,length 2,DW_OP_breg6,SLEB(-8)
+	.cfi_escape 0x10,0xc,0x2,0x76,0x78
+# DW_CFA_nop
+	.cfi_escape 0x0
+	.cfi_escape 0x0,0x0,0x0,0x0
+        # Uninteresting register
+# DW_CFA_val_offset,reg 0xc,ULEB scaled offset
+	.cfi_escape 0x14,0xc,0x4
+	.long 0
+	.cfi_endproc
+
+cfi_escape_gnu_args_fp:
+	.cfi_startproc
+	.long 0
+	.cfi_def_cfa_register 6
+	.long 0
+# DW_CFA_GNU_args_size is OK if cfa is FP
+	.cfi_escape 0x2e, 0x20
+	.cfi_endproc
+
+cfi_escape_long_expr:
+	.cfi_startproc
+	.long 0
+	.cfi_def_cfa_offset 16
+# This is a long, but valid, dwarf expression without sframe
+# implications. An FDE can still be created.
+# DW_CFA_val_offset,rcx,ULEB scaled offset(16), DW_CFA_expr,r10,length,DW_OP_deref,SLEB(-8)
+	.cfi_escape 0x14,0x2,0x2,0x10,0xa,0x2,0x76,0x78
+	.long 0
+	.cfi_endproc
+
+# CHECK: Num FDEs: 3
diff --git a/utils/bazel/llvm-project-overlay/llvm/BUILD.bazel b/utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
index 8f607c7ce087e..04bd20d69d94e 100644
--- a/utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
@@ -577,6 +577,7 @@ cc_library(
     deps = [
         ":BinaryFormat",
         ":DebugInfoCodeView",
+        ":DebugInfoDWARFLowLevel",
         ":Support",
         ":TargetParser",
         ":config",

@Sterling-Augustine Sterling-Augustine requested review from jh7370, labath, vonosmas and weiguozhi and removed request for aaronmondal, keith and rupprecht October 4, 2025 00:17
Streamer.getContext().reportWarning(
CFI.getLoc(), "skipping SFrame FDE; .cfi_escape "
"CFA expression with unknown side effects");
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The false return value will cause buildSFDE to abandon the whole function's unwinding information. I think it's too conservative. I expect the typical usage of cfi_escape looks like

...
.cfi_remember_state
.cfi_escape <some_arbitrary_experession>
...
.cfi_restore_state
...

Usually the impacted region (between cfi_remember_state and cfi_restore_state) is small, and we can still build the precise unwinding information outside of this region.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is very conservative. But unfortunately, the format doesn't have a way to represent "no data" in fres. So there isn't a good way to mark a region as uncovered.

One could, I suppose, split the FDE itself, leaving the invalid region uncovered that way. But that is a big complication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the behavior of gcc/gas in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gas recognizes a small number of common but important cases, and allows those through. For anything it doesn't recognize it omits the entire FDE. The set is hard coded.

We have a strong opportunity here to be much less conservative--and already are with long expressions. It's actually easier for us to accept unproblematic long expressions than to bail out like gas does. We definitely have room for improvement and I expect to be so over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's OK to me.

I think the capability to skip the escape region is more important than interpreting more expressions. But it needs improvement to FRE format, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolve it attract @ibhagatgnu 's attention.

Copy link

@ibhagatgnu ibhagatgnu Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. SFrame format does not support holes. The thinking was format with no holes is simpler to manage and reason about, and there was no strong reason for pursuing that.

As for the patterns for cfi_escape, yes, gas support has itself evolved over time and we have accommodated simple patterns.

# RUN: llvm-mc --filetype=obj --gsframe -triple x86_64 %s -o %t.o
# RUN: llvm-readelf --sframe %t.o | FileCheck %s

# Tests that .cfi_escape sequences that are are ok to pass through work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant are

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bazel "Peripheral" support tier build system: utils/bazel llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants