Skip to content

Conversation

mstorsjo
Copy link
Member

@mstorsjo mstorsjo commented Oct 8, 2025

For this specific case, when catching a pointer data type, by reference, Clang generates a special code pattern, which directly accesses the exception data by skipping past the _Unwind_Exception manually (rather than using the return value of __cxa_begin_catch).

On most platforms, _Unwind_Exception is 32 bytes, but in some configurations it's different. (ARM EHABI is one preexisting case.) In the case of SEH, it's also different - it is 48 bytes in 32 bit mode and 64 bytes in 64 bit mode. (See the SEH ifdef in _Unwind_Exception in clang/lib/Headers/unwind.h.)

Handle this case in TargetCodeGenInfo::getSizeOfUnwindException, fixing the code generation for catching pointers by reference.

This fixes mstorsjo/llvm-mingw#522.

For this specific case, when catching a pointer data type, by
reference, Clang generates a special code pattern, which directly
accesses the exception data by skipping past the `_Unwind_Exception`
manually (rather than using the return value of `__cxa_begin_catch`).

On most platforms, `_Unwind_Exception` is 32 bytes, but in some
configurations it's different. (ARM EHABI is one preexisting case.)
In the case of SEH, it's also different - it is 48 bytes in 32 bit
mode and 64 bytes in 64 bit mode.

Handle this case in `TargetCodeGenInfo::getSizeOfUnwindException`,
fixing the code generation for catching pointers by reference.

This fixes mstorsjo/llvm-mingw#522.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Martin Storsjö (mstorsjo)

Changes

For this specific case, when catching a pointer data type, by reference, Clang generates a special code pattern, which directly accesses the exception data by skipping past the _Unwind_Exception manually (rather than using the return value of __cxa_begin_catch).

On most platforms, _Unwind_Exception is 32 bytes, but in some configurations it's different. (ARM EHABI is one preexisting case.) In the case of SEH, it's also different - it is 48 bytes in 32 bit mode and 64 bytes in 64 bit mode. (See the SEH ifdef in _Unwind_Exception in clang/lib/Headers/unwind.h.)

Handle this case in TargetCodeGenInfo::getSizeOfUnwindException, fixing the code generation for catching pointers by reference.

This fixes mstorsjo/llvm-mingw#522.


Full diff: https://github.com/llvm/llvm-project/pull/162546.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/TargetInfo.cpp (+2)
  • (modified) clang/test/CodeGenCXX/sizeof-unwind-exception.cpp (+8)
diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp
index 1e58c3f217812..342a3af0ac1ee 100644
--- a/clang/lib/CodeGen/TargetInfo.cpp
+++ b/clang/lib/CodeGen/TargetInfo.cpp
@@ -82,6 +82,8 @@ TargetCodeGenInfo::~TargetCodeGenInfo() = default;
 // If someone can figure out a general rule for this, that would be great.
 // It's probably just doomed to be platform-dependent, though.
 unsigned TargetCodeGenInfo::getSizeOfUnwindException() const {
+  if (getABIInfo().getCodeGenOpts().hasSEHExceptions())
+    return getABIInfo().getDataLayout().getPointerSizeInBits() > 32 ? 64 : 48;
   // Verified for:
   //   x86-64     FreeBSD, Linux, Darwin
   //   x86-32     FreeBSD, Linux, Darwin
diff --git a/clang/test/CodeGenCXX/sizeof-unwind-exception.cpp b/clang/test/CodeGenCXX/sizeof-unwind-exception.cpp
index 4fb977a5367e7..e40b2d7ae43ea 100644
--- a/clang/test/CodeGenCXX/sizeof-unwind-exception.cpp
+++ b/clang/test/CodeGenCXX/sizeof-unwind-exception.cpp
@@ -3,6 +3,8 @@
 // RUN: %clang_cc1 -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions %s -O2 -o - | FileCheck %s --check-prefix=ARM-DARWIN
 // RUN: %clang_cc1 -triple arm-unknown-gnueabi -emit-llvm -fcxx-exceptions -fexceptions %s -O2 -o - | FileCheck %s --check-prefix=ARM-EABI
 // RUN: %clang_cc1 -triple mipsel-unknown-unknown -emit-llvm -fcxx-exceptions -fexceptions %s -O2 -o - | FileCheck %s --check-prefix=MIPS
+// RUN: %clang_cc1 -triple x86_64-windows-gnu -emit-llvm -fcxx-exceptions -fexceptions -exception-model=seh %s -O2 -o - | FileCheck %s --check-prefix=MINGW-X86-64
+// RUN: %clang_cc1 -triple thumbv7-windows-gnu -emit-llvm -fcxx-exceptions -fexceptions -exception-model=seh %s -O2 -o - | FileCheck %s --check-prefix=MINGW-ARMV7
 
 void foo();
 void test() {
@@ -25,9 +27,15 @@ void test() {
 // ARM-EABI-NEXT:   [[T1:%.*]] = getelementptr i8, ptr [[EXN]], i32 88
 // MIPS:            [[T0:%.*]] = tail call ptr @__cxa_begin_catch(ptr [[EXN:%.*]]) [[NUW:#[0-9]+]]
 // MIPS-NEXT:       [[T1:%.*]] = getelementptr i8, ptr [[EXN]], i32 24
+// MINGW-X86-64:     [[T0:%.*]] = tail call ptr @__cxa_begin_catch(ptr [[EXN:%.*]]) [[NUW:#[0-9]+]]
+// MINGW-X86-64-NEXT:[[T1:%.*]] = getelementptr i8, ptr [[EXN]], i64 64
+// MINGW-ARMV7:      [[T0:%.*]] = tail call arm_aapcs_vfpcc ptr @__cxa_begin_catch(ptr [[EXN:%.*]]) [[NUW:#[0-9]+]]
+// MINGW-ARMV7-NEXT: [[T1:%.*]] = getelementptr i8, ptr [[EXN]], i32 48
 
 // X86-64: attributes [[NUW]] = { nounwind }
 // X86-32: attributes [[NUW]] = { nounwind }
 // ARM-DARWIN: attributes [[NUW]] = { nounwind }
 // ARM-EABI: attributes [[NUW]] = { nounwind }
 // MIPS: attributes [[NUW]] = { nounwind }
+// MINGW-X86-64: attributes [[NUW]] = { nounwind }
+// MINGW-ARMV7: attributes [[NUW]] = { nounwind }

@efriedma-quic
Copy link
Collaborator

(See the SEH ifdef in _Unwind_Exception in clang/lib/Headers/unwind.h.)

The header also checks that __USING_SJLJ_EXCEPTIONS__ is not defined. Is that just redundant?

@mstorsjo
Copy link
Member Author

mstorsjo commented Oct 8, 2025

(See the SEH ifdef in _Unwind_Exception in clang/lib/Headers/unwind.h.)

The header also checks that __USING_SJLJ_EXCEPTIONS__ is not defined. Is that just redundant?

Possibly, yes. I think those ifdefs stem from the corresponding cases in GCC's unwind.h, and we've tried to stay compatible with that. I'm not sure if it's possible to somehow end up having both __USING_SJLJ_EXCEPTIONS__ and __SEH__ defined at the same time there (but if it is, the interpretation is that SJLJ takes precedence). Within Clang I don't think that's possible; the ExceptionHandling field in CodeGenOptions is an enum that can only have one of the values at a time at least.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjmccall Can you briefly look to make sure this makes sense? (This is Itanium unwind on Windows.)

For why we're doing this, see 5add20c ... but I don't really have enough context to say much beyond the explanation in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incompatible behavior when catching exceptions
3 participants