Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot catch exceptions on Apple M1 #7541

Open
hahnjo opened this issue Mar 16, 2021 · 18 comments
Open

Cannot catch exceptions on Apple M1 #7541

hahnjo opened this issue Mar 16, 2021 · 18 comments

Comments

@hahnjo
Copy link
Member

hahnjo commented Mar 16, 2021

Describe the bug

It's not possible to catch exceptions in the interactive root prompt on Apple Silicon, nor does TRint take care of uncaught exceptions.

Expected behavior

The user should be able to catch exceptions, or at least the fallback handler should prevent process termination.

To Reproduce

The most simple examples are

root [0] try { throw 1; } catch (...) { }
libc++abi.dylib: terminating with uncaught exception of type int

and

root [0] throw 1;
libc++abi.dylib: terminating with uncaught exception of type int

(which should be handled in TRint::HandleTermInput()).

Setup

ROOT 6.25/01 on macphsft25

Additional context

Plenty:

Edit:

Same in 6.22/07, tested on macphsft24

@pcanal
Copy link
Member

pcanal commented Mar 16, 2021

This reminds me of https://sft.its.cern.ch/jira/browse/ROOT-8544 and https://sft.its.cern.ch/jira/browse/ROOT-8523 which is the end where fixed by a7b0b3e.

Most likely the way the interpreter sets up the stack frames does not match the expectation of the exception handler (usually implemented in (g)libc).

It is very plausible to be a problem similar to the one leading to the "can not reallocate code" errors. So I see two plausible path forward: (a) fix the reallocate code error and hopefully it also fix this (b) installed a debug version of (g)libc and trace/debug the exception handlers handling of this case ...

@hahnjo hahnjo added this to the 6.24/00 milestone Mar 17, 2021
@hahnjo
Copy link
Member Author

hahnjo commented Mar 18, 2021

Ugh, this one might become tricky, even LLVM upstream is unable to handle exceptions during JIT: I tested the most basic

int main() {
  try {
    throw 1;
  } catch (...) { }
}

compiled with ./bin/clang++ -S -emit-llvm throw.cc and interpreted using ./bin/lli throw.ll, resulting in

libc++abi.dylib: terminating with uncaught exception of type int
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ./bin/lli throw.ll
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lli                      0x000000010104824c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  lli                      0x00000001010471b4 llvm::sys::RunSignalHandlers() + 128
2  lli                      0x00000001010488c4 SignalHandler(int) + 304
3  libsystem_platform.dylib 0x00000001826b1c44 _sigtramp + 56
4  libsystem_pthread.dylib  0x0000000182669c24 pthread_kill + 292
5  libsystem_c.dylib        0x00000001825b1864 abort + 104
6  libc++abi.dylib          0x0000000182629cf8 __cxxabiv1::__aligned_malloc_with_fallback(unsigned long) + 0
7  libc++abi.dylib          0x000000018261ae4c demangling_unexpected_handler() + 0
8  libobjc.A.dylib          0x00000001825136d8 _objc_terminate() + 160
9  libc++abi.dylib          0x00000001826290e0 std::__terminate(void (*)()) + 20
10 libc++abi.dylib          0x000000018262beb0 __cxa_get_exception_ptr + 0
11 libc++abi.dylib          0x000000018262be5c __cxxabiv1::exception_cleanup_func(_Unwind_Reason_Code, _Unwind_Exception*) + 0
12 libc++abi.dylib          0x000000010224003c __cxxabiv1::exception_cleanup_func(_Unwind_Reason_Code, _Unwind_Exception*) + 18446744071557956064
13 lli                      0x0000000100c8f680 llvm::MCJIT::runFunction(llvm::Function*, llvm::ArrayRef<llvm::GenericValue>) + 768
14 lli                      0x0000000100bd574c llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, char const* const*) + 1464
15 lli                      0x000000010070e4d8 main + 8576
16 libdyld.dylib            0x0000000182685f34 start + 4
zsh: abort      ./bin/lli throw.ll

It could be that Apple changed something about their exception handling ABI for arm64 because it's not even possible to build my own libunwind to debug the unwinding as I did for ROOT-10703 - it complains about invalid entries. However clang++ emits the right object files, at least when compiling to an executable directly. So maybe it's at runtime? I'll need to think about this...

@pcanal
Copy link
Member

pcanal commented Mar 18, 2021

even LLVM upstream is unable to handle exceptions during JIT

I am not too surprised. it is the same code (+/- a few things ;)).

So maybe it's at runtime? I

That is what it was last time (in JIT code the instructions space being allocated in an unexpected order).

@hahnjo
Copy link
Member Author

hahnjo commented Mar 23, 2021

even LLVM upstream is unable to handle exceptions during JIT

I am not too surprised. it is the same code (+/- a few things ;)).

True, but I had hoped that it was fixed in current trunk and it was just a matter of finding and backporting a change or two to make it work. Anyway my investigations aren't going anywhere right now, so I've filed a bug at https://bugs.llvm.org/show_bug.cgi?id=49692

@Axel-Naumann
Copy link
Member

Bad news for us, from Lang on https://bugs.llvm.org/show_bug.cgi?id=49692 :

This one is on me -- I'll look into supporting compact-unwind, but won't have time to get to it for a couple of weeks.

How urgent is this for you, and are you using ORCv2? Ideally I'll just implement this in JITLink, but that won't help if you're on MCJIT or ORCv1.

I guess disabling exceptions for M1 isn't an option either. But we could prevent the cling throw from happening on M1, until this is fixed. Would that make sense? Won't help for RDF's exceptions, though...

@Axel-Naumann Axel-Naumann removed this from the 6.24/00 milestone Mar 30, 2021
Axel-Naumann added a commit to Axel-Naumann/root that referenced this issue Mar 30, 2021
As llvm JIT cannot catch exceptions on Apple M1 (see
root-project#7541) cling
should throw less. This is a hack to reduce the impact
a bit.
Axel-Naumann added a commit that referenced this issue Mar 31, 2021
As llvm JIT cannot catch exceptions on Apple M1 (see
#7541) cling
should throw less. This is a hack to reduce the impact
a bit.
Axel-Naumann added a commit to Axel-Naumann/root that referenced this issue Mar 31, 2021
As llvm JIT cannot catch exceptions on Apple M1 (see
root-project#7541) cling
should throw less. This is a hack to reduce the impact
a bit.

(cherry picked from commit f7a3eeb)
FonsRademakers pushed a commit to root-project/cling that referenced this issue Mar 31, 2021
As llvm JIT cannot catch exceptions on Apple M1 (see
root-project/root#7541) cling
should throw less. This is a hack to reduce the impact
a bit.
Axel-Naumann added a commit that referenced this issue Apr 12, 2021
As llvm JIT cannot catch exceptions on Apple M1 (see
#7541) cling
should throw less. This is a hack to reduce the impact
a bit.

(cherry picked from commit f7a3eeb)
chrisburr added a commit to chrisburr/root-feedstock that referenced this issue Apr 19, 2021
hahnjo added a commit to hahnjo/root that referenced this issue Jan 27, 2022
As ROOT cannot catch JITted exceptions on Apple Silicon (see the bug
report root-project#7541 for context),
some tests are currently failing there due to the use of exceptions in
the compatibility code for RDataSource. Implement this with a boolean
flag and a pattern inspired by errno.
hahnjo added a commit to hahnjo/root that referenced this issue Jan 28, 2022
As ROOT cannot catch JITted exceptions on Apple Silicon (see the bug
report root-project#7541 for context),
some tests are currently failing there due to the use of exceptions in
the compatibility code for RDataSource. Implement this with a boolean
flag and a pattern inspired by errno.
hahnjo added a commit to hahnjo/roottest that referenced this issue Jan 28, 2022
hahnjo added a commit that referenced this issue Jan 28, 2022
As ROOT cannot catch JITted exceptions on Apple Silicon (see the bug
report #7541 for context),
some tests are currently failing there due to the use of exceptions in
the compatibility code for RDataSource. Implement this with a boolean
flag and a pattern inspired by errno.
hahnjo added a commit to root-project/roottest that referenced this issue Jan 28, 2022
@hahnjo
Copy link
Member Author

hahnjo commented Mar 11, 2022

After LLVM switched from Bugzilla to GitHub issues, here is the link to the migrated issue: llvm/llvm-project#49036

hahnjo added a commit to hahnjo/root that referenced this issue May 11, 2022
hahnjo added a commit that referenced this issue May 11, 2022
FonsRademakers pushed a commit to root-project/cling that referenced this issue May 11, 2022
hahnjo added a commit to hahnjo/root that referenced this issue May 11, 2022
hahnjo added a commit to hahnjo/root that referenced this issue May 11, 2022
hahnjo added a commit to hahnjo/root that referenced this issue May 11, 2022
hahnjo added a commit that referenced this issue May 11, 2022
hahnjo added a commit that referenced this issue May 11, 2022
@vgvassilev
Copy link
Member

@msneubauer ran some tests on OSX 13 and it seems this issue is fixed. Thanks a lot, Mark! Here is what he ran:

cat test_exceptions.C
void test_exceptions() {
  try {
    std::cout << "got here\n";
    throw 1;
  } catch (...) { }
}
wirelessprv-10-193-242-21:tmp msn$ root.exe -l -b -q -e '.x test_exceptions.C'

got here

This is based on a source build of the root_v6.26.06.source.tar.gz tarball.

cc: @hahnjo, @lhames

@hahnjo
Copy link
Member Author

hahnjo commented Nov 7, 2022

Hm, this is surprising because fixing libunwind was only part of the story, I thought there are at least two other missing points as outlined in llvm/llvm-project#49036. Could somebody with a setup of macOS 13 on Apple Silicon test if the catch block is actually executed, by moving the printout there?

@vgvassilev
Copy link
Member

Our best chance is @msneubauer I think.

@msneubauer
Copy link

@hahnjo @vgvassilev

$ cat test_exceptions.C
void test_exceptions() {
try {
std::cout << "got here\n";
throw 1;
} catch (...) { std::cout << "got here too\n"; }
}

$ root.exe -l -b -q -e '.x test_exceptions.C'

got here
got here too

@vgvassilev
Copy link
Member

Awesome, @hahnjo can we close this as resolved now?

@hahnjo
Copy link
Member Author

hahnjo commented Nov 16, 2022

Possibly for now. Though I wouldn't be too surprised if it breaks again on a future LLVM upgrade (maybe even llvm13) due to the missing things in LLVM and what Lang wrote in the upstream issue...

@Axel-Naumann
Copy link
Member

Before we close this I'd like to see the roottest / gtest tests re-enabled that were disabled because of this issue. I cannot find a registry of the changes we did because of this; do we need to grep for -i arm since M1 came out to find all occurrences? :-(

@vepadulano
Copy link
Member

vepadulano commented Sep 19, 2023

This recent roottest failure looks related to this issue https://lcgapp-services.cern.ch/root-jenkins/job/roottest-pullrequests-build/13090/testReport/projectroot.roottest.python/regression/roottest_python_regression_regression/

(it's a test that was never run before, I am resurrecting it, and that's why the failure was triggered only now)

@guitargeek
Copy link
Contributor

@hahnjo, @vepadulano, what's the status here? This looks like one of these issues that might have been resolved by the recent LLVM upgrade

@hahnjo
Copy link
Member Author

hahnjo commented Dec 19, 2023

I just built a fresh version of master and v6-26-00-patches on macphsft24 with macOS 14.2; the test posted in #7541 (comment) still fails. It's not clear to me how it could work in one setup in the past, but it's certainly not working out-of-box in the default build configuration.

@dpiparo
Copy link
Member

dpiparo commented Apr 25, 2024

I confirm it's still broken for llvm16 in root master.

@lhames
Copy link

lhames commented Apr 25, 2024

Are you using the unw_add/remove_find_dynamic_unwind_sections sequence described in llvm/llvm-project#49036 (comment) ? That's required on macOS 14.0 and later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants