Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock when catching SIGSEGV. #47

Closed
amosbird opened this issue Nov 28, 2016 · 2 comments
Closed

deadlock when catching SIGSEGV. #47

amosbird opened this issue Nov 28, 2016 · 2 comments

Comments

@amosbird
Copy link

Here is the backtrace printed by gdb,

#0  0x000000000287790a in sys_futex (a=0x4087f80 <tcmalloc::Static::central_cache_+1216>, o=128, v=2, t=0x7febee032860) at /home/amos/repos/dsql/src/gutil/linux_syscall_support.h:2713
#1  0x0000000002877a4b in base::internal::SpinLockDelay (w=0x4087f80 <tcmalloc::Static::central_cache_+1216>, value=2, loop=1073) at /home/amos/repos/dsql/src/gutil/spinlock_linux-inl.h:88
#2  0x00000000028af9a9 in SpinLock::SlowLock() ()
#3  0x0000000002967cd9 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) ()
#4  0x00000000029753b3 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) ()
#5  0x000000000298236b in tc_malloc ()
#6  0x00000000028dca86 in bfd_malloc ()
#7  0x00000000028dd2bb in bfd_follow_gnu_debuglink ()
#8  0x0000000002954d37 in find_line ()
#9  0x0000000002954e97 in _bfd_dwarf2_find_nearest_line ()
#10 0x00000000028efefa in _bfd_elf_find_nearest_line ()
#11 0x00000000011b2858 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libbfd>::find_in_section (this=0x7febee032ff8, addr=213107247788, base_addr=213106294784, fobj=..., section=0x12e02ce0, result=...) at /home/amos/repo
s/dsql/src/util/backward.hpp:1084
#12 0x00000000011b26f9 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libbfd>::find_in_section_trampoline (section=0x12e02ce0, data=0x7febee032d60) at /home/amos/repos/dsql/src/util/backward.hpp:1060
#13 0x00000000028de4dc in bfd_map_over_sections ()
#14 0x00000000011b266c in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libbfd>::find_symbol_details (this=0x7febee032ff8, fobj=..., addr=0x319e2e8aac <clone+108>, base_addr=0x319e200000) at /home/amos/repos/dsql/src/util/backward.hpp:1048
#15 0x00000000011b203e in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libbfd>::resolve (this=0x7febee032ff8, trace=...) at /home/amos/repos/dsql/src/util/backward.hpp:817
#16 0x00000000011b51d1 in backward::Printer::print<backward::StackTrace> (this=0x7febee032ff0, st=..., os=0x319e58f120 <_IO_2_1_stderr_>) at /home/amos/repos/dsql/src/util/backward.hpp:1747
#17 0x00000000011b3ee4 in backward::SignalHandling::sig_handler (info=0x7febee033470, _ctx=0x7febee033340) at /home/amos/repos/dsql/src/util/backward.hpp:1959
#18 0x00007fecddbe8852 in os::Linux::chained_handler(int, siginfo*, void*) () from /home/amos/repos/dsql/install/postgres/jdk/jre/lib/amd64/server/libjvm.so
#19 0x00007fecddbeec86 in JVM_handle_linux_signal () from /home/amos/repos/dsql/install/postgres/jdk/jre/lib/amd64/server/libjvm.so
#20 0x00007fecddbe54b3 in signalHandler(int, siginfo*, void*) () from /home/amos/repos/dsql/install/postgres/jdk/jre/lib/amd64/server/libjvm.so
#21 <signal handler called>

@bombela
Copy link
Owner

bombela commented Nov 28, 2016

This looks like the memory fault happened in the memory allocator (tcmalloc in your case) while it held some mutex. backward-cpp then tried to allocate memory (indirectly via the bfd library here), and tmcalloc dead-locked on the mutex.

See what I have written in the past about it: #4

The best solution is to make backward-cpp never allocated any memory. This is really hard to achieve in practice. First one would have to write a DWARF interpreter that never allocates, that's some serious work (this might exists already?).

Another solution is to spawn a "monitor" process before doing any serious work. When a fatal signal is raised, the monitored process communicates it to the monitor process, which then acts similar as a traditional debugger.

For what I understand Google breakpad is doing a combination of safe in-process dump generation + external process to process the dump. Its multi-plateform! Of course this is a bit more work to distribute/setup.

@amosbird
Copy link
Author

ok, thanks for the explanation :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants