Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't try to print stack trace if we're inside malloc #74

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davispuh
Copy link
Contributor

If signal handler is called inside malloc we might deadlock, issue #47

Got this deadlock with jemalloc

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f3a01900c55 in __GI___pthread_mutex_lock (mutex=0x7f39fd400310) at ../nptl/pthread_mutex_lock.c:80
#2  0x00007f39ff225327 in je_tcache_alloc_small_hard (tcache=tcache@entry=0x7f39fd018000, tbin=0x130, tbin@entry=0x7f39fd018068, binind=binind@entry=2) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/tcache.c:72
#3  0x00007f39ff207f8e in je_tcache_alloc_small (zero=false, size=2, tcache=0x7f39fd018000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/tcache.h:303
#4  je_arena_malloc (try_tcache=true, zero=false, size=2, arena=0x0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/arena.h:957
#5  je_imalloct (arena=0x0, try_tcache=true, size=2) at include/jemalloc/internal/jemalloc_internal.h:771
#6  je_imalloc (size=2) at include/jemalloc/internal/jemalloc_internal.h:780
#7  malloc (size=size@entry=28) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/jemalloc.c:929
#8  0x00007f3a00714a9e in operator new (sz=28) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/gcc-5.4.0/libstdc++-v3/libsupc++/new_op.cc:50
#9  0x00007f3a007677ef in __gnu_cxx::new_allocator<char>::allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:104
#10 std::string::_Rep::_S_create (__capacity=__capacity@entry=3, __old_capacity=__old_capacity@entry=0, __alloc=...) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:1051
#11 0x00007f3a00767a3e in std::string::_S_construct<char const*> (__beg=0x7f39fcfff010 "#19", __end=<optimized out>, __a=...)
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:582
#12 0x00007f3a0076aafd in std::string::_S_construct_aux<char const*> (__a=..., __end=<optimized out>, __beg=0x7f39fcfff010 "#19")
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4141
#13 std::string::_S_construct<char const*> (__a=..., __end=<optimized out>, __beg=0x7f39fcfff010 "#19") at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4162
#14 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7f39fcfff000, __s=0x7f39fcfff010 "#19", __a=...)
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:659
#15 0x00000000009055e8 in backward::Printer::print_trace (this=0x7f39fcfff9e0, os="Stack trace (most recent call last):\n", trace=..., colorize=...) at backward/backward.hpp:1779
#16 0x00000000009071eb in backward::Printer::print<backward::StackTrace> (this=this@entry=0x7f39fcfff9e0, st=..., os="Stack trace (most recent call last):\n") at backward/backward.hpp:1745
#17 0x0000000000b4512b in backward::SignalHandling::sig_handler (info=0x7f39fcfffbf0, _ctx=<optimized out>) at backward/backward.hpp:1965
#18 <signal handler called>
#19 je_bitmap_set (bit=18446744073709551553, binfo=0x7f39ff4309f8 <je_arena_bin_info+248>, bitmap=0x7f39fd394010) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/bitmap.h:105
#20 je_bitmap_sfu (binfo=0x7f39ff4309f8 <je_arena_bin_info+248>, bitmap=0x7f39fd394010) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/bitmap.h:140
#21 arena_run_reg_alloc (bin_info=0x7f39ff4309d0 <je_arena_bin_info+208>, run=0x7f39fd394000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/arena.c:291
#22 je_arena_tcache_fill_small (arena=0x7f39fd4000c0, tbin=tbin@entry=0x7f39fd018068, binind=binind@entry=2, prof_accumbytes=prof_accumbytes@entry=0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/arena.c:1479
#23 0x00007f39ff225327 in je_tcache_alloc_small_hard (tcache=tcache@entry=0x7f39fd018000, tbin=tbin@entry=0x7f39fd018068, binind=binind@entry=2) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/tcache.c:72
#24 0x00007f39ff207f8e in je_tcache_alloc_small (zero=false, size=32, tcache=0x7f39fd018000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/tcache.h:303
#25 je_arena_malloc (try_tcache=true, zero=false, size=2, arena=0x0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/arena.h:957
#26 je_imalloct (arena=0x0, try_tcache=true, size=2) at include/jemalloc/internal/jemalloc_internal.h:771
#27 je_imalloc (size=2) at include/jemalloc/internal/jemalloc_internal.h:780
#28 malloc (size=size@entry=24) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/jemalloc.c:929
#29 0x00007f3a00714a9e in operator new (sz=sz@entry=24) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/gcc-5.4.0/libstdc++-v3/libsupc++/new_op.cc:50

This PR tries to avoid that by skipping backtrace output in such case.

@@ -500,6 +500,7 @@ class StackTraceImpl {
size_t load_from(void*, size_t=0) { return 0; }
unsigned thread_id() const { return 0; }
void skip_n_firsts(size_t) { }
bool can_safely_backtrace() { return true; }
Copy link
Owner

@bombela bombela Nov 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about returning an enum { MAYBE, NO, DONTKNOW } ?

MAYBE: more yes than no, but we can never prove anything anyway.
NO: definitively not safe
DONTKNOW: didnt even try

@travisdowns
Copy link
Contributor

I just ran into this exact issue today. One option to still have a nice backtrace (which, frankly is awesone: it changes how I develop), would be to have a static array of char in backwards.cpp or in backwards itself and use that to at least give a basic stack in this case. Admittedly that is hard since new and malloc are deeply ingrained in C++ and so avoiding new and malloc entirely can be hard (replacing the new operation for some classes might help).

Just my 2c since I ran into this (I sat around forevre waiting for my process to finish only to fire up gdb and find this stack above).

@bombela
Copy link
Owner

bombela commented Dec 4, 2017 via email

@travisdowns
Copy link
Contributor

travisdowns commented Dec 5, 2017

@bombela - yup, that's true, although note from the stack above that the stack was successfully taken, so at least those calls went through (but that could just be random, perhaps it simply happens they do few enough allocations that malloc doesn't need to take any locks and you don't deadlock).

Probably just spitting out a message like "we handled this but are skipping the stack trace to avoid deadlock" is best. The call write is safe to call from a signal handler, so at least on Linux it can be printed to stderr. At least then the user will get a core and can use the debugger to look at the backtrace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants