deadlock in je_prof_boot2 #585

tamird · 2017-01-23T21:23:42Z

Occurs when building with profiling enabled in alpine linux with gcc 6.2. Full repro (I've left in the entire configure output in case that's useful):

$ docker run -it alpine:3.5 /bin/sh
/ # apk add --no-cache curl gcc g++ make
...
OK: 160 MiB in 31 packages
/ # curl -fsSL https://github.com/jemalloc/jemalloc/releases/download/4.4.0/jemalloc-4.4.0.tar.bz2 | tar jx
/ # cd jemalloc-4.4.0/
/jemalloc-4.4.0 # ./configure --enable-prof
checking for xsltproc... false
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether compiler is cray... no
checking whether compiler supports -std=gnu11... yes
checking whether compiler supports -Wall... yes
checking whether compiler supports -Werror=declaration-after-statement... yes
checking whether compiler supports -Wshorten-64-to-32... no
checking whether compiler supports -Wsign-compare... yes
checking whether compiler supports -pipe... yes
checking whether compiler supports -g3... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking whether byte ordering is bigendian... no
checking size of void *... 8
checking size of int... 4
checking size of long... 8
checking size of long long... 8
checking size of intmax_t... 8
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking whether pause instruction is compilable... yes
checking for ar... ar
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking whether malloc_usable_size definition can use const argument... no
checking for library containing log... none required
checking whether __attribute__ syntax is compilable... yes
checking whether compiler supports -fvisibility=hidden... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... no
checking whether tls_model attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... no
checking whether alloc_size attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... no
checking whether format(gnu_printf, ...) attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... no
checking whether format(printf, ...) attribute is compilable... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for ranlib... ranlib
checking for ld... /usr/bin/ld
checking for autoconf... false
checking for memalign... yes
checking for valloc... yes
checking whether compiler supports -O3... yes
checking whether compiler supports -funroll-loops... yes
checking unwind.h usability... yes
checking unwind.h presence... yes
checking for unwind.h... yes
checking for _Unwind_Backtrace in -lgcc... yes
checking configured backtracing method... libgcc
checking for sbrk... yes
checking whether utrace(2) is compilable... no
checking whether valgrind is compilable... no
checking whether a program using __builtin_unreachable is compilable... yes
checking whether a program using __builtin_ffsl is compilable... yes
checking LG_PAGE... 12
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking whether pthread_atfork(3) is compilable... yes
checking for library containing clock_gettime... none required
checking whether clock_gettime(CLOCK_MONOTONIC_COARSE, ...) is compilable... yes
checking whether clock_gettime(CLOCK_MONOTONIC, ...) is compilable... yes
checking whether mach_absolute_time() is compilable... no
checking whether compiler supports -Werror... yes
checking whether syscall(2) is compilable... yes
checking for secure_getenv... no
checking for issetugid... yes
checking for _malloc_thread_cleanup... no
checking for _pthread_mutex_init_calloc_cb... no
checking for TLS... yes
checking whether C11 atomics is compilable... yes
checking whether atomic(9) is compilable... no
checking whether Darwin OSAtomic*() is compilable... no
checking whether madvise(2) is compilable... yes
checking whether madvise(..., MADV_FREE) is compilable... yes
checking whether madvise(..., MADV_DONTNEED) is compilable... yes
checking whether madvise(..., MADV_[NO]HUGEPAGE) is compilable... yes
checking whether to force 32-bit __sync_{add,sub}_and_fetch()... no
checking whether to force 64-bit __sync_{add,sub}_and_fetch()... no
checking for __builtin_clz... yes
checking whether Darwin os_unfair_lock_*() is compilable... no
checking whether Darwin OSSpin*() is compilable... no
checking whether glibc malloc hook is compilable... no
checking whether glibc memalign hook is compilable... no
checking whether pthreads adaptive mutexes is compilable... no
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating jemalloc.pc
config.status: creating doc/html.xsl
config.status: creating doc/manpages.xsl
config.status: creating doc/jemalloc.xml
config.status: creating include/jemalloc/jemalloc_macros.h
config.status: creating include/jemalloc/jemalloc_protos.h
config.status: creating include/jemalloc/jemalloc_typedefs.h
config.status: creating include/jemalloc/internal/jemalloc_internal.h
config.status: creating test/test.sh
config.status: creating test/include/test/jemalloc_test.h
config.status: creating config.stamp
config.status: creating bin/jemalloc-config
config.status: creating bin/jemalloc.sh
config.status: creating bin/jeprof
config.status: creating include/jemalloc/jemalloc_defs.h
config.status: creating include/jemalloc/internal/jemalloc_internal_defs.h
config.status: creating test/include/test/jemalloc_test_defs.h
config.status: executing include/jemalloc/internal/private_namespace.h commands
config.status: executing include/jemalloc/internal/private_unnamespace.h commands
config.status: executing include/jemalloc/internal/public_symbols.txt commands
config.status: executing include/jemalloc/internal/public_namespace.h commands
config.status: executing include/jemalloc/internal/public_unnamespace.h commands
config.status: executing include/jemalloc/internal/size_classes.h commands
config.status: executing include/jemalloc/jemalloc_protos_jet.h commands
config.status: executing include/jemalloc/jemalloc_rename.h commands
config.status: executing include/jemalloc/jemalloc_mangle.h commands
config.status: executing include/jemalloc/jemalloc_mangle_jet.h commands
config.status: executing include/jemalloc/jemalloc.h commands
===============================================================================
jemalloc version   : 4.4.0-0-gf1f76357313e7dcad7262f17a48ff0a2e005fcdc
library revision   : 2

CONFIG             : --enable-prof
CC                 : gcc
CFLAGS             : -std=gnu11 -Wall -Werror=declaration-after-statement -Wsign-compare -pipe -g3 -fvisibility=hidden -O3 -funroll-loops
EXTRA_CFLAGS       :
CPPFLAGS           :  -D_GNU_SOURCE -D_REENTRANT
LDFLAGS            :
EXTRA_LDFLAGS      :
LIBS               :  -lgcc -lpthread
RPATH_EXTRA        :

XSLTPROC           : false
XSLROOT            :

PREFIX             : /usr/local
BINDIR             : /usr/local/bin
DATADIR            : /usr/local/share
INCLUDEDIR         : /usr/local/include
LIBDIR             : /usr/local/lib
MANDIR             : /usr/local/share/man

srcroot            :
abs_srcroot        : /jemalloc-4.4.0/
objroot            :
abs_objroot        : /jemalloc-4.4.0/

JEMALLOC_PREFIX    :
JEMALLOC_PRIVATE_NAMESPACE
                   : je_
install_suffix     :
malloc_conf        :
autogen            : 0
cc-silence         : 1
debug              : 0
code-coverage      : 0
stats              : 1
prof               : 1
prof-libunwind     : 0
prof-libgcc        : 1
prof-gcc           : 0
tcache             : 1
fill               : 1
utrace             : 0
valgrind           : 0
xmalloc            : 0
munmap             : 0
lazy_lock          : 0
tls                : 1
cache-oblivious    : 1
===============================================================================
/jemalloc-4.4.0 # make -j$(nproc) build_lib_static
...
/jemalloc-4.4.0 # printf '#include<stdlib.h>\nint main() { malloc(1); return 0; }' > main.c
/jemalloc-4.4.0 # gcc -static main.c -Llib -ljemalloc -no-pie
/jemalloc-4.4.0 # ./a.out <--- DEADLOCK

cc @petermattis @bdarnell

Using libgcc creates a deadlock on start. See jemalloc/jemalloc#585.

davidtgoldblatt · 2017-01-24T18:52:23Z

Thanks for the report. Can you grab a stack trace of the deadlock? I would not be shocked if we're letting some glibc assumptions sneak in.

tamird · 2017-01-24T19:11:30Z

Sure.

$ gdb a.out
(gdb) run
Starting program: /jemalloc/jemalloc-4.4.0/a.out
^C
Program received signal SIGINT, Interrupt.
__syscall () at src/internal/x86_64/syscall.s:13
13	src/internal/x86_64/syscall.s: No such file or directory.
(gdb) bt
#0  __syscall () at src/internal/x86_64/syscall.s:13
#1  0x0000000000460cc8 in __timedwait_cp (addr=addr@entry=0x670024 <init_lock+4>, val=val@entry=-2147483632, clk=clk@entry=0, at=at@entry=0x0,
    priv=priv@entry=128) at src/thread/__timedwait.c:31
#2  0x0000000000460d51 in __timedwait (addr=addr@entry=0x670024 <init_lock+4>, val=-2147483632, clk=clk@entry=0, at=at@entry=0x0,
    priv=priv@entry=128) at src/thread/__timedwait.c:43
#3  0x000000000045fc70 in __pthread_mutex_timedlock (m=0x670020 <init_lock>, at=at@entry=0x0) at src/thread/pthread_mutex_timedlock.c:27
#4  0x000000000045fba3 in __pthread_mutex_lock (m=m@entry=0x670020 <init_lock>) at src/thread/pthread_mutex_lock.c:11
#5  0x00000000004030a2 in je_malloc_mutex_lock (tsdn=0x0, mutex=0x670020 <init_lock>) at include/jemalloc/internal/mutex.h:101
#6  malloc_init_hard () at src/jemalloc.c:1480
#7  0x0000000000406531 in malloc_init () at src/jemalloc.c:317
#8  ialloc_body (slow_path=true, usize=<synthetic pointer>, tsdn=<synthetic pointer>, zero=false, size=4728) at src/jemalloc.c:1577
#9  malloc (size=size@entry=4728) at src/jemalloc.c:1641
#10 0x000000000045dd21 in start_fde_sort (count=589, accu=0x7fffffffdf00)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2-fde.c:409
#11 init_object (ob=0x670140 <object>) at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2-fde.c:771
#12 search_object (ob=ob@entry=0x670140 <object>, pc=pc@entry=0x45cf31 <_Unwind_Backtrace+55>)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2-fde.c:961
#13 0x000000000045e4ff in _Unwind_Find_registered_FDE (bases=0x7fffffffe298, pc=0x45cf31 <_Unwind_Backtrace+55>)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2-fde.c:1025
#14 _Unwind_Find_FDE (pc=0x45cf31 <_Unwind_Backtrace+55>, bases=bases@entry=0x7fffffffe298)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2-fde-dip.c:454
#15 0x000000000045bb0c in uw_frame_state_for (context=context@entry=0x7fffffffe1f0, fs=fs@entry=0x7fffffffe040)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2.c:1241
#16 0x000000000045c769 in uw_init_context_1 (context=context@entry=0x7fffffffe1f0, outer_cfa=outer_cfa@entry=0x7fffffffe4a0,
    outer_ra=0x44b195 <je_prof_boot2+37>) at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind-dw2.c:1562
#17 0x000000000045cf32 in _Unwind_Backtrace (trace=trace@entry=0x4434e0 <prof_unwind_init_callback>, trace_argument=trace_argument@entry=0x0)
    at /home/buildozer/aports/main/gcc/src/gcc-6.2.0/libgcc/unwind.inc:283
#18 0x000000000044b195 in je_prof_boot2 (tsd=tsd@entry=0x681478 <builtin_tls+24>) at src/prof.c:2272
#19 0x000000000040330a in malloc_init_hard () at src/jemalloc.c:1501
#20 0x00000000004001f5 in malloc_init () at src/jemalloc.c:317
#21 jemalloc_constructor () at src/jemalloc.c:2801
#22 0x000000000045e92d in libc_start_init () at src/env/__libc_start_main.c:61
#23 0x000000000045e960 in __libc_start_main (main=0x4003b1 <main>, argc=1, argv=0x7fffffffe568) at src/env/__libc_start_main.c:71
#24 0x00000000004002aa in _start_c (p=<optimized out>) at crt/crt1.c:17
#25 0x0000000000400282 in _start ()

tamird · 2017-02-01T22:30:11Z

Interestingly switching to GCC intrinsics causes immediate segfault when profiling is enabled (we're still linking to glibc, not musl).

cockroachdb/cockroach#13345 (comment)

davidtgoldblatt · 2017-02-02T00:36:47Z

Oh, interesting; I had assumed this was a musl thing.

Incidentally, how badly does this bug hurt you? I've been delaying looking into it because some of our other efforts might fix it along the way, but I'll reprioritize if this is actually blocking you on something.

tamird · 2017-02-02T00:43:11Z

We ran into this bug in an attempt to make CockroachDB support older Linux distributions (namely CentOS 6), and broadly, there are two ways to do this:

ship a fully statically linked binary: this doesn't work so well with glibc, see https://sourceware.org/bugzilla/show_bug.cgi?id=19341. So instead, we want to use musl, and hence this issue.
dynamically link against a sufficiently old version of glibc (2.12 for CentOS 6). This, along with the change to use gcc intrinsics rather than libgcc (to support musl) produces the segfault in my last comment.

So we're somewhat stuck - we'd ideally like to provide both kinds of binaries, but there is no configuration of jemalloc that works with both.

Fixing this issue would allow us to use libgcc with both musl and glibc, which would hopefully resolve the segfault issue.

davidtgoldblatt · 2017-02-02T01:28:56Z

Can you see if --enable-prof-libunwind (instead of --enable-prof) works for you? I think the issue with gcc intrinsics is fundamental, and the one with libgcc will be a while to fix.

davidtgoldblatt · 2017-02-02T01:34:38Z

Actually, I suppose that if you're shipping a custom musl, you can compile it without -fomit-frame-pointer; that might get the gcc intrinsics working.

tamird · 2017-02-02T02:05:46Z

Adding -fno-omit-frame-pointer didn't fix the segfault with gcc intrinsics. BTW, note that because we're targeting glibc 2.12 with this build, we're using gcc 4.9.3.

Somewhat expectedly, --enable-prof-libunwind avoids the segfault, but because the system doesn't have libunwind, doesn't actually support profile and logs:

<jemalloc>: Invalid conf pair: prof:true

Any other ideas?

davidtgoldblatt · 2017-02-02T19:03:16Z

Just to double-check: you were building musl with --fno-omit-frame-pointer, not jemalloc?

@jasone, any ideas?

tamird · 2017-02-02T19:12:58Z

Oh, no, I did not try rebuilding musl.

Also to double check: you're suggesting that musl built with --fno-omit-frame-pointer would work with libgcc (and not deadlock)? Can you help me understand why you'd expect that to work?

jasone · 2017-02-02T19:14:51Z

Is it possible to install libunwind? That's the best bet. Otherwise you could experiment with moving the block of code that calls prof_boot2() to the end of malloc_init_hard_recursible(). The bootstrapping code is really fragile, so I'm not sure that will work...

tamird · 2017-02-02T19:23:40Z

Quick update: I'm now using a cross-compilation toolchain with gcc 6.3.0 (built with crosstool-ng) to target musl 1.1.16, and I get the deadlock whether I use libgcc or gcc intrinsics.

Regarding libunwind: anything's possible, but documentation seems scant, and I'm also not sure what is meant by libunwind - are we talking about http://www.nongnu.org/libunwind/ or https://github.com/llvm-mirror/libunwind ?

jasone · 2017-02-02T19:28:53Z

We're talking about the project at http://www.nongnu.org/libunwind/ . In my experience it has worked well other than on obscure old platforms and ARM-based systems with otherwise brittle toolchains.

This allows musl builds to avoid profiling which causes deadlock. See jemalloc/jemalloc#585.

davidtgoldblatt · 2017-02-02T20:33:54Z

My guess is that musl compiled with frame pointers + gcc intrinsics (not libgcc) may work. My experience with the gcc intrinsics is that they'll happily crash the process if confused by the stack layout. Musl compiles with -fomit-frame-pointers by default, which is the sort of thing that can confuse the intrsinics.

tamird · 2017-02-02T20:40:27Z

@davidtgoldblatt recall that our problem with musl is deadlock, not crash. We were seeing the crash with gcc intrinsics, debian jessie glibc 2.19, and gcc 4.9.2. We are no longer seeing this crash with gcc 4.9.3 targeting glibc 2.12.

Going back to the original issue here: I've gone and disabled jemalloc profiling in our musl builds since we see these deadlocks with both libgcc and gcc intrinsics. For now, this is not worth the headache of installing libunwind.

So, again, it would be great if you guys could figure out what's causing this deadlock.

davidtgoldblatt · 2017-02-02T21:07:50Z

Sorry, I omitted some context -- when doing the repro, I tried it with the gcc intrinsics + musl and saw crashes that went away with the change to musl compilation.

The deadlock itself is pretty straightforward - grabbing a backtrace with libgcc may call malloc (at https://github.com/gcc-mirror/gcc/blob/1cb6c2eb3b8361d850be8e8270c597270a1a7967/libgcc/unwind-dw2-fde.c#L437 in this case), and we can't in general handle reentrancy (with this being particularly true during bootstrapping).

tamird · 2017-02-02T21:10:41Z

How come there's no deadlock with glibc?

davidtgoldblatt · 2017-02-03T02:07:20Z

So, the proximate cause is that musl generates a call to __register_frame_info_bases, so that unseen_objects (in unwind-dw2-fde.c in libgcc) is non-null in musl at the of the backtrace, but null in gcc.

When the unseen_objects list is nonempty (i.e. in bootstrapping with musl), _Unwind_Find_FDE will go through it, and initialize a sorted array of pointers to allow subsequent lookups to use binary search instead of linear search. This array is malloc'd on first search attempt, causing the reentry.

The upshot is, the search code checks for malloc failure and falls back to linear search in that case, so we can fix this by detecting reentrancy early on in bootstrapping and returning null.

@jasone, any problems leaping out at you?

jasone · 2017-02-03T18:19:52Z

If we return NULL during bootstrapping, does that cause slow path lookups forever after, or does the lookup code keep trying to allocate? Either way, I wonder if this is a place where more robust reentry support will come in really handy, whether for performance or plain correctness reasons.

davidtgoldblatt · 2017-02-03T18:51:32Z

No, it will keep trying to allocate each search if it failed the first time (link: https://github.com/gcc-mirror/gcc/blob/035409c33a6cf53ea48956f723c3e7ef2c68a04b/libgcc/unwind-dw2-fde.c#L983 ).

I agree about reentry. Though, note that our current plans can't handle "reentrancy during bootstrapping". I think tracking this down has pushed me into the "we should have a lock-free base allocator" camp.

antirez · 2017-02-09T14:20:39Z

Potentially same issue as this one? redis/redis#3799

davidtgoldblatt · 2017-02-09T17:50:43Z

Hmm, probably not; I've only ever seen this one manifest as blocking deadlocks or crashes. I'll jump in on the redis issue.

antirez · 2017-02-10T11:40:15Z

Thank you @davidtgoldblatt

liuyuxun · 2017-12-13T09:38:34Z

it seems that i've encounter the same problem even i use static libunwind.a to compile

./autogen.sh 
autoconf

checking for xsltproc... false
checking for arm-linux-gcc... arm-openwrt-linux-gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... yes
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether arm-openwrt-linux-gcc accepts -g... yes
checking for arm-openwrt-linux-gcc option to accept ISO C89... none needed
checking whether compiler is cray... no
checking whether compiler supports -std=gnu11... no
checking whether compiler supports -std=gnu99... yes
checking whether compiler supports -Wall... yes
checking whether compiler supports -Wshorten-64-to-32... no
checking whether compiler supports -Wsign-compare... yes
checking whether compiler supports -Wundef... yes
checking whether compiler supports -pipe... yes
checking whether compiler supports -g3... yes
checking how to run the C preprocessor... arm-openwrt-linux-gcc -E
checking whether we are using the GNU C++ compiler... yes
checking whether arm-openwrt-linux-g++ accepts -g... yes
checking whether arm-openwrt-linux-g++ supports C++14 features by default... no
checking whether arm-openwrt-linux-g++ supports C++14 features with -std=c++14... no
checking whether arm-openwrt-linux-g++ supports C++14 features with -std=c++0x... no
checking whether arm-openwrt-linux-g++ supports C++14 features with +std=c++14... no
checking whether arm-openwrt-linux-g++ supports C++14 features with -h std=c++14... no
configure: No compiler with C++14 support was found
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking whether byte ordering is bigendian... no
checking size of void *... 4
checking size of int... 4
checking size of long... 4
checking size of long long... 8
checking size of intmax_t... 8
checking build system type... x86_64-pc-linux-uclibc
checking host system type... arm-unknown-linux-gnu
checking number of significant virtual address bits... 32
checking for arm-linux-ar... arm-openwrt-linux-ar
checking for gawk... gawk
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking whether malloc_usable_size definition can use const argument... yes
checking for library containing log... -lm
checking whether __attribute__ syntax is compilable... yes
checking whether compiler supports -fvisibility=hidden... yes
checking whether compiler supports -fvisibility=hidden... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... yes
checking whether tls_model attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... yes
checking whether alloc_size attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... yes
checking whether format(gnu_printf, ...) attribute is compilable... yes
checking whether compiler supports -Werror... yes
checking whether compiler supports -herror_on_warning... yes
checking whether format(printf, ...) attribute is compilable... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for arm-linux-ranlib... no
checking for ranlib... ranlib
configure: WARNING: using cross tools not prefixed with host triplet
checking for ld... /usr/bin/ld
checking for autoconf... /usr/bin/autoconf
checking for memalign... yes
checking for valloc... yes
checking for __libc_calloc... no
checking for __libc_free... no
checking for __libc_malloc... no
checking for __libc_memalign... no
checking for __libc_realloc... no
checking for __libc_valloc... no
checking for __posix_memalign... no
checking whether compiler supports -O3... yes
checking whether compiler supports -O3... yes
checking whether compiler supports -funroll-loops... yes
checking libunwind.h usability... yes
checking libunwind.h presence... no
configure: WARNING: libunwind.h: accepted by the compiler, rejected by the preprocessor!
configure: WARNING: libunwind.h: proceeding with the compiler's result
checking for libunwind.h... yes
checking configured backtracing method... libunwind
checking for sbrk... yes
checking whether utrace(2) is compilable... no
checking whether a program using __builtin_unreachable is compilable... yes
checking whether a program using __builtin_ffsl is compilable... yes
checking LG_PAGE... 12
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking dlfcn.h usability... yes
checking dlfcn.h presence... yes
checking for dlfcn.h... yes
checking for dlsym... no
checking for dlsym in -ldl... yes
checking whether pthread_atfork(3) is compilable... yes
checking whether pthread_setname_np(3) is compilable... no
checking for library containing clock_gettime... none required
checking whether clock_gettime(CLOCK_MONOTONIC_COARSE, ...) is compilable... no
checking whether clock_gettime(CLOCK_MONOTONIC, ...) is compilable... yes
checking whether mach_absolute_time() is compilable... no
checking whether compiler supports -Werror... yes
checking whether syscall(2) is compilable... yes
checking for secure_getenv... no
checking for sched_getcpu... yes
checking for sched_setaffinity... yes
checking for issetugid... no
checking for _malloc_thread_cleanup... no
checking for _pthread_mutex_init_calloc_cb... no
checking for TLS... yes
checking whether C11 atomics is compilable... no
checking whether GCC __atomic atomics is compilable... no
checking whether GCC __sync atomics is compilable... yes
checking whether Darwin OSAtomic*() is compilable... no
checking whether madvise(2) is compilable... yes
checking whether madvise(..., MADV_FREE) is compilable... no
checking whether madvise(..., MADV_DONTNEED) is compilable... yes
checking whether madvise(..., MADV_[NO]HUGEPAGE) is compilable... no
checking whether to force 32-bit __sync_{add,sub}_and_fetch()... no
checking whether to force 64-bit __sync_{add,sub}_and_fetch()... no
checking for __builtin_clz... yes
checking whether Darwin os_unfair_lock_*() is compilable... no
checking whether Darwin OSSpin*() is compilable... no
checking whether glibc malloc hook is compilable... no
checking whether glibc memalign hook is compilable... no
checking whether pthreads adaptive mutexes is compilable... yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating jemalloc.pc
config.status: creating doc/html.xsl
config.status: creating doc/manpages.xsl
config.status: creating doc/jemalloc.xml
config.status: creating include/jemalloc/jemalloc_macros.h
config.status: creating include/jemalloc/jemalloc_protos.h
config.status: creating include/jemalloc/jemalloc_typedefs.h
config.status: creating include/jemalloc/internal/jemalloc_preamble.h
config.status: creating test/test.sh
config.status: creating test/include/test/jemalloc_test.h
config.status: creating config.stamp
config.status: creating bin/jemalloc-config
config.status: creating bin/jemalloc.sh
config.status: creating bin/jeprof
config.status: creating include/jemalloc/jemalloc_defs.h
config.status: include/jemalloc/jemalloc_defs.h is unchanged
config.status: creating include/jemalloc/internal/jemalloc_internal_defs.h
config.status: include/jemalloc/internal/jemalloc_internal_defs.h is unchanged
config.status: creating test/include/test/jemalloc_test_defs.h
config.status: test/include/test/jemalloc_test_defs.h is unchanged
config.status: executing include/jemalloc/internal/public_symbols.txt commands
config.status: executing include/jemalloc/internal/private_symbols.awk commands
config.status: executing include/jemalloc/internal/private_symbols_jet.awk commands
config.status: executing include/jemalloc/internal/public_namespace.h commands
config.status: executing include/jemalloc/internal/public_unnamespace.h commands
config.status: executing include/jemalloc/internal/size_classes.h commands
config.status: executing include/jemalloc/jemalloc_protos_jet.h commands
config.status: executing include/jemalloc/jemalloc_rename.h commands
config.status: executing include/jemalloc/jemalloc_mangle.h commands
config.status: executing include/jemalloc/jemalloc_mangle_jet.h commands
config.status: executing include/jemalloc/jemalloc.h commands
===============================================================================
jemalloc version   : 5.0.1-0-g0
library revision   : 2

CONFIG             : --host=arm-linux --with-lg-hugepage=20 --with-version=5.0.1-0-g0 --prefix=/home/lyx --enable-prof --enable-prof-libunwind --disable-prof-gcc --disable-prof-libgcc --with-static-libunwind=/home/lyx/lib/libunwind.a host_alias=arm-linux CC=arm-openwrt-linux-gcc 'CFLAGS=-fexceptions -I/home/lyx/include' LDFLAGS= CXX=arm-openwrt-linux-g++
CC                 : arm-openwrt-linux-gcc
CONFIGURE_CFLAGS   : -std=gnu99 -Wall -Wsign-compare -Wundef -pipe -g3 -fvisibility=hidden -O3 -funroll-loops
SPECIFIED_CFLAGS   : -fexceptions -I/home/lyx/include
EXTRA_CFLAGS       : 
CPPFLAGS           : -D_GNU_SOURCE -D_REENTRANT
CXX                : arm-openwrt-linux-g++
CONFIGURE_CXXFLAGS : -fvisibility=hidden -O3
SPECIFIED_CXXFLAGS : 
EXTRA_CXXFLAGS     : 
LDFLAGS            : 
EXTRA_LDFLAGS      : 
DSO_LDFLAGS        : -shared -Wl,-soname,$(@F)
LIBS               : -lm  /home/lyx/lib/libunwind.a -lm -lpthread -ldl
RPATH_EXTRA        : 

XSLTPROC           : false
XSLROOT            : 

PREFIX             : /home/lyx
BINDIR             : /home/lyx/bin
DATADIR            : /home/lyx/share
INCLUDEDIR         : /home/lyx/include
LIBDIR             : /home/lyx/lib
MANDIR             : /home/lyx/share/man

srcroot            : 
abs_srcroot        : /home/jemalloc-5.0.1/
objroot            : 
abs_objroot        : /home/jemalloc-5.0.1/

JEMALLOC_PREFIX    : 
JEMALLOC_PRIVATE_NAMESPACE
                   : je_
install_suffix     : 
malloc_conf        : 
autogen            : 0
debug              : 0
stats              : 1
prof               : 1
prof-libunwind     : 1
prof-libgcc        : 0
prof-gcc           : 0
thp                : 0
fill               : 1
utrace             : 0
xmalloc            : 0
lazy_lock          : 0
cache-oblivious    : 1
cxx                : 0
===============================================================================

0  0xb6e4b4b4 in __lll_lock_wait () from /tmp/debug_lib//libpthread.so.0
#1  0xb6e556bc in pthread_mutex_lock () from /tmp/debug_lib//libpthread.so.0
#2  0xb6ebbe38 in pthread_mutex_lock () from /tmp/debug_lib//libc.so.0
#3  0xb6f26540 in malloc_mutex_lock_final (mutex=0xb6f5ff18) at include/jemalloc/internal/mutex.h:141
#4  je_malloc_mutex_lock_slow (mutex=0xb6f5ff18) at src/mutex.c:83
#5  0xb6ee7c1c in malloc_mutex_lock (tsdn=0x0, mutex=0xb6f5ff18) at include/jemalloc/internal/mutex.h:205
#6  0xb6ee6f50 in malloc_init_hard () at src/jemalloc.c:1452
#7  0xb6eea6f0 in malloc_init () at src/jemalloc.c:216
#8  imalloc (dopts=<synthetic pointer>, sopts=<synthetic pointer>) at src/jemalloc.c:1927
#9  malloc (size=320) at src/jemalloc.c:1977
#10 0xb6eb8634 in __new_exitfn () from /tmp/debug_lib//libc.so.0
#11 0xb6eb8454 in __cxa_atexit () from /tmp/debug_lib//libc.so.0
#12 0xb6f32eac in je_prof_boot2 (tsd=0xb6fa14a8) at src/prof.c:2342
#13 0xb6ee72f8 in malloc_init_hard () at src/jemalloc.c:1484
#14 malloc_init_hard () at src/jemalloc.c:1446
#15 0xb6f96594 in _dl_run_init_array () from /lib/ld-uClibc.so.0
#16 0xb6f9afe8 in _dl_get_ready_to_run () from /lib/ld-uClibc.so.0
#17 0xb6f9b944 in ?? () from /lib/ld-uClibc.so.0
#18 0xb6f9b944 in ?? () from /lib/ld-uClibc.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

liuyuxun · 2017-12-13T09:39:33Z

have you got some plan to fix this?

davidtgoldblatt · 2017-12-13T18:59:13Z

It's something I'd like to get to in the abstract, but realistically the interactions with libc during bootstrapping are always going to be complex enough that the whack-a-mole game with uncommon libcs is hard to justify. I've got some ideas about how to re-do bootstrapping to fix this class of issues in a more principled way, but we're stretched pretty thin at the moment.

liuyuxun · 2017-12-14T02:25:16Z

@davidtgoldblatt Thanks for your attention.
The reason why I try jemalloc on arm-linux platform is that our device has very low computing & memory resources. Our 'libc' lib is uclibc(0.9.32) which is very a common choice with embedded development, but it doesn't have a leak detection mechanism and its memory management is not good as jemalloc.
I think that jemalloc is pretty good choice for embedded devices ,due to both its wonderful memory management and the debug feature.

Hope That you will get fatter.(joking)

Best Wishes For YOU!

According to jemalloc/jemalloc#585, enabling memory profiling can cause deadlock on some platform and with some version of glibc. So this pr removes it by default for best safety. Signed-off-by: Jay Lee <BusyJayLee@gmail.com>

atexit call can allocate, which may cause deadlock problem like jemalloc#585. Signed-off-by: Jay Lee <BusyJayLee@gmail.com>

tamird added a commit to cockroachdb/c-jemalloc that referenced this issue Jan 23, 2017

musl: use gcc intrinsics instead of libgcc

f42bd4f

Using libgcc creates a deadlock on start. See jemalloc/jemalloc#585.

tamird mentioned this issue Jan 23, 2017

musl: use gcc intrinsics instead of libgcc cockroachdb/c-jemalloc#13

Merged

davidtgoldblatt self-assigned this Jan 24, 2017

tamird added a commit to cockroachdb/c-jemalloc that referenced this issue Feb 2, 2017

Control profiling builds via build tag

966e6b9

This allows musl builds to avoid profiling which causes deadlock. See jemalloc/jemalloc#585.

tamird mentioned this issue Feb 2, 2017

Control profiling builds via build tag cockroachdb/c-jemalloc#18

Merged

benesch mentioned this issue May 3, 2017

fix glibc/musl detection cockroachdb/cockroach-go#32

Open

tamird mentioned this issue May 3, 2017

testserver: use glibc binary on systems with eglibc cockroachdb/cockroach-go#31

Merged

gnusi mentioned this issue Feb 14, 2018

use c++14 arangodb/arangodb#4581

Closed

BusyJay mentioned this issue Oct 12, 2020

*: remove setup configuration of jemalloc profile tikv/tikv#8813

Merged

ti-srebot mentioned this issue Oct 13, 2020

*: remove setup configuration of jemalloc profile (#8813) tikv/tikv#8815

Closed

BusyJay added a commit to tikv/jemalloc that referenced this issue Nov 12, 2020

prof: avoid atexit call

0414c99

atexit call can allocate, which may cause deadlock problem like jemalloc#585. Signed-off-by: Jay Lee <BusyJayLee@gmail.com>

BusyJay added a commit to tikv/jemalloc that referenced this issue Nov 12, 2020

prof: avoid atexit call

f03b1ca

atexit call can allocate, which may cause deadlock problem like jemalloc#585. Signed-off-by: Jay Lee <BusyJayLee@gmail.com>

5kbpers mentioned this issue Nov 3, 2021

Switch alpine to centos7 for tikv docker image PingCAP-QE/ci#458

Closed

han-ian mentioned this issue May 23, 2024

Deadlock when enable jemalloc prof. tikv/tikv#17057

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deadlock in je_prof_boot2 #585

deadlock in je_prof_boot2 #585

tamird commented Jan 23, 2017 •

edited

Loading

davidtgoldblatt commented Jan 24, 2017

tamird commented Jan 24, 2017

tamird commented Feb 1, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017 •

edited

Loading

jasone commented Feb 2, 2017

tamird commented Feb 2, 2017 •

edited

Loading

jasone commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 3, 2017

jasone commented Feb 3, 2017

davidtgoldblatt commented Feb 3, 2017

antirez commented Feb 9, 2017

davidtgoldblatt commented Feb 9, 2017

antirez commented Feb 10, 2017

liuyuxun commented Dec 13, 2017 •

edited by davidtgoldblatt

Loading

liuyuxun commented Dec 13, 2017

davidtgoldblatt commented Dec 13, 2017

liuyuxun commented Dec 14, 2017

deadlock in je_prof_boot2 #585

deadlock in je_prof_boot2 #585

Comments

tamird commented Jan 23, 2017 • edited Loading

davidtgoldblatt commented Jan 24, 2017

tamird commented Jan 24, 2017

tamird commented Feb 1, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017 • edited Loading

jasone commented Feb 2, 2017

tamird commented Feb 2, 2017 • edited Loading

jasone commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 2, 2017

tamird commented Feb 2, 2017

davidtgoldblatt commented Feb 3, 2017

jasone commented Feb 3, 2017

davidtgoldblatt commented Feb 3, 2017

antirez commented Feb 9, 2017

davidtgoldblatt commented Feb 9, 2017

antirez commented Feb 10, 2017

liuyuxun commented Dec 13, 2017 • edited by davidtgoldblatt Loading

liuyuxun commented Dec 13, 2017

davidtgoldblatt commented Dec 13, 2017

liuyuxun commented Dec 14, 2017

tamird commented Jan 23, 2017 •

edited

Loading

tamird commented Feb 2, 2017 •

edited

Loading

tamird commented Feb 2, 2017 •

edited

Loading

liuyuxun commented Dec 13, 2017 •

edited by davidtgoldblatt

Loading