You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: it looks like the parallel executing the replication backlog resize causing deadlock and crash the server. However, it may not happen all the time.
<pre>
Core dump:
(gdb) bt
#0 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x18e000) at include/jemalloc/internal/atomic.h:62
#1 rtree_leaf_elm_bits_read (dependent=true, elm=0x18e000, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:175
#2 rtree_leaf_elm_szind_read (dependent=true, elm=0x18e000, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:227
#3 rtree_szind_read (dependent=true, key=140400094347264, rtree_ctx=, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:434
#4 arena_salloc (ptr=, tsdn=) at include/jemalloc/internal/arena_inlines_b.h:191
#5 isalloc (ptr=, tsdn=) at include/jemalloc/internal/jemalloc_internal_inlines_c.h:38
#6 je_malloc_usable_size (ptr=0x7fb171c00000) at src/jemalloc.c:3740
#7 0x000000000065a56f in _Z21getMemoryOverheadDatav () at object.cpp:1094
#8 0x000000000069256f in genRedisInfoString (c=0x0, section=0xb2383c "all") at server.cpp:5687
#9 0x00000000005ee9e1 in logServerInfo () at debug.cpp:1748
#10 0x00000000005eea43 in printCrashReport () at debug.cpp:2013
#11 0x00000000005eeaec in sigsegvHandler (sig=11, info=0x7fb194ef9c70, secret=0x7fb194ef9b40) at debug.cpp:1999
#12
#13 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x18e000) at include/jemalloc/internal/atomic.h:62
#14 rtree_leaf_elm_bits_read (dependent=true, elm=0x18e000, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:175
#15 rtree_leaf_elm_szind_read (dependent=true, elm=0x18e000, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:227
#16 rtree_szind_read (dependent=true, key=140400094347264, rtree_ctx=, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:434
#17 arena_salloc (ptr=, tsdn=) at include/jemalloc/internal/arena_inlines_b.h:191
#18 isalloc (ptr=, tsdn=) at include/jemalloc/internal/jemalloc_internal_inlines_c.h:38
#19 je_malloc_usable_size (ptr=0x7fb171c00000) at src/jemalloc.c:3740
#20 0x000000000065a56f in _Z21getMemoryOverheadDatav () at object.cpp:1094
#21 0x000000000069256f in genRedisInfoString (c=c@entry=0x7fb185c56600, section=0xb26e30 "default") at server.cpp:5687
#22 0x00000000006941e3 in infoCommand (c=, c=) at server.cpp:6358
#23 0x0000000000695621 in call (c=c@entry=0x7fb185c56600, flags=flags@entry=31) at server.cpp:4488
#24 0x0000000000696650 in processCommand (c=0x7fb185c56600, callFlags=31) at server.cpp:5067
#25 0x00000000005b2255 in processCommandAndResetClient (c=c@entry=0x7fb185c56600, flags=flags@entry=31) at networking.cpp:2616
#26 0x00000000005b4e74 in processInputBuffer (c=c@entry=0x7fb185c56600, fParse=fParse@entry=false, callFlags=callFlags@entry=31) at networking.cpp:2772
#27 0x00000000005b6369 in processClients () at networking.cpp:2922
#28 0x00000000006aa4f2 in _Z25runAndPropogateToReplicasIFvvEJEEvPT_DpT0_ (pfn=) at server.h:3902
#29 0x000000000059bc5e in beforeSleep (eventLoop=0x7fb1ab232900) at server.cpp:2807
#30 0x0000000000597868 in aeProcessEvents (eventLoop=eventLoop@entry=0x7fb1ab232900, flags=flags@entry=27) at ae.cpp:710
#31 0x000000000059b957 in aeMain (eventLoop=) at ae.cpp:770
#32 0x00000000006aa584 in _Z16workerThreadMainPv (parg=0x2) at server.cpp:7324
#33 0x00007fb1ab8fdf3b in ?? () from /usr/lib64/libpthread.so.0
#34 0x00007fb1ab835810 in clone () from /usr/lib64/libc.so.6
</pre>
The text was updated successfully, but these errors were encountered:
Steps to reproduce this issue:
Start KeyDB server by specifying repl-backlog-size and repl-backlog-disk-reserve as the following:
save ""
server-threads 4
repl-backlog-size 1mb
repl-backlog-disk-reserve 100mb
At the meantime, run the following info command every 100 second:
watch -n 0.1 -c 'date;./keydb-cli -h 192.168.0.221 -p 7001 info'
Run the following benchmark command:
./keydb-benchmark -h 192.168.0.221 -c 200 -p 7001 -r 10000 -t set -d 500 -l -q
Note: it looks like the parallel executing the replication backlog resize causing deadlock and crash the server. However, it may not happen all the time.
object.cpp https://github.com/Snapchat/KeyDB/blob/main/src/object.cpp (line 1094-1095), it crashed when += mem size for the repl_backlog
if (g_pserver->repl_backlog)
mem += zmalloc_size(g_pserver->repl_backlog);
Do we need to "Lock" mem += operation here?
Crash report
The text was updated successfully, but these errors were encountered: