Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash when populate data by using linkbench #3916

Closed
sergw opened this issue Dec 28, 2018 · 6 comments
Closed

crash when populate data by using linkbench #3916

sergw opened this issue Dec 28, 2018 · 6 comments
Labels
crash luajit needs feedback Something is unclear with the issue qa Issues related to tests or testing subsystem
Milestone

Comments

@sergw
Copy link
Contributor

sergw commented Dec 28, 2018

Tarantool version: 2.1.1-151-g911139e

OS version: centos 7.4

Bug description:

[s.voronezhskii@sh7 tarantool]$ ~/tarantool/src/tarantool app.lua
tcp_server: remove dead UNIX socket: ./tarantool.sock
started
2018-12-28 11:10:32.840 [27448] main/101/app.lua C> Tarantool 2.1.1-151-g911139e
2018-12-28 11:10:32.840 [27448] main/101/app.lua C> log level 5
2018-12-28 11:10:32.840 [27448] main/101/app.lua I> mapping 268435456 bytes for memtx tuple arena...
2018-12-28 11:10:32.840 [27448] main/101/app.lua I> mapping 536870912 bytes for vinyl tuple arena...
2018-12-28 11:10:32.846 [27448] iproto/101/main I> binary: bound to 0.0.0.0:3301
2018-12-28 11:10:32.846 [27448] main/101/app.lua I> initializing an empty data directory
2018-12-28 11:10:32.865 [27448] snapshot/101/main I> saving snapshot `./00000000000000000000.snap.inprogress'
2018-12-28 11:10:32.903 [27448] snapshot/101/main I> done
2018-12-28 11:10:32.904 [27448] main/101/app.lua I> ready to accept requests
2018-12-28 11:10:32.905 [27448] main/104/checkpoint_daemon I> scheduled next checkpoint for Fri Dec 28 12:24:58 2018
2018-12-28 11:10:32.932 [27448] main C> entering the event loop
Segmentation fault
  code: SEGV_MAPERR
  addr: 0xfffffff5d016f960
  context: 0x7f71d016e740
  siginfo: 0x7f71d016e870
  rax      0x5                5
  rbx      0x7f71d016ed90     140126799195536
  rcx      0xfffffff5d016f960 -43753473696
  rdx      0x7f71d016f918     140126799198488
  rsi      0x0                0
  rdi      0x7f71de93c930     140127042259248
  rsp      0x7f71d016ed00     140126799195392
  rbp      0x7f71d016f030     140126799196208
  r8       0x0                0
  r9       0x7f71cf7f9cd0     140126789278928
  r10      0x7f71cf7febe0     140126789299168
  r11      0x0                0
  r12      0x7f71d016ee80     140126799195776
  r13      0x7f71d016ed90     140126799195536
  r14      0x7f71d016f040     140126799196224
  r15      0x0                0
  rip      0x7f71dafbe0b8     140126981972152
  eflags   0x10246            66118
  cs       0x33               51
  gs       0x0                0
  fs       0x0                0
  cr2      0xfffffff5d016f960 -43753473696
  err      0x5                5
  oldmask  0x4000000          67108864
  trapno   0xe                14
Current time: 1545984911
Please file a bug at http://github.com/tarantool/tarantool/issues
Attempting backtrace... Note: since the server has already crashed,
this may fail as well
#0  0x522bab in print_backtrace+9
#1  0x40cc0c in _ZL12sig_fatal_cbiP9siginfo_tPv+1cb
#2  0x7f71db1d45d0 in _L_unlock_13+34
#3  0x7f71dafbe0b8 in _Unwind_GetTextRelBase+1bf8
#4  0x7f71dafbefb9 in _Unwind_Backtrace+69
#5  0x7f71cf5f316a in _ZN11ProfileData10FlushTableEv+da
#6  0x7f71cf5f396c in _Z24GetStackTraceWithContextPPviiPKv+3c
#7  0x7f71cf5f0aa9 in _init+ee1
#8  0x7f71cf5f1613 in _ZN14ProfileHandler13SignalHandlerEiP9siginfo_tPv+83
#9  0x7f71db1d45d0 in _L_unlock_13+34
#10 0x545dac in lj_vm_pcall+0
#11 0xfffffff5d016f960 in +0
Aborted

Steps to reproduce:

./bin/linkbench -c LinkConfigTarantool.properties -l
@sergw sergw added the crash label Dec 28, 2018
@sergw sergw added this to the QA milestone Dec 28, 2018
@olegrok olegrok modified the milestones: QA, 1.10.3 Jan 6, 2019
@sergw
Copy link
Contributor Author

sergw commented Jan 10, 2019

Version: 1.10.2-89-g671aada

2019-01-10 11:05:57.277 [31779] main/255/main D> vy_stmt_alloc(format = 26 12, bsize = 139736760975383) = 0x193ab90
2019-01-10 11:05:57.277 [31779] main/255/main D> tuple_delete(0x195f830)
2019-01-10 11:05:57.277 [31779] main/255/main D> vy_tuple_delete(0x195f830)
Segmentation fault
  code: 128
  addr: (nil)
  context: 0x7f1789a6e700
  siginfo: 0x7f1789a6e830
  rax      0x5                5
  rbx      0x7f1789a6ed50     139739070393680
  rcx      0x34346c846866b536 3761750904704251190
  rdx      0x7f1789a6fad0     139739070397136
  rsi      0x0                0
  rdi      0x7f17d05e4930     139740256815408
  rsp      0x7f1789a6ecc0     139739070393536
  rbp      0x7f1789a6eff0     139739070394352
  r8       0x0                0
  r9       0x7f17c13f9cd0     139740003146960
  r10      0x7f17c13febe0     139740003167200
  r11      0x0                0
  r12      0x7f1789a6ee40     139739070393920
  r13      0x7f1789a6ed50     139739070393680
  r14      0x7f1789a6f000     139739070394368
  r15      0x0                0
  rip      0x7f17ccc660b8     139740196528312
  eflags   0x10246            66118
  cs       0x33               51
  gs       0x0                0
  fs       0x0                0
  cr2      0x0                0
  err      0x0                0
  oldmask  0x4000000          67108864
  trapno   0xd                13
Current time: 1547107557
Please file a bug at http://github.com/tarantool/tarantool/issues
Attempting backtrace... Note: since the server has already crashed,
this may fail as well
#0  0x512926 in print_backtrace+9
#1  0x40cb14 in _ZL12sig_fatal_cbiP9siginfo_tPv+1cb
#2  0x7f17cce7c5d0 in _L_unlock_13+34
#3  0x7f17ccc660b8 in _Unwind_GetTextRelBase+1bf8
#4  0x7f17ccc66fb9 in _Unwind_Backtrace+69
#5  0x7f17c11f316a in _ZN11ProfileData10FlushTableEv+da
#6  0x7f17c11f396c in _Z24GetStackTraceWithContextPPviiPKv+3c
#7  0x7f17c11f0aa9 in _init+ee1
#8  0x7f17c11f1613 in _ZN14ProfileHandler13SignalHandlerEiP9siginfo_tPv+83
#9  0x7f17cce7c5d0 in _L_unlock_13+34
#10 0x53a1bf in gc_finalize+88
#11 0x53acc0 in gc_onestep+2ca
#12 0x53add2 in lj_gc_step+8e
#13 0x5b9d9e in lj_trace_exit+2e5
#14 0x537837 in lj_vm_exit_handler+e1
#15 0x34346c846866b536 in +e1
Aborted

Steps to reproduce:
./bin/linkbench -c LinkConfigTarantool.properties -r

@sergw sergw added the vinyl label Jan 10, 2019
@sergw
Copy link
Contributor Author

sergw commented Jan 15, 2019

Segmentation fault
  code: 128
  addr: (nil)
  context: 0x7f175741dbc0
  siginfo: 0x7f175741dcf0
  rax      0x5                5
  rbx      0x7f175741e210     139738224910864
  rcx      0xffffffffffffff   72057594037927935
  rdx      0x7f175741fd90     139738224917904
  rsi      0x0                0
  rdi      0x7f1765fcd930     139738472044848
  rsp      0x7f175741e180     139738224910720
  rbp      0x7f175741e4b0     139738224911536
  r8       0x0                0
  r9       0x7f1757b87cd0     139738232683728
  r10      0x7f1757b8cbe0     139738232703968
  r11      0x0                0
  r12      0x7f175741e300     139738224911104
  r13      0x7f175741e210     139738224910864
  r14      0x7f175741e4c0     139738224911552
  r15      0x0                0
  rip      0x7f176264f0b8     139738411757752
  eflags   0x10246            66118
  cs       0x33               51
  gs       0x0                0
  fs       0x0                0
  cr2      0x0                0
  err      0x0                0
  oldmask  0x4000000          67108864
  trapno   0xd                13
Current time: 1547554923
Please file a bug at http://github.com/tarantool/tarantool/issues
Attempting backtrace... Note: since the server has already crashed,
this may fail as well
#0  0x512926 in print_backtrace+9
#1  0x40cb14 in _ZL12sig_fatal_cbiP9siginfo_tPv+1cb
#2  0x7f17628655d0 in _L_unlock_13+34
#3  0x7f176264f0b8 in _Unwind_GetTextRelBase+1bf8
#4  0x7f176264ffb9 in _Unwind_Backtrace+69
#5  0x7f175798116a in _ZN11ProfileData10FlushTableEv+da
#6  0x7f175798196c in _Z24GetStackTraceWithContextPPviiPKv+3c
#7  0x7f175797eaa9 in _init+ee1
#8  0x7f175797f613 in _ZN14ProfileHandler13SignalHandlerEiP9siginfo_tPv+83
#9  0x7f17628655d0 in _L_unlock_13+34
#10 0x7f17616585c0 in __GI___printf_fp_l+4f0
#11 0x7f1761657357 in _IO_vfprintf+4ed7
#12 0x7f1761681f39 in vsnprintf+79
#13 0x7f176165d3c2 in snprintf+82
#14 0x50b59b in say_format_plain+c9
#15 0x50d19b in log_vsay+88
#16 0x50c763 in say_default+cc
#17 0x6227c3 in tuple_delete+51
#18 0x622ac8 in tuple_unref+80
#19 0x62408d in box_tuple_unref+38
#20 0x4e0bb3 in lbox_tuple_gc+2d
#21 0x535c8b in lj_BC_FUNCC+34
#22 0x53a0ef in gc_call_finalizer+21f
#23 0x53a42c in gc_finalize+2f5
#24 0x53acc0 in gc_onestep+2ca
#25 0x53add2 in lj_gc_step+8e
#26 0x5b9d9e in lj_trace_exit+2e5
#27 0x537837 in lj_vm_exit_handler+e1
#28 0xffffffffffffff in +e1
Aborted

@sergw sergw added the qa Issues related to tests or testing subsystem label Jan 16, 2019
@sergw
Copy link
Contributor Author

sergw commented Jan 16, 2019

After loading phase, request fails with error:

DEBUG 2019-01-16 03:00:45,563 [Thread-0]: getLinkBetween 5121.123456789.0.1544129179776.off=1.lim=10000
ERROR 2019-01-16 03:00:45,563 [Thread-0]: getLinkListrTime failed! org.tarantool.TarantoolException: Failed to dynamically load module '': /tmp/tntTsVHfi/linkbench.so: invalid ELF header
ERROR 2019-01-16 03:00:45,563 [Thread-0]: GET_LINKS_LIST error Failed to dynamically load module '': /tmp/tntTsVHfi/linkbench.so: invalid ELF header
org.tarantool.TarantoolException: Failed to dynamically load module '': /tmp/tntTsVHfi/linkbench.so: invalid ELF header
        at org.tarantool.TarantoolClientImpl.serverError(TarantoolClientImpl.java:403)
        at org.tarantool.TarantoolClientImpl.complete(TarantoolClientImpl.java:416)
        at org.tarantool.TarantoolClientImpl.readThread(TarantoolClientImpl.java:347)
        at org.tarantool.TarantoolClientImpl$2.run(TarantoolClientImpl.java:163)
        at java.lang.Thread.run(Thread.java:748)

When I try do it manual: linkbench.get_link_list(5121,123456789,0,1544129179776,1,10000)

Segmentation fault
  code: 128
  addr: (nil)
  context: 0x7ff06885c000
  siginfo: 0x7ff06885c130
  rax      0x5                5
  rbx      0x7ff06885c650     140670522476112
  rcx      0xff180441ca6e88   71802525623938696
  rdx      0x7ff06885d420     140670522479648
  rsi      0x0                0
  rdi      0x7ff07726c930     140670767909168
  rsp      0x7ff06885c5c0     140670522475968
  rbp      0x7ff06885c8f0     140670522476784
  r8       0x0                0
  r9       0x7ff068e26cd0     140670528548048
  r10      0x7ff068e2bbe0     140670528568288
  r11      0x0                0
  r12      0x7ff06885c740     140670522476352
  r13      0x7ff06885c650     140670522476112
  r14      0x7ff06885c900     140670522476800
  r15      0x0                0
  rip      0x7ff0738ee0b8     140670707622072
  eflags   0x10246            66118
  cs       0x33               51
  gs       0x0                0
  fs       0x0                0
  cr2      0x0                0
  err      0x0                0
  oldmask  0x4000000          67108864
  trapno   0xd                13
Current time: 1547597199
Please file a bug at http://github.com/tarantool/tarantool/issues
Attempting backtrace... Note: since the server has already crashed,
this may fail as well
#0  0x512926 in print_backtrace+9
#1  0x40cb14 in _ZL12sig_fatal_cbiP9siginfo_tPv+1cb
#2  0x7ff073b045d0 in _L_unlock_13+34
#3  0x7ff0738ee0b8 in _Unwind_GetTextRelBase+1bf8
#4  0x7ff0738eefb9 in _Unwind_Backtrace+69
#5  0x7ff068c2016a in _ZN11ProfileData10FlushTableEv+da
#6  0x7ff068c2096c in _Z24GetStackTraceWithContextPPviiPKv+3c
#7  0x7ff068c1daa9 in _init+ee1
#8  0x7ff068c1e613 in _ZN14ProfileHandler13SignalHandlerEiP9siginfo_tPv+83
#9  0x7ff073b045d0 in _L_unlock_13+34
#10 0x5417fc in lj_tab_set+17e
#11 0x53a347 in gc_finalize+210
#12 0x53acc0 in gc_onestep+2ca
#13 0x53add2 in lj_gc_step+8e
#14 0x5b9d9e in lj_trace_exit+2e5
#15 0x537837 in lj_vm_exit_handler+e1
#16 0xff180441ca6e88 in +e1
Aborted

@locker
Copy link
Member

locker commented Jan 19, 2019

According to the stack trace, the issue has nothing to do with vinyl - it looks like a Lua JIT issue.

@locker locker added luajit and removed vinyl labels Jan 19, 2019
@Totktonada
Copy link
Member

Totktonada commented Jan 22, 2019

$ objdump -T /usr/lib/libprofiler.so | grep _ZN11ProfileData10FlushTableEv
0000000000008730 g    DF .text	000000000000007d  Base        _ZN11ProfileData10FlushTableEv

It is in the google-perftools library. Try to comment require('gperftools') and all its usages.

Maybe we should move this issue to tarantool/gperftools.

Totktonada added a commit to tarantool/linkbench that referenced this issue Jan 22, 2019
It leads to stucks and crashes like [1].

[1]: tarantool/tarantool#3916
@Totktonada Totktonada added needs feedback Something is unclear with the issue and removed luajit labels Jan 22, 2019
@kyukhin kyukhin added the luajit label Mar 15, 2019
@Totktonada
Copy link
Member

The commit tarantool/luajit@d92380f (propagated into tarantool in be0506d (2.1) and e5e259a (1.10)) states that it fixes this problem, however it looks as a gperftools problem. Don't sure. Maybe the bug in LuaJIT really causes this segfault. I'll open an issue in gperftools repo and close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crash luajit needs feedback Something is unclear with the issue qa Issues related to tests or testing subsystem
Projects
None yet
Development

No branches or pull requests

5 participants