Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRASH] #2183

Closed
caloveri opened this issue Jul 23, 2020 · 9 comments
Closed

[CRASH] #2183

caloveri opened this issue Jul 23, 2020 · 9 comments

Comments

@caloveri
Copy link

caloveri commented Jul 23, 2020

OpenSIPS version you are running

version: opensips 3.0.2 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
main.c compiled on 03:08:34 Jan 29 2020 with gcc 8

Crash Core Dump
https://drive.google.com/drive/folders/17Tw2uE_FgyBiYkymdgYYgNTcotof9oWW?usp=sharing

Describe the traffic that generated the bug
nothing special

To Reproduce
cannot reproduce

Relevant System Logs

Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9030]: CRITICAL:core:sig_usr: segfault in process pid: 9030, id: 7
Jul 23 10:55:53 centos-8gb-hel1-1 systemd[1]: Started Process Core Dump (PID 62467/UID 0).
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9026]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled 100 ms ago (now 1245356330 ms), delaying execution
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9026]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled 200 ms ago (now 1245356430 ms), delaying execution
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9026]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled 300 ms ago (now 1245356530 ms), delaying execution
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9026]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled 380 ms ago (now 1245356610 ms), delaying execution
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:handle_sigs: child process 9030 exited by a signal 11
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:handle_sigs: core was generated
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:handle_sigs: terminating due to SIGCHLD
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9026]: INFO:core:sig_usr: signal 15 received
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9024]: INFO:core:sig_usr: signal 15 received
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9025]: INFO:core:sig_usr: signal 15 received
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 1(9024) [MI FIFO] terminated, still waiting for 6 more
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 2(9025) [time_keeper] terminated, still waiting for 5 more
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 3(9026) [timer] terminated, still waiting for 4 more
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 4(9027) [SIP receiver udp:1.2.3.4:5060] terminated, still waiting for 3 more
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 6(9029) [SIP receiver udp:1.2.3.4:5060] terminated, still waiting for 2 more
Jul 23 10:55:53 centos-8gb-hel1-1 opensips[9021]: INFO:core:shutdown_opensips: process 8(9031) [Timer handler] terminated, still waiting for 1 more
Jul 23 10:55:53 centos-8gb-hel1-1 systemd-coredump[62468]: Process 9030 (opensips) of user 0 dumped core.#012#012Stack trace of thread 9030:#012#0  0x00007f8b6219a687 __strrlen_avx2 (libc.so.6)#012#1  0x00007f8b6208f43f vfprintf (libc.so.6)#012#2  0x00007f8b621486dc __vfprintf_chk (libc.so.6)#012#3  0x00007f8b6213302b __vsyslog_chk (libc.so.6)#012#4  0x00007f8b62133573 __syslog_chk (libc.so.6)#012#5  0x00000000004eb062 fm_free (opensips)#012#6  0x00007f8b5e9cd6d2 free_cell (tm.so)#012#7  0x00007f8b5e9fdd01 delete_cell (tm.so)#012#8  0x00007f8b5e9fe57a timer_routine (tm.so)#012#9  0x0000000000487bea handle_timer_job (opensips)#012#10 0x0000000000578895 io_wait_loop_epoll.constprop.5 (opensips)#012#11 0x000000000057d217 udp_start_processes (opensips)#012#12 0x000000000041b2be main (opensips)#012#13 0x00007f8b620606a3 __libc_start_main (libc.so.6)#012#14 0x000000000041be2e _start (opensips)
Jul 23 10:55:58 centos-8gb-hel1-1 opensips[9021]: INFO:core:cleanup: cleanup
Jul 23 10:55:58 centos-8gb-hel1-1 kernel: traps: opensips[9021] general protection fault ip:7f8b6219a687 sp:7fff75d22078 error:0 in libc-2.28.so[7f8b6203d000+1b9000]
Jul 23 10:55:58 centos-8gb-hel1-1 opensips[9021]: CRITICAL:core:sig_usr: segfault in attendant (starter) process!
Jul 23 10:55:58 centos-8gb-hel1-1 systemd[1]: Started Process Core Dump (PID 62476/UID 0).
Jul 23 10:55:59 centos-8gb-hel1-1 systemd-logind[693]: Removed session 15.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: user-runtime-dir@0.service: Unit not needed anymore. Stopping.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: Stopping User Manager for UID 0...
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Stopped target Default.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Stopped target Basic System.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Stopped target Timers.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Stopped target Sockets.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Closed D-Bus User Message Bus Socket.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Stopped target Paths.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Reached target Shutdown.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[8198]: Starting Exit the Session...
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: user-runtime-dir@0.service: Unit not needed anymore. Stopping.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: Stopped User Manager for UID 0.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: user-runtime-dir@0.service: Unit not needed anymore. Stopping.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: Stopping /run/user/0 mount wrapper...
Jul 23 10:55:59 centos-8gb-hel1-1 systemd[1]: Removed slice User Slice of UID 0.
Jul 23 10:55:59 centos-8gb-hel1-1 systemd-coredump[62477]: Process 9021 (opensips) of user 0 dumped core.#012#012Stack trace of thread 9021:#012#0  0x00007f8b6219a687 __strlen_avx2 (libc.so.6)#012#1  0x00007f8b6208f43f vfprintf (libc.so.6)#012#2  0x00007f8b621486dc __vfprintf_chk (libc.so.6)#012#3  0x00007f8b6213302b __vsyslog_chk (libc.so.6)#012#4  0x00007f8b62133573 __syslog_chk (libc.so.6)#012#5  0x00000000004eb062 fm_free (opensips)#012#6  0x00007f8b5e9cd6d2 free_cell (tm.so)#012#7  0x00007f8b5e9cffc0 free_hash_table (tm.so)#012#8  0x00007f8b5e9c026f tm_shutdown (tm.so)#012#9  0x00000000004b37c9 destroy_modules (opensips)#012#10 0x00000000004af466 cleanup (opensips)#012#11 0x00000000004afe37 shutdown_opensips (opensips)#012#12 0x00000000004b0533 handle_sigs (opensips)#012#13 0x000000000041bb3d main (opensips)#012#14 0x00007f8b620606a3 __libc_start_main (libc.so.6)#012#15 0x000000000041be2e _start (opensips)

OS/environment information

  • Operating System: Centos 8.2
  • OpenSIPS installation: /usr/sbin/opensips
  • other relevant information:

Additional context

@rvlad-patrascu
Copy link
Member

Please open the core files with gdb and post the output of bt full as they are useless without the original binaries .

@caloveri
Copy link
Author

for core.9021 I got

[New LWP 9021]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `opensips -v'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:93
93 VPCMPEQ (%rdi), %ymm0, %ymm1
(gdb) bt full
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:93
No locals.
#1 0x00007f8b6208f43f in _IO_vfprintf_internal (s=s@entry=0x2391e70,
format=format@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", ap=0x7fff75d22730) at vfprintf.c:1638
len =
string_malloced = 0
step0_jumps = {0, 3717, 3277, 3173, 4565, 3053, 5181, 4149, 3805, 5061, 4669, 3469, 4861,
4549, 3661, 4789, 3781, 4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445,
449, 4957}
space = 0
is_short = 0
use_outdigits = 0
outc =
step1_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 5061, 4669, 3469, 4861, 4549, 3661, 4789, 3781,
4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445, 449, 0}
group = 0
prec =
step2_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4669, 3469, 4861, 4549, 3661, 4789, 3781,
4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445, 449, 0}
string = 0x3833373136373239 <error: Cannot access memory at address 0x3833373136373239>
left = 0
is_long_double =
width = 0
signed_number =
step3a_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3565, 0, 0, 0, 3661, 4789, 3781, 4765, 3381,
0, 0, 0, 0, 1741, 0, 0, 0, 0, 0, 0}
alt = 0
showsign = 0
is_long =
is_char =
pad = 32 ' '
step3b_jumps = {0 <repeats 11 times>, 4861, 0, 0, 3661, 4789, 3781, 4765, 3381, 2077,
1453, 1237, 2317, 1741, 1685, 805, 1821, 0, 0, 0}
--Type for more, q to quit, c to continue without paging--c
step4_jumps = {0 <repeats 14 times>, 3661, 4789, 3781, 4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 0, 0, 0}
args_value =
is_negative =
number =
base =
the_arg = {pa_wchar = 0 L'\000', pa_int = 0, pa_long_int = 0, pa_long_long_int = 0, pa_u_int = 0, pa_u_long_int = 0, pa_u_long_long_int = 0, pa_double = 0, pa_long_double = , pa_string = 0x0, pa_wstring = 0x0, pa_pointer = 0x0, pa_user = 0x0}
spec = 115 's'
_buffer = {__routine = 0x7f8b623f6000 <_IO_mem_jumps>, __arg = 0x7f8b6208da3a <_IO_vfprintf_internal+1850>, __canceltype = 32, __prev = 0x7fff75d22668}
avail =
thousands_sep = 0x0
grouping = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>
done = 87
f = 0x5d5fa6 "s: %s(%ld) - aborting!\n"
lead_str_end =
end_of_spec =
work_buffer = "\000\000\000\000\000\000\000\000\001", '\000' <repeats 15 times>, "\063\373\240^\213\177\000\000\000\000\000\000\213\177\000\000H\201\240^\213\177\000\000\000\000\000\000\377\177\000\000\365\377\377\377\377\177\000\000\000\000\000\000\000\000\000\000\200X?b\213\177\000\000h\r\000\000\000\000\000\000G}\240^\213\177\000\000\004\000\000\000\000\000\000\000@}\240^\213\177\000\000\240(\322u\377\177\000\000
}\240^\213\177\000\000\000\000\000\000\000\000\000\000\030\000\000\000\060\000\000\000\300(\322u\377\177\000\000\000(\322u\377\177\000\000\000\000\000\000\377\177\000\000\371\377\377\377\377\177\000\000\000\000\000\000\000\000\000\000\200X?b\213\177\000\000h\r\000\000\000\000\000\000"...
workstart = 0x0
workend =
ap_save = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7fff75d22810, reg_save_area = 0x7fff75d22750}}
nspecs_done =
save_errno =
readonly_format =
PRETTY_FUNCTION = "_IO_vfprintf_internal"
__result =
#2 0x00007f8b621486dc in ___vfprintf_chk (fp=fp@entry=0x2391e70, flag=flag@entry=1, format=format@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", ap=ap@entry=0x7fff75d22730) at vfprintf_chk.c:33
_IO_acquire_lock_file = 0x2391e70
done =
#3 0x00007f8b6213302b in __GI___vsyslog_chk (pri=, flag=1, fmt=, ap=0x7fff75d22730) at ../misc/syslog.c:222
now_tm = {tm_sec = 58, tm_min = 55, tm_hour = 10, tm_mday = 23, tm_mon = 6, tm_year = 120, tm_wday = 4, tm_yday = 204, tm_isdst = 1, tm_gmtoff = 7200, tm_zone = 0x2347eb0 "CEST"}
now = 1595494558
fd =
f = 0x2391e70
buf = 0x0
bufsize = 0
msgoff = 21
saved_errno = 4
failbuf = "P'\322u\377\177\000\000\000\205\221\000Ё\237\204\004\000\000\000\000\000\000\000p\036\071\002"
clarg =
#4 0x00007f8b62133573 in __syslog_chk (pri=, flag=flag@entry=1, fmt=fmt@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n") at ../misc/syslog.c:129
ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7fff75d22818, reg_save_area = 0x7fff75d22750}}
#5 0x00000000004eb062 in syslog (__fmt=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", __pri=) at /usr/include/bits/syslog.h:31
No locals.
#6 fm_free (fm=, p=0x7f8b5f79b398, file=, func=, line=) at mem/f_malloc_dyn.h:231
f = 0x7f8b5f79b368
n =
FUNCTION = "fm_free"
#7 0x00007f8b5e9cd6d2 in _shm_free_bulk (line=137, function=, file=0x7f8b5ea092c1 "h_table.c", ptr=) at ../../mem/shm_mem.h:487
No locals.
#8 free_cell (dead_cell=0x7f8b5f9ccfb0) at h_table.c:137
b =
i =
rpl =
tt =
foo =
p =
FUNCTION = "free_cell"
#9 0x00007f8b5e9cffc0 in free_hash_table () at h_table.c:357
p_cell =
tmp_cell = 0x0
i = 20207
FUNCTION = "free_hash_table"
#10 0x00007f8b5e9c026f in tm_shutdown () at t_funcs.c:91
FUNCTION = "tm_shutdown"
#11 0x00000000004b37c9 in destroy_modules () at sr_module.c:593
t = 0x7f8b61047480
foo = 0x7f8b61047370
FUNCTION = "destroy_modules"
#12 0x00000000004af466 in cleanup (show_status=show_status@entry=1) at main.c:360
FUNCTION = "cleanup"
#13 0x00000000004afe37 in shutdown_opensips (status=status@entry=139) at main.c:524
proc =
i =
n =
p =
chld_status = 0
FUNCTION = "shutdown_opensips"
#14 0x00000000004b0533 in handle_sigs () at main.c:607
chld = 0
chld_status = 139
overall_status = 139
i =
do_exit = 1
FUNCTION = "handle_sigs"
#15 0x000000000041bb3d in main_loop () at main.c:869
startup_done =
last_check = 0
rc =
chd_rank = 5
startup_done =
last_check =
rc =
FUNCTION = "main_loop"
#16 main (argc=, argv=) at main.c:1482
c =
r =
tmp = 0x7fff75d22b06 ""
tmp_len =
port =
proto =
protos_no =
options = 0x5cc3f0 "f:cCm:M:b:l:n:N:rRvdDFEVhw:t:u:g:p:P:G:W:o:a:k:s:"
seed = 1409082643
rfd =
FUNCTION = "main"

@caloveri
Copy link
Author

for core 9030 I got

GNU gdb (GDB) Red Hat Enterprise Linux 8.2-11.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/opensips...Reading symbols from /usr/lib/debug/usr/sbin/opensips-3.0.2-1.el8.x86_64.debug...done.
done.
[New LWP 9030]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `opensips -v'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:93
93 VPCMPEQ (%rdi), %ymm0, %ymm1
(gdb) bt full
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:93
No locals.
#1 0x00007f8b6208f43f in _IO_vfprintf_internal (s=s@entry=0x2391e70,
format=format@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", ap=0x7fff75d225d0) at vfprintf.c:1638
len =
string_malloced = 0
step0_jumps = {0, 3717, 3277, 3173, 4565, 3053, 5181, 4149, 3805, 5061, 4669, 3469, 4861,
4549, 3661, 4789, 3781, 4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445,
449, 4957}
space = 0
is_short = 0
use_outdigits = 0
outc =
step1_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 5061, 4669, 3469, 4861, 4549, 3661, 4789, 3781,
4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445, 449, 0}
group = 0
prec =
step2_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4669, 3469, 4861, 4549, 3661, 4789, 3781,
4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 445, 449, 0}
string = 0x3137303036343135 <error: Cannot access memory at address 0x3137303036343135>
left = 0
is_long_double =
width = 0
signed_number =
step3a_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3565, 0, 0, 0, 3661, 4789, 3781, 4765, 3381,
0, 0, 0, 0, 1741, 0, 0, 0, 0, 0, 0}
alt = 0
showsign = 0
is_long =
is_char =
pad = 32 ' '
step3b_jumps = {0 <repeats 11 times>, 4861, 0, 0, 3661, 4789, 3781, 4765, 3381, 2077,
1453, 1237, 2317, 1741, 1685, 805, 1821, 0, 0, 0}
--Type for more, q to quit, c to continue without paging--c
step4_jumps = {0 <repeats 14 times>, 3661, 4789, 3781, 4765, 3381, 2077, 1453, 1237, 2317, 1741, 1685, 805, 1821, 0, 0, 0}
args_value =
is_negative =
number =
base =
the_arg = {pa_wchar = 1648320512 L'\x623f6000', pa_int = 1648320512, pa_long_int = 140236625502208, pa_long_long_int = 140236625502208, pa_u_int = 1648320512, pa_u_long_int = 140236625502208, pa_u_long_long_int = 140236625502208, pa_double = 6.9286098949345421e-310, pa_long_double = , pa_string = 0x7f8b623f6000 <_IO_mem_jumps> "", pa_wstring = 0x7f8b623f6000 <_IO_mem_jumps> L"", pa_pointer = 0x7f8b623f6000 <_IO_mem_jumps>, pa_user = 0x7f8b623f6000 <_IO_mem_jumps>}
spec = 115 's'
_buffer = {__routine = 0x0, __arg = 0x0, __canceltype = 1976705360, __prev = 0xffffffffffffffff}
_avail =
thousands_sep = 0x0
grouping = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>
done = 87
f = 0x5d5fa6 "s: %s(%ld) - aborting!\n"
lead_str_end =
end_of_spec =
work_buffer = "\f\000\000\000\377\177\000\000\f", '\000' <repeats 15 times>, "\200X?b\213\177\000\000h\r\000\000\000\000\000\000\a\376\240^\213\177", '\000' <repeats 11 times>, "\376\240^\213\177\000\000\000\000\000\000\213\177\000\000!\376\240^\213\177\000\000 \000\000\000\377\177\000\000\230%\322u\377\177\000\000\f\000\000\000\377\177\000\000\f", '\000' <repeats 15 times>, "\200X?b\213\177\000\000h\r\000\000\000\000\000\000O\377\240^\213\177\000\000\000\000\000\000\000\000\000\000H\377\240^\213\177\000\000\020 \000\000\000\000\000\000k\377\240^\213\177\000\000\000 \000\000\000\000\000\000\030\000\000\000\060\000\000\000\220'\322u\377\177\000\000"...
workstart = 0x0
workend =
ap_save = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7fff75d226b0, reg_save_area = 0x7fff75d225f0}}
nspecs_done =
save_errno =
readonly_format =
PRETTY_FUNCTION = "_IO_vfprintf_internal"
__result =
#2 0x00007f8b621486dc in ___vfprintf_chk (fp=fp@entry=0x2391e70, flag=flag@entry=1, format=format@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", ap=ap@entry=0x7fff75d225d0) at vfprintf_chk.c:33
_IO_acquire_lock_file = 0x2391e70
done =
#3 0x00007f8b6213302b in __GI___vsyslog_chk (pri=, flag=1, fmt=, ap=0x7fff75d225d0) at ../misc/syslog.c:222
now_tm = {tm_sec = 53, tm_min = 55, tm_hour = 10, tm_mday = 23, tm_mon = 6, tm_year = 120, tm_wday = 4, tm_yday = 204, tm_isdst = 1, tm_gmtoff = 7200, tm_zone = 0x2347eb0 "CEST"}
now = 1595494553
fd =
f = 0x2391e70
buf = 0x0
bufsize = 0
msgoff = 21
saved_errno = 0
failbuf = "\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\025\000\000\000"
clarg =
#4 0x00007f8b62133573 in __syslog_chk (pri=, flag=flag@entry=1, fmt=fmt@entry=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n") at ../misc/syslog.c:129
ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7fff75d226b8, reg_save_area = 0x7fff75d225f0}}
#5 0x00000000004eb062 in syslog (__fmt=0x5d5f60 "CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting!\n", __pri=) at /usr/include/bits/syslog.h:31
No locals.
#6 fm_free (fm=, p=0x7f8b5f646f18, file=, func=, line=) at mem/f_malloc_dyn.h:231
f = 0x7f8b5f646ee8
n =
FUNCTION = "fm_free"
#7 0x00007f8b5e9cd6d2 in _shm_free_bulk (line=137, function=, file=0x7f8b5ea092c1 "h_table.c", ptr=) at ../../mem/shm_mem.h:487
No locals.
#8 free_cell (dead_cell=dead_cell@entry=0x7f8b5f775ca8) at h_table.c:137
b =
i =
rpl =
tt =
foo =
p =
FUNCTION = "free_cell"
#9 0x00007f8b5e9fdd01 in delete_cell (p_cell=p_cell@entry=0x7f8b5f775ca8, unlock=unlock@entry=1) at timer.c:239
FUNCTION = "delete_cell"
#10 0x00007f8b5e9fe57a in wait_handler (wait_tl=0x7f8b5f775d28) at timer.c:458
p_cell = 0x7f8b5f775ca8
p_cell =
FUNCTION = "wait_handler"
__mptr =
#11 timer_routine (ticks=1245356, set=) at timer.c:1091
tl = 0x7f8b5f775d28
tmp_tl = 0x0
id = 2
FUNCTION = "timer_routine"
#12 0x0000000000487bea in handle_timer_job () at timer.c:864
t = 0x7f8b5f38def8
l =
FUNCTION = "handle_timer_job"
#13 0x0000000000578895 in handle_io (idx=3, event_type=1, fm=0x7f8b61082528) at net/net_udp.c:276
n = 0
read = 0
n =
read =
FUNCTION = "handle_io"
#14 io_wait_loop_epoll (repeat=0, t=1, h=) at net/../io_wait_loop.h:280
ret =
n = 1
r = 3
i =
e = 0x7f8b61082528
ep_event = {events = 4919000, data = {ptr = 0x61046ea000000000, fd = 0, u32 = 0, u64 = 6990834155059675136}}
fd =
FUNCTION = "io_wait_loop_epoll"
#15 0x000000000057d217 in udp_start_processes (chd_rank=chd_rank@entry=0x975fdc <chd_rank>, startup_done=startup_done@entry=0x0) at net/net_udp.c:496
si = 0x7f8b61046ea0
p_id =
i = 3
p =
FUNCTION = "udp_start_processes"
#16 0x000000000041b2be in main_loop () at main.c:797
startup_done = 0x0
last_check = 0
rc =
chd_rank = 4
startup_done =
last_check =
rc =
FUNCTION = "main_loop"
#17 main (argc=, argv=) at main.c:1482
c =
r =
tmp = 0x7fff75d22b06 ""
tmp_len =
port =
proto =
protos_no =
options = 0x5cc3f0 "f:cCm:M:b:l:n:N:rRvdDFEVhw:t:u:g:p:P:G:W:o:a:k:s:"
seed = 1409082643
rfd =
FUNCTION = "main"

@bogdan-iancu
Copy link
Member

@caloveri , the real crash is in the 9030 process (second backtrace). In the logs, do you still have this message CRITICAL:core:%s: freeing already freed %s pointer (%p), first free: %s: %s(%ld) - aborting! - it should be right above the CRITICAL reporting the crash of 9030 processes.
Also, do you still have the core file and the matching binaries ?

@caloveri
Copy link
Author

caloveri commented Jul 31, 2020 via email

@bogdan-iancu
Copy link
Member

In frame 6, could you try to print *f from gdb ?

@caloveri
Copy link
Author

caloveri commented Aug 21, 2020

(gdb) frame 6
#6 fm_free (fm=, p=0x7f8b5f646f18, file=, func=,
line=) at mem/f_malloc_dyn.h:231
231 check_double_free(p, f, fm);
(gdb) print *f
$1 = {size = 24, u = {nxt_free = 0x7f8b61148b20, reserved = 140236605917984},
prev = 0x7f8b611436a8,
file = 0x3137303036343135 <error: Cannot access memory at address 0x3137303036343135>,
func = 0x78614d0a0d383233 <error: Cannot access memory at address 0x78614d0a0d383233>,
line = 7237954716786705965}
(gdb)

@stale
Copy link

stale bot commented Sep 5, 2020

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@stale stale bot added the stale label Sep 5, 2020
@stale
Copy link

stale bot commented Oct 10, 2020

Marking as closed due to lack of progress for more than 30 days. If this issue is still relevant, please re-open it with additional details.

@stale stale bot closed this as completed Oct 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants