Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredump: dmq_usrloc sync from node in "status: disabled" #2451

Closed
marcinkowalczyk opened this issue Aug 20, 2020 · 6 comments
Closed

coredump: dmq_usrloc sync from node in "status: disabled" #2451

marcinkowalczyk opened this issue Aug 20, 2020 · 6 comments

Comments

@marcinkowalczyk
Copy link

Description

I'm running two kamailio registrars with dmq replication

modparam("usrloc", "db_mode", 0)
modparam("usrloc", "use_domain", 1)

loadmodule "dmq.so"
modparam("dmq", "server_address", "sip:own_ip:5062" )
modparam("dmq", "notification_address",  "sip:peer_ip:5062")
modparam("dmq", "ping_interval", 15)

modparam("dmq_usrloc", "enable", 1)
modparam("dmq_usrloc", "sync", 1)
modparam("dmq_usrloc", "replicate_socket_info", 1)
modparam("dmq_usrloc", "usrloc_delete", 1)

Replication works fine I can see contacts on both registrars. Problem starts when I try to restart one of regsitrars, than another one crashes just after tries to sync with first one.

I did some more investigation and if 1st node starts when he is still in "active" status in DMQ donor node will crash.

# kamcmd dmq.list_nodes
{
        host: 10.0.210.67
        port: 5062
        resolved_ip: 10.0.210.67
        status: disabled
        last_notification: 0
        local: 0
}
{
        host: 10.0.210.58
        port: 5062
        resolved_ip: 10.0.210.58
        status: active
        last_notification: 0
        local: 1
}

If 1st node will be down for bit longer time (so DMQ marks is as pending) crash will not happen and sync will be successfull.

# kamcmd dmq.list_nodes
{
        host: 10.0.210.67
        port: 5062
        resolved_ip: 10.0.210.67
        status: pending
        last_notification: 0
        local: 0
}
{
        host: 10.0.210.58
        port: 5062
        resolved_ip: 10.0.210.58
        status: active
        last_notification: 0
        local: 1
}

Troubleshooting

Reproduction

restart one of nodes in time when all cluster members are in active state

Debugging Da

[root /]# gdb /usr/sbin/kamailio /core.50795
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/kamailio...Reading symbols from /usr/sbin/kamailio...(no debugging symbols found)...done.
(no debugging symbols found)...done.
[New LWP 50795]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/kamailio -DD -P /run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
Missing separate debuginfos, use: debuginfo-install kamailio-5.4.0-0.el7.centos.x86_64
(gdb) backtrace
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#1  0x00007f7027ea29de in usrloc_get_all_ucontact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#2  0x00007f7027ea5e3c in usrloc_dmq_execute_action () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#3  0x00007f7027ea8937 in usrloc_dmq_handle_msg () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#4  0x00007f7028c43b88 in worker_loop () from /usr/lib64/kamailio/modules/dmq.so
#5  0x00007f7028c41304 in child_init () from /usr/lib64/kamailio/modules/dmq.so
#6  0x000000000057c313 in init_mod_child ()
#7  0x000000000057bf88 in init_mod_child ()
#8  0x000000000057bf88 in init_mod_child ()
#9  0x000000000057bf88 in init_mod_child ()
#10 0x000000000057bf88 in init_mod_child ()
#11 0x000000000057bf88 in init_mod_child ()
#12 0x000000000057bf88 in init_mod_child ()
#13 0x000000000057cab2 in init_child ()
#14 0x000000000042ab0d in main_loop ()
#15 0x0000000000433a76 in main ()
(gdb) bt full
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#1  0x00007f7027ea29de in usrloc_get_all_ucontact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#2  0x00007f7027ea5e3c in usrloc_dmq_execute_action () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#3  0x00007f7027ea8937 in usrloc_dmq_handle_msg () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#4  0x00007f7028c43b88 in worker_loop () from /usr/lib64/kamailio/modules/dmq.so
No symbol table info available.
#5  0x00007f7028c41304 in child_init () from /usr/lib64/kamailio/modules/dmq.so
No symbol table info available.
#6  0x000000000057c313 in init_mod_child ()
No symbol table info available.
#7  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#8  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#9  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#10 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#11 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#12 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#13 0x000000000057cab2 in init_child ()
No symbol table info available.
#14 0x000000000042ab0d in main_loop ()
No symbol table info available.
#15 0x0000000000433a76 in main ()
No symbol table info available.
(gdb) quit
[root /]#

Log Messages

Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: ALERT: <core> [main.c:777]: handle_sigs(): child process 53050 exited by a signal 11
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: ALERT: <core> [main.c:780]: handle_sigs(): core was not generated
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: INFO: <core> [main.c:802]: handle_sigs(): terminating due to SIGCHLD
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53021]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53014]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53022]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53013]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53023]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53024]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53018]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53028]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53025]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53031]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53026]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53032]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53027]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53035]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53047]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53054]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53038]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53053]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53045]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53057]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53079]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53072]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53033]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53074]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53070]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53077]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53056]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53019]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53036]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53040]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53075]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53043]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53041]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53030]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53020]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53052]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53029]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53015]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53017]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53016]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received

SIP Traffic

Possible Solutions

shared db sync

Additional Information

# kamailio  -v
version: kamailio 5.4.0 (x86_64/linux) 6c4fce
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 6c4fce
compiled on 17:15:32 Jul 29 2020 with gcc 4.8.5

  • Operating System:
CentOS Linux release 7.8.2003 (Core)
 3.10.0-1127.18.2.el7.x86_64 #1 SMP Sun Jul 26 15:27:06 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

@miconda
Copy link
Member

miconda commented Aug 24, 2020

Can you install the debug symbols package then grab again the backtrace with gdb? Because it will show details about the file and line for each frame of the backtrace.

@marcinkowalczyk
Copy link
Author

Hi,

I hope now it's ok

[root /]# gdb /usr/sbin/kamailio core.70775
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/kamailio...Reading symbols from /usr/lib/debug/usr/sbin/kamailio.debug...done.
done.
[New LWP 70775]
Missing separate debuginfo for /usr/lib64/mysql/libmysqlclient.so.18
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/b5/d3caa8c105db5c047c10ed4fbe66308e69a258.debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/kamailio -DD -P /run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f3b39dfca38 in usrloc_dmq_send_contact (ptr=0x7f3b3b34bf28, aor=..., action=1, node=0x7f3b3b319868) at usrloc_sync.c:749
749                     srjson_AddStrToObject(&jdoc, jdoc.root, "sock", ptr->sock->sock_str.s, ptr->sock->sock_str.len);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-46.el7.x86_64 libcom_err-1.42.9-17.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-15.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) backtrace
#0  0x00007f3b39dfca38 in usrloc_dmq_send_contact (ptr=0x7f3b3b34bf28, aor=..., action=1, node=0x7f3b3b319868) at usrloc_sync.c:749
#1  0x00007f3b39df29de in usrloc_get_all_ucontact (node=0x7f3b3b319868) at usrloc_sync.c:255
#2  0x00007f3b39df5e3c in usrloc_dmq_execute_action (jdoc_action=0x27a0340, node=0x7f3b3b319868) at usrloc_sync.c:433
#3  0x00007f3b39df8937 in usrloc_dmq_handle_msg (msg=0x7f3b3b6f8228, resp=0x7fffe405f420, node=0x7f3b3b319868) at usrloc_sync.c:521
#4  0x00007f3b3ab93b88 in worker_loop (id=0) at worker.c:113
#5  0x00007f3b3ab91304 in child_init (rank=0) at dmq.c:296
#6  0x000000000057c313 in init_mod_child (m=0x7f3b51714998, rank=0) at core/sr_module.c:780
#7  0x000000000057bf88 in init_mod_child (m=0x7f3b517156b0, rank=0) at core/sr_module.c:776
#8  0x000000000057bf88 in init_mod_child (m=0x7f3b517179c8, rank=0) at core/sr_module.c:776
#9  0x000000000057bf88 in init_mod_child (m=0x7f3b51718dd0, rank=0) at core/sr_module.c:776
#10 0x000000000057bf88 in init_mod_child (m=0x7f3b517198c0, rank=0) at core/sr_module.c:776
#11 0x000000000057bf88 in init_mod_child (m=0x7f3b5171a068, rank=0) at core/sr_module.c:776
#12 0x000000000057bf88 in init_mod_child (m=0x7f3b5171b1e8, rank=0) at core/sr_module.c:776
#13 0x000000000057cab2 in init_child (rank=0) at core/sr_module.c:825
#14 0x000000000042ab0d in main_loop () at main.c:1763
#15 0x0000000000433a76 in main (argc=10, argv=0x7fffe40604b8) at main.c:2856
(gdb) bt full
#0  0x00007f3b39dfca38 in usrloc_dmq_send_contact (ptr=0x7f3b3b34bf28, aor=..., action=1, node=0x7f3b3b319868) at usrloc_sync.c:749
        jdoc = {root = 0x27bd930, flags = 0, buf = {s = 0x0, len = 0}, malloc_fn = 0x41b430 <malloc@plt>, free_fn = 0x41ad90 <free@plt>}
        __FUNCTION__ = "usrloc_dmq_send_contact"
#1  0x00007f3b39df29de in usrloc_get_all_ucontact (node=0x7f3b3b319868) at usrloc_sync.c:255
        rval = 0
        len = 504820
        buf = 0x7f3b542ec010
        cp = 0x7f3b542ec145
        c = {s = 0x7f3b542ec0f6 "sip:konsultant@10.14.5.93:5060", len = 30}
        recv = {s = 0x7f3b542ec118 "", len = 0}
        path = {s = 0x0, len = 0}
        ruid = {s = 0x7f3b542ec12c "uloc-5f437d69-5ce9-eb\001\004\344\340\035", len = 21}
        aorhash = 3773039617
        send_sock = 0x0
        flags = 0
        aor = {s = 0x7f3b3b34be98 "047350@konsultant.voice.int.ccig.pl", len = 35}
        r = 0x7f3b3b34bdf0
        _d = 0x7f3b3b2f3760
        ptr = 0x7f3b3b34bf28
        res = 0
        n = 3
        __FUNCTION__ = "usrloc_get_all_ucontact"
#2  0x00007f3b39df5e3c in usrloc_dmq_execute_action (jdoc_action=0x27a0340, node=0x7f3b3b319868) at usrloc_sync.c:433
        ci = {ruid = {s = 0x0, len = 0}, c = 0x7f3b3a0033e0 <c.13717>, received = {s = 0x0, len = 0}, path = 0x7f3b3a003400 <path.13719>, expires = 0, q = 0, callid = 0x7f3b3a003410 <callid.13718>, cseq = 0, flags = 0, cflags = 0, user_agent = 0x7f3b3a003420 <user_agent.13720>, sock = 0x0, methods = 0, instance = {s = 0x0, len = 0}, reg_id = 0, server_id = 0,
          tcpconn_id = -1, keepalive = 0, xavp = 0x0, last_modified = 0}
        it = 0x0
        sock = 0x0
        action = 3
        expires = 0
        cseq = 0
        flags = 0
        cflags = 0
        q = 0
        last_modified = 0
        methods = 0
        reg_id = 0
        server_id = 0
        port = 0
        proto = 0
        aor = {s = 0x0, len = 0}
        ruid = {s = 0x0, len = 0}
        received = {s = 0x0, len = 0}
        instance = {s = 0x0, len = 0}
        host = {s = 0x27bd544 "", len = 11}
        c = {s = 0x27bd160 "p\004z\002", len = 30}
        callid = {s = 0x27bd390 "", len = 43}
        path = {s = 0x27a0390 "P\337{\002", len = 0}
        user_agent = {s = 0x27bd2e0 "\004", len = 22}
        __FUNCTION__ = "usrloc_dmq_execute_action"
#3  0x00007f3b39df8937 in usrloc_dmq_handle_msg (msg=0x7f3b3b6f8228, resp=0x7fffe405f420, node=0x7f3b3b319868) at usrloc_sync.c:521
        content_length = 12
        body = {s = 0x7f3b3b6f8ba9 "{\"action\":3}", len = 12}
        jdoc = {root = 0x27a02f0, flags = 0, buf = {s = 0x7f3b3b6f8ba9 "{\"action\":3}", len = 12}, malloc_fn = 0x41b430 <malloc@plt>, free_fn = 0x41ad90 <free@plt>}
        __FUNCTION__ = "usrloc_dmq_handle_msg"
#4  0x00007f3b3ab93b88 in worker_loop (id=0) at worker.c:113
        worker = 0x7f3b3b2d1008
        current_job = 0x7f3b3b6e12f8
        peer_response = {resp_code = 0, content_type = {s = 0x0, len = 0}, reason = {s = 0x0, len = 0}, body = {s = 0x0, len = 0}}
        ret_value = 0
        not_parsed = 1
        dmq_node = 0x7f3b3b319868
        __FUNCTION__ = "worker_loop"
#5  0x00007f3b3ab91304 in child_init (rank=0) at dmq.c:296
        i = 0
        newpid = 0
        __FUNCTION__ = "child_init"
#6  0x000000000057c313 in init_mod_child (m=0x7f3b51714998, rank=0) at core/sr_module.c:780
        __FUNCTION__ = "init_mod_child"
#7  0x000000000057bf88 in init_mod_child (m=0x7f3b517156b0, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#8  0x000000000057bf88 in init_mod_child (m=0x7f3b517179c8, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#9  0x000000000057bf88 in init_mod_child (m=0x7f3b51718dd0, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#10 0x000000000057bf88 in init_mod_child (m=0x7f3b517198c0, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#11 0x000000000057bf88 in init_mod_child (m=0x7f3b5171a068, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#12 0x000000000057bf88 in init_mod_child (m=0x7f3b5171b1e8, rank=0) at core/sr_module.c:776
        __FUNCTION__ = "init_mod_child"
#13 0x000000000057cab2 in init_child (rank=0) at core/sr_module.c:825
        ret = 1367153016
        type = 0x809065 "PROC_MAIN"
        __FUNCTION__ = "init_child"
#14 0x000000000042ab0d in main_loop () at main.c:1763
        i = 8
        pid = 70771
        si = 0x0
---Type <return> to continue, or q <return> to quit---
        si_desc = "udp receiver child=7 sock=10.0.210.58:5062\000Q;\177\000\000\253\037\177", '\000' <repeats 13 times>, "\320\003\006\344\377\177\000\000\070\370|\000\000\000\000\000*\000\000\000\000\000\000\000\000\331uS;\177\000\000\203a\201\000\000\000\000\000\217\331uS;\177\000\000\300\270A\000\000\000\000\000\220\304{Q;\177\000"
        nrprocs = 8
        woneinit = 1
        __FUNCTION__ = "main_loop"
#15 0x0000000000433a76 in main (argc=10, argv=0x7fffe40604b8) at main.c:2856
        cfg_stream = 0x26cc010
        c = -1
        r = 0
        tmp = 0x7fffe406181a ""
        tmp_len = 2496
        port = 2496
        proto = 1472
        ahost = 0x0
        aport = 0
        options = 0x7d2498 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:"
        ret = -1
        seed = 4061656402
        rfd = 4
        debug_save = 0
        debug_flag = 0
        dont_fork_cnt = 2
        n_lst = 0x7f3b537e9160 <intel_02_known>
        p = 0xf0b5ff <Address 0xf0b5ff out of bounds>
        st = {st_dev = 20, st_ino = 377534, st_nlink = 2, st_mode = 16832, st_uid = 997, st_gid = 995, __pad0 = 0, st_rdev = 0, st_size = 80, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1597913469, tv_nsec = 499342371}, st_mtim = {tv_sec = 1598258538, tv_nsec = 910898271}, st_ctim = {tv_sec = 1598258538, tv_nsec = 910898271}, __unused = {0, 0, 0}}
        tbuf = '\000' <repeats 376 times>...
        option_index = 0
        long_options = {{name = 0x7d468f "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x7cfc94 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x7d4694 "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x7d469a "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x7d46a0 "substdef", has_arg = 1, flag = 0x0, val = 1026}, {
            name = 0x7d46a9 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x7d46b3 "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x7d46bd "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, {name = 0x7d46c8 "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x7d46d1 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {
            name = 0x7d46dc "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}}
        __FUNCTION__ = "main"
(gdb)

and a log file:

Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70736]: ALERT: <core> [main.c:777]: handle_sigs(): child process 70775 exited by a signal 11
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70736]: ALERT: <core> [main.c:780]: handle_sigs(): core was generated
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70736]: INFO: <core> [main.c:802]: handle_sigs(): terminating due to SIGCHLD
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70748]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70737]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70749]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70738]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70739]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70750]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70740]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70741]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70751]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70742]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70743]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70752]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70744]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70745]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70769]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70746]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70747]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70755]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70772]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70756]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70778]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70758]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70780]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70760]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70781]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70762]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70783]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70768]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70785]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70771]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70787]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70773]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70793]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70774]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70794]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70776]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70789]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70791]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70766]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70753]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 24 10:56:06 10.0.210.58 kamailio-dispatcher[70736]: INFO: <core> [core/sctp_core.c:53]: sctp_core_destroy(): SCTP API not initialized

@miconda
Copy link
Member

miconda commented Aug 24, 2020

Can you get the output for next gdb commands?

frame 0
list
p *ptr
p*ptr->sock

@marcinkowalczyk
Copy link
Author

Hi Dan

(gdb) frame 0
#0  0x00007f3b39dfca38 in usrloc_dmq_send_contact (ptr=0x7f3b3b34bf28, aor=..., action=1, node=0x7f3b3b319868) at usrloc_sync.c:749
749                     srjson_AddStrToObject(&jdoc, jdoc.root, "sock", ptr->sock->sock_str.s, ptr->sock->sock_str.len);
(gdb) list
744             srjson_AddStrToObject(&jdoc, jdoc.root, "aor", aor.s, aor.len);
745             srjson_AddStrToObject(&jdoc, jdoc.root, "ruid", ptr->ruid.s, ptr->ruid.len);
746             srjson_AddStrToObject(&jdoc, jdoc.root, "c", ptr->c.s, ptr->c.len);
747             srjson_AddStrToObject(&jdoc, jdoc.root, "received", ptr->received.s, ptr->received.len);
748             if (_dmq_usrloc_replicate_socket_info==1)
749                     srjson_AddStrToObject(&jdoc, jdoc.root, "sock", ptr->sock->sock_str.s, ptr->sock->sock_str.len);
750             srjson_AddStrToObject(&jdoc, jdoc.root, "path", ptr->path.s, ptr->path.len);
751             srjson_AddStrToObject(&jdoc, jdoc.root, "callid", ptr->callid.s, ptr->callid.len);
752             srjson_AddStrToObject(&jdoc, jdoc.root, "user_agent", ptr->user_agent.s, ptr->user_agent.len);
753             srjson_AddStrToObject(&jdoc, jdoc.root, "instance", ptr->instance.s, ptr->instance.len);
(gdb) p *ptr
$1 = {domain = 0x7f3b3b2f3660, ruid = {s = 0x7f3b3b346e20 "uloc-5f437d69-5ce9-eb", len = 21}, aor = 0x7f3b3b34bdf8, c = {s = 0x7f3b3b346c80 "sip:konsultant@10.14.5.93:5060", len = 30}, received = {s = 0x0, len = 0}, path = {s = 0x0, len = 0}, expires = 1598259464, q = -1, callid = {s = 0x7f3b3b346d08 "772b73866b2cee3801ee5a017d1e9b3e@10.14.5.93;;\177",
    len = 43}, cseq = 126, state = CS_NEW, flags = 0, cflags = 0, user_agent = {s = 0x7f3b3b346da0 "CORE__CONF_2.648_G729_", len = 22}, uniq = {s = 0x0, len = 0}, sock = 0x0, last_modified = 1598259346, last_keepalive = 1598259346, ka_roundtrip = 0, methods = 4294967295, instance = {s = 0x0, len = 0}, reg_id = 0, server_id = 0, tcpconn_id = -1, keepalive = 0,
  xavp = 0x0, next = 0x0, prev = 0x0}
(gdb) p*ptr->sock
Cannot access memory at address 0x0
(gdb)

miconda added a commit that referenced this issue Aug 24, 2020
@miconda
Copy link
Member

miconda commented Aug 24, 2020

Can you try with latest master or branch 5.4? I pushed a commit for it.

@marcinkowalczyk
Copy link
Author

I've tested against latest 5.4 branch and seems issue is fixed. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants