Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kamailio 5.4.2 - CentOS 7 - qm_free(): BUG: freeing already freed pointer, called from core: core/usr_avp.c: destroy_avp_list_unsafe #2620

Closed
devryan opened this issue Feb 2, 2021 · 4 comments

Comments

@devryan
Copy link

devryan commented Feb 2, 2021

Hi,
Same issue with @farnk05 on #2224

Wanted to open a fresh issue to not reopen an old one.

This is on kamailio 5.4.2, which appears to have these fixes from #2224 (comment) in them when I checked the src files.

$ sudo rpm -qi kamailio |grep Date
Install Date: Fri 20 Nov 2020 12:03:26 AM PST
Build Date  : Tue 27 Oct 2020 05:37:31 AM PDT
$ sudo rpm -qa | grep kamailio
kamailio-mysql-5.4.2-0.el7.x86_64
kamailio-tls-5.4.2-0.el7.x86_64
kamailio-5.4.2-0.el7.x86_64
kamailio-websocket-5.4.2-0.el7.x86_64
kamailio-tcpops-5.4.2-0.el7.x86_64
kamailio-statsd-5.4.2-0.el7.x86_64

Package Source: https://rpm.kamailio.org/centos/7/5.4/5.4.2/x86_64/
OS: RHEL 7.6.1810
Mem mgr: default/qm
SHM is 4096, PKG is 1024 (system has 16gb ram, 4 core Intel Skylake CPU on a KVM.

There are a few variations we see with the qm_free errors, here are the most common we see when kamailio segfaults, and we have to let systemd restart it, or Monit as we now have to have Monit check for CRITICAL messages since kamailio segfaults so often.

/usr/sbin/kamailio[32734]: CRITICAL: <core> [core/mem/q_malloc.c:521]: qm_free(): BUG: freeing already freed pointer (0x7f0da5012fc0), called from core: core/usr_avp.c: destroy_avp_list_unsafe(626), first free core: core/usr_avp.c: destroy_avp_list_unsafe(626) - ignoring 
/usr/sbin/kamailio[32734]: CRITICAL: <core> [core/mem/q_malloc.c:521]: qm_free(): BUG: freeing already freed pointer (0x7f0da5012fc0), called from core: core/usr_avp.c: destroy_avp_list_unsafe(626), first free core: core/usr_avp.c: destroy_avp_list_unsafe(626) - ignoring ...
/usr/sbin/kamailio[32733]: CRITICAL: dialog [dlg_profile.c:574]: set_dlg_profile(): BUG - dialog not found in a non REQUEST route (1)
/usr/sbin/kamailio[32733]: CRITICAL: dialog [dlg_profile.c:574]: set_dlg_profile(): BUG - dialog not found in a non REQUEST route (1)
/usr/sbin/kamailio[6160]: CRITICAL: <core> [core/mem/q_malloc.c:521]: qm_free(): BUG: freeing already freed pointer (0x7ff00842e320), called from tm: h_table.c: free_cell_helper(189), first free core: core/usr_avp.c: destroy_avp_list_unsafe(626) - ignoring

GDB as requested in the other Issue (note gdb was run on another VM, not the main system, hopefully that is not an issue:

$ gdb /usr/sbin/kamailio /core-kamailio-11-995-992-11912-1612143069
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/kamailio...Reading symbols from /usr/lib/debug/usr/sbin/kamailio.debug...done.
done.

[New LWP 11912]

warning: .dynamic section for "/lib64/libc.so.6" is not at the expected address (wrong library or version mismatch?)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by '/usr/sbin/kamailio -DD -P /var/run/kamailio/kamailio.pid -f /etc/kamailio/kamai'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000006024be in destroy_avp_list_unsafe (list=0x7fdc559d4fd8) at core/usr_avp.c:625
625                     avp = avp->next;
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.6.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libmaxminddb-1.2.0-1.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64 libunistring-0.9.3-9.el7.x86_64 mariadb-libs-5.5.60-1.el7_5.x86_64 openssl-libs-1.0.2k-16.el7_6.1.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) frame 0
#0  0x00000000006024be in destroy_avp_list_unsafe (list=0x7fdc559d4fd8) at core/usr_avp.c:625
625                     avp = avp->next;
(gdb) list
620             avp_t *avp, *foo;
621
622             avp = *list;
623             while( avp ) {
624                     foo = avp;
625                     avp = avp->next;
626                     shm_free_unsafe( foo );
627             }
628             *list = 0;
629     }
(gdb) p *p_entry
No symbol "p_entry" in current context.
(gdb) p *l
No symbol "l" in current context.
(gdb)
No symbol "l" in current context.
(gdb) p *lh
No symbol "lh" in current context.
(gdb) p *lh
No symbol "lh" in current context.
(gdb) frame 1
#1  0x00007fdd56fb8f92 in free_cell_helper (dead_cell=0x7fdc559d4dd8, silent=0, fname=0x7fdd570d1363 "timer.c", fline=643) at h_table.c:255
255                     destroy_avp_list_unsafe(&dead_cell->uri_avps_from);
(gdb)
#1  0x00007fdd56fb8f92 in free_cell_helper (dead_cell=0x7fdc559d4dd8, silent=0, fname=0x7fdd570d1363 "timer.c", fline=643) at h_table.c:255
255                     destroy_avp_list_unsafe(&dead_cell->uri_avps_from);
(gdb) list
250             if(dead_cell->user_avps_from)
251                     destroy_avp_list_unsafe(&dead_cell->user_avps_from);
252             if(dead_cell->user_avps_to)
253                     destroy_avp_list_unsafe(&dead_cell->user_avps_to);
254             if(dead_cell->uri_avps_from)
255                     destroy_avp_list_unsafe(&dead_cell->uri_avps_from);
256             if(dead_cell->uri_avps_to)
257                     destroy_avp_list_unsafe(&dead_cell->uri_avps_to);
258             if(dead_cell->xavps_list)
259                     xavp_destroy_list_unsafe(&dead_cell->xavps_list);
(gdb) p *dlg
No symbol "dlg" in current context.
(gdb) p *msg
No symbol "msg" in current context.
(gdb)
@miconda
Copy link
Member

miconda commented Feb 2, 2021

Attach also the output of gdb command bt full.

@devryan
Copy link
Author

devryan commented Feb 2, 2021

(gdb) bt full
#0  0x00000000006024be in destroy_avp_list_unsafe (list=0x7fdc559d4fd8) at core/usr_avp.c:625
        avp = 0x554b414c5554594c
        foo = 0x554b414c5554594c
        __FUNCTION__ = "destroy_avp_list_unsafe"
#1  0x00007fdd56fb8f92 in free_cell_helper (dead_cell=0x7fdc559d4dd8, silent=0, fname=0x7fdd570d1363 "timer.c", fline=643) at h_table.c:255
        b = 0x0
        i = 1
        rpl = 0x0
        tt = 0x0
        foo = 0x0
        cbs = 0x0
        cbs_tmp = 0x7fdc53f941c0
        __FUNCTION__ = "free_cell_helper"
#2  0x00007fdd5707dc4d in wait_handler (ti=236720304, wait_tl=0x7fdc559d4e60, data=0x7fdc559d4dd8) at timer.c:643
        p_cell = 0x7fdc559d4dd8
        ret = 0
        unlinked = 0
        rcount = 1
        __FUNCTION__ = "wait_handler"
#3  0x00000000004ea094 in timer_list_expire (t=236720304, h=0x7fdc536c6640, slow_l=0x7fdc536c74c8, slow_mark=4293) at core/timer.c:857
        tl = 0x7fdc559d4e60
        ret = 32733
#4  0x00000000004ea53d in timer_handler () at core/timer.c:922
        saved_ticks = 236720304
        run_slow_timer = 0
        i = 197
        __FUNCTION__ = "timer_handler"
#5  0x00000000004ea9e7 in timer_main () at core/timer.c:961
No locals.
#6  0x000000000042a838 in main_loop () at main.c:1753
        i = 8
        pid = 0
        si = 0x0
        si_desc = "udp receiver child=7 sock=<our public ip>:35060\000\000;0\177", '\000' <repeats 13 times>, "\360\373\"b\377\177\000\000\330\r}\000\000\000\000\000%\000\000\000\000\000\000\000\320\376\200\232\335\177\000\000C`\201\000\000\000\000\000_\377\200\232\335\177\000\000b\233kS\335\177\000\000\020\b\302Z\335\177\000"
        nrprocs = 8
        woneinit = 1
        __FUNCTION__ = "main_loop"
#7  0x0000000000433a96 in main (argc=12, argv=0x7fff6222fcd8) at main.c:2856
        cfg_stream = 0x2359010
        c = -1
        r = 0
        tmp = 0x7fff62230ecc ""
        tmp_len = 0
        port = 0
        proto = 0
        ahost = 0x0
        aport = 0
        options = 0x7d3a38 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:"
        ret = -1
        seed = 3270549116
        rfd = 4
        debug_save = 0
        debug_flag = 0
        dont_fork_cnt = 2
---Type <return> to continue, or q <return> to quit---
        n_lst = 0x0
        p = 0x0
        st = {st_dev = 20, st_ino = 21156, st_nlink = 2, st_mode = 16832, st_uid = 995, st_gid = 2, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1611404518, tv_nsec = 41000000}, st_mtim = {
            tv_sec = 1612139036, tv_nsec = 999083038}, st_ctim = {tv_sec = 1612139036, tv_nsec = 999083038}, __unused = {0, 0, 0}}
        tbuf = '\000' <repeats 88 times>, "p\362\252\000\000\000\000\000\340\270A\000\000\000\000\000\320\374\"b\377\177", '\000' <repeats 26 times>, "\036-!\233\335\177\000\000\001", '\000' <repeats 23 times>, "\260\203r\232\335\177\000\000\240\374\"b\377\177\000\000\032\251!\233\335\177\000\000\034\000\000\000\000\000\000\000"...
        option_index = 0
        long_options = {{name = 0x7d5c2f "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x7d1234 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x7d5c34 "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x7d5c3a "subst",
            has_arg = 1, flag = 0x0, val = 1025}, {name = 0x7d5c40 "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x7d5c49 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x7d5c53 "server-id", has_arg = 1, flag = 0x0, val = 1028},
          {name = 0x7d5c5d "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, {name = 0x7d5c68 "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x7d5c71 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name = 0x7d5c7c "debug",
            has_arg = 1, flag = 0x0, val = 1032}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}}
        __FUNCTION__ = "main"
(gdb)

@devryan
Copy link
Author

devryan commented Feb 3, 2021

A shutdown triggered just this morning, which systemd noticed and restarted the service:

Feb  3 08:59:27 /usr/sbin/kamailio[16761]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 54
Feb  3 08:59:27 /usr/sbin/kamailio[16761]: CRITICAL: <core> [core/pass_fd.c:85]: recv_all(): 1st recv on 57 failed: Connection reset by peer
Feb  3 08:59:27 /usr/sbin/kamailio[16761]: CRITICAL: <core> [core/tcp_main.c:3543]: handle_tcp_child(): read from tcp child 6 (pid 16759, no 34) Connection reset by peer [104]
Feb  3 09:00:27 /usr/sbin/kamailio[16724]: CRITICAL: <core> [main.c:686]: sig_alarm_abort(): shutdown timeout triggered, dying...

ssl info if needed

# rpm -qa | grep ssl
openssl-devel-1.0.2k-16.el7_6.1.x86_64
openssl-1.0.2k-16.el7_6.1.x86_64
openssl-libs-1.0.2k-16.el7_6.1.x86_64

Adding in dmesg info as these are what we see when we crash

# rpm -qa | grep glibc
glibc-headers-2.17-260.el7_6.5.x86_64
glibc-2.17-260.el7_6.5.x86_64
glibc-common-2.17-260.el7_6.5.x86_64
glibc-devel-2.17-260.el7_6.5.x86_64
[Sat Jan 30 08:08:57 2021] kamailio[22756]: segfault at c0c0c0c8 ip 00000000006024be sp 00007ffd4069ff50 error 4 in kamailio[400000+4af000]
[Sat Jan 30 08:27:31 2021] traps: kamailio[640] general protection ip:6024be sp:7ffe53193600 error:0 in kamailio[400000+4af000]
[Sat Jan 30 10:53:21 2021] UDP: bad checksum. From <some public client ip>:22071 to <our public ip>:65535 ulen 583
[Sat Jan 30 14:05:32 2021] traps: kamailio[9455] general protection ip:7f12478121a2 sp:7ffd396648b0 error:0 in dialog.so[7f12477eb000+cd000]
[Sat Jan 30 14:05:42 2021] traps: kamailio[9437] general protection ip:7f124780c931 sp:7ffd39664b20 error:0 in dialog.so[7f12477eb000+cd000]
[Sun Jan 31 00:34:03 2021] traps: kamailio[17610] general protection ip:6024be sp:7fffebc18af0 error:0 in kamailio[400000+4af000]
[Sun Jan 31 11:02:37 2021] traps: kamailio[8222] general protection ip:6024be sp:7ffee67a50b0 error:0 in kamailio[400000+4af000]
[Sun Jan 31 11:20:46 2021] traps: kamailio[26901] general protection ip:6024be sp:7ffe6be8df40 error:0 in kamailio[400000+4af000]
[Sun Jan 31 11:45:30 2021] kamailio[3055]: segfault at 3 ip 00007f70cbadeef9 sp 00007fffcfd2d520 error 4 in libc-2.17.so[7f70cba92000+1c2000]
[Sun Jan 31 15:59:56 2021] kamailio[14268]: segfault at 22 ip 00007f86b5b46035 sp 00007fffb896d8f0 error 4 in dialog.so[7f86b5b0f000+cd000]
[Sun Jan 31 16:00:06 2021] traps: kamailio[14242] general protection ip:7f86b5b8ebb3 sp:7fffb896d990 error:0 in dialog.so[7f86b5b0f000+cd000]
[Sun Jan 31 17:31:17 2021] traps: kamailio[11912] general protection ip:6024be sp:7fff6222f1b0 error:0 in kamailio[400000+4af000]
[Mon Feb  1 05:12:03 2021] traps: kamailio[15669] general protection ip:7f5cc4c63ef9 sp:7ffdab5be2c0 error:0 in libc-2.17.so[7f5cc4c17000+1c2000]
[Mon Feb  1 06:26:36 2021] traps: kamailio[6190] general protection ip:6024be sp:7fffdc80bab0 error:0 in kamailio[400000+4af000]
[Mon Feb  1 08:10:28 2021] kamailio[8169]: segfault at 7 ip 00007f200d17eef9 sp 00007ffccd228360 error 4 in libc-2.17.so[7f200d132000+1c2000]
[Mon Feb  1 11:08:44 2021] kamailio[6226]: segfault at 28 ip 00007f732e500ef9 sp 00007ffe0f649620 error 4 in libc-2.17.so[7f732e4b4000+1c2000]
[Mon Feb  1 14:30:13 2021] kamailio[24829]: segfault at c ip 00000000006024be sp 00007fff2aacafd0 error 4 in kamailio[400000+4af000]
[Mon Feb  1 15:38:56 2021] kamailio[2157]: segfault at f0f0f0f8 ip 00007f5cb99601a2 sp 00007ffcb079b070 error 4 in dialog.so[7f5cb9939000+cd000]
[Mon Feb  1 15:39:06 2021] kamailio[2093]: segfault at f0f0f110 ip 00007f5cb995a931 sp 00007ffcb079b2e0 error 4 in dialog.so[7f5cb9939000+cd000]
[Mon Feb  1 15:58:23 2021] kamailio[1014]: segfault at c0c0c0c8 ip 00000000006024be sp 00007ffe6a339590 error 4 in kamailio[400000+4af000]
[Tue Feb  2 01:03:57 2021] kamailio[4755]: segfault at 229 ip 00000000006024be sp 00007ffc4fc04590 error 4 in kamailio[400000+4af000]
[Tue Feb  2 09:01:57 2021] traps: kamailio[6179] general protection ip:6024be sp:7ffcd941ec20 error:0 in kamailio[400000+4af000]
[Tue Feb  2 09:57:03 2021] kamailio[25176]: segfault at 10 ip 00000000006024be sp 00007fff0b5c9620 error 4 in kamailio[400000+4af000]
[Tue Feb  2 10:05:43 2021] traps: kamailio[18233] general protection ip:6024be sp:7ffd0e595f70 error:0 in kamailio[400000+4af000]
[Tue Feb  2 10:39:32 2021] traps: kamailio[22574] general protection ip:6024be sp:7ffee87edc30 error:0 in kamailio[400000+4af000]
[Tue Feb  2 11:16:58 2021] traps: kamailio[5711] general protection ip:7f83d42c5931 sp:7ffe25ab7880 error:0 in dialog.so[7f83d42a4000+cd000]
[Tue Feb  2 11:17:08 2021] traps: kamailio[5682] general protection ip:7f83d42c5931 sp:7ffe25aba530 error:0 in dialog.so[7f83d42a4000+cd000]
[Tue Feb  2 11:51:00 2021] traps: kamailio[22527] general protection ip:7fbd54be1ef9 sp:7ffdf86303f0 error:0 in libc-2.17.so[7fbd54b95000+1c2000]
[Tue Feb  2 12:17:00 2021] kamailio[5721]: segfault at 3 ip 00007f569bdeaef9 sp 00007fff625860b0 error 4 in libc-2.17.so[7f569bd9e000+1c2000]
[Tue Feb  2 14:10:50 2021] kamailio[17395]: segfault at 4e ip 00000000006024be sp 00007ffd38eb78b0 error 4 in kamailio[400000+4af000]
[Tue Feb  2 14:13:19 2021] traps: kamailio[5376] general protection ip:6024be sp:7ffd52be6f70 error:0 in kamailio[400000+4af000]
[Wed Feb  3 08:59:27 2021] traps: kamailio[16759] general protection ip:7f85ba1719c3 sp:7ffd629152c0 error:0 in dialog.so[7f85ba150000+cd000]

@devryan
Copy link
Author

devryan commented Mar 29, 2021

FYI, we fixed these segfaults by completely removing all StatsD-related dialog additions to kamailio.cfg. We still use dialog module, but just for call timers/call states.

@devryan devryan closed this as completed Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants