Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segfault in free_contacts() under load in 2.4.7 #2095

Closed
attermann opened this issue Apr 30, 2020 · 32 comments
Closed

[BUG] Segfault in free_contacts() under load in 2.4.7 #2095

attermann opened this issue Apr 30, 2020 · 32 comments
Assignees
Milestone

Comments

@attermann
Copy link

attermann commented Apr 30, 2020

OpenSIPS version you are running

version: opensips 2.4.7 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: 495b980
main.c compiled on 03:18:29 Apr 10 2020 with gcc 5.4.0

Describe the bug

Opensips crashes randomly under load. Sometime multiple times a day, and other times a day or more between crashes.

Some backtraces show slight variations in the call stack but always eventually through run_trans_callbacks_locked() and segfaults in free_contacts ().

To Reproduce

Unable to reproduce. Happens randomly in production under load.

Expected behavior

No segfault!

Relevant System Logs

Apr 28 20:37
core.opensips.sig11.18058

(gdb) bt full
#0  free_contacts (_c=0x7f5ece72f610) at parser/contact/contact.c:296
        ptr = 0x6e6f3d32723b
#1  0x00000000005aa5fa in free_contact (_c=_c@entry=0x7f5eb32530e0)
    at parser/contact/parse_contact.c:107
No locals.
#2  0x000000000053e75c in clean_hdr_field (hf=hf@entry=0x7f5eb32530b0)
    at parser/hf.c:87
No locals.
#3  0x00007f5eadfea2c7 in clean_msg_clone (msg=0x7f5eb32521e0, 
    max=0x7f5eb3253a78, min=0x7f5eb32521e0) at sip_msg.h:116
        hdr = 0x7f5eb32530b0
#4  run_trans_callbacks_locked (type=type@entry=128, 
    trans=trans@entry=0x7f5eb2ecdf08, req=<optimized out>, 
    rpl=rpl@entry=0x7f5ece745cc8, code=code@entry=200) at t_hooks.c:265
No locals.
#5  0x00007f5eadf976ae in relay_reply (t=<optimized out>, 
    p_msg=<optimized out>, branch=<optimized out>, msg_status=<optimized out>, 
    cancel_bitmap=<optimized out>) at t_reply.c:1353
        relay = <optimized out>
        save_clone = <optimized out>
        buf = 0x7f5ece731738 "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP <redacted source>:5060;branch=z9hG4bKt9ULXZ7fmfq6e5fK61D6FA\r\nRecord-Route: <sip:<redacted interface>:7080;transport=tcp;r2=on;lr;did=1f3.6b5ad8f6>\r\nRecord-Route: <sip:<redacted interface>"...
        res_len = 1015
        relayed_code = 200
        relayed_msg = <optimized out>
        bm = {to_tag_val = {s = 0x7f5eb32521e0 "\212B\016", len = 4455636}}
        totag_retr = <optimized out>
        uas_rb = 0x7f5eb2ece008
        cb_s = {
          s = 0x7f5ece731738 "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP <redacted source>:5060;branch=z9hG4bKt9ULXZ7fmfq6e5fK61D6FA\r\nRecord-Route: <sip<redacted interface>:7080;transport=tcp;r2=on;lr;did=1f3.6b5ad8f6>\r\nRecord-Route: <sip:<redacted interface>"..., len = 1015}
        text = {s = 0x7f5ece745cc8 "_b\002", len = -1515613952}
        __FUNCTION__ = "relay_reply"
#6  0x00007f5eadf9e8a7 in reply_received (p_msg=0x7f5ece745cc8)
    at t_reply.c:1603
        msg_status = 200
        last_uac_status = 100
        branch = 0
        reply_status = <optimized out>
        timer = 2
        cancel_bitmap = 0
        uac = <optimized out>
        t = 0x7f5eb2ecdf08
        backup_list = <optimized out>
        has_reply_route = <optimized out>
        __FUNCTION__ = "reply_received"
#7  0x0000000000470a71 in forward_reply (msg=msg@entry=0x7f5ece745cc8)
    at forward.c:495
        new_buf = 0x0
        to = 0x0
        new_len = 1
        mod = 0x7f5ece6425e8
        proto = <optimized out>
        id = 0
        send_sock = <optimized out>
        s = <optimized out>
        len = <optimized out>
        __FUNCTION__ = "forward_reply"
#8  0x0000000000446d62 in receive_msg (
    buf=buf@entry=0x9bbec0 <tcp_current_req> "SIP/2.0 200 OK\r\nVia: SIP/2.0/TCP <redacted interface>:7080;branch=z9hG4bK9529.26b38601.0\r\nVia: SIP/2.0/UDP <redacted source>:5060;branch=z9hG4bKt9ULXZ7fmfq6e5fK61D6FA\r\nRecord-Route: <sip:<redacted interface>:7080;transpo"..., len=<optimized out>, rcv_info=rcv_info@entry=0x7ffd44971ba0, 
    existing_context=existing_context@entry=0x0, flags=flags@entry=0)
    at receive.c:258
        ctx = 0x7f5ece72de40
        msg = 0x7f5ece745cc8
        start = {tv_sec = 1588106249, tv_usec = 998674}
        rc = 3
        in_buff = {
          s = 0x9bbec0 <tcp_current_req> "SIP/2.0 200 OK\r\nVia: SIP/2.0/TCP <<redacted interface>:7080;branch=z9hG4bK9529.26b38601.0\r\nVia: SIP/2.0/UDP <redacted ource>:5060;branch=z9hG4bKt9ULXZ7fmfq6e5fK61D6FA\r\nRecord-Route: <sip:<redacted interface>:7080;transpo"..., len = 1075}
        __FUNCTION__ = "receive_msg"
#9  0x000000000060b443 in tcp_handle_req (_max_msg_chunks=<optimized out>, 
    con=<optimized out>, req=0x9bbec0 <tcp_current_req>)
    at net/proto_tcp/tcp_common.h:411
        local_rcv = {src_ip = {af = 2, len = 4, u = {addrl = {3526109766, 0}, 
              addr32 = {3526109766, 0, 0, 0}, addr16 = {10822, 53804, 0, 0, 0, 
                0, 0, 0}, addr = "F*,\322", '\000' <repeats 11 times>}}, 
          dst_ip = {af = 2, len = 4, u = {addrl = {3408669254, 0}, addr32 = {
                3408669254, 0, 0, 0}, addr16 = {10822, 52012, 0, 0, 0, 0, 0, 
                0}, addr = "F*,\313", '\000' <repeats 11 times>}}, 
          src_port = 33762, dst_port = 7080, proto = 2, 
          proto_reserved1 = 647886724, proto_reserved2 = 0, src_su = {s = {
              sa_family = 2, 
              sa_data = "\203\342F*,\322\000\000\000\000\000\000\000"}, sin = {
              sin_family = 2, sin_port = 57987, sin_addr = {
                s_addr = 3526109766}, 
              sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {
              sin6_family = 2, sin6_port = 57987, sin6_flowinfo = 3526109766, 
              sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
                  __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 
                    0, 0}}}, sin6_scope_id = 1023}}, 
          bind_address = 0x7f5ece63ef10}
        msg_buf = 0x9bbec0 <tcp_current_req> "SIP/2.0 200 OK\r\nVia: SIP/2.0/TCP <redacted interface>:7080;branch=z9hG4bK9529.26b38601.0\r\nVia: SIP/2.0/UDP <redacted_source>:5060;branch=z9hG4bKt9ULXZ7fmfq6e5fK61D6FA\r\nRecord-Route: <sip:<redacted interface>:7080;transpo"...
        msg_len = <optimized out>
        c = 56 '8'
        size = 0
#10 tcp_read_req (con=0x7f5eaed175f8, bytes_read=0x7ffd44971c80)
    at net/proto_tcp/proto_tcp.c:1187
        bytes = <optimized out>
        total_bytes = 1075
        req = 0x9bbec0 <tcp_current_req>
        src_su = {s = {sa_family = 0, sa_data = '\000' <repeats 13 times>}, 
          sin = {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, 
            sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {
            sin6_family = 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {
              __in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
                __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 
                  0, 0}}}, sin6_scope_id = 6273364}}
        dst_su = {s = {sa_family = 0, 
            sa_data = "\000\000\000\000\000\000\000\221\251\245-\325a#"}, 
          sin = {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, 
            sin_zero = "\000\221\251\245-\325a#"}, sin6 = {sin6_family = 0, 
            sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {
                __u6_addr8 = "\000\221\251\245-\325a#\000\000\000\000\000\000\000", __u6_addr16 = {37120, 42409, 54573, 9057, 0, 0, 0, 0}, __u6_addr32 = {
                  2779353344, 593614125, 0, 0}}}, sin6_scope_id = 2932582632}}
        __FUNCTION__ = "tcp_read_req"
#11 0x00000000005e88f7 in handle_io (fm=0x7f5ece67d378, idx=idx@entry=49, 
    event_type=event_type@entry=1) at net/net_tcp_proc.c:251
        ret = 0
        n = <optimized out>
        con = 0x7f5eaed175f8
        s = 309
        rw = <optimized out>
        resp = <optimized out>
        response = {140044726089208, 1}
        __FUNCTION__ = "handle_io"
#12 0x00000000005ea88b in io_wait_loop_epoll (h=<optimized out>, 
    t=<optimized out>, repeat=<optimized out>) at net/../io_wait_loop.h:280
        ep_event = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}
        ret = <optimized out>
        n = 1
        r = 49
        e = <optimized out>
        fd = <optimized out>
        i = <optimized out>
#13 tcp_worker_proc_loop () at net/net_tcp_proc.c:396
        __FUNCTION__ = "tcp_worker_proc_loop"
#14 0x00000000005f87fb in tcp_start_processes (
    chd_rank=chd_rank@entry=0x998104 <chd_rank>, 
    startup_done=startup_done@entry=0x0) at net/net_tcp.c:1909
        r = 31
        reader_fd = {203, 205}
        pid = <optimized out>
        __FUNCTION__ = "tcp_start_processes"
#15 0x000000000041d343 in main_loop () at main.c:790
        startup_done = 0x0
        chd_rank = 48
        rc = <optimized out>
#16 main (argc=<optimized out>, argv=<optimized out>) at main.c:1442
        cfg_stream = <optimized out>
        c = <optimized out>
        r = <optimized out>
        tmp = 0x7ffd44973ee1 ""
        tmp_len = <optimized out>
        port = <optimized out>
        proto = <optimized out>
        protos_no = <optimized out>
        options = 0x639290 "f:cCm:M:b:l:n:N:rRvdDFEVhw:t:u:g:P:G:W:o:"
        ret = -1
        seed = 1514354603
        __FUNCTION__ = "main"

OS/environment information

  • Operating System: Ubuntu 16.04.6 LTS
  • OpenSIPS installation: github latest "2.4" branch

Additional context

Seems to be the same issue as reported mid-March in OpenSIPS-Users "Re: segfault at the opensips 2.4.5 (Liviu Chircu) (Johan De Clercq)".

@attermann
Copy link
Author

I would really appreciate a response to this bug report. If more details are required to troubleshoot this I will provide whatever I can. I am also happy to work on a fix for this myself but I'd like to hear back first to know whether perhaps this has already been fixed since there was a similar report to mine.

@attermann
Copy link
Author

It's worth noting that since reverting back to 2.3 (2.3.6 to be exact) we have not encountered this segfault again.

@liviuchircu
Copy link
Member

hey @attermann -- we're pretty busy these days, getting the 3.1 release ready. Expect this issue to get more attention once we hit the "feature freeze" point, within less than a couple of weeks.

Meanwhile, if you can reproduce this crash in a lab, I highly recommend you use the git bisect feature and help us locate the commit which broke it. For example, on latest 2.4 branch, you could search for the commit which broke it using this efficient technique:

git bisect start # begin hunting the bad commit
git bisect bad # mark the current, last commit as "bad"
git checkout a28b3d4ae5c9 # go to your 2.3.x "found-to-be-good" commit
git bisect good # actually mark this commit as "good"
# at this point, git will select a half-way commit.  Re-compile and re-test.  Based on your results, do either:
git bisect good / git bisect bad
# at this point, git will select a half-way commit.  Re-compile and re-test.  Based on your results, do either:
git bisect good / git bisect bad
...
# keep going until you find the exact commit which broke things.  This will take logarithmic time (say, 5-8 steps)

Cheers!

@vitalikvoip
Copy link

Also have this problem and can't reproduce it intentionally.
Used 2.4.5 and have another issue:

May 20 12:04:18 west /usr/sbin/opensips[1745]: CRITICAL:core:io_watch_add: #12>>> [TCP_main] BUG trying to overwrite entry 738 in the hash(738, 19, 0x7fbab137ea38,1) with (738, 19, 0x7fbab1235138,1)#12#012It seems you have hit a programming bug.
Please help us make OpenSIPS better by reporting it at https://github.com/OpenSIPS/opensips/issues

Updated to 2.4.7 and have these crashes.
It crashes either when tm wants to clean a transaction or in dialog when it wants to get a contact from sip_msg->contact->parsed. "sip_msg->contact->parsed" looks like a valid pointer but the memory it points to looks overwritten:

#3  0x000000000050cb6d in free_contact (_c=0x7f81ba070a20) at parser/contact/parse_contact.c:107
107     parser/contact/parse_contact.c: No such file or directory.
(gdb) p _c
$1 = (contact_body_t **) 0x7f81ba070a20
(gdb) p **_c
$2 = {star = 0 '\000', contacts = 0x63203a73253a6572}

#0  0x00000000004d7fd4 in do_free_params (_p=0xa736e6f69, _shm=0) at parser/parse_param.c:453
#1  0x00000000004d8048 in free_params (_p=0xa736e6f69) at parser/parse_param.c:465
#2  0x000000000050c22f in free_contacts (_c=0x7f81c697b058) at parser/contact/contact.c:298
#3  0x000000000050cb6d in free_contact (_c=0x7f81ba070a20) at parser/contact/parse_contact.c:107
#4  0x00000000004c2f59 in clean_hdr_field (hf=0x7f81ba0709f0) at parser/hf.c:87
#5  0x00007f81b60f20b8 in clean_msg_clone (msg=0x7f81ba06e360, min=0x7f81ba06e360, max=0x7f81ba0710f0) at sip_msg.h:116
#6  0x00007f81b60f2cc0 in run_trans_callbacks_locked (type=128, trans=0x7f81ba0d8bf0, req=0x7f81ba06e360, rpl=0x7f81c697b188, code=200) at t_hooks.c:265
#7  0x00007f81b6101e0d in relay_reply (t=0x7f81ba0d8bf0, p_msg=0x7f81c697b188, branch=0, msg_status=200, cancel_bitmap=0x7ffc7517ad2c) at t_reply.c:1358
#8  0x00007f81b610307c in reply_received (p_msg=0x7f81c697b188) at t_reply.c:1608
#9  0x00000000004364a6 in forward_reply (msg=0x7f81c697b188) at forward.c:495
#10 0x0000000000474e2d in receive_msg (
    buf=0x8dd860 <tcp_current_req> "SIP/2.0 200 OK\r\nCSeq: 3 INVITE\r\nCall-ID: 4fb0a15e993d8b7a-c8a2fa2d29abf7c7\r\nFrom: \"jhon.doe\" <sip:123123123@john.com>;tag=241273b736d581e8\r\nTo: <sip:333222111@john.com>;tag=60147127_c9b66bbf"..., len=4711, 
    rcv_info=0x7ffc7517aef0, existing_context=0x0, flags=0) at receive.c:258
#11 0x000000000054c7f2 in tcp_handle_req (req=0x8dd860 <tcp_current_req>, con=0x7f81b9041f48, _max_msg_chunks=10) at net/proto_tcp/tcp_common.h:411
#12 0x0000000000551f3a in tcp_read_req (con=0x7f81b9041f48, bytes_read=0x7ffc7517b0ac) at net/proto_tcp/proto_tcp.c:1275
#13 0x000000000053fb02 in handle_io (fm=0x7f81c690c0f0, idx=0, event_type=1) at net/net_tcp_proc.c:255
#14 0x000000000053e055 in io_wait_loop_epoll (h=0x8fe4e0 <_worker_io>, t=2, repeat=0) at net/../io_wait_loop.h:280
#15 0x0000000000540c1a in tcp_worker_proc_loop () at net/net_tcp_proc.c:401
#16 0x00000000005380bc in tcp_start_processes (chd_rank=0x80071c <chd_rank>, startup_done=0x7f81b8c8db70) at net/net_tcp.c:2063
#17 0x000000000043ce44 in main_loop () at main.c:790
#18 0x000000000043fae9 in main (argc=13, argv=0x7ffc7517b3e8) at main.c:1442

@vitalikvoip
Copy link

vitalikvoip commented May 20, 2020

@attermann do you use dialog, rtpengine, rest_client modules?

I've briefly checked what changed and seen some contact related changes in dialog module. It's just an assumption and I have no proofs. I've rebuilt opensips with q_malloc and memdbg and sometimes it notifies about overwritten memory chunks.

@attermann
Copy link
Author

attermann commented May 21, 2020

@vitalikvoip Yes I do use the dialog module but only on a small percentage of total calls. I do have one core that crashed in dialog though. Backtrace below for reference.

#0  0x0000000000523d1f in fm_remove_free (n=0x7f076df73418, qm=0x7f076de7a010)
    at mem/f_malloc.c:209
        pf = 0x0
        hash = 0
#1  fm_malloc (qm=0x7f076de7a010, size=0) at mem/f_malloc.c:438
        frag = 0x7f076df73418
        n = <optimized out>
        hash = 0
        __FUNCTION__ = "fm_malloc"
#2  0x00007f074bbe18bc in build_extra_hdr (cell=0x7f0752fd9890, 
    str_hdr=0x7ffce3f9e2d0, extra_hdrs=0x0) at dlg_req_within.c:337
        p = <optimized out>
#3  dlg_end_dlg (dlg=dlg@entry=0x7f0752fd9890, 
    extra_hdrs=extra_hdrs@entry=0x0, send_byes=send_byes@entry=1)
    at dlg_req_within.c:459
        str_hdr = {s = 0x0, len = 0}
        t = 0x10
        i = <optimized out>
        res = 0
        __FUNCTION__ = "dlg_end_dlg"
#4  0x00007f074bbf8d4d in dlg_ontimeout (tl=0x7f0752fd98e0)
    at dlg_handlers.c:2095
        old_ctx = <optimized out>
        new_ctx = <optimized out>
        dlg = 0x7f0752fd9890
        new_state = 0
        old_state = 757703520
        unref = 5
        do_expire_actions = 1
        __FUNCTION__ = "dlg_ontimeout"
#5  0x00007f074bbdcac9 in dlg_timer_routine (ticks=<optimized out>, 
    attr=<optimized out>) at dlg_timer.c:602
        tl = 0xffffffffffffffff
        ctl = 0x7f0752fd98e0
        __FUNCTION__ = "dlg_timer_routine"
#6  0x00000000004ba09c in handle_timer_job () at timer.c:731
        t = 0x7f074e186440
        __FUNCTION__ = "handle_timer_job"
#7  0x00000000005e874a in handle_io (fm=0x7f076debfae8, idx=idx@entry=117, 
    event_type=event_type@entry=1) at net/net_tcp_proc.c:134
        ret = 0
        n = <optimized out>
        con = <optimized out>
        s = 302
        rw = <optimized out>
        resp = <optimized out>
        response = {139669392015104, 1}
        __FUNCTION__ = "handle_io"
#8  0x00000000005ea88b in io_wait_loop_epoll (h=<optimized out>, 
    t=<optimized out>, repeat=<optimized out>) at net/../io_wait_loop.h:280
        ep_event = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}
        ret = <optimized out>
        n = 1
        r = 117
        e = <optimized out>
        fd = <optimized out>
        i = <optimized out>
#9  tcp_worker_proc_loop () at net/net_tcp_proc.c:396
        __FUNCTION__ = "tcp_worker_proc_loop"
#10 0x00000000005f87fb in tcp_start_processes (
    chd_rank=chd_rank@entry=0x998104 <chd_rank>, 
    startup_done=startup_done@entry=0x0) at net/net_tcp.c:1909
        r = 23
        reader_fd = {187, 189}
        pid = <optimized out>
        __FUNCTION__ = "tcp_start_processes"
#11 0x000000000041d343 in main_loop () at main.c:790
        startup_done = 0x0
        chd_rank = 40
        rc = <optimized out>
#12 main (argc=<optimized out>, argv=<optimized out>) at main.c:1442
        cfg_stream = <optimized out>
        c = <optimized out>
        r = <optimized out>
        tmp = 0x7ffce3f9eee1 ""
        tmp_len = <optimized out>
        port = <optimized out>
        proto = <optimized out>
        protos_no = <optimized out>
        options = 0x639290 "f:cCm:M:b:l:n:N:rRvdDFEVhw:t:u:g:P:G:W:o:"
        ret = -1
        seed = 4044006346
        __FUNCTION__ = "main"

@liviuchircu
Copy link
Member

liviuchircu commented May 22, 2020

@vitalikvoip and @attermann : I did some stress tests on 2.4.7 (f8e19b6) with "dialog" and "topology_hiding", but could not find any crash. If I remember correctly, the free_contacts() crash resembles some kind of PKG memory corruption, so any module could be the culprit, really.

Bottom line: please provide either your opensips.cfg or a minimal version of it that still triggers the crash (ship to liviu@opensips.org if you cannot make it public), so I can study/compare them and get closer to a diagnosis on this memory corruption. Alternatively, if you can help me reproduce the crash on my lab setup, the bug is 80%+ solved.

Cheers!

@attermann
Copy link
Author

@liviuchircu Unfortunately I haven't been able to reproduce outside of production so I'm going to have to send my whole config. I trimmed a lot of commented sections and anonymized so hopefully I didn't break it. Note that most of the "external" connections are TCP, and "internal" are UDP.

opensips.cfg:

####### Global Parameters #########

#log_level=2
log_level=3
#log_level=4
log_stderror=no
log_facility=LOG_LOCAL1

children=4

/* uncomment the following line to enable debugging */
#debug_mode=yes

/* comment the next line to enable the auto discovery of local aliases
   based on revers DNS on IPs */
auto_aliases=no

# CBA
check_via=no
dns=no
rev_dns=no
dns_try_ipv6=no
dns_retr_time=1
dns_retr_no=3

disable_dns_blacklist=yes
disable_dns_failover=no
dns_use_search_list=no

# CBA TCP related params
open_files_limit=16384
#tcp_children=16
tcp_children=32
tcp_max_connections=16384
# test params for TCP connection keep-alive (to resolve webrtc client port changing by setting connection lifetime to exceed registration interval)
#tcp_connection_lifetime=130
tcp_connection_lifetime=610
#tcp_keepalive=1
#tcp_keepcount=3
#tcp_keepidle=300
#tcp_keepinterval=300

# CBA performance checks
event_pkg_threshold = 90
event_shm_threshold = 90
#exec_dns_threshold = 60000
exec_dns_threshold = 100000 /* 100 milliseconds */
#exec_msg_threshold = 60000
exec_msg_threshold = 1000000 /* 1 second */
#tcp_threshold = 60000
tcp_threshold = 100000 /* 100 milliseconds */

# include local config
include_file "local.cfg"

# ------------------ module loading ----------------------------------
#mpath="/usr/lib/x86_64-linux-gnu/opensips/modules"
mpath="/usr/local/opensips/lib64/opensips/modules"

#### SIGNALING module
loadmodule "signaling.so"

#### StateLess module
loadmodule "sl.so"

#### Transaction Module
loadmodule "tm.so"
modparam("tm", "fr_timeout", 5)
#modparam("tm", "fr_inv_timeout", 30)
modparam("tm", "fr_inv_timeout", 120)
modparam("tm", "restart_fr_on_each_reply", 0)
modparam("tm", "onreply_avp_mode", 0)

#### Record Route Module
loadmodule "rr.so"
/* do not append from tag to the RR (no need for this script) */
modparam("rr", "append_fromtag", 0)

#### MAX ForWarD module
loadmodule "maxfwd.so"

#### SIP MSG OPerationS module
loadmodule "sipmsgops.so"

#### FIFO Management Interface
loadmodule "mi_fifo.so"
modparam("mi_fifo", "fifo_name", "/tmp/opensips_fifo")
modparam("mi_fifo", "fifo_mode", 0666)

#### URI module
loadmodule "uri.so"
modparam("uri", "use_uri_table", 0)

#### UDP protocol
loadmodule "proto_udp.so"
loadmodule "proto_tcp.so"

#### TEXTOPS module
loadmodule "textops.so"

#### DbText module
loadmodule "db_text.so"
modparam("db_text", "db_mode", 1)

#### nat traversal module
loadmodule "nat_traversal.so"
modparam("nat_traversal", "keepalive_state_file", "/var/run/opensips/keepalive_state")

##### RTPengine module
loadmodule "rtpengine.so"
modparam("rtpengine", "extra_id_pv", "$avp(extra_id)")
# "rtpengine_sock" is only means to specify rtpengine server in 2.3 and below
modparam("rtpengine", "rtpengine_sock", "udp:1.1.1.2:2223 udp:1.1.1.3:2223")
# db_url" for specifying rtpengine servers is only supported in 2.4 and above
#modparam("rtpengine", "db_url", "text:///etc/opensips/dbtext")

##### user location module
loadmodule "usrloc.so"
modparam("usrloc", "use_domain", 1)
modparam("usrloc", "nat_bflag", "NAT")
modparam("usrloc", "db_mode", 0)
#modparam("usrloc", "db_mode", 1)
#modparam("usrloc", "db_url", "mysql://opensips:opensipsrw@localhost/opensips")

##### registrar module
loadmodule "registrar.so"
/* uncomment the next line not to allow more than 10 contacts per AOR */
modparam("registrar", "max_contacts", 10)

##### Mid-Registrar module
loadmodule "mid_registrar.so"
modparam("mid_registrar", "mode", 2) /* 0 = contact-mirroring / 1 = contact-throttling / 2 = aor-throttling */
modparam("mid_registrar", "outgoing_expires", 7200)
#modparam("mid_registrar", "insertion_mode", 0) /* 0 = contact; 1 = path */
/* uncomment the next line not to allow more than 10 contacts per AOR */
modparam("mid_registrar", "max_contacts", 10)
# CBA set TCP connections to close only after reg expiry
modparam("mid_registrar", "tcp_persistent_flag", 10)

##### dialog module (for call duration limits)
loadmodule "dialog.so"

##### dns_cache module
loadmodule "cachedb_local.so"
modparam("cachedb_local", "cachedb_url", "local://")
loadmodule "dns_cache.so"
modparam("dns_cache", "cachedb_url", "local://")
modparam("dns_cache", "blacklist_timeout", 10) # 10 seconds


####### Routing Logic ########

# main request routing logic

route {

  # perform initial message validation
  if (!mf_process_maxfwd_header("10"))
  {
    xlog("L_NOTICE", "[$ci] sanity: rejecting due to maxfwd\n");
    sl_send_reply( "483", "Too Many Hops");
    exit;
  }
  if ($ml >= 16384)
  {
    xlog("L_NOTICE", "[$ci] sanity: rejecting due to message size\n");
    sl_send_reply( "513", "Message too big");
    exit;
  };
  if ($ua =~ "friendly-scanner")
  {
    xlog("L_NOTICE", "[$ci] blacklist: rejecting due to user-agent\n");
    # do not reply for stealth
    #sl_send_reply("403", "Forbidden");
    exit;
  };

  # OPTIONS processing
  if (is_method( "OPTIONS"))
  {
    # ReachUC clients require response to OPTIONS
    # only reply on internal port(s) for stealth
    #if ($Rp == 6050 || $Rp == 8070)
      sl_send_reply( "200", "OK");
    exit;
  }

  # check for sequential request within a dialog (indicated by tag in To header)
  if (has_totag())
  {
    # sequential request within a dialog should
    # take the path determined by record-routing
    if (loose_route())
    {
      if (is_method( "INVITE|BYE|ACK|CANCEL|REFER|INFO|SUBSCRIBE|NOTIFY"))
      {
        # test for issue with natted uc client
        # fix contact header for natted clients
        #if (client_nat_test("3")) {
        if (client_nat_test("9")) {
          fix_contact();
        }
        # mark routing logic in request
        append_hf("P-hint: rr-enforced\r\n");
        route(sequential);
      }
      else
      {
        xlog("L_WARN", "[$ci] except_lr_method: $rm (from: $fu -> to: $tu)\n");
      }
    }
    # sequential without record-route should be treated as initial
    # (currently only applies to ACK to a failure response)
    else if (is_method( "ACK"))
    {
      # mark routing logic in request
      append_hf("P-hint: rr-absent\r\n");
      route(initial);
    }
    else
    {
      #xlog("L_WARN", " [$ci] except_loose: $rm (from: $fu -> to: $tu)\n");
      append_hf("P-hint: rr-absent\r\n");
      route(initial);
    }
    exit;
  }

  # record routing for dialog forming requests (in case they are routed)
  # remove any preloaded route headers
  remove_hf( "Route");

  # request is an initial request, so apply routing logic

  # check for dialog-forming requests (INVITE and SUBSCRIBE) that should get record-route
  if (is_method( "INVITE|SUBSCRIBE"))
  {
    # test for issue with natted uc client
    # fix contact header for natted clients
    #if (client_nat_test("3")) {
    if (client_nat_test("9")) {
      fix_contact();
    }
    record_route();
    route(initial);
  }
  # allow initial ACK and CANCEL requests as well (outside of dialog)
  else if (is_method( "ACK|CANCEL"))
  {
    # sequential request without to-tag? (ie, ACK to a malformed 407 response without to-tag, or CANCEL request)
    route(initial);
  }
  # check for special-case for REGISTER request
  else if (is_method( "REGISTER"))
  {
    # fix contact header for natted clients
    #if (client_nat_test("3")) {
    if (client_nat_test("9")) {
      fix_contact();
    }
    # TODO determine if record-route is appropriate for REGISTER
    record_route();
    route(register);
  }
  # check for special-case for MESSAGE request
  else if (is_method( "MESSAGE"))
  {
    route(message);
  }
  # check for non-dialog-forming requests
  else if (is_method( "SUBSCRIBE|NOTIFY|INFO"))
  {
    route(initial);
  }
  else
  {
    xlog("L_WARN", "[$ci] unsupported: $rm (from: $fu -> to: $tu)\n");
  }

}

route[sdp] {
  # apply SDP manipulation
  if(has_body("application/sdp")) {
    # fix IPv6 media
    if (search_body("[Cc]=[Ii][Nn] [Ii][Pp]6")) {
      xlog("L_INFO", "[$ci] IPv6 SDP detected in initial request\n");
      #fix_nated_sdp("2", "10.11.12.13");
      #fix_nated_sdp("2");
      # replace ipv6 IP with fake private IP to invoke IP detection logic in NMS
      replace_body_all("[Cc]=[Ii][Nn] [Ii][Pp]6.*", "c=IN IP4 10.11.12.13");
    }
    # fix NS webrtc media
    #if (search_body("[Oo]=[Nn][Ee][Tt][Ss][Aa][Pp][Ii][Ee][Nn][Ss]_[Nn][Mm][Ss].*")) {
    if (false) {
      #xlog("L_INFO", "[$ci] NMS SDP detected\n");
      if (search_body("[Aa]=[Mm][Ss][Ii][Dd]-[Ss][Ee][Mm][Aa][Nn][Tt][Ii][Cc].*")) {
        xlog("L_INFO", "[$ci] removing msid-semantic from NMS SDP\n");
        replace_body_all("[Aa]=[Mm][Ss][Ii][Dd]-[Ss][Ee][Mm][Aa][Nn][Tt][Ii][Cc].*", "a=removed");
      }
      if (search_body("[Aa]=[Mm][Ii][Dd]:[Aa][Uu][Dd][Ii][Oo].*")) {
        xlog("L_INFO", "[$ci] removing mid:audio from NMS SDP\n");
        replace_body_all("[Aa]=[Mm][Ii][Dd]:[Aa][Uu][Dd][Ii][Oo].*", "a=removed");
      }
    }
  }
}

route[transcode]
{

  # check for switch to transcode call
  if (is_method("INVITE|ACK|CANCEL|BYE") && ($Rp == 7080 || $Rp == 8070)) {
    setflag(transcode);
    xlog("L_INFO", "[$ci] *** SET transcode=true\n");
  }

  if (isflagset(transcode)) {

    # create unique rtpengine id for this branch transaction
    #$avp(extra_id) = "branch." + $T_branch_idx;
    #$avp(extra_id) = $hdr(Via) + "." + $T_branch_idx;
    #$avp(extra_id) = $ci + "-" + $T_branch_idx;
    $avp(extra_id) = $ci + "-" + $ft + "-" + $T_branch_idx;
    xlog("L_INFO", "[$ci] extra_id: $avp(extra_id)\n");

    $var(rtpengine_branchflags) = "via-branch=extra";
    xlog("L_INFO", "[$ci] *** SET rtpengine_branchflags=$var(rtpengine_branchflags)\n");

    # codec-mask-CODEC: do not present CODEC to B-side, but use it on A-side
    # codec-strip-CODEC: do not use CODEC for A-side
    # transcode-CODEC: offer CODEC to B-side

    # external interface, coming from client
    if ($Rp == 5060 || $Rp == 7080) {
      # codec-mask-opus: make the A-side receive opus codec, but B-side will use something else
      # codec-strip-PCMU codec-strip-PCMA: prevent the A-side from receiving either PCMU or PCMA codec
      # transcode-PCMU transcode-PCMA: offer either PCMU or PCMA codec to B-side
      #$var(rtpengine_offerflags) = "codec-mask-iLBC codec-mask-opus codec-strip-PCMU codec-strip-PCMA transcode-PCMU transcode-PCMA";
      $var(rtpengine_offerflags) = "via-branch=extra codec-mask-iLBC codec-mask-opus codec-strip-PCMU codec-strip-PCMA transcode-PCMU transcode-PCMA";
    }
    # internal interface, coming from pbx
    else if ($Rp == 6050 || $Rp == 8070) {
      # codec-mask-PCMU codec-mask-PCMA: make the A-side receive PCMU or PCMA codec, but B-side will use something else
      # codec-strip-opus: prevent the A-side from receiving opus codec
      # transcode-opus: offer opus codec to B-side
      #$var(rtpengine_offerflags) = "codec-mask-PCMU codec-mask-PCMA codec-strip-iLBC codec-strip-opus transcode-iLBC transcode-opus";
      $var(rtpengine_offerflags) = "via-branch=extra codec-mask-PCMU codec-mask-PCMA codec-strip-iLBC codec-strip-opus transcode-iLBC transcode-opus";
    }
    xlog("L_INFO", "[$ci] *** SET rtpengine_offerflags=$var(rtpengine_offerflags)\n");
  }
}

route[register]
{

      xlog("L_NOTICE", "[$ci] register_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, contact: $ct)\n");

  if ($rd=~".*.service")
  {
    # handle service non-fqdn domains
    xlog("L_INFO", "[$ci] service_domain: $rd\n");
    # force static route to service proxy
    # service calls routed to <reseller>.hpbx.example.com
    $var(domain) = $rd;
    # route to hostname derived from domain
    $var(proxy) = "sip:" + $(var(domain){s.select,-2,.}) + ".hpbx.example.com";
  }
  else if ($rd=~".*.other.net")
  {
    # handle other domains
    xlog("L_INFO", "[$ci] other_domain: $rd\n");
    # force static route to other proxy
    # other calls routed directly to domain
    $var(proxy) = "sip:" + $rd;
  }
  else
  {
    xlog("L_NOTICE", "[$ci] register unknown_domain: $rd, REJECTING\n");
    # do not reply for stealth
    #sl_send_reply("403", "Forbidden");
    exit;
  }

  # mid-registrar
  #bm_start_timer("mid-reg-save");
  mid_registrar_save("location");
  #mid_registrar_save("location", "r");
  # note call to bm_log_time belows likely disrupts $retcode checked next
  #bm_log_timer("mid-reg-save");
  switch ($retcode) {
  case 1:
    # 1 (success) - current REGISTER request must be dispatched by the script writer over to the main registrar
    xlog("L_INFO", "[$ci] registrar: relaying registration for \"$ru\" to \"$var(proxy)\" [$si:$sp] $ru (from: $fu -> to: $tu)\n");
    $ru = $var(proxy);
    # swap internal and external interfaces
    route(swap);
    xlog("L_NOTICE", "[$ci] register_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, contact: $ct)\n");
    # relay the request
    t_on_reply("transaction");
    t_relay();
    break;
  case 2:
    # 2 (success) - current REGISTER request has been absorbed by the mid-registrar; a 200 OK reply has been sent upstream
    xlog("L_INFO", "[$ci] registrar: absorbing registration for \"$ru\" [$si:$sp] (from: $fu -> to: $tu)\n");
    break;
  default:
    # -1 (error) - generic error code; the logs should provide more help
    xlog("L_WARN", "[$ci] registrar: failed to save registration for \"$ru\"! [$si:$sp] (from: $fu -> to: $tu)\n");
    sl_send_reply("500", "Server Internal Error");
    exit;
  }

}

route[initial]
{

  # maniulate sdp
  route(sdp);

  # external interface
  if ($Rp == 5060 || $Rp == 7080)
  {

    xlog("L_NOTICE", "[$ci] initial_external_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

    # mark routing logic in request
    append_hf("P-hint: external\r\n");

    if ($rd=~".*.service")
    {
      # handle service non-fqdn domains
      xlog("L_INFO", "[$ci] service_domain: $rd\n");
      # service calls routed to <reseller>.hpbx.example.com
      $var(domain) = $rd;
      # route to hostname derived from domain
      $var(proxy) = "sip:" + $(var(domain){s.select,-2,.}) + ".hpbx.example.com";
      # temporarily routing to alternate host
      #$var(proxy) = "sip:ns-dev01.whitelabelvoip.net";

      # honor proxy request if present
#      xlog("L_INFO", "[$ci] *** X-RUC-Proxy: $hdr(X-RUC-Proxy)\n");
#      if ($(hdr(X-RUC-Proxy){s.len}) > 0) {
#        # set proxy
#        xlog("L_INFO", "[$ci] *** Setting new proxy!\n");
#        $var(proxy) = "sip:" + $hdr(X-RUC-Proxy);
#      }
      xlog("L_INFO", "[$ci] proxy: $var(proxy)\n");
      $du = $var(proxy);
    }
    else if ($rd=~".*.other.net")
    {
      # handle direct domains
      xlog("L_INFO", "[$ci] other_domain: $rd\n");
      $du = "sip:" + $rd;
    }
    else
    {
      xlog("L_NOTICE", "[$ci] initial unknown_domain: $rd, REJECTING\n");
      # do not reply for stealth
      #sl_send_reply("403", "Forbidden");
      exit;
    }

    # limit outgoing call duration to specified limit
    xlog("L_INFO", "[$ci] *** X-RUC-Limit: $hdr(X-RUC-Limit) (len: $(hdr(X-RUC-Limit){s.len}))\n");
#    if ($hdr(X-RUC-Limit)!="0") {
#    if ($(hdr(X-RUC-Limit){s.len}) > 0 && $hdr(X-RUC-Limit)!="0") {
    if ($hdr(X-RUC-Limit) != null && $(hdr(X-RUC-Limit){s.len}) > 0 && $hdr(X-RUC-Limit) != "0") {
      # set outgoing call duration limit
      xlog("L_INFO", "[$ci] *** Setting call duration limit of $hdr(X-RUC-Limit) seconds!\n");
      create_dialog("B");
      $DLG_timeout = $hdr(X-RUC-Limit);
    }

    # NOTE: moved to branch_route[initial]
    # conditionally swap internal and external interfaces
    #route(swap);

    xlog("L_NOTICE", "[$ci] initial_external_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");
  }
  # internal interface
  else if ($Rp == 6050 || $Rp == 8070)
  {

    # NOTE: ACL now handled by UFW blocking of internal ports to external IPs

    xlog("L_NOTICE", "[$ci] initial_internal_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

    # mark routing logic in request
    append_hf("P-hint: internal\r\n");

    # mid-registrar
    xlog("L_INFO", "[$ci] registrar: looking up \"$ru\" $rm [$si:$sp] (from: $fu -> to: $tu)\n");
    if (!mid_registrar_lookup("location")) {
      xlog("L_INFO", "[$ci] registrar: could not find \"$ru\" $rm [$si:$sp] (from: $fu -> to: $tu)\n");
      if (is_method( "INVITE") && $rd=~".*.service")
      {
        # can not route .service domains without registration
        xlog("L_INFO", "[$ci] registrar: rejecting request, \"$ru\" not found in lookup $rm [$si:$sp] (from: $fu -> to: $tu)\n");
        t_reply("404", "Not Found");
        exit;
      }
      # 2.3 and below self-check
      else if (uri == myself) {
      # 2.4 and above self-check
      #else if (is_myself("$rd")) {
        # uri refers to this proxy (probably from a past registration)
        xlog("L_INFO", "[$ci] registrar: rejecting request, \"$ru\" refers to myself: $rm [$si:$sp] (from: $fu -> to: $tu)\n");
        t_reply("404", "Not Found");
        exit;
      }
      xlog("L_INFO", "[$ci] registrar: routing request as addressed to \"$ru\" $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
      # routing as addressed instead of failing, as NOTIFY is using direct uri
      #t_reply("404", "Not Found");
      #exit;
    }
    else {
      xlog("L_INFO", "[$ci] registrar: found \"$ru\", routing to contact $rm [$si:$sp] (from: $fu -> to: $tu)\n");
    }

    # NOTE: moved to branch_route[initial]
    # conditionally swap internal and external interfaces
    #route(swap);

    xlog("L_NOTICE", "[$ci] initial_internal_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");
  }
  else
  {
    xlog("L_NOTICE", "[$ci] unknown interface: $rm [$si:$sp] (from: $fu -> to: $tu)\n");
    sl_send_reply( "403", "Forbidden");
    exit;
  }

/*
  # NOTE: moved rtpengine setup to branch_route[]
  # handle rtpengine setup for new call if early media
  route(transcode);
  if (isflagset(transcode)) {
    if (is_method("INVITE") && has_body("application/sdp")) {
      setflag(early);
      xlog("L_INFO", "[$ci] *** RTPEngine INVITE OFFER\n");
      rtpengine_offer("$var(rtpengine_offerflags)");
    }
  }
*/

    # relay the request
    t_on_branch("initial");
    t_on_reply("transaction");
    if (is_method("CANCEL")) {
      # preserve the received Reason header
      t_relay("8");
    }
    else {
      t_relay();
    }
}

branch_route[initial] {

  xlog("L_NOTICE", "[$ci] branch_initial_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

  # conditionally swap internal and external interfaces
  route(swap);
  # handle rtpengine setup for new call if early media
  route(transcode);
  if (isflagset(transcode)) {
    if (is_method("INVITE") && has_body("application/sdp")) {
      setflag(early);
      xlog("L_INFO", "[$ci] *** RTPEngine INVITE OFFER\n");
      rtpengine_offer("$var(rtpengine_offerflags)");
    }
  }

  xlog("L_NOTICE", "[$ci] branch_initial_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

}

route[message]
{

  # message received on external interface
  if ($Rp == 5060 || $Rp == 7080)
  {
    xlog("L_NOTICE", "[$ci] message_external_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu)\n");

    # mark routing logic in request
    append_hf("P-hint: msg-to-hub\r\n");

    # conditionally swap internal and external interfaces
    route(swap);
    # relay the request
    t_on_reply("transaction");
    t_relay();

    xlog("L_NOTICE", "[$ci] message_external_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu)\n");
  }
  # message received on internal interface
  else if ($Rp == 6050 || $Rp == 8070)
  {
    xlog("L_NOTICE", "[$ci] message_internal_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu)\n");

    # mark routing logic in request
    append_hf("P-hint: msg-from-hub\r\n");

    # mid-registrar
    xlog("L_INFO", "[$ci] registrar: msg looking up $ru $rm [$si:$sp] (from: $fu -> to: $tu)\n");
    if (!mid_registrar_lookup("location")) {
      xlog("L_INFO", "[$ci] registrar: msg_unregistered: $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
      if ($rd=~".*.service")
      {
        # can not route .service domains without registration
        xlog("L_INFO", "[$ci] registrar: msg_not_found: $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
        t_reply("404", "Not Found");
        exit;
      }
      # 2.3 and below self-check
      if (uri == myself) {
      # 2.4 and above self-check
      #if (is_myself("$rd")) {
        # uri refers to this proxy (probably from a past registration)
        xlog("L_INFO", "[$ci] registrar: msg_this_server: $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
        t_reply("404", "Not Found");
        exit;
      }
      xlog("L_INFO", "[$ci] registrar: msg_as_addressed $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
      # routing as addressed instead of failing, as NOTIFY is using direct uri
      #t_reply("404", "Not Found");
      #exit;
    }
    else {
      xlog("L_INFO", "[$ci] registrar: msg_lookup_succeeded $rm [$si:$sp] $ru (from: $fu -> to: $tu)\n");
    }

    # conditionally swap internal and external interfaces
    route(swap);
    # relay the request
    t_on_reply("transaction");
    t_relay();

    # route as addressed
    xlog("L_NOTICE", "[$ci] message_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu)\n");
  }
}

route[sequential] {

    xlog("L_NOTICE", "[$ci] sequential_pre: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

  # manipulate sdp
  route(sdp);

  route(transcode);
  if (isflagset(transcode)) {
    # NOTE: ACK may not be considered part of INVITE transaction in which case early flag would be out of scope
    #if (!isflagset(early) && is_method("ACK") && has_body("application/sdp")) {
    if (is_method("ACK") && has_body("application/sdp")) {
      xlog("L_INFO", "[$ci] *** RTPEngine ACK ANSWER\n");
      rtpengine_answer("$var(rtpengine_branchflags)");
    }
    # NOTE: not tearing down rtpengine session on CANCEL because it is
    # sequential and branch is not known, and because branch will reply with
    # failure status that will be associated with branch and this will
    # tear down the rtpengine session for this branch
/*
    else if (is_method("CANCEL")) {
      # handle rtpengine teardown
      xlog("L_INFO", "[$ci] *** RTPEngine CANCEL DELETE\n");
      rtpengine_delete("$var(rtpengine_branchflags)");
    }
*/
    else if (is_method("BYE")) {
      # handle rtpengine teardown
      xlog("L_INFO", "[$ci] *** RTPEngine BYE DELETE\n");
      # NOTE: not passing branch flags to rtpengine_delete here since branch is
      # unknown to sequential request in dialog, and we assume all remaining 
      # rtpengine branches will be torn down as a result.
      rtpengine_delete();
    }
    else if (is_method("INVITE") && has_body("application/sdp")) {
      setflag(early);
      xlog("L_INFO", "[$ci] *** RTPEngine INVITE OFFER\n");
      rtpengine_offer("$var(rtpengine_offerflags)");
    }
  }

  # swap internal and external interfaces
  route(swap);

  # relay the request
  t_on_reply("transaction");
  if (is_method("CANCEL")) {
    # preserve the received Reason header
    t_relay("8");
  }
  else {
    t_relay();
  }

  xlog("L_NOTICE", "[$ci] sequential_post: $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");
}


route[reply] {

  if (is_method("INVITE") && $rs=~"(180)|(183)|(200)|(202)") {
    #xlog("L_INFO", "[$ci] cross-check: Contact $ct\n");
    #xlog("L_INFO", "[$ci] cross-check: To user \"$tU\" in \"$tu\"\n");
    #xlog("L_INFO", "[$ci] cross-check: Contact user \"$(ct.fields(uri){uri.user})\" in \"$ct.fields(uri)\"\n");
    if($(ct.fields(uri){uri.user}) != null && $tU != $(ct.fields(uri){uri.user})) {
      xlog("L_WARN", "[$ci] cross-check: to-user \"$tU\" in \"$tu\" does not match contact-user \"$(ct.fields(uri){uri.user})\" in \"$ct.fields(uri)\"\n");
    }
  }

  # swap internal and external interfaces
  route(swap);

  # test for issue with natted uc client
  # fix contact header for natted clients
  #if (client_nat_test("3")) {
  if (client_nat_test("9")) {
    fix_contact();
  }

  # maniulate sdp
  route(sdp);

}

onreply_route[transaction] {
  # NOTE: when default onreply_route is present in addition to transaction onreply routes, *both* will be called.

  xlog("L_NOTICE", "[$ci] onreply_transaction_pre: $rs to $rm $ru <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");

  route(transcode);
  if (isflagset(transcode)) {
    if (is_method("INVITE")) {
      #if (t_check_status("1[0-9][0-9]")) {
      if (t_check_status("(180)|(183)|(200)|(202)") && has_body("application/sdp")) {
        if (!isflagset(early)) {
          xlog("L_INFO", "[$ci] *** RTPEngine OK OFFER\n");
          rtpengine_offer("$var(rtpengine_offerflags)");
        }
        else {
          xlog("L_INFO", "[$ci] *** RTPEngine OK ANSWER\n");
          rtpengine_answer("$var(rtpengine_branchflags)");
        }
      }
      else if (t_check_status("[3-9][0-9][0-9]")) {
        # currently disabling delete on failure until support for forking is
        # configured since this delete will tear down the active branch too.
        xlog("L_INFO", "[$ci] *** RTPEngine REJECT DELETE\n");
        rtpengine_delete("$var(rtpengine_branchflags)");
      }
    }
  }

  route(reply);

  xlog("L_NOTICE", "[$ci] onreply_transaction_post: $rs to $rm <$pr:$si:$sp -> $Ri:$Rp | $fs -> $dd:$dp> (from: $fu, to: $tu, from-tag: $ft)\n");
}

local.cfg

# listing external udp interface first so that it is used by mid-registrar
#  when relaying registrations to the main registrar
# external interfaces
listen=udp:1.1.1.1:5060
listen=tcp:1.1.1.1:5060
listen=udp:1.1.1.1:7080
listen=tcp:1.1.1.1:7080
# internal interfaces
listen=udp:1.1.1.1:6050
listen=tcp:1.1.1.1:6050
listen=udp:1.1.1.1:8070
listen=tcp:1.1.1.1:8070

route[swap] {

  # if message received on external port then switch to internal port
  # if message received on internal port then switch to external port

  # if message is destined for udp then switch to udp
  # if message is destined for tcp then switch to tcp

  if ($fs != NULL) {
    xlog( "L_INFO","[$ci] swap: send socket already set to $fs, not swapping $pr:$Ri:$Rp\n" );
    return;
  }

  # swap external and internal interfaces
  if ($Rp == 5060) {
    # if ruri is using tcp then force tcp in send socket
    if ($rP == "tcp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for tcp:1.1.1.1:6050\n" );
      force_send_socket(tcp:1.1.1.1:6050);
    }
    else if ($rP == "udp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for udp:1.1.1.1:6050\n" );
      force_send_socket(udp:1.1.1.1:6050);
    }
    else {
      xlog( "L_INFO","[$ci] swap: unknown r-uri proto, not swapping $pr:$Ri:$Rp\n" );
    }
  }
  else if ($Rp == 6050) {
    # if ruri is using tcp then force tcp in send socket
    if ($rP == "tcp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for tcp:1.1.1.1:5060\n" );
      force_send_socket(tcp:1.1.1.1:5060);
    }
    else if ($rP == "udp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for udp:1.1.1.1:5060\n" );
      force_send_socket(udp:1.1.1.1:5060);
    }
    else {
      xlog( "L_INFO","[$ci] swap: unknown r-uri proto, not swapping $pr:$Ri:$Rp\n" );
    }
  }
  else if ($Rp == 7080) {
    # if ruri is using tcp then force tcp in send socket
    if ($rP == "tcp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for tcp:1.1.1.1:8070\n" );
      force_send_socket(tcp:1.1.1.1:8070);
    }
    else if ($rP == "udp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for udp:1.1.1.1:8070\n" );
      force_send_socket(udp:1.1.1.1:8070);
    }
    else {
      xlog( "L_INFO","[$ci] swap: unknown r-uri proto, not swapping $pr:$Ri:$Rp\n" );
    }
  }
  else if ($Rp == 8070) {
    # if ruri is using tcp then force tcp in send socket
    if ($rP == "tcp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for tcp:1.1.1.1:7080\n" );
      force_send_socket(tcp:1.1.1.1:7080);
    }
    else if ($rP == "udp") {
      xlog( "L_INFO","[$ci] swap: swapping $pr:$Ri:$Rp for udp:1.1.1.1:7080\n" );
      force_send_socket(udp:1.1.1.1:7080);
    }
    else {
      xlog( "L_INFO","[$ci] swap: unknown r-uri proto, not swapping $pr:$Ri:$Rp\n" );
    }
  }
}

@vitalikvoip
Copy link

vitalikvoip commented May 23, 2020

I have a feeling that I'm really close to figure it out. Have logs and coredump when contact_body_t allocated in PKG mem in one worker and then freed in another what crashed it. It has crashed because I have mem debug and q_malloc detected broken memory block. Without memory debug I suppose it may either crash or corrupt PKG memory.
That pkg_malloc was called from dlg_update_contact()->parse_contact() during in-dialog CANCEL processing. Actually opensips allocated it in one worker when processed CANCEL and then tried to free when received 200OK for INVITE (the same transaction) in another worker.

To make it clear want to emphasize that in my case crash happens only when UAC sends reINVITE, received 100 Trying, and sends CANCEL immediately. While this "CANCEL" travels across the network, opensips has enough time to forward reINVITE and than it receives CANCEL from UAC and 200 OK for reINVITE from UAS a few moments later.

UAC                 proxy                               UAS
| ---reINVITE->     |                                    |
|  <-----100---     |                                    |
|                   | ---reINVITE, CSeq: 5 INVITE->      |
| ---CANCEL->       |                                    |
|                   |  <--- 200 OK, CSeq: 5 INVITE ---   |
|                 [ bang ]                               |

@vitalikvoip
Copy link

The patch attached fixed it for me. At least the problem I was able to reproduce with SIPP.

diff --git a/modules/dialog/dlg_handlers.c b/modules/dialog/dlg_handlers.c
index 8afca4601..4930dd54b 100644
--- a/modules/dialog/dlg_handlers.c
+++ b/modules/dialog/dlg_handlers.c
@@ -1259,6 +1259,11 @@ static void dlg_update_caller_contact(struct cell* t, int type, struct tmcb_para
                return;
        }
 
+       if (msg->msg_flags&FL_SHM_CLONE) {
+               LM_ERR("skip SHM allocated msg\n",);
+               return;
+       }
+
        dlg_update_contact(dlg, msg, DLG_CALLER_LEG);
 }
 
@@ -1275,6 +1280,11 @@ static void dlg_update_callee_contact(struct cell* t, int type, struct tmcb_para
                return;
        }
 
+       if (msg->msg_flags&FL_SHM_CLONE) {
+               LM_ERR("skip SHM allocated msg\n",);
+               return;
+       }
+
        dlg_update_contact(dlg, msg, callee_idx(dlg));
 }

@vitalikvoip
Copy link

Unfortunately, it looks like that was not the only crash related to dialog module in 2.4.7

Got a new crash:

#0  context_get_ptr (type=CONTEXT_DIALOG, ctx=0xfa, pos=0) at ../../context.h:179
#1  0x00007f476306a1be in dlg_ctx_get_ptr (dlg=0x12, pos=0) at dlg_ctx.c:65
#2  0x00007f47634eee01 in free_acc_ctx (ctx=0x7f476e3af640) at acc_logic.c:205
#3  0x00007f47634f37c8 in unref_acc_ctx (ctx=0x7f476e3af640) at acc_logic.c:1287
#4  0x00007f47691a3c54 in empty_tmcb_list (head=0x7f476d64a3d8) at t_hooks.c:53
#5  0x00007f4769188a51 in free_cell (dead_cell=0x7f476d64a368) at h_table.c:127
#6  0x00007f47691b6885 in delete_cell (p_cell=0x7f476d64a368, unlock=1) at timer.c:239
#7  0x00007f47691b750a in wait_handler (wait_tl=0x7f476d64a3e8) at timer.c:458
#8  0x00007f47691b94b8 in timer_routine (ticks=135779, set=0x1) at timer.c:1091
#9  0x00000000004a703e in handle_timer_job () at timer.c:731
#10 0x000000000054806c in handle_io (fm=0x7f47799bf0a8, idx=3, event_type=1) at net/net_udp.c:265
#11 0x0000000000546983 in io_wait_loop_epoll (h=0x9014e0 <_worker_io>, t=1, repeat=0) at net/../io_wait_loop.h:280
#12 0x000000000054896b in udp_start_processes (chd_rank=0x80371c <chd_rank>, startup_done=0x7f476bd40b70) at net/net_udp.c:389
#13 0x000000000043d0c5 in main_loop () at main.c:784
#14 0x000000000043fde7 in main (argc=13, argv=0x7ffe1e40b5f8) at main.c:1442

@attermann
Copy link
Author

@vitalikvoip Curious did your patch address the crash in free_contacts() during transaction clean-up, or only the dialog-related crash?

@vitalikvoip
Copy link

My patch addresses crash I had in dialog module when had described call flow with CANCEL from caller and 200 OK from callee.
Had another one with transaction clean up but it's still related to dialog because crashes when cleans dialog context created by acc module. I don't actually need acc at the moment, but still wanted to fix it since I've already dived deep into this part of opensips, but couldn't reproduce it manually and my production proxies stay stable since last week. Will try to analyze when it happens again.

@vitalikvoip
Copy link

It's been working without any issues for 5 days. @attermann if your opensips still crashes and you see in coredumps it happens around free_contacts() I think it worth trying my patch.

@bogdan-iancu
Copy link
Member

I suspect that there is a TMCB_RESPONSE_OUT (type 128) TM callback (registered by other module) that is parsing the Contact hdr of the request that was cloned in SHM memory.
Now, if you have high load, you may have SIP retransmissions, so you may end up with two 200 reply's handled in different procs (in the same time), both trying to "parse" + "use" + "free" the contact hdr over the same sip msg structure (the cloned, shared request that is saved into transaction).
Now, the TMCB_RESPONSE_OUT callbacks are executed under lock, so maybe there is some part of the code (outside that callback) that tries to parse the Contact....and overlaps with the callback
There are a couple of modules using the TMCB_RESPONSE_OUT callback and potentially parsing the Contact hdr, so let me try to narrow down the possibilities...

@vitalikvoip
Copy link

After applying the patch I've posted above I had no more crashes. So maybe in my case it was only due to CANCEL handling since CANCEL is not end-2-end request and dlg update functions dealt with the cloned sip msg.
Haven't thought about retransmissions from this angle. Your assumption makes perfect sense and most likely retransmission maybe be another corner case.

@bogdan-iancu
Copy link
Member

You suggest that the concurrent access+parsing+free over the request stored in shmem happens between (1) the 200 OK reply and (2) the CANCEL ?

And your patched eliminated this concurrency by not doing any parsing for the 200 OK ?

@vitalikvoip
Copy link

The problem was not in concurrency. Back then I couldn't understand why it crashes and the described case was the only one I knew to reproduce the crash.
As far as I understood all parsing functions are supposed to be called only on PKG allocated sip_msg received from a wire and before it is cloned to SHM by tm module. Normally, when tm executes TMCB_REQUEST_BUILT callbacks it provides either PKG allocated sip_msg that was read before or faked_req (do_dns_failover) which is cleaned just after usage in the same process. The CANCEL request is an exception since t_cancel.c:cancel_branch() provides an SHM allocated sip_msg (t->uas.request) to callbacks what makes any parsing dangerous. When it happens, one process handles CANCEL request and parses its contact allocating some info in PKG mem and attaching it to SHM allocated sip_msg. Then after a while another process (for example timer) may try to read that data and dereference those PKG pointers and reads some random unexpected data or writes something that leads to memory corruption.

@stale
Copy link

stale bot commented Jul 18, 2020

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@Fernandojdk
Copy link

Cause I have a lot of environment running the latest 3.1 from rpm packages, I can take some coredump info to provide more info if needed.

Let me know if this will help.

@stale stale bot removed the stale label Jul 31, 2020
@vitalikvoip
Copy link

I have it running almost a month without any issues while had at least one crash day before applied the patch.

@Fernandojdk
Copy link

I’ll try your patch to see if it will fix for me too.

@vitalikvoip
Copy link

I’ll try your patch to see if it will fix for me too.

You've mentioned you are using opensips 3.1. I've tested my patch only with 2.4.7 and don't if this part is still the same in 3.1. Anyhow give it a chance and I can take a look at your coredumps closely when have a spare moment.

@Fernandojdk
Copy link

I’m able to locate the block of code on 3.1 where i need to apply your patch. Today I’ll try to see if it fix for me on the latest 3.1.

@vitalikvoip
Copy link

@Fernandojdk , did you have a chance to check it with 3.1?

@Fernandojdk
Copy link

Fernandojdk commented Aug 20, 2020

Hello @vitalikvoip

I applied your patch on opensips 3.1, but the bug continued when we are under high load. For 3.1 this patch did not work.
Unfortunately.
The frequency at which the bug happens has decreased, but still here.

@liviuchircu
Copy link
Member

liviuchircu commented Aug 21, 2020

Hey guys,

@razvancrainea came up with the following patch for 2.4 that should address the issue (PKG-parsed contacts intermingling with a SHM-stored SIP message), would you mind giving it a try? We are undergoing some testing as well on it, as of lately.

git apply <(base64 -d <<EOF | gzip -dc
H4sIAMyQP18AA7VVa0/bSBT97PyKS1ZQx688SKEtBUEJUKQ0XSHYL93KMvbEtmrPuJ5x22jhv++9
fuRVAlRqpcSex52Zc849cx3E0ynYdhgr8LqpCIqEyW4Qe4kIu0ESupHHg4Tl0vHh9vH5VswD9gN2
9wZs1/cGvuP02WB/Ohh6AfR7vb3hsGXb9lOntEzTfPKk42OwXw6tPTDxuQ/Hxy34K+Z+UgQM2o7T
xV/gKc9NijRzovbPs5mXS5ZXL1eJp2N8yb5SlLkhyhdceb5q3o9t2ITWGz+9oArM8+ejJH36g94r
q48SYeO1NSCVQCpPxT7EPIk5w5cCErbIUKw5El2qvPCrGZ8lCRjYsqAelXHmpjIEAx9WC/5rgYYz
UK89wK4feTkYKs0OWqZGR+RMwSH0qFuHuQoMnwYnN+MxjuOqeAr6Fu5pH9UxsLODw5qmV8wi5gWY
e52OhfejK/f04+T65PTaPbeg14G30IO7O1jZodMp4WnjD+7o3YXevpz8c3l9BiKHm79HJ9j6HqtI
FOj7Bj7YwIWCUo+Yh1v/8naHKGlIocg5cbC1e2CJRPHWAdtHJdIAgcNKavUVUASVgNklsMvJ+ccN
yOA2F18YfxybvYzN1O5J8k3A6FRTa3KA6q8G3YpgRntoKo9TN0G98Rh9p8FdTtHeK9zkPMCCnSV2
GPs76NE2y/xKgksMCHeRx+XU/DhM9hs4FwUPoOB1Tpoln7YdQ37Gza1myEkYX3RkeWid4jW5dL2x
L0lFHu48KPN8RNbgAGEDpuq3bEXXMhH+FxcbOv5Lg1JesG0fJSyUn/Dx2Vkwqm5B1yDLqoiB9FIG
86wFgr/Ayy7A4zMVofRgdOv6MXhpvaLysbtv/Znqgdml6sH9NNsIfyk3KznrwOFh5TXQFnf8RhKD
dZbwltJ+BFO0YPVBge0MBAc8CrYDskMFRnvYFBaUJDC6unJlEgq+lgZ73axaKJQAxoOyhtzTo2Q5
TbxQwt0hjMYX7vn45MI9fX8yuTgbHSyXqyuWJZ5PdEQS/AqTKnm7e3XxH+5b/dd/KH1Y5dHNMkrd
1EtQDn05QY0xtzBqUYrPrq70Nl11xkURRpCyVOSzkhBn3xuei/K2SepaabtfSl19Y+rOsuz31ZV5
0Fq4ovxMbYxAFlRkFj2KRcB+Nvt1v9LauU1L0RdZRbzPtOgzDGrWapAYqMKbumg8IGKtYakaZYoK
+DRnK9VdNTtSJL6QBlWzxk3fRBys+Ah9wPJ1O5GVDMBqQ65Ts4zNLaVS/9bFauelEoxMdlr/AycT
XeKQCgAA
EOF
)

LE: I have edited the above patch, do not use the version included in the email! :)

@Fernandojdk
Copy link

Hello guys,

I've changed the code posted by @liviuchircu and ported to 3.1 to see if it will fix this issue.

On the next week i can tell if this fix works.

razvancrainea added a commit that referenced this issue Aug 26, 2020
This avoids concurrent access to the contact header's parsed structure
in request messages saved in shm - addresses ticket #2095
razvancrainea added a commit that referenced this issue Aug 26, 2020
This avoids concurrent access to the contact header's parsed structure
in request messages saved in shm - addresses ticket #2095

(cherry picked from commit df68e3b)
razvancrainea added a commit that referenced this issue Aug 26, 2020
This avoids concurrent access to the contact header's parsed structure
in request messages saved in shm - addresses ticket #2095

(cherry picked from commit df68e3b)
razvancrainea added a commit that referenced this issue Aug 26, 2020
This avoids concurrent access to the contact header's parsed structure
in request messages saved in shm - addresses ticket #2095

(cherry picked from commit df68e3b)
@razvancrainea
Copy link
Member

We've tested this in several environments and crashes stopped to happened, hence I believe this issue is fixed. If you're still facing this problem, please re-open the ticket with further info.

@Fernandojdk
Copy link

Hello guys.

Just for feedback. I already adapted the above fix to work with 3.1 and until now we have no crash anymore. So i think this issue is fixed.

Thanks all 🥇

@attermann
Copy link
Author

Hi @liviuchircu. I was able to apply your patch above to 2.4.7 and will be testing it in production over the next week. I also tried applying it to 2.4.8 but it failed. Is this patch already in 2.4.8?

@liviuchircu
Copy link
Member

Yes it is, @razvancrainea pushed the same fix in df68e3b

kxin-vonage added a commit to Vonage/h2o-opensips that referenced this issue Nov 16, 2021
* xsnprintf.c: fix excessive expression

(cherry picked from commit 99d35618a22daa7bed3d426cdb9db7eb87d507df)
(cherry picked from commit d3ba9c451e07ebc91f9036f7f2468a6ea1ccf1e2)

* msilo: fix excessive check

(cherry picked from commit 71b6088d888af32609b7dad87f0e250580c6c7ff)
(cherry picked from commit 222a80153abed8fd731a623aa371861ea4f4a006)

* osp: fix use of uninitialized pointer

(cherry picked from commit fc5b59ea3eeeabac4e0a28302b5d0b0d1b5f507d)
(cherry picked from commit 451fe196826ead159e0ddda34719b1f9948c0bad)

* b2b_logic: fix checking potentially uninitialized pointer

(cherry picked from commit fefd32149a97cdd2ad1b1ea0f19a4eeaa0809375)
(cherry picked from commit 081b654024ea627f479c9137414b108db85004d9)

* pua: fix bitmask check

(cherry picked from commit cd7dec8af8f71fa1024b69fb66e8f02ef3d43278)

* proto_help: Fix broken doc build

(cherry picked from commit b9a6292e9de47c02a9e5ae1dcc540c888f90be8a)

* Revert "auth: fix error check"

This reverts commit b7772a087e4bf5c57fa4655d9fd53e074ca4247e.

(cherry picked from commit d8ec4f3b05f2c4b21fe3363b3e146c1811ccf870)
(cherry picked from commit 72a3f08a96149ebd7a18d8338d86fe587d4ab669)

* auth: return -1 on error in auth_get_ha1()

(cherry picked from commit 96fcb5f9fd95796e11ec00a8246426d6dfe11a58)
(cherry picked from commit 9578c7fef7f54000b9e38c5b248302649a739d06)

* parser/content: handle \r for Accept folding lines

When receiving a heeader with folding lines, we also need to skip '\r'
when parsing the new lines. This way we can accept headers such as:
Accept: application/sdp,
	application/isup,
	multipart/mixed,
	application/dtmf,
	application/dtmf-relay

Thanks go to 46Labs for reporting this

(cherry picked from commit f7efd44c9e796e941452d2c173bac6aec4342bb3)

* tls_mgm: enforce at least TLSv1 when SSLv2.3 is used

Starting with Debian 10 (buster), the OpenSSL library is compiled with
TLSv1.2 as the minimum TLS version. Therefore, when specifying SSLv2.3,
basically only TLSv1.2 and TLSv1.3 can be used.
This commit sets the minimum version to TLSv1, since SSLv2.3 will soon
be deprecated.

Thanks go to @telematico for reporting this issue
Close #1986

* Rebuild documentation

* acc: Improve oom handling

Do not populate bogus str values (e.g. {NULL, 3}) when out of SHM and
setting $acc_extra.

(cherry picked from commit 86ef10af2cbaad475dd92d41be64873af40e716e)

* Fix param order in goes_to_gw_1 without partitions

Credits go to @zhouboyong

* Rebuild documentation

* osp: actually fix potential use of uninitialized pointer

If OSPPTransactionNew() failed, callids[0] is still uninitialized.

(cherry picked from commit c8a99ad3d1dea71ddaa73ed4cac42f4c56b49a38)
(cherry picked from commit 8a6d458ec72243c667b39ac1f97cfe7e7e98d1ec)

* exec: fix potential use of uninitialized pointer

(cherry picked from commit 9da92ced576d18e5487cf5dfa584ee638c170520)
(cherry picked from commit a4da730c13d52b494f834fdd5c7ff5d61a17937f)

* exec: fix NULL check after malloc

hf->next_other should be checked instead.

(cherry picked from commit 8d7e2f310d7f43535dea1dff6b6889df02d95c02)
(cherry picked from commit bcd723680b506f6ac0a6ec095e3920777bcb6b65)

* rtpengine: check NULL before assignment

(cherry picked from commit f33f1d839102a24342a91f15a6902c05ee90ea45)
(cherry picked from commit b428f119fb179d00f47d1917362b33484ebaf4b1)

* db_virtual: check NULL before log

(cherry picked from commit bbbe695ea8d64efe8a6152ed69aca88896d7a3dc)
(cherry picked from commit 39a3a06fd2486848d7e22736182602a741d42f5d)

* Rebuild documentation

* Fix compilation warnings with -DUSE_FUTEX

It seems that whenever "time.h" is included without first enabling
_GNU_SOURCE, including "unistd.h" afterwards will leave the "syscall"
function undeclared, regardless whether _GNU_SOURCE is defined or not.

../../parser/../mem/../futex_lock.h: In function ‘get_lock’:
../../parser/../mem/../futex_lock.h:78:31: warning: implicit declaration
of function ‘syscall’; did you mean ‘sysconf’?
[-Wimplicit-function-declaration]
 #define futex_wait(lock, val) syscall(SYS_futex, lock, FUTEX_WAIT, val,
 0, 0, 0)
                               ^
../../parser/../mem/../futex_lock.h:268:4: note: in expansion of macro
‘futex_wait’
    futex_wait(lock, 2);
    ^~~~~~~~~~

(cherry picked from commit fa4e4fa70085db9da4dc72f856e0648341b0f5a6)

* Fix un-released transaction upon t_wait_for_new_branches + cancel.

If t_wait_for_new_branches() is used and no real signaling branch is added, the caller canceling must not be ignored upon branch selection. The injected 408 timeout on the phony branch must be considered, leading the transaction termination.

Thanks to Chris Maciejewski for reporting and providing troubleshooting info.
Also thanks to Razvan Crainea ( @razvancrainea ) for investigating and reproducing the problem.

(cherry picked from commit 7221b984d555cb68fd07e1bfb502607446345d42)

* Improve canceling of transactions doing only t_wait_for_new_branches().

If a transaction has no real signalling branch, but only doing t_wait_for_new_branches(), upon caller cancel inject a 487 reply on the phony/waiting branch in order to terminate the transaction on the spot.
The provious behavior was to wait (even after the cancel) for the phony/waiting branch timeout (fr_inv_timeout) and to generate an 408 timeout.
Now, there is no useless waiting and a proper 487 is generated.

(cherry picked from commit 8af80a40bb9877b91380da670353a68c53f0a36b)

* Merge pull request #2074 from besser82/topic/besser82/json_c_014_so_5

Add support for upcoming json-c 0.14.0.
(cherry picked from commit fe24824e042486575daa01e20df2c8380b9e0b07)

* Merge branch 'master' of https://github.com/robdyck/opensips into robdyck-master

(cherry picked from commit 4558e7af66240a2e359a1ffbeee114bad3b4f71d)

* Removed extra space

(cherry picked from commit b47d7a617cd8384f2929655f497f3b52c554028d)

* [exec] improve docs on exec async + launch

Related to #2049

(cherry picked from commit bb7afdc2dc4363f24d8e28e00ab337c7c174d1e5)

* proto2a(): Do not segfault on PROTO_NONE (0)

(cherry picked from commit 54064b4614d42164b49105b43f9a5dc1ce308a92)

* xml: allow access to nodes with undefined namespace prefixes

Related to #1988

(cherry picked from commit 17b44ea1f2270be8e59b893ab70b3aac8d5a3826)

* Rebuild documentation

* Merge pull request #2037 from l2dy/tracer

tracer: fix null pointer dereference
(cherry picked from commit d59715ec3614c5c9d5afb40aa17704d86fca6021)

* rest_client: Disable "Expect: 100-continue" by default

Although the "100 Continue" flow sometimes saves bandwidth when the
server replies with an error response, it also has some drawbacks:

    * extra roundtrips, adding latency and complicating the async
        handling
    * the client body upload and server body download must be performed
        within the same curl_multi_perform() loop, during the async
        resume callback, limiting the amount of parallelism that can be
        obtained when compared to the simple flow

From now on, the "100 Continue" flow for POST/PUT requests with bodies
larger than 1024 bytes is disabled by default.  Developers may enable it
using a new, backwards-compatible, "enable_expect_100" module parameter.

Fixes #2081

* async statement: Improve error log when tm is missing

(cherry picked from commit afb721ed164206855130202cf5b0ff4c83bf2394)

* Rebuild documentation

* fraud_detection: Avoid random str_strcmp() error logs

* fraud_detection: Fix bad test in e1767c80

* HP_MALLOC:  Dramatically reduce memory fragmentation

For HP_MALLOC to offer optimal parallel allocation performance, it
needs the memory to be fragmented: the more available chunks, the better
its performance.  However, there needs to be some coalescing as well,
otherwise variable-length allocations past the 10K range will
quickly deplete (fragment) even SHM pools of 10+ GB.

This patch adds the following heuristics to HP_MALLOC and HP_MALLOC_DBG:

    * shm_malloc(): only split a fragment if the resulting chunk is at
      least MIN_SPLIT_SIZE bytes (default 256 for PKG, 4096 for SHM)
    * shm_free(): try to coalesce the next fragment (enough to fix all
      fragmentation issues)

* HP_MALLOC: Fix computation of PKG stats

* HP_MALLOC: Add "re-scanning" logic

Given that HP_MALLOC has fine-grained locking, if the big fragment
shifts down from hash bucket N to bucket N-1 due to another
process performing the allocation, we may actually "lose" it during
our own scan, since bucket N-1 was empty and we're now waiting
for bucket N to unlock.  When it does unlock, it will also be empty.

As a solution: retry the scan up to N times, as long as it's feasible!

* HP_MALLOC: Aggregate SHM statistic updates

It is better to update the stats with a diff of the
attached/detached/split/merged fragment changes.  Updating the stats in
a "step by step" manner has the side-effect of creating temporarily
bogus values. (i.e. negative "used" and/or "real_used")

* Core: Add an HP_MALLOC stress testing suite

Fixes #2089

* drouting: Fix segfault during dr API reloads (fraud_detection)

(cherry picked from commit 362f8ffa49f2a1070ac16f5021a81871f568f4af)

* Rebuild documentation

* Add missing typedef keyword

Without this keyword these struct definitions are considered by GCC as a
variable definition.

This was catched because there is a more strict check about these
variable definitions since GCC 10:

* https://gcc.gnu.org/gcc-10/porting_to.html#common

Signed-off-by: Peter Lemenkov <lemenkov@gmail.com>
(cherry picked from commit 242903711399ccb4eafeac614129424ffddb0e1b)

* Fix building with gcc 10

GCC 10 started to use -no-common by default. Here is an official
explanation:

* https://gcc.gnu.org/gcc-10/porting_to.html#common

We have to mark all the declarations of global variables with extern
keyword and put the definitions into a proper file.

Signed-off-by: Peter Lemenkov <lemenkov@gmail.com>
(cherry picked from commit 7b59ee14e95ea3500dd8c3368b081404b9731c9b)

* rtpengine: Remove unused struct fields

These two fields were copied directly from rtpproxy module but never
used by this module.

Signed-off-by: Peter Lemenkov <lemenkov@gmail.com>
(cherry picked from commit fd8262cee78a67e1dcca7f487f919239d0f13f33)

* make: Prepare for upcoming gcc 10 (use -fno-common)

Per https://gcc.gnu.org/gcc-10/porting_to.html#common, -fno-common will
become a default in gcc 10, so variables defined within .h files will no
longer be allowed!

Thanks to Peter Lemenkov for the suggestion!

(cherry picked from commit b864b2915551fc25509a54eaf30df0cedbe1643f)

* HP_MALLOC tests: Fix unused function warning

* Fix more "-fno-common" gcc warnings

* proto_hep: Fix HEPv3 IPv6 destination contents (send DST not SRC addr)

(cherry picked from commit 7667483bbfb43017ccdfc8929ffe91a37c10c71f)

* dispatcher:  Fix missing "tmb" definition

(cherry picked from commit afbf2871a144466a16534db8c86d7d0377077e17)

Fixes #2110

(cherry picked from commit c9e61269f360de6e91fae54f8b6a92964e5f00a3)

* Rebuild documentation

* Fix compilation with non-HP_MALLOC and -DUNIT_TESTS

* signal_pkg_status(): Fix un-initialized pt entry detection

otherwise, OpenSIPS may do a kill(-1, SIGUSR2), which terminates all
processes of the current user...
		    (sometimes this includes the entire X session :-) )

* Rebuild documentation

* Rebuild documentation

* Skip the disabled GWs when using route_to_gw()

Closes #2072

(cherry picked from commit d0d05907099add7666e31820b4b9ecd2d4c85e7e)
(cherry picked from commit a43ee8d78f8596e7f6fbf2d9fb154a879042c660)

* EXpire the WAIT type subscriptions also (produced by async wait_for_event())

Credits go to @liviuchircu
Closes #2054

(cherry picked from commit fe9bfd9c9e17d9ec069b9b98010a151800d30d83)

* RPM spec: don't build cachedb_mongodb for centos-8 because of removed libzstd, fix perl build deps

* db_perlvdb: Fix incorrect integer downcast

Credits to Aaron Meriwether for spotting the bug and suggesting the fix

Fixes #1952

(cherry picked from commit 3e83412d50b0d8dd5b8a0f7498fb8b66e96ef353)

* Rebuild documentation

* Module dependencies: Improve debug log text

Suggested by Mark Farmer

(cherry picked from commit 128c545677bfa7a8bdfd27b4d4ac399937df7bc2)

* Fix crash in tracing xlogs from routes without SIP msg

(cherry picked from commit b0a63d96ca16a26b3b4a623484ff2359d5533811)

* Fixed the HEP type for the logs

Credits go to @bibace
Closes #2137

(cherry picked from commit 7b36e441a6314fbd54745d8084aa5cfe88c341d7)

* Fix the src/dest info for the HEP "log" packages

This fixes the way the logs are displayed in the call flow of Homer

(cherry picked from commit 98c39776965527a4d91c25755239369029a5e023)

* Fixed 3 x htons(), reducing to one :)

(cherry picked from commit 1856155a6c2093efae7448fce80371e97eedd42a)

* tls_mgm: fix a cleanup crash when failing to create a SSL_CTX

(cherry picked from commit c6ac01258b61608b06ca400815e3b89302724107)

* cachedb_mongodb: fix crashes when loading module along with tls_mgm

Loading the cachedb_mongodb module causes the openssl library to be
initialized in the pre-daemon process, by the mongoc library's
constuctor, before tls_mgm gets the chance to do its own initialization
in the attendant process.
This commit overwrites the openssl functions used by the mongoc library
in order to actually skip the initialization and leave it to tls_mgm.

Related to #2091

(cherry picked from commit a3e87277f275d2b26a3f1472e78a07baea9de85c)

* cachedb_mongodb: complete fix in commit 4dfa0ba for openssl < 1.1.0

(cherry picked from commit 9f454463e42f2148990bd5f307eeff0acecb18bb)

* Rebuild documentation

* net/tcp: mark connections as being added in reactor

This way we can "remember" to remove them from the reactor when the
connection times out.

Full credits go to @vitalikvoip for reporting, troubleshooting and
fixing this problem. Close #2134

(cherry picked from commit 3e862f0e34ab9fa7065e6132935ab7c429c730bb)

* proto_wss: proper cleanup wss connection in case of tls error

When TLS domain is not found, or the SSL context cannot be created, make
sure the connection does not keep an hanging `proto_data` that will be
later double-freed by the cleanup routine.

Credits go to Jonathan Hulme (@digipigeon) for reporting this.

Close #2123

(cherry picked from commit b150486a6944f684b238a62dc2d2dc9e307dd007)

* dialog: clarify usage scope of $DLG_dir

Reported by @imdrpn in #2132

(cherry picked from commit 896be267dba83b2e45d4b246a37c2de1dde5d6d1)

* rest_client: fix crash when loading module along with tls_mgm

Closes #2115

(cherry picked from commit f4227fa77a22e3439718fb9fe4b8077ea6fc659a)

* Rebuild documentation

* rest_client: complete fix in commit c4b84f7

Related to #2115

(cherry picked from commit 31104219475b94b864c126f42f1e5abba8e41148)

* The main process (attendant) is no longer running the child_init()

The main proc must stay free of any db connections, so it can later fork and child_init() other procs (due auto_scaling).
For the modules which were keeping a DB conn open in main proc for later usage, during module destroy, the new approach is to open the conn, on the spot, in the destroy handler.

Related to #2003

(cherry picked from commit 3a8f6f1375dc3e3dc8a4a786ee642d4a5fe61069)

* siprec: send BYE for sessions not established

When a SIPREC session was started on a 183 reply, but the session was
not stopped, a BYE to the SIPREC server should be sent to close the
session.

Thanks go to Krunal Patel for reporting this and offering the fix.

Close #2136

(cherry picked from commit b4cd132731f8e2fc0a949ea52b22952879b345ca)

* acc: fix no name tags in extra lists

Before this commit, extra_fields that were having spaces after the last
tag (such as: `last; ` would generate an extra field with no tag. This
commit fixes this case.

(cherry picked from commit a69e6ada6f57989e7a3af940e75d3b99ca8bced3)

* Fix wrong type of lumps.

As the eoh (Enf Of Header) is determined during the on-demand parsing, the add_lump() may wrongly lable a lump as "header" or "body" (as the add_lump() may be used before the parsing reached the EOH).
So, better identify all headers asap, so the add_lump() will properly work all the time. This will not add any overhead, as in 99.9999% of the case, due the functions used in cfg, anyhow you end up parsing the whole message.

Fixes  #2144
This is a 100% safe fix, there are 0 chances for side effects.

* dialog API: Clarify documentation for DLGCB_FAILED

The DLGCB_FAILED callback is subject to the "200 OK" vs. "408 timeout"
SIP race condition during dialog establishment and developers should use
it with extra caution :)

(cherry picked from commit 1f2c8c4854f9494ff7b712be3dc1eba87cba57f7)

* Delete expired subscription only with active tags

In clustering setup, the DB rutine handling subsscription should delete from DB and run the del handler only if the subscription has an active sharing tag.

Fixes #1848

(cherry picked from commit 929ab4d43c0ad1878d60d3ef27f2a71f905b80f0)

* Fix test of recognizing SCTP as UDP-based protocol

...and allow per-socket/listener definition of number of workers/processes

Closes #2125

(cherry picked from commit 749685c7ab6e7baa6562f9803bc52f2fc89ea82b)

* Proper detection/skipping of EOH in multipart.

Do not assume a len of 2 (not even checking the content). OpenSIPS accepts (for hdrs) a single \n or \r as EOH

(cherry picked from commit 3ec9d1b25430715f2b800c455c8518104a243f49)

* Fix wrong inheriting of Contact-Length hdr.

When moving from multi part to single part, do not inherite the Ct-len hdr of the part.
Fixes #1990

(cherry picked from commit 7dd1151341b8229cd30e335b246e56938551f6bd)

* Fix various typos, phrases & small issues

(cherry picked from commit 86f6e0c3038025bfc53243a27cb0fe76869c324e)
(cherry picked from commit 55ebe7b72f5619b4dbf1ebe86585d1599ca53295)

* TLS: Refresh expired default certificates

... and also extend their lifetime from 1 year -> 10 years

* rr: Fix ordering issues with add_rr_param() and create_dialog()

Commits bd1719dcf and c263182ee4d introduced regressions for
record_route() and record_route_preset(), respectively, which caused
add_rr_param() to not work anymore in the following sequence of function
calls:

create_dialog() with a "dlg_match_mode" != DID_NONE
record_route() or record_route_preset()
add_rr_param() # would not do anything, simply acting as a NOP

The reason is that the "fake" lump chain which was meant to hold
all add_rr_param() operations prior to an eventual record_route() call
would no longer be masked after the record_route() and would continue to
accumulate data from subsequent add_rr_param() calls.  Since it was
meant to be a "fake" chain from the beginning (maybe the user doesn't
call record_route() at all!), the data would simply be lost.

Many thanks to John Quick for the accurate report and instructions on
how to reproduce the issue!

* Fixes #2138 - mid_reg_save()

mid_registrar_save() not forwarding REGISTER in aor throttling mode after unregister when in write-back mode.

(cherry picked from commit 6ee1664559e65fefb2f9cc5f59772b4997711ce0)

* Rebuild documentation

* mid_registrar: Be compatible with gcc 4.x

... it seems named initializers were not supported back then.

This commit also makes the initialization more human-friendly by using
the "10" value (UL_EXPIRED_TIME) instead of the opaque "0" value -- the
end result is the same.

Thanks to Nick Altmann and Răzvan Crainea for their help!

(cherry picked from commit 84bea49d7e4392a23526790d135fbff816081ec3)

* rtpengine: use proper buffer iovec count when calling writev to send commands to rtpengine daemon

 - this went undetected for a long time probably because the last buffer had a zero iov_len ...

(cherry picked from commit b1bf482df43320cae37207bf7975215785e825ab)

* rtpengine: improve error logs: print # iovec buffers along errno/strerror on writev failure

(cherry picked from commit bd099afa394a8033f6d988f355ff0923592c8c0b)

* rtpengine: fix writev error 22:Invalid argument

 - this fix is based on a similar fix implemented in the rtpproxy module
 - if the # of iovec buffes is to high, concatenate last buffers and reduce the # of buffers

(cherry picked from commit eda6ec3b6260db52a8420d3a9f0cc2f82273af0f)

Conflicts:
	modules/rtpengine/rtpengine.c

* rtpproxy: fix writev error 22:Invalid argument

 - align code with rtpengine implementation for systems with no IOV_MAX defined

(cherry picked from commit 565e25ba9b967bb5442d62f2ef5791c51787f078)

* rtpproxy: improve error logs: print # iovec buffers along errno/strerror on writev failure

(cherry picked from commit f4ba69bc6708e94faeefe8d008f48c0a80d66eb8)

* cachedb_cassandra: fix unsupported protocol version errors

This commit prevents the Cassandra driver with version >= 2.15 from throwing
errors while trying to negociate the appropriate protocol version.

Closes #2145

(cherry picked from commit d60830f49c9c0ac6c161b56cddbc2572c04d3c66)

* Fixed uac_auth() on sequential requests in dialogs were uac_auth() was already used

The fix solves the conflict between the changes over the cseq value coming from (1) dialog module, trying to increment the cseq to compensate the cseq change during INVITE auth and (2) the uac module incrementing the cseq again because of auth'ing a sequential request.

The solution was to first indetify such case in uac_auth() and to use a larger delete lump (when replacing the cseq) in uac_auth(). Such a larger/wider delete lump will invalidate the lump added by the dialog module.

This is a follow up on #1969

Credits to Ben Newlin for reporting and assisting with the testing

* Fixed potentially un-init dlg variable

Thanks to Trevis for its report :)

* Bump version to 2.4.8

* cluster sync logging: Fix incomplete "if" condition

Reported by William Jin

* tls_mgm docs: Fix example

Reported by JP Hindin

* clusterer: Fix some startup corner-cases (crashes)

The first extra check fixes a crash in 'db_mode = 0', if only
'neighbor_node_info' is given, so cl->current_node will be NULL.

The second extra check prevents:

(gdb) bt full
0  check_seed_flag (cl_list=<optimized out>) at node_info.c:307
1  load_db_info (dr_dbf=dr_dbf@entry=0x7f2e65996300 <dr_dbf>,
     db_hdl=<optimized out>, db_table=db_table@entry=0x7f2e65993370
     <db_table>, cl_list=<optimized out>) at node_info.c:485
2  mod_init () at clusterer_mod.c:408

Fixes #2086

(cherry picked from commit 0b609f03165ee80f525795f278128162f6418817)

* Rebuild documentation

* rtpengine: Remove unused defines

Signed-off-by: Peter Lemenkov <lemenkov@gmail.com>
(cherry picked from commit 463e815cfa1835f1ef97f9b8f16a943fce0b8762)

* Revert "rtpengine: Remove unused defines"

This reverts commit 27e758dfc0fd6ae40b2157c3535928b721c4b26f.

(backport was not needed on 2.4)

* db_mysql: fix crashes when loading module along with tls_mgm

(cherry picked from commit 9459f5d3acb5efa9a4a0e5c8d60c18733062bc04)

* Rebuild documentation

* HP_MALLOC: Fix bad merge conflict solve

It seems the 25eec9f5f backport was not properly done, as the
re-scanning logic ended up _inside_ the main loop, rather than
_outside_ it.

Also improve the default unit test settings, so the HP_MALLOC stress
tests can be more easily launched.

Credits to Ryan Bullock for the report
Fixes #2168

* pua: Fix some list management bugs; Improve code

    * do not use strncmp() to compare "str" structs, since it may lead
      to false positives (e.g. strncmp("foobar", "foo", 3) == 0).  Use
      str_match() or str_strcmp() instead.

    * do not read memory after pkg_free() (severity: low, since it's PKG
      memory, so reading "invalid" heap memory works anyway...)

    * simplify list_pop() and list_free() code

(cherry picked from commit 5df8b636a15a799b5f1ab7a58ebe1c208ce49d2d)

* xml: properly handle oom when extending $xml result buffer

Fixes Coverity CID #200068

(cherry picked from commit 3112b5942636ad281783a5ba0ec13992e0c78693)

* db_text: Fix Coverity warning (bad 'switch' fallthrough)

Thanks to Răzvan Crainea for the help!

(cherry picked from commit 6b56ca5fa1163bc290a4c01f554e07cae860fa8c)
(cherry picked from commit 8b9307f2ec54a0f1551599254d1afbc6e0f7799b)

* mid_registrar: Fix uninitialized variable

Although the severity is low, this code path could be triggered by a
bogus downstream registrar which returns a 200 OK without the Contact
advertised by the mid-registrar in its REGISTER request.

Fixes CID #200051

(cherry picked from commit a4684d54f1d32088936a345afa36cb66bbf53727)
(cherry picked from commit df197a285b7b6d3c751dce42a65f69a7d5abe68b)

* usrloc: Fix unitialized variable

CID #199971

(cherry picked from commit 5f03b8556bf241336795ba8ca517c982d732dd4a)
(cherry picked from commit 96428bcf3e5f849486862f6f4a481674e1d6fed3)

* freeswitch: Fix a rare memleak

CID #164007

(cherry picked from commit 2a9824cdc955eb783e08f88906b70d337dbdebe9)
(cherry picked from commit 2aa619c00666e85631cd687f243d05585ac1478f)

* HP_MALLOC status logging: Fix incorrect bitwise shift

CID #200101

(cherry picked from commit fcdd224a5c8638a28e3be10e4028c27aa404a637)

* statistics macros: Avoid unsigned integer comparisons against 0

CID #199900, #199913, #199923, #199934, #199941, #199964, #199998,
 #200032, #200049, #200074, #200096.

(cherry picked from commit 39fc05ec0ff2122dd79ec5582ac020c999c11c30)
(cherry picked from commit b497af00652cac0421f9e316bec4d5d12bb9ede3)

* mid_registrar: Remove unnecessary check

CID #199954

(cherry picked from commit 7a638403ee648ba629fd1f6f4a17bdea091da16d)
(cherry picked from commit c34e15660de83681a16c84ec61b96a85ec460e5d)

* usrloc: Remove unnecessary NULL check

CID #200086

(cherry picked from commit f305fa88f9341f37eb95bf88980e974723b57e94)
(cherry picked from commit 42b71cf855efa610c1c408caea7c3e54c039aa3f)

* mid_registrar: Add paranoid safety check

This extra check doesn't fix anything right now, but at least it makes
Coverity happy.

CID #200020

(cherry picked from commit 2e60b50228a80c1e9629e43670a8cbcc5f71839c)
(cherry picked from commit 85464ef2a96f9a3b4665f8ddbeea3292149eac79)

* sanitize mpath length

Fix Coverity #199897

(cherry picked from commit 587316afc6032142c96077e3efac8029d7fa8355)
(cherry picked from commit 4f3ed1239338e0c647bb387dcf5acdb51a6f3162)

* mpath: Fix possible buffer underrun

Also attempt to mask the false positive in CID #200070.

CID #211377

(cherry picked from commit 7f244f17a54833c6bb3f9ea58aa4afd855774ce1)
(cherry picked from commit 0108f2e7cba7e596b8f2ab20e0e176cbc9720744)

* drouting: Always NULL-terminate the GW buffer

While this doesn't immediately fix any bug, it should make Coverity
happy.

CID #200062

(cherry picked from commit df2ce6d826587f1130c5928b0815bd10ed2957f1)
(cherry picked from commit 2a8008ac6280059b9ba5548e7d856db7e0953ffb)

* exec: Always NULL-terminate the read buffer

CID #200052

(cherry picked from commit 0230f9c9032bd01661c6c7c197965528507e3f2c)
(cherry picked from commit 5ff4861e836cd97889c831c5f3533a985fb19d5e)

* mi_datagram: Minor improvements

    * always NULL-terminate the read buffer (CID #200029).  Doesn't fix
      any known bug, just a good practice
    * avoid redundant memset(65535) on each read
    * improve startup error log

(cherry picked from commit 0eee2119991270bd23a6279e6b1a825586f2d227)
(cherry picked from commit a2ae1349a19b2fc79d1893bef96d7f01434020cf)

* db_text: Complete 65b18091

(cherry picked from commit 673cf7f9eb3e5bd3a8da805483885e049bddc873)
(cherry picked from commit 11205ebef3ec4f207f33f01afe209518a0ed4453)

* dialog: don't allow RE-INVITE and OPTIONS pinging at the same time

Enabling both RE-INVITE and OPTIONS pinging for the same endpoint causes
issues with the CSEQ value in:
* ACK (incremented value compared to the corresponding INVITE)
* INVITE and OPTIONS (decreasing values between the requests), in case
  races occur between computing the new CSEQ values and sending the messages.

(cherry picked from commit 182ceb0ca6126251ca9e872844925c3bf60b1c90)

* Rebuild documentation

* [registrar] Proper init of delete_nh_he variable

The logic may end up to free_hostent(delete_nh_he) without actually using the delete_nh_he, leading to freeing so random pointers.

Reported by coverity CID 199942

(cherry picked from commit 3b20c9db07ea88bd84609676f2657011588b8b41)

* [core] proper checking on return code for getsockname()

Reported by coverity CID 200064

(cherry picked from commit c489f462d236fddb883e64a6cde7b04929a76191)

* [pua_dialoginfo] Fix bad test on resolving flag name

get_flag_id_by_name() may also return other negative codes than -1

Reported by coverity CID 207914

(cherry picked from commit f3530c10adc1d2a5b01340a9823f84db61c98e72)

* [drouting] removed useless code

Makes no sense to select the table at db_con init time, as various queries (from various modules) may set and use different tables.

Related to coverity report CID 58405

(cherry picked from commit ab7d4430941263d20acd1f45bbbbf09c80a5040f)

* [nathelper] Fix bad test on resolving flag name

get_flag_id_by_name() may also return other negative codes than -1

Reported by coverity CID 200026, 200005, 199906

(cherry picked from commit 249b80741ddbb0952c7f737af2a7c2473f05eacf)

* [nathelper] Proper testing upon extracting hash value

Be sure it is valid hexa and it it within the correct range. The lack of these tests may open the gate for some nasty exploits.

Reported by coverity CID 199958

(cherry picked from commit b1d796344aa674e25fc4408bda5d336bce2bc605)

* [load_balancer] Spelling over log messages

(cherry picked from commit 3b6d0de16d68815b571cebca6bf4c9124f59f06b)

* evi: annotate evi_param_set_int to fix Coverity

Fixes several Coverity alarms

* db_text: prevent buffer overflow

Fixes Coverity CID #200093

* qos: fix getting SDP

* parse_to: add fallthrough comments/indications for coverity CID #200094

* tm: cleanup replicated cancel msg

Fixes Coverity CID #200031

* db_text: fix no-op code

fixes Coverity CID #200004

* fix coverity detected false positives

CID #40557, #211391

* proto_ws: make sure we have destination to connect to

Fixes Coverity CID #58340

* proto_hep: remove useless NULL check

Fixes Coverity CID #163991

* proto_wss: add fall through indication

Fixes Coverity CID #58373

* proto_tls: make sure we have destination to connect to

Fixes Coverity CID #58361

* db_text: complete coverity CID #200093

* db: merge db_is_neq_type() in db_ut

Fix coverity CID #40717, #40718, #40719, #40720, #40721, #40722

* pua_xmpp: release memory in case of error

Fixes Coverity CID #199967

* rtpengine: fix offer vs answer detection for replies

When late negociation is not used, rtpengine_manage() should send an
answer command, not an offer one.

(cherry picked from commit d0641318ff238b01506e18bd4aa7422fb3cfeab7)

* rtpengine: fix late negociation detection of rtpengine_manage()

When a reply is received, do not look into the request if it has body,
since that could lead to headers parsing in pkg - that might leak.
Thanks go to @bogdan-iancu for spotting the issue

(cherry picked from commit 06f7297065f079ebd4201055079a02e69dde2cc5)

* tm: use msg_flags to indicate message is replicated

(cherry picked from commit 804c9a0492940fe779e3dd1d4b4fe9f8cba57cf4)

* event_flatstore: prevent NULL deref in case of bogus param

Fixes Coverity CID #207918

(cherry picked from commit 596c572da62169d453fface97d13ec557d200561)

* tls_mgm: set the TLS session id context with the proper length

Fixes Coverity CID #207904

(cherry picked from commit 4a18ce89c89b4f9831ff748bfa75e2aa5db1b303)

* ut.h: Fix str_casematch() implementation

Since str's have a known length, we should not stop the iteration at
'\0'.

(cherry picked from commit 7d0df27a40d9bd0207637ec08e5deb662c45bbf6)
(cherry picked from commit 4b29d8dc30065c2d92f686e870f9bfc771dd4954)

* Rebuild documentation

* userblacklist:  Be compatible with db_cachedb over MongoDB

Issue reported by Ryan Embgrets

(cherry picked from commit f9fd982beec461cf15bd3001df744066e42af625)

* fraud_detection: Add the possibility to disable stats

A (0) value for any of the fraud detection thresholds now means: "ignore
this threshold", instead of: "raise an event on each call for this
threshold?!".  Although (0) values could, in theory, have been useful
for event generation testing purposes before this patch, this is a small
price to pay in order to gain the ability to disable a stat.  Developers
can now add 1 more CPS and still keep testing with a new (1) value :)

Although the DB schema changes slightly, due to the added "DEFAULT 0",
it is fully backwards-compatible with the previous one, except
fraud_detection row INSERTs will now be more quiet, generating no more
warnings from the backend (e.g. "Column does not have a default
value..." -> but still defaults to 0, thanks to non-strict mode, leading
to strange warnings from the module).

Fixes #890

* Supply a processing context to error route.

When the error is triggered outside the context of another route, be sure it also have a processing context

Closes #2187

* Rebuild documentation

* rtpproxy: ignore unknown payload type media streams

Thanks go to @wangduanduan for reporting and providing troubleshoot info

* presence: reset delete prepare statement when delete not performed

* fraud_detection: Improve accuracy for 'concurrent calls'

Avoid using the DLGCB_FAILED callback, since it has no protection
against the "408 Request Timeout / 200 OK" race condition, and simply
use DLGCB_DESTROY in order to decrement the concurrent calls counter.

This fixes a bug where a "408 Timeout / 200 OK" call would cause a
concurrent calls value of 4294967295, due to an extra decrement.

Fixes #2079

(cherry picked from commit 2404b96c87982387071cea63f5fc42d9169d8798)

* fraud_detection: Fix misleading 'show_fraud_stats' docs

    * the 'fraud_profile' MI command parameter is both invalid and not
      at all required by the current code
    * document the "stale statistics" limitation of this function

(cherry picked from commit 4dba112130651ca6013c7ebc931fc744b3776774)

* drouting: Do not start with NULL DB connections

Force all DB connection objects to be created right from the start,
otherwise strange, hard to diagnose errors may _occasionally_ pop up
during runtime:

 ERROR:core:db_use_table: invalid parameter value (nil), 0x7ff31329a108
 ERROR:drouting:dr_state_flusher: cannot select table "dr_gateways"

    ... or:

 INFO:drouting:dr_reload_cmd: dr_reload MI command received!
 CRITICAL:core:db_table_version: invalid parameter value
 ERROR:core:db_check_table_version: querying version for table dr_gateways
 CRITICAL:drouting:dr_reload_data_head: failed to load routing info
 CRITICAL:drouting:dr_reload_cmd: Failed to load data head

* tls_mgm: fix crashes with db_text and null string values

Closes #2192

(cherry picked from commit 3d26747cc309d7f5c461e3be49fb30db78bc3591)

* Rebuild documentation

* fraud_detection: Fix a bug causing negative CC/CPM

Following a reset of the statistics for a (user, dialed_number) pair,
skip any pending subtraction operations scheduled during a previous
interval (e.g. a dialog which started during fraud detection interval A
and ended in interval B should not decrement CC anymore if it has
already been reset to 0 in the meantime during a more recent
check_fraud() call).

Many thanks to Inderjeet Sharma (@inderjeetsharma) for reporting and
helping track down this bug.

Fixes #2079

(cherry picked from commit 4f414366ee0fcc52fa2d18c59be0c5d753047934)

* rtpengine: handle PRACK for answer

Thanks go to @nikbyte for the report. Close #2208

(cherry picked from commit 1f9b433fd68889c3e692e249d54ea8dbb2d576d3)

* hep_udp listener: Fix "use_children" regression

Commit 899f65c8f4 introduced a regression where "use_children" would be
ignored for the HEP UDP protocol, possibly leading to extra forked
workers after an OpenSIPS upgrade, failed restarts due to exceeded MySQL
max connections, etc.

* Rework previous commit

Although the protocols listed in the @proto array "seem" to be
initialized, they are still missing some essential data, which is only
later provided by trans_load(), during the startup phase.

* tls_mgm: Avoid creating a shared DB conn at mod_init()

Many thanks to Adrien Martin (@adrien-martin) for the immense help in
reporting, troubleshooting and helping close this one out.

Fixes #2161

(cherry picked from commit 2c008cd9225dc9a60df2fbf2e37d032ce6f27db0)

* Rebuild documentation

* dialog: parse contact in separate structure

This avoids concurrent access to the contact header's parsed structure
in request messages saved in shm - addresses ticket #2095

* rtpengine: fix order of specified interface flags

When both 'external' and 'internal' flags were specified, 'internal' was
always assigned as iniface, and 'external' as outiface. This commit
makes sure that the order they are specified in is respected.

(cherry picked from commit 68f8f9d5c7f43af0a6fb914883ad700a93048835)

* rtpengine: Fix docs for `extra_id_pv`

Suggested by Nick Altmann

(cherry picked from commit 968063d9f3d652e94c2895d9dee46b7843c875b5)

* Fix command in documentation for clusterer_list_cap

(cherry picked from commit 7a0f5991de5387b2496b1229f949353b568ad10c)

* Rebuild documentation

* [permissions] Fixed mem leak on pattern field

Closes #2197

(cherry picked from commit d6f3424d1fe71de94a545a36d0e14fbd9ecd3b3f)

* Fix next_hop crash (seen in nathelper nh_timer) due to reading reused mem

Problem:

- get_domain_db_ucontacts (through get_domain_ucontacts) was handing out
  (next_hop) pointers to memory that was unused. This resulted in a
  crash when this memory was reused before the invalid pointer was
  reused.

Relevant issues and commits:

- #1652 [OpenSIPS crashes since of child that serves rtpproxy]
- e162f5f10 [fix 1652: usrloc: make next_hop point within the shared buffer]
- #1710 [nathelper next_hop off by one for usrloc path]
- 0300eb1d5 [fix 1710 / revert 1652: usrloc: fix next hop compute for ...]

That is: e162f5f10 fixes this exact problem in get_domain_db_ucontacts, in
get_domain_mem_ucontacts and in get_domain_cdb_ucontacts (cdb_pack_ping_data).

But in 0300eb1d5 it is reverted for only get_domain_db_ucontacts.

This fix:

- Rewrites the fix for get_domain_db_ucontacts and
  get_domain_cdb_ucontacts, making it less fragile/bug-prone.
- Adds comments about fragility to get_domain_mem_ucontacts
- Fixed unaligned memcpy that might affect non-intel CPUs:
  `((struct proxy_l *)cp)->name.s = next_hop_host`

Bug reported and fix tested by Jasper Hafkenscheid @hafkensite (VoIPGRID).

(cherry picked from commit 5a6b3abe41a2eaed961a530675f6441a692e8640)

* usrloc: Update doxygen comment about packing get_all_ucontacts

(cherry picked from commit fc1c3ec32fa285e90cea457096f557e1088b1031)

* Accept single \r as EOH in multipart

This completes commit 3ec9d1b25430715f2b800c455c8518104a243f49.

(cherry picked from commit f5d1b55aa213e448f2716751227234e073277855)

* load_balancer: Fix AB/BA deadlock in lb_start()

Assuming we have resources A, B, before this patch we would have the
following lock grab order on the corresponding resource locks:
   (see lb_data.c +743):

    lb_start(): A, B
    lb_next(): B, A

This patch fixes this type of deadlock by inverting the resource AVP
elements, such that both functions grab the array locks identically:

    lb_start(): A, B
    lb_next(): A, B

* List new / missing contributors

(cherry picked from commit ef7a6801c483ece3ea8ab07275fcec7aa051a3f8)

* rtpengine: fix single-bit field type

(cherry picked from commit 9c4f6c1c7f7ac88e19ff78b5169c33b17b1d8d38)

* Rebuild documentation

* nathelper: Do not crash on bad ping reply

Credits to Alexey Vasilyev for the report!
Fixes #2242

* dialog: fix/reformat some logs

(cherry picked from commit 2efaeee24c69841e9776e360960bc8d644081ff9)

* cJSON lib: fix double free in case of string parsing error

Fixes #1977

(cherry picked from commit a53f3bd689a01c495809214823d07bfc3cab512a)

* Rebuild documentation

* [core] fix building path hdr for IPv6

IPv6 addresses in path received parameter are not enclosed in brackes

Credits go to @hafkensite / Jasper Hafkenscheid
Closes #2247

* [core] fix limited support for transport vals in PATH

The transport of the received parameter was limited to TPC/TLS/SCTP (no WS or WSS)

* [core] missing file from e1347ebe4

* [core] fix missing check on DNS threshold

For the DNS A record lookups the DNS threshold was not properly checked and reported.
Reported by @bcnewlin / Ben Newlin
Many thanks to Ben for helpin to troubleshoot this issue.

* [core] convert space identation to tabs

* [core] fixed bogus paste from 2736d6

* sqlite - fix upsert in db_insert_update

(cherry picked from commit 6f71f3f065fa78a65ee021931207aded5903c909)

* Rebuild documentation

* o Don't assume bash lives in /bin;

o fix arguments ordering while calling head to work correctly
on non-GNU systems.

(cherry picked from commit b0612c10fe610a76499e9d8170ad392f63d4396f)

* sipcapture: fix overlapping function name

the build_dummy_msg() function being used without the static identifier
causes confusion for some compilers. An example is the AL2 compiler,
which instead of using the module's implementation, it uses the core's
one, hence resulting in broken behavior.

Credit goes to @ghmj2417 (GitHub) for reporting this in ticket #2255
Close #2255

(cherry picked from commit 7ba72d434aee3db949c351a46970122f41a7c50a)

* Rebuild documentation

* usrloc is_contact_registered(): Add useful debug logs

(cherry picked from commit 64c3724faa3b90918bda4c95ee716d487a3f0f1f)

* emergency: fix memory leak on error

(cherry picked from commit 18f9f7873330b7e5d7c1e31828062d80a6268402)

* emergency: Remove bogus curl_global_cleanup() at runtime

Suggested by Walter Doekes (@wdoekes)

(cherry picked from commit f91c13af1fa9ee55071bd99b249902360c8def65)

* Extend STR_NULL and str_init() to set a type explicitly. This
allows those macros to be used elsewhere in the function body,
not only in variable declaration(s), i.e.:

void foobar(int baz)
{
    str localstr;

    if (random() == 42) {
       localstr = str_init(I_STR LUCKY_STR FIN_STR);
    } else {
        localstr = str_init(I_STR UNLUCKY_STR FIN_STR);
    }
    some_api(&localstr);
}

(cherry picked from commit 36e0281b7309e2be39e478c96575cadf9a35260b)
(cherry picked from commit 62b1b5320fc4e9b7e9d71ae8f069e7ea839f0b60)

* Improve auth module examples:

o Use $var(xyz) consistently. $avp's take shared resources and
  aren't really justified in this use case;

o fix www_challenge / proxy_challenge second argument to match
  reality.

(cherry picked from commit d9574358d1670391c2b1cbe1d7ccbe37eb4defca)
(cherry picked from commit 008b5154a6007ad6c223fc4dad38bdaf8e0a2077)

* auth: Fix doc regression in d9574358d :)

Although $var() variables are awesome, some module params (e.g. the
"rpid_avp") really mandate an AVP and won't settle on anything else.

(cherry picked from commit 8ce9eb22e1ead5b0afeff367b01d0b510650cdc5)

* Rebuild documentation

* packaging: fix typo in emergency module description

Thank you @SB-JohnK for reporting it! Close #2263

(cherry picked from commit 25e55f10c7d690230e3e1060958d8b5f0621b540)

* Fix Memory leak in Freeswitch ESL module.

(cherry picked from commit 294e8c489c22fce345450e0441dfbc5a79205d94)

* Rebuild documentation

* event_rabbitmq: cleanup connection after creating the socket

Reported by @ycxwoo in ticket #2230

(cherry picked from commit 87e17276a79cc01ddef9334f0bf21256a6c21606)

* topo_hiding: always allocate space for DID in user

When contact could not be parsed, the new Contact's prefix was not
allocating bytes in buffer for storing for the dialog ID in the user
part. However, the DID was appended to the prefix without any error
checks, resulting in a memory corruption. This commit always allocates
spaces for storing the Dialog ID, even if the contact could not be
parsed.
This fixes all uses of the topology_hiding() function with the `D` flag.

Thanks a lot Sergei Lavrov (@ccppprogrammer) for reporting this in
ticket #2285 and for offering information for figure it out. Close #2285

(cherry picked from commit 2ea8f86813bfbf87b4b0ab61c9f3e852ce777f4c)

* Rebuild documentation

* registrars: Rebuild lib/reg on each build

This fixes an issue where registrar would log lines resembling:
    INFO:mid_registrar:...

due to the lib/reg/*.o files having been built beforehand with the
mid_registrar source code and simply linked into the .so afterwards.

This issue is fixed in 3.2+ using a different mechanism (separate build
of the lib/reg/reg.a library).

Reported by Vlad Paiu

(cherry picked from commit 74a180d2a88546b0ae14a6a91c669ddb63b29f61)

* acc: Remove redundant code

(cherry picked from commit fcbe480f2eeb9b0933bb5a3fb1d15379fb79c493)

* Code quality: Use str_match() instead of !str_strcmp()

Reasons behind this change:
    - better performance: str_strcmp() includes lots of redundant checks
      and performs worse than the simplistic str_match()

    - safer code: although by making this change we increase the chance
      of a segfault (if either "a" or "b" strings are NULL), the crash
      will immediately indicate the coding bug and will be easy to
      troubleshoot.  By the same argument, nobody complains that
      memset() or strcmp() include zero NULL checks: they don't have to!

    - more meaningful code: logically speaking, in most cases, the
      programmer simply intends to _match_ two strings, rather than to
      order them lexicographically...

    (cherry picked from commit 765a84b293db8ec3da960c0ae61f725f3519bfd9)
    (cherry picked from commit d004a6c5e35b7dac783b52dd95fa11d6ad2d13c2)

* permissions: Integrate with 'db_default_url'

    * DB partitions now inherit their DB URL (if not given explicitly)
        from 'db_url' -> 'db_default_url'
    * the above is similar for the 'table_name' property
       (inherit from the 'address_table' modparam)
    * rewrite the 'partitions' parser: no more coding style limitations
    * backwards-compatible: the 'db_url' is still mandatory if you are
      not specifying a partition definition!  This is due to the fact
      that the module has a non-DB based usage mode

Part of a series of patches for #2117

(cherry picked from commit bb57cc1c375b9325138154eb415cebc2fa971ec0)
(cherry picked from commit e663ee6a905f64c1d26954fa896bbc1461001163)
(cherry picked from commit 8cb6048949cc0440d87b205bbaed88fa16d1fc93)

* dialplan: Integrate with 'db_default_url'

    * DB partitions now inherit their DB URL (if not given explicitly)
        from 'db_url' -> 'db_default_url'
    * the above is similar for the 'table_name' property
       (inherit from the 'address_table' modparam)
    * rewrite the 'partitions' parser: no more coding style limitations
    * backwards-compatible: at least one partition must be specified
      (even through simply defining a global 'db_default_url') before
      the module can start

Part of a series of patches for #2117

(cherry picked from commit 2e8164368b31da9292f08623eac6934ef2057e22)
(cherry picked from commit f061ad31d39538de59da05e8a98018cc42c4c336)
(cherry picked from commit 000df2ba61e91142ce3c8f09562e5867519a94b5)

* dialplan docs: Update partition, db_url and table_name

(cherry picked from commit 6443ad5829c15f8ae7dad8bb9f0e4f418702b1ad)
(cherry picked from commit 5185859a5776b6cb6ed49938eeba80c2017c5087)
(cherry picked from commit dbd3e9eace04e0b54a16bfff4b9c78d9a5c51732)

* permissions docs: Update partition, db_url and address_table

(cherry picked from commit 180674b833ef9e82cb3f97105add2ada6843ddc1)
(cherry picked from commit 67c06628eae20cca4fc3e5509435b775a425d17e)
(cherry picked from commit 8ad9847f0b010b5344c1d9fdf4b4f65f347efae8)

* dispatcher: Fix a bug with the 'default' partition

This commit allows the "default" partition to be populated just like the
other ones:

modparam("dispatcher", "partition",
    "default:
            db_url = mysql://opensips:opensipsrw@127.0.0.1/opensips_3_2;
            table_name = v1_dispatcher")

(cherry picked from commit cc2f4a8c49402f99502a58940d2fbb51daed6e47)
(cherry picked from commit 419f0119400229ac54df2b8ce1943d88c31c4b5a)
(cherry picked from commit 1f14390a4b7147a3ba7434b5dc19c6aa98184a44)

* dispatcher: Add useful debug log

(cherry picked from commit 198687e17d13a06b0868676252292643c6d549c4)
(cherry picked from commit 654f1e9ba552ab96e1b3b31b3199f5464b41bd9d)
(cherry picked from commit 5c154e2cec728014993dd980309225b47e6af514)

* dispatcher: Integrate with 'db_default_url'

    * both 'db_url' and a partition DB URL will inherit their starting
      value from the 'db_default_url' global
    * add the option to point the default partition to a known one, in
      order to avoid unwanted loading of the default table.  Example:

	modparam("dispatcher", "partition", "default: trunks")
	modparam("dispatcher", "partition", "trunks:  db_url = ...")

(cherry picked from commit c7d668bad5b2b41a9c9065e1539914f0629ccf64)
(cherry picked from commit ba7a4e33a6e2141aa7300f4e5db268c39f5e9520)
(cherry picked from commit c0b2a0ae4117e395e1c4724b7b91270d83a0d269)

* dispatcher docs: Update partition, db_url and table_name

(cherry picked from commit 89a618be1f005d1bc80d528bc41de73fff24533a)
(cherry picked from commit dd0245e98087e07fb2182ae4e21f06ce034c9b4b)
(cherry picked from commit 02df9b8fdb2bdb2521481b9d1b466466689c38e1)

* dialplan: Allow re-pointing the default partition

This is useful in situations where script writers do not intend to cache
the default 'dialplan' table, rather only the named partitions.

Example syntax:
    db_default_url = "mysql://opensips:opensipsrw@127.0.0.1/opensips"
    ...
    modparam("dialplan", "partition", "
	pstn:
	    table_name = dialplan_pstn")
    modparam("dialplan", "partition", "default: pstn")

(cherry picked from commit efee3438773a83c1a05edc8b05ba45f53a83c5a6)
(cherry picked from commit 4abd2617f970b2505534ee8dfd68fdd11ddcc716)
(cherry picked from commit 03bf0ae1b0ce1931377ae131235413bbc5f01df1)

* dialplan: Fix startup with a single, non-default partition

* dispatcher: prevent possibl uninitialized access

(cherry picked from commit 58804282fe53b25a2d18f8bed35b8f6fc8f8a1a8)

* b2b entities/logic: Inherit `db_default_url` if possible

If there is still no DB URL, simply start without DB support (just as
before)

(cherry picked from commit ff7b5849751d11537d989afe2b17ef35ebd2a133)
(cherry picked from commit a54424be0c0ff2cd42adf6a0d748ac2a866dc319)
(cherry picked from commit 0684e98376a841b6b7351b23e23de1d9a804ba41)

* SIP digest auth: Improve handling for multiple digest challenges

Before this patch, OpenSIPS would always work with the digest
credentials of the 1st WWW/Proxy-Authenticate header field.

While RFC 3261 does not define the behavior with multiple
WWW/Proxy-Authenticate headers, in § 22.3 it is stated that:

   Note that if an authentication scheme that does not support realms is
   used in the Proxy-Authorization header field, a proxy server MUST
   attempt to parse all Proxy-Authorization header field values to
   determine whether one of them has what the proxy server considers to
   be valid credentials.

... so a proxy must _walk_ through unacceptable headers until it finds
one with "valid" credentials.  In the context of the upcoming RFC 8760,
this would also mean: finding an auth header field with an MD5 digest
algorithm, which is the only one currently supported.

TL;DR: this patch improves the "uac", "uac_registrant" and
"b2b_entities" modules so they correctly process 3 WWW-Authenticate
headers with the following algorithm preference:
    algorithm=SHA-512-256
    algorithm=SHA-256
    algorithm=MD5

... and correctly build an MD5-based response for the 3rd header field.

Issue discovered during OpenSIPIt 2020

(cherry picked from commit 27d5862fb834e2c71f2fefb89a27430e30814647)

SIP digest auth: Complete 27d5862fb8

The b2b_entities was not patched well, as some weird switch
fallthrough-on-error behavior was introduced.

(cherry picked from commit 5942d53f7aeec8eadcafd394fd1ee51ede2e5ab1)
(cherry picked from commit 3274fc2b076bc8d2395ef1b68244a43c66463106)

* Merge pull request #2253 from sippy/master_2020

Fix a rather obscure copy-n-paste bug in the digest parser

(cherry picked from commit baddd32e2fd6cb43dd8b55f8a9f494fbd506d403)
(cherry picked from commit 74558d8957f57b73427d50dd7b7908f989537a12)

* Use body->len as the only authoritative way to check if the provided body is empty or not.

(cherry picked from commit b0997b1fb56d6f518a26b9fe32e45cb16b9a0284)
(cherry picked from commit bd5a1d027b238892d798e5c3fc0d0633387192fe)

* dialog: update contact & SDPs for replicated update

* Rebuild documentation

* b2b_entities: fix restoring of DB persisted server entities

Server entities restored from DB would not use the original To tag due to
regenerating the timestamp part of the entity key.

(cherry picked from commit dd237c001791be0b3b071d07b36057221f6ed3f6)

* mid_registrar_save(): Fix shm memleak with no t_relay()

Since mid_registrar_save() is stateful by definition, it needs the SIP
transaction to properly function, so always forcing its creation if the
REGISTER is about to be forwarded is not that significant of a change,
if any at all.

Thus, this patch fixes a SHM memleak on the following type of logic:

    mid_registrar_save();

    # script terminates without having created the transaction,
    # so mid_registrar's "prepped" data does not get freed on T deletion
    exit;

(cherry picked from commit b14a7eaff60b719c8b80af4f8bfdffe2bc7e995b)
(cherry picked from commit e2141a3b66b4ffe2e01fb9ed27412c8532b09fb8)

* tm: Fix transaction leakage with global onreply_route

Given that the opensips.cfg execution order for replies is:
    1) onreply_route (global)
    2) reply_received(), i.e. "tm" module scope
      2.1) onreply_route (branch)
      2.2) onreply_route (transaction)

... this patch fixes some transaction leaks where if the script developer
matches the current transaction within global onreply_route, e.g. with a
random statement such as xlog("$T_reply_code\n"), it would cause an
extra ref after a transaction matching operation would be performed yet
again (??) within reply_received(), as the "tm" scope begins executing.

The fix is to simply avoid transaction lookups when evaluating tm
variables within the non-transactional, global SIP reply route.

Credits to Bogdan-Andrei Iancu for suggesting this solution

(cherry picked from commit 3a25c0b28f9bffc5e6942a7a3f5484ef903e28f6)

* Rebuild documentation

* Bump version to 2.4.9

* Update ChangeLog for 2.4.9

* APT repository: ubuntu 20

* Update Changelog

* Work around for an issue on buster that the database query (pa_dbf.update) with a string NULL value won't match with any record.

* Set up a special condition when 'sharing_tag' is always set to NULL in the active_watcher table.

* Hotfix/openssl double free fix (#75)

* proto_tls/wss: fix crashes when dumping the openssl error stack

This commit serializes the execution of openssl's connect/accept/read/write
operations in order to prevent adding/removing entries concurrently to/from
the openssl error stack. If the performance penalty is deemed too high, the
NO_SSL_GLOBAL_LOCK compilation flag can be used to disable this behavior and
retain the risk of crashes.

Reported in #2362
Credits to Alexey Vasilyev for helping troubleshoot this.

(cherry picked from commit b6b7520024561c3cc65d272e038992adc0825fc7)
(cherry picked from commit 00090791c7101d5c7e4499df16d54eba91c7cfda)

* proto_tls/wss: complete fix in commit b6b7520

(cherry picked from commit dac973404b43b148f91d7fae66417d864f3c047d)
(cherry picked from commit 5897b2294fb9cba16fb6c9590ccee5a3c4c1351c)

Co-authored-by: Vlad Patrascu <vladp@opensips.org>

* presence: fix memory leak when receiving an unmatched replicated publish (#76)

(cherry picked from commit d1d898878555abe48cd3584ab1f3a8fe6f84ced7)

Co-authored-by: Razvan Crainea <razvan@opensips.org>

* transfromation/regex: remove unnecessary regex limitation (#77)

Thanks go to Calvin Ellison for reporting this in the mailing list
Completes 6e1afe64

(cherry picked from commit a8f8aabc245ca86bd40a9eddf9befcd74a087be1)

Co-authored-by: Razvan Crainea <razvan@opensips.org>

* fix record route leak in rls (#78)

* Fix memory leak in sqlite module. (#72)

* initial changes for sqlite memory leak

* fix indentation and single pointer

* using generic free function but freeing values separately

* fix comment to address  warning

* Revert "fix comment to address  warning"

This reverts commit ed53f9ae8b09c30e9de203190023a61c8bc13408.

* Revert "using generic free function but freeing values separately"

This reverts commit 399c111f4186adab33cedd20ad47f197978b60c6.

* Revert "fix indentation and single pointer"

This reverts commit 55c6c56a96b3631a6a34485ecad368b47044e747.

* Revert "initial changes for sqlite memory leak"

This reverts commit f161ac86301fb65306542faa2554325685275525.

* fix sqlite memory leak for row values

* db_sqlite: drop pkg_free() NULL check

Thanks go to Liviu Chircu for better readability suggestions

Co-authored-by: Razvan Crainea <razvan@opensips.org>

Co-authored-by: Zero King <l2dy@icloud.com>
Co-authored-by: Liviu Chircu <liviu@opensips.org>
Co-authored-by: Razvan Crainea <razvan@opensips.org>
Co-authored-by: OpenSIPS <github@opensips.org>
Co-authored-by: Bogdan-Andrei Iancu <bogdan@opensips.org>
Co-authored-by: Vlad Patrascu <vladp@opensips.org>
Co-authored-by: Peter Lemenkov <lemenkov@gmail.com>
Co-authored-by: Walter Doekes <walter+github@wjd.nu>
Co-authored-by: Nick Altmann <nick.altmann@gmail.com>
Co-authored-by: Aron Podrigal <aronp@guaranteedplus.com>
Co-authored-by: Ovidiu Sas <osas@voipembedded.com>
Co-authored-by: Fabian Gast <fabian.gast@nfon.com>
Co-authored-by: Eric Green <eric.green@onsip.com>
Co-authored-by: Maksym Sobolyev <sobomax@sippysoft.com>
Co-authored-by: rance <rance@ytel.com>
Co-authored-by: aniketkuwarvonage <86266545+aniketkuwarvonage@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants