Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5.0.* dialogs expiring early #1182

Closed
Forty-Tw0 opened this issue Jul 7, 2017 · 6 comments
Closed

5.0.* dialogs expiring early #1182

Forty-Tw0 opened this issue Jul 7, 2017 · 6 comments

Comments

@Forty-Tw0
Copy link

Forty-Tw0 commented Jul 7, 2017

Description

I have 3 servers using 5.0.1 and 3 using 5.0.2, all use the same config.

No matter what I set the dialog timeout to, the time between INVITE/ACK and the unix epoch 'timeout' variable in the dialog database table changed at random as much as 4 years in the past. The actual timeout seems to be around 95 seconds.

I have found a relation which explains the timeout value in the database.
timeout(end_ts) + tl->timeout(lifetime) - start_time(start_ts) = actual timeout that occurs
Where tl->timeout is reported by this log line:

DEBUG: dialog [dlg_timer.c:235]: get_expired_dlgs(): start with tl=0x7ff6657da7f0 tl->prev=0x7ff665748968 tl->next=0x7ff665748968 (44923874) at 44923874 and end with end=0x7ff665748968 end->prev=0x7ff6657da7f0 end->next=0x7ff6657da7f0

94 seconds between the INVITE/ACK and dialog timeout in this case

Jul  7 22:02:51 /usr/sbin/kamailio[31028]: INFO: <script>: RELAYING INVITE to 103@deviceIP:5060 (100 -> 103)
Jul  7 22:02:53 /usr/sbin/kamailio[31029]: INFO: <script>: RELAYING ACK to 103@deviceIP:5060 (100 -> 103)
Jul  7 22:04:24 /usr/sbin/kamailio[31045]: WARNING: dialog [dlg_handlers.c:1577]: dlg_ontimeout(): timeout for dlg with CallID '0_1383046486@192.168.1.60' and tags '3811275978' '8f85dcea-3a30-434c-afbb-60d76ac630ac'

In the database for this dialog
start_time 1499464973
timeout 1448295888 # this randomly changes with every call I make, but is always < start_time
state is 3 which means waiting for ACK?

kamctl kamcmd dlg.list

{
        h_entry: 402
        h_id: 8751
        call-id: 0_1383046486@192.168.1.60
        from_uri: sip:100@serverIP:5062
        to_uri: sip:103@serverIP:5062
        state: 3
        start_ts: 1499464973
        init_ts: 1499464971
        timeout: 1499508172
        lifetime: 43200
        dflags: 512
        sflags: 2
        iflags: 0
        caller: {
                tag: 3811275978
                contact: sip:100@deviceIP:5060
                cseq: 2
                route_set:
                socket: udp:serverIP:5062
        }
        callee: {
                tag: 8f85dcea-3a30-434c-afbb-60d76ac630ac
                contact: sip:deviceIP:5060
                cseq: 0
                route_set: <sip:serverIP:5062>
                socket: udp:serverIP:5060
        }
        profiles: {
        }
        variables: {
        }
}

Troubleshooting

I tried to use keep alive in the dialog with ka_timer and ka_interval, I do see the OPTIONS and 200 OK flowing between the two devices and kamailio. This does not prevent the dialog from timing out though.

I tried timer_procs=1 to use a separate timer process, did nto help.

Reproduction

I am just setting the dialog flag on the initial INVITE and let Kamailio do it's thing with the dialog.

modparam("dialog", "db_mode", 1)
modparam("dialog", "default_timeout", 43200) # this should be the default as per kamailio documentation

Additional Information

  • Kamailio Version:
version: kamailio 5.0.2 (x86_64/linux)
flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown
compiled on 14:05:03 Jun 14 2017 with gcc 4.8.5
  • Operating System:
3.10.0-514.21.1.el7.x86_64 #1 SMP Thu May 25 17:04:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
@miconda
Copy link
Member

miconda commented Jul 10, 2017

Do you use $dlg_ctx(...) or $dlg(...) variables? What other dialog parameters have you set in config?

@Forty-Tw0
Copy link
Author

Forty-Tw0 commented Jul 10, 2017

I am not setting any dlg variables in my config, I am only using the dialog module to pull dialog information from the SQL database to figure out which contact to send ACKs and BYEs to. The only dialog related function I use in my config is to set the dlg flag on the initial INVITE.

From what I have seen, the timeouts and timestamps in memory are correct but the ones in the database are very wrong. Kamailio is loading the timeout from the database in it's timer handlers so it is using the incorrect values.

I am now trying to read contacts from $dlg() variables in place of the SQL database, in hope that disabling the database for dialogs will force Kamailio to use the timeout in memory rather than the database.

@Forty-Tw0
Copy link
Author

Forty-Tw0 commented Jul 10, 2017

I am now using the $dlg variables without the any database modparams and the dlg_ontimeout() still occurs at around 95 seconds.

This is the only modparam I have set for dialogs now:
modparam("dialog", "dlg_flag", 1)
Which I call on an INVITE with dlg_setflag(1);

I can read the value of these variables: $DLG_lifetime $dlg(lifetime) as and 43200

I have tried setting the default_timeout to the default timeout and it changes nothing.
modparam("dialog", "default_timeout", 43200)

I have even rebooted the server entirely to clear out the RAM.

@miconda
Copy link
Member

miconda commented Jul 11, 2017

Reading again properly, I noticed now that the state is 3, as you questioned.

In this state, the lifetime of dialog is not yet in effect. If the ACK doesn't come in 60sec from the 200ok, then the lifetime is set to 10sec. Depending on the timer step, the actual duration can vary a bit, because all these checks are done on timer.

In a similar situation, if the dialog is not answered in 300sec from initiation, it is destroyed.

The dialog timeout (lifetime) as configured via mod param is used after the ACK is routed.

Maybe these values should be make params, in this way people will become more aware of them.

Now, back to your need -- why do you need dialog to route ACK and BYE, aren't the Contact and Record-Route headers enough? Or they are messed up by some nodes in the path (e.g., sbc, alg)?

@Forty-Tw0
Copy link
Author

Forty-Tw0 commented Jul 11, 2017

Using dlg_manage() on my ACK caused the dialog state to be updated to 4, and my dialogs are no longer timing out early!

The timeout column in my SQL database now contains the correct unix epoch as well, is it intended to be much less than the start_time when the dialog state is 3? I guess it isn't really used if the ACK has not arrived.


I am not using record-route headers; I think I had some trouble with the PBX preferring them over the contact header which resulted in lost packets. I will look at them again in the next stage of development, now that everything is working well enough as is.

My architecture relies heavily on an identifier in the URIs and contacts, this identifier is used to forward SIP to registered contacts of PBXs behind NAT. It just so happens that for in-dialog SIP like ACK/PRACK/UPDATE/BYE there are no headers which contain this identifier when sent from the user's device. So for those I have to resort to using the contact stored in the dialog variables.

@miconda
Copy link
Member

miconda commented Jul 12, 2017

If you want to do your own management for routing of requests within dialog, then up to you. Check also the htable module, could be an alternative for storing data based on Call-ID, but dialog should be fine.

I am closing this item, as dialog relies on SIP specs for internal states. I will have in mind to add those timeouts for no-ACK and early dialogs as mod parameters.

@miconda miconda closed this as completed Jul 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants