Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pua_dialoginfo: failed calls triggers sending PUBLISH errors #142

Closed
ovidiusas opened this issue Nov 27, 2013 · 13 comments
Closed

pua_dialoginfo: failed calls triggers sending PUBLISH errors #142

ovidiusas opened this issue Nov 27, 2013 · 13 comments
Assignees
Labels
Milestone

Comments

@ovidiusas
Copy link
Member

If a call fails without receiving a provisional reply first, it will trigger an error:
ERROR:pua_dialoginfo:dialog_publish: sending publish failed for pres_uri [SIP URI] to server [SERVER]

The issue here is that the call is in trying state and then is terminated.
If we set:
modparam("pua_dialoginfo", "publish_on_trying", 1)
the error doesn't occur because an initial PUBLISH is sent for trying.

If the "publish_on_trying" parameter is not set (the default value being 0), then the first PUBLISH is sent when the dialog is terminated. The PUBLISH has an expiration set to 0 and since there was no previous states published before, pua does not send any PUBLISH, hence the error.

The relevant pua code (pua/send_publish.c send_publish_int):
if(presentity== NULL)
{
if(publ->expires== 0)
{
LM_DBG("request for a publish with expires 0 and"
" no record found\n");
goto error;
}

The fix should be in pua_dialoginfo, there should be no PUBLISH sent out when:

  • the call is transitioning from early to terminated;
  • the publish_on_trying parameter is set to 0.

To test this, a call should be made to a SIP device that will reject the call without sending any 18x provisional replies first.

Regards,
Ovidiu Sas

@bogdan-iancu
Copy link
Member

Hi Ovidiu,

It may sounds as a stupid question, but shouldn't we send a PUBLISH (with trying) all th time when a new call is started ? (like the publish_on_trying is set to 1 all the time)
Otherwise, if you have a call ending with timeout, no PUBLISH at all will be fired.

Regards,
Bogdan

@ghost ghost assigned bogdan-iancu Dec 6, 2013
@ovidiusas
Copy link
Member Author

Hello Bogdan,

I circumvented this issue by setting publish_on_trying to 1.
According to the RFC, trying is a valid state and yes, we should send a PUBLISH at the beginning of each call with state trying.

The questions/issues here are:

  1. Why do we have this parameter if trying is a valid state? Are there phones that are having issues handling 'trying'? Maybe we should get rid of this parameter.
  2. If we really need to keep this parameter, then the default value should be 1.
  3. If we keep this parameter, then we need to fix the error and avoid sending a PUBLISH.

Regards,
Ovidiu Sas

@bogdan-iancu
Copy link
Member

HI Ovidiu,

I will talk to Damien (@dsandras) as it seems that param was introduced by him (see commit f7dae1d) .

Regards,
Bogdan

@dsandras
Copy link
Contributor

Hi Bogdan,

The main reason why I added that setting in the first place was to workaround the SIP PUBLISH problem in OpenSIPS. We wanted to minimize the number of PUBLISH being sent.

For example, when you call a device that is BUSY, you will be notified through a PUBLISH that there was a call attempt, then that the call was finished, instantly. With the unordered PUBLISH problems, the BLF was most of the time stuck in a wrong state.

In general, even if that specific problem has been fixed, I think that it is preferable to minimize state changes on servers with many users, and many BLFs, It is probably best to be notified as soon as the call is ringing state, but not before to prevent such rapid state changes. I would keep the option, but I would apply Ovidiu's suggesting : when the publish_on_trying parameter is disabled, we should not try publishing the terminated state.

Moreover, the RFC 4235 states:

3.10. Rate of Notifications

For reasons of congestion control, it is important that the rate of
notifications not be excessive. It is RECOMMENDED that the server
not generate notifications for a single subscriber faster than once
every 1 second.

So I think that "trying" -> "terminated" instant transitions should be prevented, and that is what the publish_on-trying option is trying to do.

@ovidiusas
Copy link
Member Author

Hello Damien,

Please update the documentation with what you described here.
I suggest to set the default value of the parameter to '1' and let the admin to choose not to signal/publish the 'trying' state.
Can you also take a look and fix the issue that was introduced by this new parameter: the bogus error probe?

Thanks,
Ovidiu

@dsandras
Copy link
Contributor

Hi Ovidiu,

I can change the default value, and adapt the documentation.

Please note that the trying state corresponds to dialoginfo_set being called in the dialplan. Not to the remote peer answering with a 100 Trying.

However, fixing the bogus error probe seems difficult. I can't see a way in pua_dialoginfo to prevent publishing the "terminated" state if there was no previous publication (ie if we did not reach the "trying" state when publish_on_trying is set to 0). There is actually no way to determine the state before the callback is triggered.

Do you, or Bogdan, have an implementation suggestion ?

@bogdan-iancu
Copy link
Member

Hi,

Damien, two aspects:

  1. going back to Ovidiu's question: I understand what the RFC says, but it is wise from user perspective "no to publish" at all while he is in a call with trying state ?
  2. as a fix when "publish_on_trying" is not used, we can either look at the "old state" when TERMINATING (to see from which state with switched to "terminated") or we can use a dlg val as marker to remember if a prev publish was sent or not - I can look more into this.

Regards,
Bogdan

@dsandras
Copy link
Contributor

Hi Bogdan,

  1. I think it is wise. Actually, the "early" state of the dialog is notified (180 Ringing, or even, I think 100 Trying). The only state that is not notified is the "trying" state. This trying state does not correspond to a specific SIP Message or dialog state, but to the fact that pua_dialoginfo_set has been called from the routing script. I think it is preferable in most cases to notify only when there is some kind of SIP message transiting over the network.

  2. OK

@ovidiusas
Copy link
Member Author

@dsandras
The trying state correspond to the INVITE being sent out and therefor there is a message transiting the network.

@bogdan-iancu
I was looking into a solution for this and we should aim for a generic one. My idea was to use a flag to mark when a PUBLISH was sent and to send a new PUBLISH out with a '0' expiry only if a previous PUBLISH was sent. This doesn't need to keep track of the state and it is more aligned with presence (I would like to avoid specific dialog details in the PUBLISH sending decision logic, for this particular '0' expiry case).

Regards,
Ovidiu Sas

@dsandras
Copy link
Contributor

@ovidiusas : Yes, I meant a response from the remote peer transiting over the network.

@ovidiusas
Copy link
Member Author

Just to avoid any confusion, the doc should be updated with a diagram:

For a successful call we have:

UAC       proxy       UAS     presence server
 |--INVITE->|          |            |
 |<-100-----|--INVITE->|            |
 |          |--PUBLISH(trying)----->|
 |          |<-100-----|            |
 |          |          |            |
 |          |<-18x-----|            |
 |<-18x-----|--PUBLISH(early)------>|
 |          |          |            |
 |          |<-200-----|            |
 |<-200-----|--PUBLISH(confirmed)-->|
 |--ACK---->|          |            |
 |          |--ACK---->|            |
 |          |          |            |

If we choose to avoid the "trying" state, then "early" will be triggered on the first 18x provisional reply.
Something like this:

UAC       proxy       UAS     presence server
 |--INVITE->|          |            |
 |<-100-----|--INVITE->|            |
 |          |<-100-----|            |
 |          |          |            |
 |          |<-18x-----|            |
 |<-18x-----|--PUBLISH(early)------>|
 |          |          |            |
 |          |<-200-----|            |
 |<-200-----|--PUBLISH(confirmed)-->|
 |--ACK---->|          |            |
 |          |--ACK---->|            |
 |          |          |            |

The issue is with rejected calls. If we don't have a 18x provisional reply, then we have this:

UAC       proxy       UAS     presence server
 |--INVITE->|          |            |
 |<-100-----|--INVITE->|            |
 |          |--PUBLISH(trying)----->|
 |          |<-100-----|            |
 |          |          |            |
 |          |<-456xx---|            |
 |          |--PUBLISH(terminated)->|
 |<-456xx---|--ACK---->|            |
 |--ACK---->|          |            |

If we don't have a 18x provisional reply and we avoid the "trying" state, then we don't have any PUBLISH at all:

UAC       proxy       UAS     presence server
 |--INVITE->|          |            |
 |<-100-----|--INVITE->|            |
 |          |<-100-----|            |
 |          |          |            |
 |          |<-456xx---|            |
 |<-456xx---|--ACK---->|            |
 |--ACK---->|          |            |

ovidiusas added a commit that referenced this issue Jan 27, 2014
@ovidiusas
Copy link
Member Author

Fixed by commits:

@dsandras
Copy link
Contributor

Thanks for fixing this !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants