How can Srf client know about the loss of TCP/WS connection from SIP client to Drachtio server #138

jyotigroomefy · 2022-12-25T15:46:34Z

We have an application that uses drachtio srf to connect to the drachtio server.
We have situation when the TCP/WS connection of SIP client to Drachtio server is broken but it appears that there is no way drachtio server communicate that connection loss to the drachtio srf client. After this we have dialogs with disconnected sip client alive the in srf client . Now when we send sip message using these dialogs then drachtio server crashes with segmentation fault. If there is way to get destroy events with all the dialogs associated with the disconnected TCP/WS connections, it would be very helpful for application using drachtio srf.

davehorton · 2022-12-25T21:06:39Z

what version of drachtio server are you running? If not the latest, could you try with 0.8.20 and let me know if the server still crashes in this instance? If it does, I would appreciate seeing the drachtio server logs with sofia-loglevel=4

jyotigroomefy · 2022-12-30T09:34:44Z

Thanks for your prompt response Dave.

I have collected the logs. I have attached here.
I am using Drachtio version 0.8.20

filtered4.txt
filtered2.txt

davehorton · 2022-12-30T13:06:15Z

Thanks this log is useful. Reviewing it now. How easy is this for you to recreate?

jyotigroomefy · 2023-01-04T07:59:27Z

filtered5.txt
Here is one more log file for above crashes.

davehorton · 2023-01-04T12:52:45Z

Thanks. Again -- is this easy for you to recreate? If so, I would like to ask your help in testing a fix

davehorton · 2023-01-04T16:24:17Z

Would it be possible for you to test with v0.8.21-rc1 and to forward the logs again when you recreate? I have added some logging that I hope will help me troubleshoot this further

davehorton · 2023-01-04T19:57:32Z

Also, would it possible to get more history leading up to the crash? I would like to be able to see the wss connection from when it is first made, and the logs so far all start after the client has established the wss socket connection. In any case, please just provide as much log history as possible leading up to the seg fault.

jyotigroomefy · 2023-01-06T08:07:34Z

we are testing the latest one. we will share logs if we found seg fault issue.

davehorton · 2023-01-06T12:36:18Z

ok great.

jyotigroomefy · 2023-01-13T09:07:12Z

Hi we have tested latest one v0.8.21-rc1. We have found same crashes. Sharing logs here.

filtered1.txt

davehorton · 2023-01-13T20:15:48Z

I looked over the logs but I need more history. I have to be able to track the entire history of when this specific client connected to drachtio. This log covers the period 2023-01-12 17:03:09 - 2023-01-12 17:05:15. Thats only a few minutes. Have you got the logs just preceeding this time peroid? If so please provide.

In this log, I can see the issue has to do with reference counting, but the reference counts for this connection are already off by one at the start of the log, so the issue I need to investigate and fix happened at an earlier time, The specific client in this case was connecting from wss/192.168.1.125:21268 and I need to go back to the log where he first connected

I really want to fix this bug but I need your help to preserve and provide all of the logs for a much longer period of time,

davehorton · 2023-01-13T20:16:21Z

by the way we can leave the issue open here, but it is more properly a drachtio server bug

davehorton · 2023-01-18T19:39:24Z

any chance you've been able to make this happen again, and get more log history @jyotigroomefy ?

jyotigroomefy · 2023-01-22T15:41:31Z

Yes. i am able to take full log history. Please find in the attachment.

drachtioCompleteFiltered.txt

davehorton · 2023-01-22T18:20:22Z

Excellent thanks, will have a look

davehorton · 2023-01-24T18:13:19Z

Your code is doing one thing wrong which needs to be corrected. You are responding to an incoming REGISTER with a 202 Accepted. This is not correct, a success response to a REGISTER is a 200 OK, not a 202. (A 202 is the success response for a SUBSCRIBE).

2023-01-21T06:46:55.28341336Z stdout F [0] SIP/2.0 202 Accepted
2023-01-21T06:46:55.283418339Z stdout F [0] Via: SIP/2.0/WSS consort.org;branch=z9hG4bK1356925;received=192.168.1.125;rport=22661
2023-01-21T06:46:55.283423296Z stdout F [0] From: <sip:mcx.cons-857.cons@consort.org>;tag=9f2pefdtd3
2023-01-21T06:46:55.283428524Z stdout F [0] To: <sip:mcx.cons-857.cons@consort.org>;tag=ea4vQS96QNZ7D
2023-01-21T06:46:55.2834335Z stdout F [0] Call-ID: rljqtset8cartjdbv5f6r1
2023-01-21T06:46:55.283438734Z stdout F [0] CSeq: 7942 REGISTER
2023-01-21T06:46:55.283453612Z stdout F [0] Contact: <sip:mcx.cons-857.cons@consort.org>;+g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mcptt";+g.3gpp.mcptt
2023-01-21T06:46:55.283460235Z stdout F [0] Expires: 0
2023-01-21T06:46:55.283465531Z stdout F [0] Content-Length: 0
2023-01-21T06:46:55.283470693Z stdout F [0]  
2023-01-21T06:46:55.283475785Z stdout F [0] 2023-01-21 12:16:55.275963 tport.c:2345 tport_set_secondary_timer() tport(0x563333420000): reset timer

Could you please change your code to respond 200 OK when you want to respond with a success result to a REGISTER. Also please upgrade to drachtio-server version v0.8.21-rc3. After making both changes please retest and let me know the results.

v0.8.21-rc3 has a fix that may resolve this issue (but it also requires that you properly respond 200 instead of 202 to successful register requests)

jyotigroomefy · 2023-01-25T04:16:59Z

thank you :) I will change response 200 OK. I will use v0.8.21-rc3 and let you know the results.

jyotigroomefy · 2023-01-27T06:58:45Z

Hi
we have changed response to 200 OK and used latest version in our application. We have tested it. now frequently crashes is not happening but got same crash yesterday. i have attached log file here.

We have gone through your code and noticed one thing in "sip-dialog-controller.cpp" file.
'pData" pointer variable is redeclared same as argument variable. i have attached code snapshot here.

drachtio_log.txt

davehorton · 2023-01-27T12:30:07Z

Thanks I will take a look at your log.

As far as that statement defining a local var named pData, that is not a problem. It is not re-declaring an existing variable it is creating a new variable of the same name so in this scope that variable is what is used when we refer to pData

davehorton · 2023-01-29T20:10:26Z

you are still incorrectly sending 202 responses to REGISTER requests. Could you fix that, and provide a new log if you still experience crashes?

2023-01-25T13:57:19.295093506Z stdout F [0] 2023-01-25 19:27:19.294716 Client::read_handler read: c6d15884-7839-4f24-9883-67c92a303060|sip|73aa8742-80d4-4848-98a7-ce183f85b68f|
2023-01-25T13:57:19.295110268Z stdout F [0] SIP/2.0 202 Accepted
2023-01-25T13:57:19.295119837Z stdout F [0] Call-ID: idc38nsv5vb7096e9v5pv6
2023-01-25T13:57:19.295128731Z stdout F [0] cseq: 904 REGISTER
2023-01-25T13:57:19.295145116Z stdout F [0] from: <sip:mcx.cons-487.cons@consort.org>;tag=i175v71a1f
2023-01-25T13:57:19.295154496Z stdout F [0] to: <sip:mcx.cons-487.cons@consort.org>
2023-01-25T13:57:19.295163473Z stdout F [0] Content-Length: 0
2023-01-25T13:57:19.295172128Z stdout F [0] contact: <sip:mcx.cons-487.cons@consort.org>;+g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mcptt";+g.3gpp.mcptt
2023-01-25T13:57:19.295181279Z stdout F [0] expires: 0
2023-01-25T13:57:19.2951902Z stdout F [0] 
2023-01-25T13:57:19.295198971Z stdout F [0]

davehorton · 2023-02-02T14:29:39Z

can you give me an updated status on this? Are you having any issues after the change above?

jyotigroomefy · 2023-02-02T16:30:46Z

I am really sorry for sending logs without changing 202 status code. We have now fixed 202 status code responses to REGISTER requests and good thing we have not found any issue for last 3 days. we are continue doing testing. I will update here if we found any issue. Thank you for helping us :)

davehorton · 2023-02-02T16:57:04Z

no worries at all. Please let me know how it goes, whether you experience any crashes or hangs, and also how memory utilization looks over time

davehorton · 2023-02-17T19:08:36Z

@jyotigroomefy can you give me an update on where this stands? Is your issue fixed or is it still occurring?

davehorton · 2023-02-20T17:54:27Z

closed do to lack of response; presumed fixed.

jyotigroomefy · 2023-02-24T08:09:28Z

@davehorton it's working fine. Our issue is fixed. It hasn't occur again. Thank you :)

miklinson mentioned this issue Feb 15, 2023

Getting informed when a client websocket closes #57

Open

davehorton closed this as completed Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can Srf client know about the loss of TCP/WS connection from SIP client to Drachtio server #138

How can Srf client know about the loss of TCP/WS connection from SIP client to Drachtio server #138

jyotigroomefy commented Dec 25, 2022

davehorton commented Dec 25, 2022

jyotigroomefy commented Dec 30, 2022

davehorton commented Dec 30, 2022

jyotigroomefy commented Jan 4, 2023

davehorton commented Jan 4, 2023

davehorton commented Jan 4, 2023

davehorton commented Jan 4, 2023

jyotigroomefy commented Jan 6, 2023

davehorton commented Jan 6, 2023

jyotigroomefy commented Jan 13, 2023

davehorton commented Jan 13, 2023 •

edited

Loading

davehorton commented Jan 13, 2023

davehorton commented Jan 18, 2023

jyotigroomefy commented Jan 22, 2023

davehorton commented Jan 22, 2023

davehorton commented Jan 24, 2023 •

edited

Loading

jyotigroomefy commented Jan 25, 2023

jyotigroomefy commented Jan 27, 2023

davehorton commented Jan 27, 2023

davehorton commented Jan 29, 2023

davehorton commented Feb 2, 2023

jyotigroomefy commented Feb 2, 2023

davehorton commented Feb 2, 2023

davehorton commented Feb 17, 2023

davehorton commented Feb 20, 2023

jyotigroomefy commented Feb 24, 2023

How can Srf client know about the loss of TCP/WS connection from SIP client to Drachtio server #138

How can Srf client know about the loss of TCP/WS connection from SIP client to Drachtio server #138

Comments

jyotigroomefy commented Dec 25, 2022

davehorton commented Dec 25, 2022

jyotigroomefy commented Dec 30, 2022

davehorton commented Dec 30, 2022

jyotigroomefy commented Jan 4, 2023

davehorton commented Jan 4, 2023

davehorton commented Jan 4, 2023

davehorton commented Jan 4, 2023

jyotigroomefy commented Jan 6, 2023

davehorton commented Jan 6, 2023

jyotigroomefy commented Jan 13, 2023

davehorton commented Jan 13, 2023 • edited Loading

davehorton commented Jan 13, 2023

davehorton commented Jan 18, 2023

jyotigroomefy commented Jan 22, 2023

davehorton commented Jan 22, 2023

davehorton commented Jan 24, 2023 • edited Loading

jyotigroomefy commented Jan 25, 2023

jyotigroomefy commented Jan 27, 2023

davehorton commented Jan 27, 2023

davehorton commented Jan 29, 2023

davehorton commented Feb 2, 2023

jyotigroomefy commented Feb 2, 2023

davehorton commented Feb 2, 2023

davehorton commented Feb 17, 2023

davehorton commented Feb 20, 2023

jyotigroomefy commented Feb 24, 2023

davehorton commented Jan 13, 2023 •

edited

Loading

davehorton commented Jan 24, 2023 •

edited

Loading