New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opensips 2.3 - BUG - shutdown timeout triggered #1172
Comments
I have two core dumps. Bellow are bt / bt full output Core dump 1
Core Dump 2
|
Hi, @apsaras! Any specific configuration to your usrloc / mid_registrar modules? I have yet to trigger a contact expiry crash - maybe some module config details can help me understand your scenario better. |
Hello Liviu My configuration is a bit strange. What I am doing is the following. I am using registrar to register a PBX with OpenSIPs, and then, using the established connection, to forward registration requests to PBX from remote users using mid-registrar. Following are the registrar and mid-registrar module configuration. ----- mid_registrar -----modparam("mid_registrar","mode",2) ----- registrar params -----modparam("registrar", "max_contacts", 10) |
Ok, that explains it. This is quite an unexpected scenario - by design, registrar and mid_registrar are currently mutually exclusive, hence why the PBX contact's "mid_registrar" data is NULL, causing the crash on expiry. Thinking of a way to solve this. |
Is there any other way to implement the above scenario? PBX is behind NAT with dynamic IP and remote users should register on PBX via Proxy. The only reason I need PBX Registration is for authentication, location and NAT. Another issue that I have and it is strange, not sure if related to that bug is that in some cases Remote IP Phones are registering, receiving OK and after that send a de-registration (Registration with 0 expiration). When that happens, OpenSIPs returns 400. |
Hello Liviu. Do you have any estimation on that? Can you suggest any workaround? Thank you. |
Just pushed a fix for this - please let me know if you encounter any more issues when running both registrars concurrently. PS: We analyzed your scenario, and couldn't come up with a better way of solving it :) |
Hello Liviu. Thank you very much for your prompt fix. I compiled the new rev. and I am getting the following cores dumps now. core dump 1
Core Dump 2 |
Let me put together a full testing setup first - hopefully it will allow me to figure out what's missing. |
I can give you access to my system to test and trace |
Please email connect details to liviu@opensips.org - I can't seem yet to get it to crash over here. |
Just sent it. Please check and if you need anything contact me via email. |
Hello.
I am testing latest OpenSIPs 2.3 on Debian 7 64bit running on XenServer VM and randomly crash with no core dump generated. Those are the results of log level 6.
version: opensips 2.3.1 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
git revision: c33f23f
main.c compiled on 17:30:58 Aug 9 2017 with gcc 4.7
Aug 9 19:27:08 myhostname /sbin/opensips[24788]: DBG:nathelper:nh_timer: resolving next hop: '1.1.1.1'
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:httpd:answer_to_connection: START *** cls=(nil), connection=0x1a6df20, url=/json/sip_trace, method=GET, versio=HTTP/1.1, upload_data[0]=(nil), con_cls=(nil)
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:httpd:getConnectionHeader: Accept=/*
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:httpd:answer_to_connection: accept_type=[-1]
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:httpd:answer_to_connection: normalised_url=[/sip_trace]
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:mi_json:mi_json_answer_to_connection: START *** cls=(nil), connection=0x1a6df20, url=/sip_trace, method=GET, versio=HTTP/1.1, upload_data[0]=(nil), *con_cls=(nil)
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: ERROR:mi_json:mi_json_answer_to_connection: unable to find mi command [sip_trace]
Aug 9 19:27:09 myhostname /sbin/opensips[24782]: DBG:httpd:answer_to_connection: MHD_create_response_from_data [0x7fe85a5e06e0:56]
Aug 9 19:27:12 myhostname /sbin/opensips[24787]: DBG:usrloc:run_ul_callbacks: contact=0x7fe85f988d18, callback type 8/15, id 0 entered
Aug 9 19:27:12 myhostname /sbin/opensips[24787]: DBG:mid_registrar:mid_reg_ct_event: Contact callback (8): contact='sip:s@1.1.1.24:14128' | param=(0x7fe85f988e48 -> (nil)) | data[0]=((nil))
Aug 9 19:27:12 myhostname /sbin/opensips[24826]: DBG:core:io_wait_loop_epoll: [TCP_main] EPOLLHUP on IN ->connection closed by the remote peer!
Aug 9 19:27:12 myhostname /sbin/opensips[24826]: CRITICAL:core:receive_fd: EOF on 22
Aug 9 19:27:12 myhostname /sbin/opensips[24826]: DBG:core:handle_worker: dead child 6, pid 24787 (shutting down?)
Aug 9 19:27:12 myhostname /sbin/opensips[24826]: DBG:core:io_watch_del: [TCP_main] io_watch_del op on index 5 22 (0x8d3f00, 22, 5, 0x0,0x1) fd_no=35 called
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: DBG:core:handle_sigs: status = 11
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: INFO:core:handle_sigs: child process 24787 exited by a signal 11
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: INFO:core:handle_sigs: core was not generated
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: INFO:core:handle_sigs: terminating due to SIGCHLD
Aug 9 19:27:12 myhostname /sbin/opensips[24826]: INFO:core:sig_usr: signal 15 received
......
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: INFO:core:cleanup: cleanup
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: DBG:uac_auth:mod_destroy: done
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: DBG:core:pool_remove: connection still kept in the pool
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: DBG:core:pool_remove: connection still kept in the pool
Aug 9 19:27:12 myhostname /sbin/opensips[24776]: DBG:httpd:httpd_proc_destroy: destroying module ...
Aug 9 19:28:12 myhostname /sbin/opensips[24776]: CRITICAL:core:sig_alarm_abort: BUG - shutdown timeout triggered, dying...
The text was updated successfully, but these errors were encountered: