Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kamailio crushes when tcp errors invalid fd -1 #748

Closed
mbike2000ru opened this issue Aug 17, 2016 · 13 comments
Closed

Kamailio crushes when tcp errors invalid fd -1 #748

mbike2000ru opened this issue Aug 17, 2016 · 13 comments

Comments

@mbike2000ru
Copy link

Hi
Kamailio crushes with lots of such CRITICAL errors:

Aug 16 15:09:48 kam-fe1 kamailio[8485]: CRITICAL: [tcp_read.c:1654]: handle_io(): io_watch_del failed for 0x7f3e610a3a70 id 105 fd -1, state -1, flags 4028, main fd -1, refcnt -2117 ([178.209.103.226]:40855 -> [178
.209.103.226]:5060)
Aug 16 15:09:48 kam-fe1 kamailio[8479]: CRITICAL: [io_wait.h:594]: io_watch_del(): invalid fd -1, not in [0, 23)
Aug 16 15:09:48 kam-fe1 kamailio[8481]: WARNING: [tcp_read.c:1629]: handle_io(): F_TCPCONN connection marked as bad: 0x7f3e608454e8 id 78 refcnt -257
Aug 16 15:09:48 kam-fe1 kamailio[8379]: INFO: <script>: 52373459-303f6c6b@192.168.2.2|log|source 128.204.55.172:5160
Aug 16 15:09:48 kam-fe1 kamailio[8485]: WARNING: [tcp_read.c:1629]: handle_io(): F_TCPCONN connection marked as bad: 0x7f3e610a3a70 id 105 refcnt -2117
Aug 16 15:09:48 kam-fe1 kamailio[8479]: CRITICAL: [tcp_read.c:1654]: handle_io(): io_watch_del failed for 0x7f3e63c73d90 id 984 fd -1, state -1, flags 5022, main fd -1, refcnt -126097 ([81.211.54.218]:51642 -> [81.
211.54.218]:5060)
Aug 16 15:09:48 kam-fe1 kamailio[8481]: CRITICAL: [io_wait.h:594]: io_watch_del(): invalid fd -1, not in [0, 2)
Aug 16 15:09:48 kam-fe1 kamailio[8379]: INFO: <script>: 52373459-303f6c6b@192.168.2.2|log|from sip:Menedzher1_UYHPQ@29161.ztpbx.ru
Aug 16 15:09:48 kam-fe1 kamailio[8485]: CRITICAL: [io_wait.h:594]: io_watch_del(): invalid fd -1, not in [0, 2)
Aug 16 15:09:48 kam-fe1 kamailio[8377]: ALERT: [main.c:728]: handle_sigs(): child process 8504 exited by a signal 11

+++++++++++++++++++++++++++++++++
kamailio -v
version: kamailio 4.3.5 (x86_64/linux)
flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown
compiled on 17:51:55 Mar 8 2016 with gcc 4.4.7

@miconda
Copy link
Member

miconda commented Aug 17, 2016

What do you mean by crashes? Does it stop running or just printing error messages? If stops, provide the logs at shutdown.

@mbike2000ru
Copy link
Author

mbike2000ru commented Aug 17, 2016

it stops.

tcp_logs.txt
As soon as I get a back trace I will provide it.

@mbike2000ru
Copy link
Author

mbike2000ru commented Aug 19, 2016

kamailio_backtrace.txt
kamailio_start_of_critical_alarms.txt
kamailio_backtrace2.txt

Hello
I got a backtrace(2 core dump files)
gdb /usr/sbin/kamailio core.kamailio.495.1471534091.11080
gdb /usr/sbin/kamailio core.kamailio.495.1471534091.11081
++++++++++++++++++++++

Log in this case:
Aug 18 15:28:13 kam-fe1 kamailio[11105]: CRITICAL: [pass_fd.c:275]: receive_fd(): EOF on 131
Aug 18 15:28:13 kam-fe1 kamailio[10981]: INFO: <script>: isbc89cdadhb00stta9s6s87ibhh77c766cb@SoftX3000|log|external reply 200
Aug 18 15:28:13 kam-fe1 kamailio[11061]: INFO: <script>: unhandled AMQP event, payload: { "Register-Overwrite-Notify": false, "Suppress-Unregister-Notifications": true, "Custom-Channel-Vars": { "Username": "00001", "Realm": "4995069521.vats.gobaza.ru", "Account-ID": "4c17761edf730103d6cb9732b6f1b8c4", "Authorizing-ID": "ab9596bb62cd00e97a3efe7c4cac111e", "Authorizing-Type": "device", "Owner-ID": "e36703e3650331f27a2adb538ab0bc38", "Account-Realm": "4995069521.vats.gobaza.ru", "Account-Name": "4995069521", "Suppress-Unregister-Notifications": true, "Register-Overwrite-Notify": false }, "Auth-Password": "xgBDSfnt", "Auth-Method": "password", "Node": "whistle_apps@logic3.vpbx.local", "Msg-ID": "aa16f225-90a9-4968-8d6a-e28efb9cfdfd", "App-Version": "0.4.2", "App-Name": "registrar", "Event-Name": "authn_resp", "Event-Category": "directory", "Server-ID": "consumer://1/" }
Aug 18 15:28:13 kam-fe1 kamailio[10979]: ALERT: [main.c:728]: handle_sigs(): child process 11081 exited by a signal 11
Aug 18 15:28:13 kam-fe1 kamailio[10979]: ALERT: [main.c:731]: handle_sigs(): core was generated

@mbike2000ru
Copy link
Author

the Private memory was 4 mb. I increased the Private memory to 32 mb. Probably it will help to resolve the issue.

@mbike2000ru
Copy link
Author

Private memory increase did not help. Kamailio crushed yestearday

@mbike2000ru
Copy link
Author

mbike2000ru commented Aug 26, 2016

hi
almost all "gdb/bt full" contain:
#13 0x00000000005f916c in parse_disposition (s=0xa75468 tcp_reader_ltimer+228904, disp=optimized out) at parser/parse_disposition.c:60
disp_p =
new_p =
state = 2
saved_state = 2
tmp = 0x7f161b681fc0 ""
end = 0x4020dd8800000000 error: Cannot access memory at address 0x4020dd8800000000
func = "parse_disposition"

may be it helps to solve the issue?

@miconda
Copy link
Member

miconda commented Aug 26, 2016

Did you install from source code or from packages? Because the symbols are missing in most of the frames in backtraces, looking invalid and being not useful to track down the issue.

@mbike2000ru
Copy link
Author

mbike2000ru commented Aug 26, 2016

Core.dumps were generated in production server.
I do not know how to define if kamailio-dbg rpm was install in that production server.

Is it safe to install this rpn on production server (it has a high load) so probably it may be done at night.

I made gdb at another server with kamailio-dbg installed.

@mbike2000ru
Copy link
Author

The production kamailio was installed from Kazoo repository(2600hz rpm)Core.dumps were generated in production server.
I do not know how to define if kamailio-dbg rpm was install in that production server.Is it safe to install this rpm on production server (it has a high load) so probably it may be done at night?I made gdb at another server with kamailio-dbg installed.

On Friday, August 26, 2016 12:20 PM, Daniel-Constantin Mierla notifications@github.com wrote:

Did you install from source code or from packages? Because the symbols are missing in most of the frames in backtraces, looking invalid and being not useful to track down the issue.—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@mbike2000ru
Copy link
Author

mbike2000ru commented Aug 28, 2016

Package kazoo-kamailio-debuginfo-4.3.4-8.el6.x86_64 already installed and latest version

so - all core dumps were done with this module on the same server(where the problem occured).
The backtrace attached
bt_full.txt

miconda added a commit that referenced this issue Aug 31, 2016
@miconda
Copy link
Member

miconda commented Aug 31, 2016

Can you try with master branch or backport the patch referenced above into your clone? If all ok, then I will backport to stable branches inside kamailio repo.

miconda added a commit that referenced this issue Aug 31, 2016
@miconda
Copy link
Member

miconda commented Aug 31, 2016

Use also the second patch referenced above, because your version was older than I first looked at.

@miconda
Copy link
Member

miconda commented Sep 7, 2016

Reopen if the latest patches don't fix it.

@miconda miconda closed this as completed Sep 7, 2016
miconda added a commit that referenced this issue Sep 8, 2016
…a debug logs

- reported by GH #748

(cherry picked from commit 71b9765)
miconda added a commit that referenced this issue Sep 8, 2016
…or case

- related to GH #748

(cherry picked from commit 4819554)
miconda added a commit that referenced this issue Jun 13, 2017
…a debug logs

- reported by GH #748

(cherry picked from commit 71b9765)
(cherry picked from commit cfa3a6f)
miconda added a commit that referenced this issue Jun 13, 2017
…or case

- related to GH #748

(cherry picked from commit 4819554)
(cherry picked from commit f450fea)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants