You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
this is easily reproducible over RDMA. The server side uses only one xio thread which accepts > 100 connections. Eventually no more connections are accepted and the thread cannot make progress anymore. I am attaching backtraces from the thread.
(gdb) t 121
[Switching to thread 121 (Thread 0x7f1da37fe700 (LWP 6324))]
#0 0x00007f1ebb083b7e in xio_nexus_release_cb (data=<optimized out>) at ../common/xio_nexus.c:1096
1096 ../common/xio_nexus.c: No such file or directory.
(gdb) bt
#0 0x00007f1ebb083b7e in xio_nexus_release_cb (data=<optimized out>) at ../common/xio_nexus.c:1096
#1 0x00007f1ebb053eed in xio_ev_loop_exec_scheduled (loop=loop@entry=0x7f1d9400fa20) at xio/xio_ev_loop.c:368
#2 0x00007f1ebb053f83 in xio_ev_loop_run_helper (loop_hndl=0x7f1d9400fa20, timeout=timeout@entry=-1) at xio/xio_ev_loop.c:412
#3 0x00007f1ebb0542fa in xio_ev_loop_run (loop_hndl=<optimized out>) at xio/xio_ev_loop.c:514
#4 0x00007f1ebb0567b5 in xio_context_run_loop (ctx=0x7f1d940082f0, timeout_ms=timeout_ms@entry=-1) at xio/xio_context.c:504
On another node with the same problem:
(gdb) t 131
[Switching to thread 131 (Thread 0x7fe15bfff700 (LWP 10981))]
#0 0x00007fe374969e89 in INIT_LIST_HEAD (list=<optimized out>) at ./linux/list.h:59
59 ./linux/list.h: No such file or directory.
(gdb) bt
#0 0x00007fe374969e89 in INIT_LIST_HEAD (list=<optimized out>) at ./linux/list.h:59
#1 list_del_init (entry=<optimized out>) at ./linux/list.h:166
#2 xio_ev_loop_remove_event (evt=0x7fda44f4fb38) at xio/xio_ev_loop.c:332
#3 0x00007fe374969ee7 in xio_ev_loop_exec_scheduled (loop=loop@entry=0x7fe11c009810) at xio/xio_ev_loop.c:361
#4 0x00007fe374969f83 in xio_ev_loop_run_helper (loop_hndl=0x7fe11c009810, timeout=timeout@entry=-1) at xio/xio_ev_loop.c:412
#5 0x00007fe37496a2fa in xio_ev_loop_run (loop_hndl=<optimized out>) at xio/xio_ev_loop.c:514
#6 0x00007fe37496c7b5 in xio_context_run_loop (ctx=0x7fe11c0095f0, timeout_ms=timeout_ms@entry=-1) at xio/xio_context.c:504
Hi,
this is easily reproducible over RDMA. The server side uses only one xio thread which accepts > 100 connections. Eventually no more connections are accepted and the thread cannot make progress anymore. I am attaching backtraces from the thread.
On another node with the same problem:
More info included:
Please let me know if you need more info. Thanks.
The text was updated successfully, but these errors were encountered: