Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zeromq can hang indefinitely in zmq::ctx_t::terminate() despite LINGER=0 #2586

Closed
tcwalther opened this issue May 24, 2017 · 35 comments
Closed

Comments

@tcwalther
Copy link

tcwalther commented May 24, 2017

I have an edge-case in a Python program where ZeroMQ can hang indefinitely. I have an extremely hard time isolating the bug; the most reliable way for me to reproduce it is to repeatedly run our entire test suite until that suite hangs (every 10th time on average). I've already posted a corresponding bug report at zeromq/pyzmq#1003, but I think I've found out a bit more now which hopefully justifies creating an issue here.

The basic problem is that I have created contexts and sockets all with LINGER=0 and BLOCKY=False, and still, sometimes it hangs forever in zmq::ctx_t_terminate(). I am using zeromq 4.2.1. Looking at the GDB output, I can see that it hangs in https://github.com/zeromq/libzmq/blob/v4.2.1/src/ctx.cpp#L194. Here is the GDB output:

(gdb) frame
#4  0x00007f1d61ffcccd in zmq::ctx_t::terminate (this=0x4984290) at src/ctx.cpp:194
194	        int rc = term_mailbox.recv (&cmd, -1);
(gdb) p *this
$13 = {tag = 2880293630, sockets = {items = std::vector of length 48, capacity 64 = {0x48c6860, 0x5ca93d0,
      0x49d4bc0, 0x5c93cc0, 0x5bf5ff0, 0x4941160, 0x5ce6780, 0x5daee30, 0x5d03450, 0x5d0e030, 0x5dbf220, 0x49ccb20,
      0x49026d0, 0x4f375c0, 0x5d0ae60, 0x5e60cd0, 0x5d1b370, 0x5b00010, 0x4a14470, 0x4b0fb40, 0x4f247b0, 0x5c4fd60,
      0x48dd020, 0x5c4ed80, 0x4988070, 0x5d11ca0, 0x5c7b840, 0x4976780, 0x5aed1b0, 0x5cb16e0, 0x4944f50, 0x5d34bb0,
      0x4d79b60, 0x4a2fc90, 0x4c3b020, 0x5e64e60, 0x4eedc40, 0x4f87570, 0x5e5cd50, 0x4f41b40, 0x5e1ea80, 0x5afacd0,
      0x4acf650, 0x48c4640, 0x4f40a10, 0x5b57c60, 0x5b38b80, 0x5e66cc0}},
  empty_slots = std::vector of length 975, capacity 1024 = {1025, 1024, 1023, 1022, 1021, 1020, 1019, 1018, 1017,
    1016, 1015, 1014, 1013, 1012, 1011, 1010, 1009, 1008, 1007, 1006, 1005, 1004, 1003, 1002, 1001, 1000, 999, 998,
    997, 996, 995, 994, 993, 992, 991, 990, 989, 988, 987, 986, 985, 984, 983, 982, 981, 980, 979, 978, 977, 976,
    975, 974, 973, 972, 971, 970, 969, 968, 967, 966, 965, 964, 963, 962, 961, 960, 959, 958, 957, 956, 955, 954,
    953, 952, 951, 950, 949, 948, 947, 946, 945, 944, 943, 942, 941, 940, 939, 938, 937, 936, 935, 934, 933, 932,
    931, 930, 929, 928, 927, 926, 925, 924, 923, 922, 921, 920, 919, 918, 917, 916, 915, 914, 913, 912, 911, 910,
    909, 908, 907, 906, 905, 904, 903, 902, 901, 900, 899, 898, 897, 896, 895, 894, 893, 892, 891, 890, 889, 888,
    887, 886, 885, 884, 883, 882, 881, 880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 870, 869, 868, 867, 866,
    865, 864, 863, 862, 861, 860, 859, 858, 857, 856, 855, 854, 853, 852, 851, 850, 849, 848, 847, 846, 845, 844,
    843, 842, 841, 840, 839, 838, 837, 836, 835, 834, 833, 832, 831, 830, 829, 828, 827, 826...}, starting = false,
  terminating = true, slot_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
        __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, reaper = 0x5db58c0,
  io_threads = std::vector of length 1, capacity 1 = {0x5dda470}, slot_count = 1026, slots = 0x5dd5a80,
  term_mailbox = {<zmq::i_mailbox> = {_vptr.i_mailbox = 0x7f1d622633f0 <vtable for zmq::mailbox_t+16>},
    cpipe = {<zmq::ypipe_base_t<zmq::command_t>> = {
        _vptr.ypipe_base_t = 0x7f1d62263390 <vtable for zmq::ypipe_t<zmq::command_t, 16>+16>}, queue = {
        begin_chunk = 0x495c3c0, begin_pos = 0, back_chunk = 0x495c3c0, back_pos = 0, end_chunk = 0x495c3c0,
        end_pos = 1, spare_chunk = {ptr = 0x0}}, w = 0x495c3c0, r = 0x495c3c0, f = 0x495c3c0, c = {ptr = 0x0}},
    signaler = {w = 17, r = 17, pid = 9671}, sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
          __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
        __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
        __size = "\001\000\000", __align = 1}}, active = false}, endpoints = std::map with 0 elements,
  pending_connections = std::multimap with 0 elements, endpoints_sync = {mutex = {__data = {__lock = 0, __count = 0,
        __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, static max_socket_id = {value = 100}, max_sockets = 1023,
  max_msgsz = 2147483647, io_thread_count = 1, blocky = false, ipv6 = false, thread_priority = -1,
  thread_sched_policy = -1, opt_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
        __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, pid = 9671, crypto_sync = {mutex = {__data = {__lock = 0, __count = 0,
        __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}

And the first 30 frames of the stack trace:

#0  0x00007f1d68dcc84d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62025eca in poll (__timeout=-1, __nfds=1, __fds=0x7ffda9fc7dd0)
    at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
#2  zmq::signaler_t::wait (this=this@entry=0x4984398, timeout_=timeout_@entry=-1) at src/signaler.cpp:228
#3  0x00007f1d6200d0b5 in zmq::mailbox_t::recv (this=this@entry=0x4984330, cmd_=cmd_@entry=0x7ffda9fc7e80,
    timeout_=timeout_@entry=-1) at src/mailbox.cpp:81
#4  0x00007f1d61ffcccd in zmq::ctx_t::terminate (this=0x4984290) at src/ctx.cpp:194
#5  0x00007f1d614a4ec8 in __pyx_f_3zmq_7backend_6cython_7context_7Context__term (__pyx_v_self=0x7f1d0232cdd8)
    at zmq/backend/cython/context.c:2210
#6  __pyx_pf_3zmq_7backend_6cython_7context_7Context_6term (__pyx_v_self=0x7f1d0232cdd8)
    at zmq/backend/cython/context.c:2323
#7  __pyx_pw_3zmq_7backend_6cython_7context_7Context_7term (__pyx_v_self=0x7f1d0232cdd8, unused=<optimized out>)
    at zmq/backend/cython/context.c:2289
#8  0x00007f1d6226ae29 in __Pyx_CyFunction_CallAsMethod (func=0x7f1d626838e8, args=0x7f1d4edfcac8, kw=0x0)
    at zmq/backend/cython/error.c:2013
#9  0x0000000000453fd2 in _PyObject_FastCallDict ()
#10 0x0000000000542cc5 in ?? ()
#11 0x000000000054692c in _PyEval_EvalFrameDefault ()
#12 0x0000000000541f01 in ?? ()
#13 0x000000000054bc86 in _PyFunction_FastCallDict ()
#14 0x000000000045411f in _PyObject_FastCallDict ()
#15 0x000000000045421b in _PyObject_Call_Prepend ()
#16 0x0000000000453fd2 in _PyObject_FastCallDict ()
#17 0x00000000004ca6d1 in ?? ()
#18 0x000000000043e2b1 in ?? ()
#19 0x000000000043ed89 in ?? ()
#20 0x000000000043f581 in _PyObject_GC_NewVar ()
#21 0x00000000004bcfd5 in PyTuple_New ()
#22 0x0000000000545665 in _PyEval_EvalFrameDefault ()
#23 0x0000000000541f01 in ?? ()
#24 0x0000000000543097 in ?? ()
#25 0x000000000054692c in _PyEval_EvalFrameDefault ()
#26 0x0000000000542b71 in ?? ()
#27 0x000000000054bbaa in _PyFunction_FastCallDict ()
#28 0x000000000045411f in _PyObject_FastCallDict ()
#29 0x000000000045421b in _PyObject_Call_Prepend ()
#30 0x0000000000453df0 in PyObject_Call ()

Unfortunately, I have not yet been able reproduce this bug in a debug build of Python, hence the nondescriptive stacktrace in the Python part.

Looking at the terminate() function, I don't understand the rationale of letting int rc = term_mailbox.recv (&cmd, -1) wait forever. It is my understanding that I specifically set LINGER=0 to not have this behaviour. What am I missing?

Update: I thought it might be useful to include thread information. There are 11 threads in total; the backtrace above is from thread 1. Threads 4-11 are from Tensorflow, and can probably be neglected. I don't know what threads 2 and 3 are good for. Given that term_mailbox.recv waits for the reaper thread to terminate all sockets (according to the comment above said line of code), I wonder which one that would be.

(gdb) info threads
  Id   Target Id         Frame
  11   Thread 0x7f1d2dc6a700 (LWP 9675) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  10   Thread 0x7f1d2d469700 (LWP 9676) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  9    Thread 0x7f1d2cc68700 (LWP 9677) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  8    Thread 0x7f1d2c467700 (LWP 9678) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  7    Thread 0x7f1d2bc66700 (LWP 9679) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  6    Thread 0x7f1d2b465700 (LWP 9680) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  5    Thread 0x7f1d2ac64700 (LWP 9681) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  4    Thread 0x7f1d2a463700 (LWP 9682) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  3    Thread 0x7f1ce0fff700 (LWP 9990) "py.test" 0x00007f1d68dda283 in epoll_wait ()
    at ../sysdeps/unix/syscall-template.S:81
  2    Thread 0x7f1ce8dfb700 (LWP 9991) "py.test" 0x00007f1d68dda283 in epoll_wait ()
    at ../sysdeps/unix/syscall-template.S:81
* 1    Thread 0x7f1d6a023740 (LWP 9671) "py.test" 0x00007f1d68dcc84d in poll ()
    at ../sysdeps/unix/syscall-template.S:81

The backtraces of threads 4 to 11 are identical:

(gdb) thread 4
[Switching to thread 4 (Thread 0x7f1d2a463700 (LWP 9682))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185	../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f1d61b864dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007f1d39a79fc7 in Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) ()
   from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#3  0x00007f1d39a7a847 in std::_Function_handler<void (), Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::NonBlockingThreadPoolTempl(int, tensorflow::thread::EigenEnvironment)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#4  0x00007f1d39a79950 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
   from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#5  0x00007f1d61b89a60 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f1d69bfc184 in start_thread (arg=0x7f1d2a463700) at pthread_create.c:312
#7  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 and 3's backtraces are identical, too, but their *this variable in frame 1 differ significantly:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f1ce8dfb700 (LWP 9991))]
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
81	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x5ad0220) at src/epoll.cpp:168
#2  0x00007f1d6203b345 in thread_routine (arg_=0x5ad02a0) at src/thread.cpp:100
#3  0x00007f1d69bfc184 in start_thread (arg=0x7f1ce8dfb700) at pthread_create.c:312
#4  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) up
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x5ad0220) at src/epoll.cpp:168
168	            timeout ? timeout : -1);
(gdb) p *this
$22 = {<zmq::poller_base_t> = {_vptr.poller_base_t = 0x7f1d62262bf0 <vtable for zmq::epoll_t+16>, clock = {
      last_tsc = 93651029260469, last_time = 29439022}, timers = std::multimap with 48 elements = {[29443487] = {
        sink = 0x7f1cf40a2390, id = 1}, [29443499] = {sink = 0x7f1cf40aee80, id = 1}, [29443561] = {
        sink = 0x7f1cf4011170, id = 1}, [29443566] = {sink = 0x7f1cf40a9090, id = 1}, [29443594] = {
        sink = 0x7f1cf4002030, id = 1}, [29443600] = {sink = 0x7f1cf40b0990, id = 1}, [29443607] = {
        sink = 0x7f1cf40b05e0, id = 1}, [29443607] = {sink = 0x7f1cf4012cc0, id = 1}, [29443620] = {
        sink = 0x7f1cf4017c70, id = 1}, [29443628] = {sink = 0x7f1cf40a0f00, id = 1}, [29443653] = {
        sink = 0x7f1cf4016eb0, id = 1}, [29443666] = {sink = 0x7f1cf40a46f0, id = 1}, [29443667] = {
        sink = 0x7f1cf40a4ed0, id = 1}, [29443668] = {sink = 0x7f1cf40b0120, id = 1}, [29443671] = {
        sink = 0x7f1cf40adfd0, id = 1}, [29443674] = {sink = 0x7f1cf40afc60, id = 1}, [29443697] = {
        sink = 0x7f1cf40b13b0, id = 1}, [29443712] = {sink = 0x7f1cf40a1a20, id = 1}, [29443719] = {
        sink = 0x7f1cf40af6c0, id = 1}, [29443739] = {sink = 0x7f1cf417c780, id = 1}, [29443749] = {
        sink = 0x7f1cf40a6700, id = 1}, [29443762] = {sink = 0x7f1cf4013070, id = 1}, [29443780] = {
        sink = 0x7f1cf40b1d30, id = 1}, [29443782] = {sink = 0x7f1cf40ae380, id = 1}, [29443787] = {
        sink = 0x7f1cf40a6af0, id = 1}, [29443797] = {sink = 0x7f1cf40a7ab0, id = 1}, [29443803] = {
        sink = 0x7f1cf40a1ed0, id = 1}, [29443819] = {sink = 0x7f1cf40a3b20, id = 1}, [29443823] = {
        sink = 0x7f1cf40b0f20, id = 1}, [29443845] = {sink = 0x7f1cf4012820, id = 1}, [29443854] = {
        sink = 0x7f1cf406b700, id = 1}, [29443864] = {sink = 0x7f1cf40a2f50, id = 1}, [29443865] = {
        sink = 0x7f1cf40a9440, id = 1}, [29443879] = {sink = 0x7f1cf40af310, id = 1}, [29443895] = {
        sink = 0x7f1cf40b1870, id = 1}, [29443912] = {sink = 0x7f1cf40a4300, id = 1}, [29443922] = {
        sink = 0x7f1cf4010780, id = 1}, [29443955] = {sink = 0x7f1cf40ae9f0, id = 1}, [29443959] = {
        sink = 0x7f1cf40b21f0, id = 1}, [29443965] = {sink = 0x7f1cf40a7ea0, id = 1}, [29443979] = {
        sink = 0x7f1cf40a3340, id = 1}, [29443997] = {sink = 0x7f1cf4010b30, id = 1}, [29443998] = {
        sink = 0x7f1cf40b2850, id = 1}, [29444011] = {sink = 0x7f1cf40b2eb0, id = 1}, [29444020] = {
        sink = 0x7f1cf40a56b0, id = 1}, [29444076] = {sink = 0x7f1cf4012130, id = 1}, [29444106] = {
        sink = 0x7f1cf40adaf0, id = 1}, [29444111] = {sink = 0x7f1cf40a0aa0, id = 1}}, load = {value = 1}},
  ctx = @0x4984290, epoll_fd = 30, retired = std::vector of length 0, capacity 64, stopping = false, worker = {
    tfn = 0x7f1d62008c60 <zmq::epoll_t::worker_routine(void*)>, arg = 0x5ad0220, descriptor = 139762142787328},
  retired_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0,
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f1ce0fff700 (LWP 9990))]
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
81	in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x4b2c290) at src/epoll.cpp:168
#2  0x00007f1d6203b345 in thread_routine (arg_=0x4b2c310) at src/thread.cpp:100
#3  0x00007f1d69bfc184 in start_thread (arg=0x7f1ce0fff700) at pthread_create.c:312
#4  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) up
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x4b2c290) at src/epoll.cpp:168
168	            timeout ? timeout : -1);
(gdb) p *this
$21 = {<zmq::poller_base_t> = {_vptr.poller_base_t = 0x7f1d62262bf0 <vtable for zmq::epoll_t+16>, clock = {
      last_tsc = 93281918731244, last_time = 29311916}, timers = std::multimap with 0 elements, load = {value = 1}},
  ctx = @0x4984290, epoll_fd = 28, retired = std::vector of length 0, capacity 2, stopping = false, worker = {
    tfn = 0x7f1d62008c60 <zmq::epoll_t::worker_routine(void*)>, arg = 0x4b2c290, descriptor = 139762010683136},
  retired_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0,
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}
@tcwalther tcwalther changed the title zeromq can hang in zmq::ctx_t::terminate() zeromq can hang indefinitely in zmq::ctx_t::terminate() despite LINGER=0 May 24, 2017
@bjovke
Copy link
Contributor

bjovke commented May 25, 2017

@tcwalther LINGER=0 is not enough, you have to close all open sockets prior to terminating context.
In documentation this is clearly stated:

After interrupting all blocking calls, zmq_ctx_destroy() shall block until the following conditions are satisfied:
All sockets open within context have been closed with zmq_close().

Are you sure you have closed all sockets before trying to destroy context?
The line on which you stated ZMQ hangs indicates that you haven't.

@tcwalther
Copy link
Author

tcwalther commented May 25, 2017

@bjovke thanks for your reply. This would mean that I would have to further debug this in pyzmq. As far as I understand it, all sockets are explicitly closed. In my Python code, I have:

class Connection:
    def __init__(self, max_n_sockets=50):
        self.context = zmq.asyncio.Context()
        self.context.setsockopt(zmq.LINGER, 0)
        self.context.set(zmq.BLOCKY, False)

        self.sockets = asyncio.Queue()
        self.used_sockets = 0
        self.max_n_sockets = max_n_sockets

    def __del__(self):
        self.context.destroy(linger=0)

    # below here up to `max_n_sockets` sockets are created
    # ....

where self.context.destroy actively closes all sockets: https://github.com/zeromq/pyzmq/blob/master/zmq/backend/cython/context.pyx#L236

I guess I'll have to continue trying to reproduce this in my Python debug build to see the full Python stack trace, and thus understand why I the socket has not been closed.

As a last question: is there any way that I can configure ZeroMQ to avoid this behaviour? Having a program hang indefinitely is really not in my interest.

@bjovke
Copy link
Contributor

bjovke commented May 25, 2017

@tcwalther Make sure that you're not calling self.context.destroy(linger=0) while there are sockets opened by different thread than this one. Sockets are not thread safe and each socket must be used from one thread only.
So if there's a socket open in some other thread, close it from that thread and then destroy context.
Unfortunately, I don't think ZeroMQ has a feature to close and cleanup everything by itself.

@tcwalther
Copy link
Author

tcwalther commented May 25, 2017

Yes, thanks for the advice. It's one of the reasons I chose the asyncio module, which runs single-threaded in an event loop. There should be no sockets in other threads.

Right now, I'm stuck in trying to recreate the race condition in an environment where I can backtrace the garbage collection. It would be great if ZeroMQ had a builtin way to avoid this deadlock from the start; it's easy to detect if a program has crashed, but hard to detect if a program is hanging. Note, also, that if I send a SIGTERM to the program (the python test suite in this case), it stops hanging and continues to work normally. This may be expected as probably the SIGTERM is caught and swallowed by ZeroMQ.

@bjovke
Copy link
Contributor

bjovke commented May 25, 2017

SIGTERM probably unblocks select() call inside some socket's IO thread. What ZMQ socket types are you using? For "inproc" sockets deadlock could happen if other end never connected.
It would be a good idea if you could debug ZeroMQ, reach the blocking code and see what is being waited for and in which socket.

@tcwalther
Copy link
Author

Thanks for your reply, @bjovke. I don't think that it's a select() call. Looking at the backtraces I posted above, the candidates are poll() and epoll_wait() (2x), and I assume it's blocking in the poll() call. I don't use any inproc sockets, only tcp.

@tcwalther
Copy link
Author

Sorry, to ask the question more precisely: is it possible that ZeroMQ is indeed not waiting on a socket? There doesn't seem to be a thread that is currently inside a socket. If not, what is it waiting for?

As another point of information: I just went through all 48 sockets in the context. In frame 4, thread 1 - see stack above - I printed p this->sockets, whereas this is a pointer to the context, and then went through all the addresses in this list via p (*(zmq::socket_base_t *) ADDRESS).

Going through the output, I see that all have linger = 0, all are connected via tcp and not inproc. ctx_terminated and destroyed are both false.

Here is the output:

(gdb) p (*(zmq::socket_base_t *) 0x48c6860)`
$50 = {
  <zmq::own_t> = {
    <zmq::object_t> = {
      _vptr.object_t = 0x7f1d62264cd0 <vtable for zmq::req_t+16>,
      ctx = 0x4984290,
      tid = 3
    },
    members of zmq::own_t:
    options = {
      sndhwm = 1000,
      rcvhwm = 1000,
      affinity = 0,
      identity_size = 0 '\000',
      identity = "\000\000\000\000\000\000\000\000\320\257\004\000\000\000\000\200s0\003", '\000' <repeats 12 times>, "\006\000\000\000\000\000\000\000\001", '\000' <repeats 15 times>, "p\000\000\000\000\000\000\000 \000\000\000\000\000\000\000P\207\365\004\000\000\000\000\320\212\336\005\000\000\000\000\220\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\000W\233\004\000\000\000\000\b\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000Identity\000V\263\005\000\000\000\000\001\001\000\000\000\000\000\000\200\212\336\005\000\000\000\000\300\336\315\005", '\000' <repeats 12 times>, "\002\000\000\000\004", '\000' <repeats 11 times>, "\260\\\205:\035\177", '\000' <repeats 18 times>...,
      rate = 100,
      recovery_ivl = 10000,
      multicast_hops = 1,
      multicast_maxtpdu = 1500,
      sndbuf = -1,
      rcvbuf = -1,
      tos = 0,
      type = 3,
      linger = 0,
      connect_timeout = 0,
      tcp_maxrt = 0,
      reconnect_ivl = 100,
      reconnect_ivl_max = 5000,
      backlog = 100,
      maxmsgsize = -1,
      rcvtimeo = -1,
      sndtimeo = -1,
      ipv6 = false,
      immediate = 0,
      filter = false,
      invert_matching = false,
      recv_identity = false,
      raw_socket = false,
      raw_notify = true,
      socks_proxy_address = "",
      tcp_keepalive = -1,
      tcp_keepalive_cnt = -1,
      tcp_keepalive_idle = -1,
      tcp_keepalive_intvl = -1,
      tcp_accept_filters = std::vector of length 0, capacity 0,
      zap_ipc_creds = 64,
      ipc_uid_accept_filters = std::set with 0 elements,
      ipc_gid_accept_filters = std::set with 0 elements,
      ipc_pid_accept_filters = std::set with 0 elements,
      mechanism = 0,
      as_server = 0,
      zap_domain = "",
      plain_username = "",
      plain_password = "",
      curve_public_key = '\000' <repeats 31 times>,
      curve_secret_key = '\000' <repeats 31 times>,
      curve_server_key = '\000' <repeats 31 times>,
      gss_principal = "",
      gss_service_principal = "",
      gss_plaintext = false,
      socket_id = 5,
      conflate = false,
      handshake_ivl = 30000,
      connected = false,
      heartbeat_ttl = 0,
      heartbeat_interval = 0,
      heartbeat_timeout = -1,
      use_fd = -1
    },
    terminating = false,
    sent_seqnum = {
      value = 1
    },
    processed_seqnum = 1,
    owner = 0x0,
    owned = std::set with 1 elements = {
      [0] = 0x5b9b090
    },
    term_acks = 0
  },
  <zmq::array_item_t<0>> = {
    _vptr.array_item_t = 0x7f1d62264e18 <vtable for zmq::req_t+344>,
    array_index = 0
  },
  <zmq::i_poll_events> = {
    _vptr.i_poll_events = 0x7f1d62264e38 <vtable for zmq::req_t+376>
  },
  <zmq::i_pipe_events> = {
    _vptr.i_pipe_events = 0x7f1d62264e70 <vtable for zmq::req_t+432>
  },
  members of zmq::socket_base_t:
  connect_rid = "",
  endpoints = std::multimap with 1 elements = {
    ["tcp://localhost:52839"] = {
      first = 0x5b9b090,
      second = 0x5ad7e40
    }
  },
  inprocs = std::multimap with 0 elements,
  tag = 3135106223,
  ctx_terminated = false,
  destroyed = false,
  mailbox = 0x4f3ef70,
  pipes = {
    items = std::vector of length 1, capacity 1 = {0x5ad7e40}
  },
  poller = 0x0,
  handle = 0x0,
  last_tsc = 93306064684022,
  ticks = 26,
  rcvmore = false,
  clock = {
    last_tsc = 93281919039832,
    last_time = 29311916
  },
  monitor_socket = 0x0,
  monitor_events = 0,
  last_endpoint = "tcp://localhost:52839",
  thread_safe = false,
  reaper_signaler = 0x0,
  sync = {
    mutex = {
      __data = {
        __lock = 0,
        __count = 0,
        __owner = 0,
        __nusers = 0,
        __kind = 1,
        __spins = 0,
        __elision = 0,
        __list = {
          __prev = 0x0,
          __next = 0x0
        }
      },
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>,
      __align = 0
    },
    attr = {
      __size = "\001\000\000",
      __align = 1
    }
  },
  monitor_sync = {
    mutex = {
      __data = {
        __lock = 0,
        __count = 0,
        __owner = 0,
        __nusers = 0,
        __kind = 1,
        __spins = 0,
        __elision = 0,
        __list = {
          __prev = 0x0,
          __next = 0x0
        }
      },
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>,
      __align = 0
    },
    attr = {
      __size = "\001\000\000",
      __align = 1
    }
  }
}

@bjovke
Copy link
Contributor

bjovke commented May 26, 2017

ZeroMQ creates I/O thread for each created socket. From that point on thread is autonomous in receiving/sending messages.
When you close a ZMQ socket, zmq_close() just sends a message to its I/O thread to terminate and returns right away.
I/O thread sends a message to Reaper thread when it has finished executing. When you call zmq_ctx_term() (destroy context), this function is waiting indefinitely for Reaper thread to finish and Reaper thread finishes when last I/O thread finishes executing.
In your case this means that Reaper thread is waiting for some of the I/O threads to finish and I/O threads block for only one reason: waiting for an event on system socket or maybe a message to terminate.

Try to find all threads when there's a infinite wait on context destroy. Beside main thread one thread is Reaper and other ones are I/O threads. I/O threads which are still existent are the ones blocking your program.

@bjovke
Copy link
Contributor

bjovke commented May 26, 2017

Also I've noticed in your last output terminating = false which means that close has not been issued for that socket.
Maybe you can try and add code to your program to close all of the sockets before executing self.context.destroy(linger=0). I'm really not familiar with the Python library so I'm not sure what is going on there.

@tcwalther
Copy link
Author

The Python library closes all sockets before calling term, see: https://github.com/zeromq/pyzmq/blob/master/zmq/backend/cython/context.pyx#L236

@bjovke
Copy link
Contributor

bjovke commented May 26, 2017

I know that, but you can try just for test to close them by yourself.
And during execution time keep an eye on number of threads created. Number of sockets closed should match the number of I/O threads.

@bluca
Copy link
Member

bluca commented May 27, 2017

By default there is only one io thread, not one per socket

@bjovke
Copy link
Contributor

bjovke commented May 28, 2017

@bluca Yes, you're right, where has my brain gone to...
In that case it's easier. I/O thread probably blocks on select() or recv() system sockets call.
The one thread you noticed waiting is your main thread waiting for reaper to terminate which is waiting for I/O thread to terminate.
You just need to find out on which sockets select() is being done and with which timeout or socket for which recv() is being waited.

@tcwalther
Copy link
Author

tcwalther commented May 30, 2017

Looking at the stack traces, I don't see any select() or recv(), just epoll_wait().

If I'm not mistaken, the issue seems related to #1279 - the backtrace looks identical, and the fact that it also happened in 1 out of 10 cases in said bug report equally corresponds to the behaviour I have here.

I don't think I'm using zmq_socket_monitor in my code, though - I don't call PyZMQ's socket.monitor anywhere in my code. And unlike the person in #1279, I can also reproduce this bug with ZeroMQ 4.2.2.

@bluca
Copy link
Member

bluca commented May 30, 2017

the I/O thread uses epoll by default rather than select/poll (it's configurable)

@bjovke
Copy link
Contributor

bjovke commented May 30, 2017

Ok, whatever it is, epoll() or select(), there's a wait for file descriptor(s) which are probably sockets.
If you reach epoll_wait() with debugger, you will see what is the value of timeout and what is being waited on.
One wait is for mailbox.recv() with timeout -1, waiting for reaper thread to finish, and another one is waiting on some system socket(s) which translates to ZMQ socket(s). There you will see how many of them are being waited for and what's the timeout which should be 0 (LINGER=0).
We can't test this because we don't have the complete project you're working on and me personally don't get along with Python.

@tcwalther
Copy link
Author

tcwalther commented May 31, 2017

Update: all backtraces before were executed in the same docker container with the same process being debugged, but from this comment onward, I was unfortunately forced to start a new docker container with a new process. As such, there may be small differences between previous backtraces and information from here-on.

Original comment below.


Let's see if I can add more information.

  • I have 28 file descriptors of type anon_inode.
  • I have 21 sockets in my context. They have file descriptors 30 upwards (each type anon_inode). All of them have linger=0.
  • That leaves 7 anon_inodes with file descriptors < 30 that aren't assigned to those sockets.
  • Thread 2 for example is stuck in zmq::epoll_t::loop (see backtraces above), and it has an zmq::epoll_t::epoll_fd = 29, so probably that's where the other 7 fds are used. Thread 3 waits for epoll_fd = 27.

Does that help in any way?

I don't know which socket epoll_wait() waits on (only the file descriptor). The timeout value itself is "optimised out". Calling the function that sets the timeout variable returns 0, but I don't know if the function is stateful (and thus calling it again would return something else). I would expect that the timeout thus is -1 (from timeout ? timeout : -1).

Given an epoll_t, how do I find the socket that it is waiting for, if not by the file descriptor?

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther
Stack traces for threads 2 - 11 show this:
#7 0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#4 0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

If you're cloning your process and using same ZMQ sockets from different processes then this could be the cause of your troubles.

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther On second thought clone() might be the thread implementation in the background, with CLONE_THREAD parameter used. So it is probably completely normal to see clone() as long as you didn't call it yourself to clone the process.
Also, as I remember, multi-threading in Python can actually clone the whole process. Are you using Python threads and having multiple processes?

@bluca
Copy link
Member

bluca commented May 31, 2017

Yes that's how a pthread backtrace will start with

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@bluca Ah, you're back after being scared that Github is dead and all data is lost. :)

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther For the socket count, you said that you have 21 sockets, but in the trace for thread 2 you have timers = std::multimap with 48 elements which would correspond to 24 sockets. There are 3 sockets unaccounted for.

@tcwalther
Copy link
Author

@bjovke oh yes, indeed. I had to restart my machine, and thus lost my old docker container (with the state of 24 sockets) right before my last comment (I've updated the comment accordingly).

As such, all previous backtraces were from the same process, but from the comment 2 hours ago onwards, I was forced to debug a new process in a new container.

In the new process, timers = std::multimap with 21 elements.

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther But this still does not explain blocking wait of epoll_wait(), regardless of number of ZMQ sockets. poll() is not the culprit because that's your main thread waiting for reaper to terminate.
If you're using poller somewhere in your code, it might be that the poller is already in the blocking state with epoll_wait() in some other thread before you called self.context.destroy(linger=0). In that case setting LINGER=0 might have no effect.

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther On second thought, I think there should be one extra FD created internally by ZMQ and added to polling set exactly for the purpose of unblocking the epoll_wait() call. That should be the mailbox FD added to each socket.

@tcwalther
Copy link
Author

How would you recommend going forward then? Is there other information you'd like me to provide?

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther Well, the "easiest" way is to debug ZMQ and just follow the code of zmq_ctx_term(). There you can see the values inside each socket, including associated fds, and so on.
With mixed C++ and Python code I don't know how it can be done.

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther Even then it will be hard to catch it because you said that it happens every 10th time.
Ideal would be if we have native (C/C++) example which experiences this every run.

@bjovke
Copy link
Contributor

bjovke commented May 31, 2017

@tcwalther There's another way, to attach debugger to Python process, having the source code of the exact same version of ZMQ you're using and having that ZMQ compiled with debug symbols. In that way you can break into blocking wait and inspect values of all variables.

@tcwalther
Copy link
Author

There's another way, to attach debugger to Python process, having the source code of the exact same version of ZMQ you're using and having that ZMQ compiled with debug symbols. In that way you can break into blocking wait and inspect values of all variables.

@bjovke I am already doing that. All the backtraces and variable inspections pasted above come from this.

@bjovke
Copy link
Contributor

bjovke commented Jun 1, 2017

@tcwalther Good then. I was under impression that you have very limited abilities to analyze the process.
I'm not familiar with Linux debuggers and what can be done there.
You should be able to see values of the variables at the time infinite wait happened.
You have two options:

  • Debug the ZMQ yourself and find the issue.
  • Create a minimal test case which experiences infinite wait on every run, so other people can work on it.

It's almost impossible for everyone else to run your complete software and try to catch the issue every 10th execution.

@ovanes
Copy link

ovanes commented Aug 2, 2017

@tcwalther I have similar issue in C++ and I've described it here: zeromq/cppzmq#139.

I see in your Python code that you call self.context.set(zmq.BLOCKY, False). If I do it in C++ I always receive -1 as a result, which means error. I am not sure that BLOCKY option can be set in libzmq at all.

@tcwalther
Copy link
Author

tcwalther commented Aug 21, 2017

@ovanes interesting - pyzmq doesn't return a value for the set method, so I could never check.

In the end, I solved it by using a patched Python context object. This may not help you in your C++ code, but maybe it gives you some inspiration. Have a look at the solution here: zeromq/pyzmq#1003 (comment)

@stale
Copy link

stale bot commented Nov 4, 2018

This issue has been automatically marked as stale because it has not had activity for 365 days. It will be closed if no further activity occurs within 56 days. Thank you for your contributions.

@stale stale bot added the stale label Nov 4, 2018
@stale stale bot closed this as completed Dec 31, 2018
@tcwalther
Copy link
Author

I've run into this issue again, but this time using regular contexts, not asyncio. I can reproduce this regression in PyZMQ 18.1, and it doesn't happen in 18.0.x.

Again, it's very hard to isolate the bug into a small example script. Were there any significant changes in 18.1 that could lead to ZeroMQ not terminating on exit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants