2.0.7/2.0.8 crash/hang/assert when calling zmq_close on a blocked socket #53

chuckremes · 2010-08-28T17:26:26Z

I finally have a somewhat reproducible case for this problem. Unfortunately it is in Ruby but the code should be easily converted to C by someone who knows it better than I.

Here are the steps behind the crash (sometimes it asserts on object.cpp line 342 [process_pipe_term]).

Set up 2 threads.
Each thread should create its own context. (Example below reproduces the bug with a single context though, so this isn't a hard requirement.)
Each thread should create a socket and bind/connect to the same transport.
One thread should send some data to the other. IMPORTANT: traffic must pass through each socket at least once otherwise the bug does not reproduce.
One or both sockets should block forever on their next send/recv operation.
Outside of the thread(s), call zmq_close on one or both sockets. 7 out of 10 times I see it crash or assert on object.cpp line 342 (2.0.8 release).

BTW, calling zmq_term succeeds every time. Internally it is probably calling zmq_close on all associated sockets but it doesn't trigger a crash as far as I can see.

Sometimes it crashes with:

pure virtual method called
terminate called without an active exception
Abort trap

Sample code in Ruby.

rrequire 'rubygems'
require 'ffi-rzmq'

# defined outside of the thread closures so they are visible globally
sock = nil
ctx = ZMQ::Context.new(1)

# must pass some traffic for the crash/hang/assert to show up
recv_thread = Thread.new do
  sock = ctx.socket(ZMQ::REP)
  sock.bind('tcp://127.0.0.1:2200')

  msg = sock.recv_string 
  puts "got msg [#{msg}]"
  sock.send_string 'goodbye'
  puts "sent"
  sock.recv_string # blocks forever
  puts "should never get here"
end

sock2 = ctx.socket(ZMQ::REQ)
sock2.connect('tcp://127.0.0.1:2200')

sock2.send_string 'hello'
sock2.recv_string

# when we get here, we know sock is blocked waiting for
# the next message

puts "closing socket"
sock.close
puts "closed"

chuckremes · 2010-08-28T19:24:06Z

If calling zmq_close on a socket owned by another thread is not safe, what is the suggested technique for cleaning up a socket from a finalizer?

Languages with garbage collection may reap objects containing sockets. It makes sense that the destructors/finalizers should call zmq_close on those sockets to clean them up. These reaper threads are always different from the thread where a socket was allocated.

sustrik · 2010-08-30T10:11:37Z

This wil be (is) fixed by socket migration feature in 0MQ/2.1 (see the trunk).

Fix issue zeromq#264

Fixed formatting on zmq_getsockopt.txt

Victor

csrl pushed a commit to exosite-archive/zeromq2 that referenced this issue Dec 22, 2012

Merge pull request zeromq#53 from hurtonm/fix_issue_264

5d3b188

Fix issue zeromq#264

drahosp pushed a commit to LuaDist/libzmq that referenced this issue Feb 13, 2014

Merge pull request zeromq#53 from hintjens/master

f56c6fa

Fixed formatting on zmq_getsockopt.txt

LutzWeischerFujitsu mentioned this issue Mar 10, 2021

make test fails on AArch64, Fedora 33 #4157

Open

benjdero pushed a commit to benjdero/libzmq that referenced this issue Feb 20, 2023

Merge pull request zeromq#53 from vperron/victor

41cc2c6

Victor

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.0.7/2.0.8 crash/hang/assert when calling zmq_close on a blocked socket #53

2.0.7/2.0.8 crash/hang/assert when calling zmq_close on a blocked socket #53

chuckremes commented Aug 28, 2010

chuckremes commented Aug 28, 2010

sustrik commented Aug 30, 2010

2.0.7/2.0.8 crash/hang/assert when calling zmq_close on a blocked socket #53

2.0.7/2.0.8 crash/hang/assert when calling zmq_close on a blocked socket #53

Comments

chuckremes commented Aug 28, 2010

chuckremes commented Aug 28, 2010

sustrik commented Aug 30, 2010