-
Notifications
You must be signed in to change notification settings - Fork 787
Problem: Terminating ctx safely is hard #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Solution: Add shutdown(). This function is required for clean termination of the zmq context in multi-threaded applications where sockets are used in threads. In particular if blocking operation are used and if sockets can be created at any time.
Solution: A shudown_guard running in a thread that calls shutdown() on a context until notified to stop. This class makes terminating a context in multi-threaded code safe and easy.
Looks like I have to make the tests a bit more robust with respect to race conditions. |
Pull Request Test Coverage Report for Build 319
💛 - Coveralls |
The background on this is that in our code we were sometimes experiencing crashes and hangs. When the code was ran with ThreadSanitizer it pointed to data race in close() in context_t. We added a timed mutex around the context to synchronize, and that fixed most of the issues but there were sometimes still some data races reported. This appears to fix these issues. Not related to this PR, I'm still however getting a data race between thread connecting to/receiving from a socket and a "ZMQbg/IO/1" thread. Need to investigate that further. EDIT: It looks like this has been investigated before (zeromq/libzmq#3309), so I will not worry about it for now. |
std::unique_lock<std::mutex> lock{mtx}; | ||
if (cv.wait_for(lock, iv, [this]{ return do_stop; })) | ||
return; // stop requested | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somehow don't get the point on the use case for this. Why does this need to be a loop which might call shutdown multiple times? When would a single call to shutdown not be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be required if 1 socket is created before shutdown, see zeromq/libzmq#3792. A better workaround/hack is maybe to create a temporary socket in context_t::shutdown before calling zmq_ctx_shutdown?
Closing this since shutdown has been fixed in libzmq zeromq/libzmq#3794 |
Solution: A shudown_guard running in a thread
that calls shutdown() on a context until
notified to stop. This class makes terminating
a context in multi-threaded code safe and easy.