Skip to content

Conversation

gummif
Copy link
Member

@gummif gummif commented Jan 21, 2020

Solution: A shudown_guard running in a thread
that calls shutdown() on a context until
notified to stop. This class makes terminating
a context in multi-threaded code safe and easy.

Solution: Add shutdown(). This function is required
for clean termination of the zmq context in multi-threaded
applications where sockets are used in threads. In particular
if blocking operation are used and if sockets can be created
at any time.
Solution: A shudown_guard running in a thread
that calls shutdown() on a context until
notified to stop. This class makes terminating
a context in multi-threaded code safe and easy.
@gummif
Copy link
Member Author

gummif commented Jan 22, 2020

Looks like I have to make the tests a bit more robust with respect to race conditions.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 319

  • 21 of 23 (91.3%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 86.813%

Changes Missing Coverage Covered Lines Changed/Added Lines %
zmq.hpp 4 5 80.0%
zmq_addon.hpp 17 18 94.44%
Totals Coverage Status
Change from base Build 316: 0.3%
Covered Lines: 711
Relevant Lines: 819

💛 - Coveralls

@gummif
Copy link
Member Author

gummif commented Jan 23, 2020

The background on this is that in our code we were sometimes experiencing crashes and hangs. When the code was ran with ThreadSanitizer it pointed to data race in close() in context_t. We added a timed mutex around the context to synchronize, and that fixed most of the issues but there were sometimes still some data races reported. This appears to fix these issues.

Not related to this PR, I'm still however getting a data race between thread connecting to/receiving from a socket and a "ZMQbg/IO/1" thread. Need to investigate that further. EDIT: It looks like this has been investigated before (zeromq/libzmq#3309), so I will not worry about it for now.

std::unique_lock<std::mutex> lock{mtx};
if (cv.wait_for(lock, iv, [this]{ return do_stop; }))
return; // stop requested
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow don't get the point on the use case for this. Why does this need to be a loop which might call shutdown multiple times? When would a single call to shutdown not be sufficient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be required if 1 socket is created before shutdown, see zeromq/libzmq#3792. A better workaround/hack is maybe to create a temporary socket in context_t::shutdown before calling zmq_ctx_shutdown?

@gummif
Copy link
Member Author

gummif commented Jan 26, 2020

Closing this since shutdown has been fixed in libzmq zeromq/libzmq#3794

@gummif gummif closed this Jan 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants