Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for deadlocks brought out by the GroupPresentation test. #2886

Merged
merged 4 commits into from Aug 7, 2021

Conversation

pmesnier
Copy link
Contributor

@pmesnier pmesnier commented Aug 1, 2021

This PR focuses on two potential deadlock scenarios, both in the subscriber impl code, the function to get a list of current data readers and a function to notify datareaders. The problem is that multiple threads are passing through these functions and must mediate the sharing of resources using locks. However there is a lock on the SubscriberImpl object that is grabbed a the start of these functions, then during processing a second lock on the DataReaderImpl object is repeatedly taken while evaluating individual data readers. This triggers a deadlock when a thread is holding the DataReaderImpl instance lock and waiting for the SubscriberImpl lock, but another thread is holding the SI lock and wants the particular DataReaderImpl instance lock currently held by the first thread.
This solution is to make a local copy of the data reader collection to be searched then release the SI lock before iterating the data reader collection. This avoids the deadlock condition.

@ClaytonCalabrese
Copy link
Contributor

Has anyone seen the tests/DCPS/LivelinessTimeout/run_test.pl rtps_disc failure on test_m10_o1d0_sec before, or could that have been introduced by this PR?

dds/DCPS/SubscriberImpl.cpp Outdated Show resolved Hide resolved
dds/DCPS/SubscriberImpl.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@simpsont-oci simpsont-oci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious what ThreadSanitizer reports for the GroupPresentation tests, but that may need to come later...

@pmesnier pmesnier changed the title possible fixes to the group presentation deadlocks Fix for deadlocks brought out by the GroupPresentation test. Aug 4, 2021
@mitza-oci
Copy link
Member

These test failures in GitHub actions seem to be new to this PR:

tests/security/attributes/run_test.pl Partitions_Reordered
tools/modeling/tests/MultiTopic/run_test.pl

@mitza-oci mitza-oci merged commit 09bf478 into master Aug 7, 2021
@mitza-oci mitza-oci deleted the gpdeadlock branch August 7, 2021 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants