Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix deadlock #79

Merged
merged 1 commit into from
Mar 17, 2015
Merged

fix deadlock #79

merged 1 commit into from
Mar 17, 2015

Conversation

vrabaud
Copy link
Member

@vrabaud vrabaud commented Feb 1, 2015

THAT WAS NOT FUN ! I SPENT A DAY ON THAT !
But it makes navigation work with tf2 so it's ok :)

Here is what's happening in some navigation tests I am switching to tf2.
In buffer_core.cpp, line 1298, a callback is being called in
testTransformableRequests after locking transformable_requests_mutex_
line 1256. This is actually defined through a message_filter. It is a
transformable that is being executed: it therefore locks
messages_mutex_ line 449 in message_filter.h. At the same time, a
message is added and the same mutex is locked line 308 in add but
line 321, a transformable is also added to the list in buffer_core
which calls a lock on transformable_requests_mutex_ line 1164 in
addTransformableRequest. Hence a deadlock.

Here is what's happening in some navigation tests I am switching to tf2.
In buffer_core.cpp, line 1298, a callback is being called in
testTransformableRequests after locking transformable_requests_mutex_
line 1256. This is actually defined through a message_filter. It is a
transformable that is being executed: it therefore locks
messages_mutex_ line 449 in message_filter.h. At the same time, a
message is added and the same mutex is locked line 308 in add but
line 321, a transformable is also added to the list in buffer_core
which calls a lock on transformable_requests_mutex_ line 1164 in
addTransformableRequest. Hence a deadlock.
@vrabaud
Copy link
Member Author

vrabaud commented Feb 1, 2015

please merge after review and I'll then finish up #76

@vrabaud
Copy link
Member Author

vrabaud commented Feb 2, 2015

and I might have found another one so please hold on. My fix might not be a full fix.

@vrabaud
Copy link
Member Author

vrabaud commented Feb 2, 2015

ok, the bug was in navigation, feel free to merge this one.

@mikeferguson
Copy link

A bug in navigation? no... that never happens. :)

@vrabaud
Copy link
Member Author

vrabaud commented Feb 2, 2015

actually, not a bug maybe. I opened up #80

@tfoote
Copy link
Member

tfoote commented Mar 17, 2015

This does look to break the deadlock on the message_mutex_ however that was both protecting the message list as well as the target_frames_ list. With the change as proposed the iterator over the target_frames is unprotected.

I'm looking at extending the target_frames_strings_ mutex to cover the list as well, but I think it is actually the issue, since it's interated both in the transformable method as well as in the add.

tfoote added a commit that referenced this pull request Mar 17, 2015
@tfoote
Copy link
Member

tfoote commented Mar 17, 2015

@vrabaud can you look at 10370e5 and possibly test. I think it keeps avoiding the deadlock, but keeps the target_frames_ iterators safe as a follow on to this patch.

@tfoote tfoote added in progress and removed ready labels Mar 17, 2015
@vrabaud
Copy link
Member Author

vrabaud commented Mar 17, 2015

ok, @tfoote I approve your fix and most of all, it works for navigation :)

@mikeferguson
Copy link

Yay!

@tfoote tfoote merged commit 1f77209 into ros:indigo-devel Mar 17, 2015
@bricerebsamen bricerebsamen mentioned this pull request Apr 2, 2015
@vrabaud vrabaud deleted the deadlock_fix branch May 11, 2015 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants