Skip to content
This repository was archived by the owner on Sep 30, 2022. It is now read-only.

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Jun 28, 2016

:bot:assign: @bosilca
:bot🏷️bug
:bot:milestone:v2.0.0

hjelmn and others added 3 commits June 28, 2016 09:16
This commit fixes a race condition discovered by @artpol84. The race
happens when a signalling thread decrements the sync count to 0 then
goes to sleep. If the waiting thread runs and detects the count == 0
before going to sleep on the condition variable it will destroy the
condition variable while the signalling thread is potentially still
processing the completion. The fix is to add a non-atomic member to
the sync structure that indicates another process is handling
completion. Since the member will only be set to false by the
initiating thread and the completing thread the variable does not need
to be protected. When destoying a condition variable the waiting
thread needs to wait until the singalling thread is finished.

Thanks to @artpol84 for tracking this down.

Fixes #1813

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

(cherry picked from open-mpi/ompi@fb455f0)

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from open-mpi/ompi@8d011ea)

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(request handling related)

(cherry picked from open-mpi/ompi@5417155)

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@ompiteam-bot ompiteam-bot added this to the v2.0.0 milestone Jun 28, 2016
@mellanox-github
Copy link

Test FAILed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1814/ for details.

@hppritcha
Copy link
Member

@jsquyres please merge when you'd like

@jsquyres jsquyres merged commit 440f73f into open-mpi:v2.x Jun 28, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants