Skip to content

Commit

Permalink
server: Fix PMIx_Server_Finalize hang
Browse files Browse the repository at this point in the history
The hang was quite rare and appears as the result of the race condition
between PMIx progress thread and main thread calling
PMIx_Server_finalize.

The following sequence is possible:

| main thread   | Progress thread   |
|               | while(ev_active){ |
| ev_active=0   |                   |
| ev_break_loop |                   |
|               | ev_loop()         |

According to libevent manual, in this situation, libevent will
ignore ev_break_loop as it wasn't in the loop at the time
ev_break_loop() was called (see (b) in the libevent excerpt below)
So the progress thread will enter the loop and hang.

To fix this use event_base_loopexit that have desired behavior
(See section (a) of the excerpt below)

**excerpt from the libevent manual**:
```
...
Note also that event_base_loopexit(base,NULL) and event_base_loopbreak(base)
act differently when no event loop is running:

(a) loopexit schedules the next instance of the event loop to stop right
after the next round of callbacks are run (as if it had been invoked with
EVLOOP_ONCE)

(b) whereas loopbreak only stops a currently running loop, and has no
effect if the event loop isn’t running.
...
```

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
  • Loading branch information
artpol84 committed May 13, 2019
1 parent 294eae0 commit f652ae1
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion src/runtime/pmix_progress_threads.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
* Copyright (c) 2015 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2017-2019 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2019 Mellanox Technologies, Inc.
* All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
Expand Down Expand Up @@ -239,7 +241,7 @@ static void stop_progress_engine(pmix_progress_tracker_t *trk)
trk->ev_active = false;
/* break the event loop - this will cause the loop to exit upon
completion of any current event */
pmix_event_base_loopbreak(trk->ev_base);
pmix_event_base_loopexit(trk->ev_base);

pmix_thread_join(&trk->engine, NULL);
}
Expand Down

0 comments on commit f652ae1

Please sign in to comment.