Skip to content

Commit

Permalink
[docker-wait-any]: Exit worker thread if main thread is expected to e…
Browse files Browse the repository at this point in the history
…xit (#12255)

There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed).  Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread.  However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`.  The reason is unknown.

This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.

The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
  • Loading branch information
saiarcot895 committed Oct 6, 2022
1 parent 3686454 commit 9251d4b
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions files/image_config/misc/docker-wait-any
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ def wait_for_container(docker_client, container_name):

# Signal the main thread to exit
g_thread_exit_event.set()
return


def main():
Expand Down

0 comments on commit 9251d4b

Please sign in to comment.