Do not send redundant SIGQUITs #2073

brsakai-csco · 2023-06-09T13:20:06Z

If the worker reports itself as shutting down, do not second another SIGQUIT to the worker. It already knows it is shutting down, and this can cause a race with signal handlers that triggers coredumps.

Summary

Full writeup in #2046
The short version:

When a worker receives a SIGQUIT, it begins to gracefully shut down
During graceful shutdown, the worker notifies the manager thread of its shutdown
If the manager sent the initial SIGQUIT, then it already knows the worker is shutting down and takes no action
If the manager did not send the initial SIGQUIT, then it marks the worker for graceful shutdown, causing it to send a second SIGQUIT
This second SIGQUIT can arrive once Perl has already de-registered the QUIT signal handler, which causes it to perform the default QUIT action (coredump)

The patch here increments the worker's quit state in the managing thread so that it does not send a second SIGQUIT to the worker if the manager finds out about the graceful shutdown from the worker

Motivation

This behavior was causing coredumps in our application, since we were using SIGQUIT to gracefully end workers.
I understand from #1883/#1449 that there are also other ways for the worker to initiate graceful shutdown, which can trigger the same race condition

References

Tested via the test app uploaded to #2046
Attempts to fix the same issue seen in #1883

lib/Mojo/Server/Prefork.pm

marcusramberg · 2023-06-09T13:34:06Z

I like this fix better, except for the if duplication/perltidy.

kraih · 2023-06-09T17:25:36Z

Please squash your commits.

brsakai-csco · 2023-06-09T17:34:58Z

@kraih Done, I think. Let me know if that doesn't look how you expect

kraih · 2023-06-09T17:51:27Z

Is do not second another SIGQUIT to the worker (from the commit message) correct english?

If the worker reports itself as shutting down, do not send a SIGQUIT to the worker. It already knows it is shutting down, and this can cause a race with signal handlers that can trigger core dumps.

brsakai-csco · 2023-06-09T18:02:01Z

I didn't even see that. Must've switched the words in my head without realizing. Something like do not send a second > do not send another > do not second another.

Updated the commit message 👍

jixam · 2023-06-09T20:37:54Z

I like this 👍

For completeness, just a note that this does not remove the race condition. It is still possible to hit it by sending two SIGQUIT signals close to each other or by sending a single signal when a worker is already shutting down.

To fully fix the race, $SIG{QUIT} = 'IGNORE'; is needed before exit 0; in the worker. However, this is not trivial to do because the handler is localized.

kraih reviewed Jun 9, 2023

View reviewed changes

lib/Mojo/Server/Prefork.pm Outdated Show resolved Hide resolved

kraih requested review from a team, kraih, jhthorsen and christopherraa June 9, 2023 13:22

jixam mentioned this pull request Jun 9, 2023

Fix potential core dump during worker process shutdown #1883

Closed

brsakai-csco force-pushed the patch-1 branch from dee5146 to a0cfd85 Compare June 9, 2023 17:33

Do not send redundant SIGQUITs

e39dc9f

If the worker reports itself as shutting down, do not send a SIGQUIT to the worker. It already knows it is shutting down, and this can cause a race with signal handlers that can trigger core dumps.

brsakai-csco force-pushed the patch-1 branch from a0cfd85 to e39dc9f Compare June 9, 2023 18:00

kraih approved these changes Jun 9, 2023

View reviewed changes

marcusramberg approved these changes Jun 9, 2023

View reviewed changes

mergify bot merged commit 14d2875 into mojolicious:main Jun 9, 2023
10 checks passed

brsakai-csco deleted the patch-1 branch June 13, 2023 11:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not send redundant SIGQUITs #2073

Do not send redundant SIGQUITs #2073

brsakai-csco commented Jun 9, 2023

marcusramberg commented Jun 9, 2023

kraih commented Jun 9, 2023

brsakai-csco commented Jun 9, 2023

kraih commented Jun 9, 2023

brsakai-csco commented Jun 9, 2023

jixam commented Jun 9, 2023

Do not send redundant SIGQUITs #2073

Do not send redundant SIGQUITs #2073

Conversation

brsakai-csco commented Jun 9, 2023

Summary

Motivation

References

marcusramberg commented Jun 9, 2023

kraih commented Jun 9, 2023

brsakai-csco commented Jun 9, 2023

kraih commented Jun 9, 2023

brsakai-csco commented Jun 9, 2023

jixam commented Jun 9, 2023