Skip to content

Fix SIGCHLD race in signal handler setup#5206

Closed
bitoku wants to merge 1 commit intoopencontainers:mainfrom
bitoku:fix-sigchld-race
Closed

Fix SIGCHLD race in signal handler setup#5206
bitoku wants to merge 1 commit intoopencontainers:mainfrom
bitoku:fix-sigchld-race

Conversation

@bitoku
Copy link
Copy Markdown

@bitoku bitoku commented Mar 31, 2026

Add an initial reap in forward() before entering the signal loop.
Since signal.Notify runs in a goroutine, a fast-exiting process
could trigger SIGCHLD before signal registration completes, causing
the signal to be silently discarded. This left forward() blocking
forever on a signal that would never arrive.

By calling reap() once before the loop, we catch any process that
already exited during the setup window. If the process is still
running, the reap is a no-op and SIGCHLD arrives normally via the
registered signal handler.

Fix: #5208

@lifubang
Copy link
Copy Markdown
Member

@bitoku Thanks for the report! You're absolutely right -- there’s clearly a race condition here. I should have caught this when I opened test PR #5197.

That said, I believe the direction of your patch may be misguided. As proposed, it would undo the performance improvement introduced in #4654.

Instead, I think we should check whether the container process is still alive -- something along these lines:

+++ b/signals.go
@@ -70,6 +70,21 @@ func (h *signalHandler) forward(process *libcontainer.Process, tty *tty, detach
                return -1, err
        }
 
+       // We need to check if the process is still alive before we start forwarding signals,
+       // otherwise we might miss the exit of the process and wait indefinitely for a SIGCHLD
+       // that will never come. This can happen if the process exits before we start forwarding
+       // signals, which can happen in cases where the process is very short lived or if there
+       // is a race between the process exiting and us starting to forward signals. By sending
+       // a signal 0, we can check if the process is still alive without actually sending a
+       // signal to it. If the process is not alive, we can return immediately with its exit
+       // status.
+       if err := process.Signal(unix.Signal(0)); err != nil {
+               if state, err := process.Wait(); err == nil {
+                       return state.ExitCode(), nil
+               }
+               return -1, err
+       }
+
        if h.notifySocket != nil {
                if detach {
                        _ = h.notifySocket.run(pid1)

@lifubang
Copy link
Copy Markdown
Member

By the way, it would be great if you could add an integration test to cover this regression.

@lifubang lifubang added backport/1.4-todo A PR in main branch which needs to backported to release-1.4 backport/1.5-todo A PR in main branch which needs to be backported to release-1.5 backport/1.3-todo A PR in main branch which needs to be backported to release-1.3 labels Mar 31, 2026
@bitoku bitoku force-pushed the fix-sigchld-race branch 2 times, most recently from a10eb07 to 7ebfd5b Compare March 31, 2026 11:36
@bitoku
Copy link
Copy Markdown
Author

bitoku commented Mar 31, 2026

@lifubang
Thank you!
Ok so it offsets runc init. I didn't check your PR, sorry.

I don't think the suggested way works, because kill -0 to a zombie process should success.
Instead I reused h.reap. It does unblocking wait, and does the same handling as when it gets SIGCHILD.
Let me know how it looks.

By the way, it would be great if you could add an integration test to cover this regression.

I added it though it passes even without the fix.
I encountered this issue in a really busy environment, but it's hard to reproduce it in CI, I guess.
Do we want to keep the integration test, or should I remove it?

@lifubang

This comment was marked as outdated.

@bitoku

This comment was marked as outdated.

Add an initial reap in forward() before entering the signal loop.
Since signal.Notify runs in a goroutine, a fast-exiting process
could trigger SIGCHLD before signal registration completes, causing
the signal to be silently discarded. This left forward() blocking
forever on a signal that would never arrive.

By calling reap() once before the loop, we catch any process that
already exited during the setup window. If the process is still
running, the reap is a no-op and SIGCHLD arrives normally via the
registered signal handler.

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
@bitoku bitoku force-pushed the fix-sigchld-race branch from 44fe5e2 to e64a48e Compare March 31, 2026 13:11
@lifubang
Copy link
Copy Markdown
Member

Thanks for the update. Could you please verify if this patch resolves your issue? I believe it offers a more robust solution:

diff --git a/signals.go b/signals.go
index 936d751f..649d3bcf 100644
--- a/signals.go
+++ b/signals.go
@@ -26,14 +26,18 @@ func newSignalHandler(enableSubreaper bool, notifySocket *notifySocket) chan *si
                }
        }
        handler := make(chan *signalHandler)
+
+       // ensure that we have a large buffer size so that we do not miss any
+       // signals in case we are not processing them fast enough.
+       s := make(chan os.Signal, signalBufferSize)
+
        // signal.Notify is actually quite expensive, as it has to configure the
        // signal mask and add signal handlers for all signals (all ~65 of them).
        // So, defer this to a background thread while doing the rest of the io/tty
        // setup.
+       // But `SIGCHLD` is very important, so we need to handle it immediately.
+       signal.Notify(s, unix.SIGCHLD)
        go func() {
-               // ensure that we have a large buffer size so that we do not miss any
-               // signals in case we are not processing them fast enough.
-               s := make(chan os.Signal, signalBufferSize)
                // handle all signals for the process.
                signal.Notify(s)
                handler <- &signalHandler{

As I mentioned in the linked comment, I don't think adding a BATS test is necessary. Since I couldn't reproduce the issue even after 100,000 iterations, writing a test for it would be futile. It simply won't trigger in a realistic timeframe.

@kolyshkin
Copy link
Copy Markdown
Contributor

Yes, I think this is a simplest possible fix. I tested it by adding time.Sleep(10 * time.Millisecond) to the beginning of go func() in newSignalHandler which allows to reproduce the issue 100% on my machine.

@lifubang can you please open a PR (and acknowledge @bitoku contribution via Reported-by: Ayato Tokubi <atokubi@redhat.com>?)

@kolyshkin
Copy link
Copy Markdown
Contributor

Yet another way to fix this would be something like this:

diff --git a/utils_linux.go b/utils_linux.go
index d7f51e20..a5b0a630 100644
--- a/utils_linux.go
+++ b/utils_linux.go
@@ -299,6 +299,7 @@ func (r *runner) run(config *specs.Process) (int, error) {
                }
        }
        handler := <-handlerCh
+       handler.signals <- unix.SIGCHLD
        status, err := handler.forward(process, tty, detach)
        if err != nil {
                r.terminate(process)

This should have the same/similar effect as the fix in this PR but I like this approach better.

PS this reminds me we should try to finish the review of #4661.

@lifubang
Copy link
Copy Markdown
Member

lifubang commented Apr 1, 2026

@lifubang can you please open a PR (and acknowledge @bitoku contribution via Reported-by: Ayato Tokubi <atokubi@redhat.com>?)

Thanks @bitoku, opened #5210, PTAL.

@bitoku
Copy link
Copy Markdown
Author

bitoku commented Apr 1, 2026

closing in favor of #5210

@bitoku bitoku closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.3-todo A PR in main branch which needs to be backported to release-1.3 backport/1.4-todo A PR in main branch which needs to backported to release-1.4 backport/1.5-todo A PR in main branch which needs to be backported to release-1.5 regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

a SIGCHLD race in signal handler setup (regression since runc 1.3.0)

3 participants