Skip to content

Commit

Permalink
clh: isClhRunning waits for full timeout when clh exits
Browse files Browse the repository at this point in the history
isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: kata-containers#9431

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
  • Loading branch information
alex-matei authored and Redent0r committed Apr 23, 2024
1 parent b4c814c commit c48e1bc
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion src/runtime/virtcontainers/clh.go
Original file line number Diff line number Diff line change
Expand Up @@ -1536,7 +1536,12 @@ func (clh *cloudHypervisor) isClhRunning(timeout uint) (bool, error) {
timeStart := time.Now()
cl := clh.client()
for {
err := syscall.Kill(pid, syscall.Signal(0))
waitedPid, err := syscall.Wait4(pid, nil, syscall.WNOHANG, nil)
if waitedPid == pid && err == nil {
return false, nil
}

err = syscall.Kill(pid, syscall.Signal(0))
if err != nil {
return false, nil
}
Expand Down

0 comments on commit c48e1bc

Please sign in to comment.