Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clh: isClhRunning waits for full timeout when clh exits #182

Merged
merged 3 commits into from
May 2, 2024

Conversation

Redent0r
Copy link

@Redent0r Redent0r commented Apr 23, 2024

Merge Checklist
  • Followed patch format from upstream recommendation: https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#patch-format
    • Included a single commit in a given PR - at least unless there are related commits and each makes sense as a change on its own.
  • Aware about the PR to be merged using "create a merge commit" rather than "squash and merge" (or similar)
  • genPolicy only: Ensured the tool still builds on Windows
  • genPolicy only: Updated sample YAMLs' policy annotations, if applicable
  • The upstream-missing label (or upstream-not-needed) has been set on the PR.
Summary

Cherry picked from upstream kata-containers@5492316

isClhRunning uses signal 0 to test whether the process is still alive or not. This doesn't work because the process is a direct child of the shim. Once it is dead the process becomes zombie.
Since no one waits for it the process lingers until its parent dies and init reaps it. Hence sending signal 0 in isClhRunning will always return success whether the process is dead or not.
This patch calls wait to reap the process, if it succeeds that means it is our child process, if not we send the signal.

Fixes: kata-containers#9431

Test Methodology

https://dev.azure.com/mariner-org/mariner/_build/results?buildId=557064&view=results

@Redent0r Redent0r added the upstream/merged PRs that have been merged upstream label Apr 23, 2024
The PID needs to be initialized before calling isClhRunning.
waitVMM() uses isClhRunning and is called by launchClh() just
before returning from function.

Fixes: kata-containers#9230

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: kata-containers#9431

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
@Redent0r Redent0r marked this pull request as ready for review May 1, 2024 18:14
@Redent0r Redent0r requested review from a team as code owners May 1, 2024 18:14
@Redent0r Redent0r merged commit 597200d into msft-main May 2, 2024
42 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream/merged PRs that have been merged upstream
Projects
None yet
4 participants