New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
agnhost: fix sigterm shutdown #110212
agnhost: fix sigterm shutdown #110212
Conversation
@aojea: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
https://storage.googleapis.com/k8s-triage/index.html?sig=network#7ccacb7807a6b800df8c things started to fail on these tests, that use the healthz endpoint for readiness, this is breaking the behavior, a SIGTERM never exits |
os.Exit(0) | ||
}() | ||
} | ||
} | ||
// SIGTERM Exit Code 143 | ||
os.Exit(143) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have doubts about the exit codes, I've verified that previously, sending a SIGTERM the exit code is a 143
./agnhost netexec
I0525 12:19:24.160983 1873100 log.go:184] Started HTTP server on port 8080
I0525 12:19:24.161146 1873100 log.go:184] Started UDP server on port 8081
Terminated
$ echo $?
143
if that is correct, the previous exit code to 0 was wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my google search says that the right way is to set it to 0,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for ref, this @rphillips comment here https://bugzilla.redhat.com/show_bug.cgi?id=1395663#c10
f2c911e
to
2b82216
Compare
/lgtm |
/test pull-kubernetes-unit |
/kind bug
/kind failing-test
/kind flake
/kind regression
When adding the new readyz handler to the agnhost netexec functionality
#110174
we forget to handle the case when a SIGTERM is received but not shutdown delay is set, causing the process to hang forever without exiting, this is making some test to fail. https://storage.googleapis.com/k8s-triage/index.html?sig=network#7ccacb7807a6b800df8c
Since we are always handling the SIGTERM signal now, just exit immediately if not shutdown-delay option is set,
How to repro
in another terminal send a
kill -SIGTERM
, the process never dies