-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add readyz handling to netexec #110174
add readyz handling to netexec #110174
Conversation
@deads2k: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
mux.HandleFunc("/", rootHandler) | ||
mux.HandleFunc("/clientip", clientIPHandler) | ||
mux.HandleFunc("/header", headerHandler) | ||
mux.HandleFunc("/dial", dialHandler) | ||
mux.HandleFunc("/echo", echoHandler) | ||
mux.HandleFunc("/exit", func(w http.ResponseWriter, req *http.Request) { exitHandler(w, req, exitCh) }) | ||
mux.HandleFunc("/healthz", healthzHandler) | ||
mux.HandleFunc("/readyz", readyzHandler(termCh)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the description of the new handler? , L93 per example
https://github.com/kubernetes/kubernetes/pull/110174/files#diff-f26ab21963f05c932bece947c19bcff504d1f6a4d53e4e33f2029908126bd12aR93
return | ||
|
||
default: | ||
if serverReady.get() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we set the serverReady.get()
variable as unready based on the term signal, will that not work for both healthz and readyz handler and simplify the code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we set the serverReady.get() variable as unready based on the term signal, will that not work for both healthz and readyz handler and simplify the code here?
No, we want to remain healthy during something like termination, but we also want readyz to be false. So healthz and readyz will have different values. The term signal should turn readyz=false and leave healthz=true.
/assign @aojea @andrewsykim hmm, this handle was added to being able to test the "termination endpoints" functionality IIRC I need to check carefully if this is going to break it The problem is that the changes in the images are not tested directly in the CI, the images have to be promoted and the k/k code updated with the new image, so the CI picks it up ... there is a trick to build the image locally and publish in a personal registry and add a temporary commit to test the new image in the CI /hold |
There are more changes needed for the image, like updating the version the whole process is documented here https://github.com/kubernetes/kubernetes/tree/master/test/images#updating-the-tests-images a trick for testing your changes in the CI is to build your custom image and publish to a personal repo, as example $ cd test/images
$ REGISTRY=quay.io/aojea make all-push WHAT=agnhost and then "hijack" the image by replacing it in the manifest used to configure the e2e registries kubernetes/test/utils/image/manifest.go Line 236 in aa49dff
or just adding the image url directly to the tests |
updated for version and comments. |
attempting proof in #110176 |
/retest Tests with the new image pass @deads2k feel free to unhold, remember you still need 2 additional steps https://github.com/kubernetes/kubernetes/tree/master/test/images#promoting-images
|
Acknolwedged /hold cancel |
next step here: kubernetes/k8s.io#3759 |
close(sigTermReceived) | ||
}() | ||
|
||
if delayShutdown > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the problem, if not delayShutdown is set the process keeps hanging on sigterm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the problem, if not delayShutdown is set the process keeps hanging on sigterm
Which particular line does the process hang on? I don't mind reverting or fixing, but this line existed gating off this go func before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before, we didn't capture the sigterm signal if delayShutdown was 0.
Now, we always capture the sigterm signal, but if we have delayShutdown == 0, we don't execute the os.Exit() and we don't signal the process to exit
Testing services and endpoint handling requires readyz handling during termination of a pod. Chasing a possible bug, I found this gap in the server where we have no external signal for a kubelet to mark ready=false.
/kind bug
/priority important-soon
/sig network