-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness probe stops too early at eviction #124648
Comments
/sig node |
/triage accepted |
/assign |
I would expect readiness probes to continue to work after a pod is terminated - the ready condition is orthogonal to pod deletion status. Once the ready condition becomes false, the probe can be stopped. If the ready condition is already false, we can stop the probe right away. External system components can use deletion timestamp to make a decision independent of ready (and historically do), but since we added unreadyEndpoints it is appropriate for kubelet to probe ready status until such a time as the probe returns false (or if the pod never started, etc). |
For eviction, you could maybe argue that stopping the probe is justified, but not for graceful shutdown. I think that code is simply out of date and a relic of a period before the kubelet pod workers had a more central role in the lifecycle of the pod. |
What happened?
When kubelet evicts a pod, the ready condition doesn’t get
NotReady
during the pod termination even if areadinessProbe
is configured.What did you expect to happen?
A readiness probe works during a pod termination so that the pod gets
NotReady
as early as possible.How can we reproduce it (as minimally and precisely as possible)?
Use this
readiness.yaml
:Create resources and wait an eviction happens:
When deleting this pod, the readiness probe works during termination:
Anything else we need to know?
I guess this issue is caused as follows:
At eviction, a pod phase is set to
PodFailed
internally inpodStatusFn
before stopping containers in the pod:kubernetes/pkg/kubelet/kubelet.go
Lines 2026 to 2029 in 0d8f996
kubernetes/pkg/kubelet/eviction/eviction_manager.go
Lines 605 to 608 in 0d8f996
Because the internal pod phase is
PodFailed
, the probe worker finishes working without probing containers at termination:kubernetes/pkg/kubelet/prober/worker.go
Lines 203 to 216 in dd68c5f
Kubernetes version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: