Readiness probe stops too early at eviction #124648

hshiina · 2024-04-30T22:55:00Z

What happened?

When kubelet evicts a pod, the ready condition doesn’t get NotReady during the pod termination even if a readinessProbe is configured.

What did you expect to happen?

A readiness probe works during a pod termination so that the pod gets NotReady as early as possible.

How can we reproduce it (as minimally and precisely as possible)?

Use this readiness.yaml:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: readiness-script-configmap
data:
  readiness-script.sh: |
    #!/bin/sh
    handler() {
      rm /tmp/ready
      sleep 20
    }
    touch /tmp/ready
    trap handler SIGTERM
    while true; do
      sleep 1
    done
---
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: readiness-container
  name: readiness-test
spec:
  containers:
  - command:
    - sh
    - /script/readiness-script.sh
    image: busybox
    name: readiness
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/ready
      initialDelaySeconds: 3
      periodSeconds: 3
    volumeMounts:
    - name: readiness-script
      mountPath: /script
  - command:
    - sh
    - -c
    - for i in `seq 100`; do dd if=/dev/random of=file${i} bs=1048576 count=1 2>/dev/null; sleep .1; done; while true; do sleep 5; done
    name: disk-consumer
    image: busybox
    resources:
      limits:
        ephemeral-storage: "100Mi"
      requests:
        ephemeral-storage: "100Mi"
  volumes:
  - name: readiness-script
    configMap:
      name: readiness-script-configmap

Create resources and wait an eviction happens:

$ kubectl create -f readiness.yaml; kubectl get pods readiness-test -w
configmap/readiness-script-configmap created
pod/readiness-test created
NAME         	READY   STATUS          	RESTARTS   AGE
readiness-test   0/2 	ContainerCreating   0      	0s
readiness-test   1/2 	Running         	0      	3s
readiness-test   2/2 	Running         	0      	7s
readiness-test   0/2 	Error           	0      	46s
readiness-test   0/2 	Error           	0      	47s

When deleting this pod, the readiness probe works during termination:

$ kubectl create -f readiness.yaml; (sleep 15; kubectl delete pod readiness-test) & kubectl get pods -w
configmap/readiness-script-configmap created
pod/readiness-test created
[1] 137999
NAME             READY   STATUS              RESTARTS   AGE
readiness-test   0/2     ContainerCreating   0          0s
readiness-test   1/2     Running             0          2s
readiness-test   2/2     Running             0          6s
pod "readiness-test" deleted
readiness-test   2/2     Terminating         0          15s
readiness-test   1/2     Terminating         0          21s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s

Anything else we need to know?

I guess this issue is caused as follows:

At eviction, a pod phase is set to PodFailed internally in podStatusFn before stopping containers in the pod:

kubernetes/pkg/kubelet/kubelet.go

Lines 2026 to 2029 in 0d8f996

    
           if podStatusFn != nil { 
        
           	podStatusFn(&apiPodStatus) 
        
           } 
        
           kl.statusManager.SetPodStatus(pod, apiPodStatus)

kubernetes/pkg/kubelet/eviction/eviction_manager.go

Lines 605 to 608 in 0d8f996

    
           err := m.killPodFunc(pod, true, &gracePeriodOverride, func(status *v1.PodStatus) { 
        
           	status.Phase = v1.PodFailed 
        
           	status.Reason = Reason 
        
           	status.Message = evictMsg

Because the internal pod phase is PodFailed, the probe worker finishes working without probing containers at termination:

kubernetes/pkg/kubelet/prober/worker.go

Lines 203 to 216 in dd68c5f

    
           status, ok := w.probeManager.statusManager.GetPodStatus(w.pod.UID) 
        
           if !ok { 
        
           	// Either the pod has not been created yet, or it was already deleted. 
        
           	klog.V(3).InfoS("No status for pod", "pod", klog.KObj(w.pod)) 
        
           	return true 
        
           } 
        
           // Worker should terminate if pod is terminated. 
        
           if status.Phase == v1.PodFailed || status.Phase == v1.PodSucceeded { 
        
           	klog.V(3).InfoS("Pod is terminated, exiting probe worker", 
        
           		"pod", klog.KObj(w.pod), "phase", status.Phase) 
        
           	return false 
        
           }

Kubernetes version

$ kubectl version
# paste output here

Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Cloud provider

None. I reproduced on my laptop.

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

kind

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

hshiina · 2024-04-30T22:56:03Z

/sig node

AnishShah · 2024-05-01T17:29:34Z

/triage accepted
/priority backlog

bitoku · 2024-05-02T08:21:32Z

/assign

aojea · 2024-05-15T10:22:07Z

In any case it seems this behavior will leave the pod out of the endpoints #110255 so it does not seem related to #116965

smarterclayton · 2024-05-15T14:56:57Z

I would expect readiness probes to continue to work after a pod is terminated - the ready condition is orthogonal to pod deletion status. Once the ready condition becomes false, the probe can be stopped. If the ready condition is already false, we can stop the probe right away.

External system components can use deletion timestamp to make a decision independent of ready (and historically do), but since we added unreadyEndpoints it is appropriate for kubelet to probe ready status until such a time as the probe returns false (or if the pod never started, etc).

smarterclayton · 2024-06-17T21:31:52Z

For eviction, you could maybe argue that stopping the probe is justified, but not for graceful shutdown. I think that code is simply out of date and a relic of a period before the kubelet pod workers had a more central role in the lifecycle of the pod.

hshiina added the kind/bug Categorizes issue or PR as related to a bug. label Apr 30, 2024

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 30, 2024

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 30, 2024

SergeyKanzhelev added this to Triage in SIG Node Bugs May 1, 2024

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 1, 2024

AnishShah moved this from Triage to Triaged in SIG Node Bugs May 1, 2024

hshiina mentioned this issue May 1, 2024

Graceful Node Shutdown does not update endpoints for terminating pods #116965

Open

k8s-ci-robot assigned bitoku May 2, 2024

smarterclayton linked a pull request Jun 17, 2024 that will close this issue

DO NOT MERGE: kubelet/prober: Continue probing terminal phase pods #125558

Open

This was referenced Jun 27, 2024

DO NOT MERGE: Pods with the DisruptionTarget condition should be treated as terminating by the endpoints controller #125774

Open

Readiness probes should keep running (or pod marked as not ready) during graceful node termination #105780

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readiness probe stops too early at eviction #124648

Readiness probe stops too early at eviction #124648

hshiina commented Apr 30, 2024

hshiina commented Apr 30, 2024

AnishShah commented May 1, 2024

bitoku commented May 2, 2024

aojea commented May 15, 2024

smarterclayton commented May 15, 2024

smarterclayton commented Jun 17, 2024

Readiness probe stops too early at eviction #124648

Readiness probe stops too early at eviction #124648

Comments

hshiina commented Apr 30, 2024

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

hshiina commented Apr 30, 2024

AnishShah commented May 1, 2024

bitoku commented May 2, 2024

aojea commented May 15, 2024

smarterclayton commented May 15, 2024

smarterclayton commented Jun 17, 2024