Status manager does not normalize ephemeral container statuses #124873

hshiina · 2024-05-14T20:51:03Z

What happened?

EphemeralContainerStatuses is not normalized in normalizeStatus() in the status manager:

kubernetes/pkg/kubelet/status/status_manager.go

Lines 1026 to 1043 in cade1dd

    
           // update container statuses 
        
           for i := range status.ContainerStatuses { 
        
           	cstatus := &status.ContainerStatuses[i] 
        
           	normalizeContainerState(&cstatus.State) 
        
           	normalizeContainerState(&cstatus.LastTerminationState) 
        
           } 
        
           // Sort the container statuses, so that the order won't affect the result of comparison 
        
           sort.Sort(kubetypes.SortedContainerStatuses(status.ContainerStatuses)) 
        
           // update init container statuses 
        
           for i := range status.InitContainerStatuses { 
        
           	cstatus := &status.InitContainerStatuses[i] 
        
           	normalizeContainerState(&cstatus.State) 
        
           	normalizeContainerState(&cstatus.LastTerminationState) 
        
           } 
        
           // Sort the container statuses, so that the order won't affect the result of comparison 
        
           kubetypes.SortInitContainerStatuses(pod, status.InitContainerStatuses) 
        
           return status

This does not seem to cause any user facing problem because EphemeralContainerStatuses is sorted before passed to the status manager and timestamps are normalized at marshaling to create a patch:

kubernetes/pkg/kubelet/kubelet_pods.go

Line 2336 in cade1dd

sort.Sort(kubetypes.SortedContainerStatuses(containerStatuses))

kubernetes/pkg/kubelet/status/status_manager.go

Line 873 in cade1dd

    
           newPod, patchBytes, unchanged, err := statusutil.PatchPodStatus(context.TODO(), m.kubeClient, pod.Namespace, pod.Name, pod.UID, pod.Status, mergedStatus)

However, an unexpected behavior is caused internally because timestamps are not normalized. If the log verbosity is 3 or larger, the following messages are logged periodically in kubelet when an ephemeral container exists:

May 14 20:10:33 kind-control-plane kubelet[720]: I0514 20:10:33.437155 	720 status_manager.go:230] "Syncing all statuses"
May 14 20:10:33 kind-control-plane kubelet[720]: I0514 20:10:33.438512 	720 status_manager.go:984] "Pod status is inconsistent with cached status for pod, a reconciliation should be triggered" pod="default/ephemeral-demo" statusDiff=<
May 14 20:10:33 kind-control-plane kubelet[720]:     	  &v1.PodStatus{
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	... // 8 identical fields
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	PodIPs:            	{{IP: "10.244.0.5"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	StartTime:         	s"2024-05-14 20:09:21 +0000 UTC",
May 14 20:10:33 kind-control-plane kubelet[720]:     	-     	InitContainerStatuses: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	+     	InitContainerStatuses: []v1.ContainerStatus{},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	ContainerStatuses: 	{{Name: "ephemeral-demo", State: {Running: &{StartedAt: {Time: s"2024-05-14 20:09:23 +0000 UTC"}}}, Ready: true, Image: "registry.k8s.io/pause:3.1", ...}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	QOSClass:          	"BestEffort",
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	EphemeralContainerStatuses: []v1.ContainerStatus{
May 14 20:10:33 kind-control-plane kubelet[720]:     	              	{
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	Name: "debugger-qk87g",
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	State: v1.ContainerState{
May 14 20:10:33 kind-control-plane kubelet[720]:     	                              	Waiting:	nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	-                             	Running:	&v1.ContainerStateRunning{StartedAt: v1.Time{Time: s"2024-05-14 20:09:35 +0000 UTC"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	+                             	Running:	&v1.ContainerStateRunning{StartedAt: v1.Time{Time: s"2024-05-14 20:09:35.697513091 +0000 UTC"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                              	Terminated: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	LastTerminationState: {},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	Ready:            	false,
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	... // 7 identical fields
May 14 20:10:33 kind-control-plane kubelet[720]:     	              	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	Resize:            	"",
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	ResourceClaimStatuses: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	  }
May 14 20:10:33 kind-control-plane kubelet[720]:  >

This message is emitted here:

kubernetes/pkg/kubelet/status/status_manager.go

Lines 984 to 988 in cade1dd

    
           klog.V(3).InfoS("Pod status is inconsistent with cached status for pod, a reconciliation should be triggered", 
        
           	"pod", klog.KObj(pod), 
        
           	"statusDiff", cmp.Diff(podStatus, &status)) 
        
           return true

After needsReconcile() returns true, syncPod() is called. This does not look so harmful because unchaged gets true eventually:

kubernetes/pkg/kubelet/status/status_manager.go

Lines 873 to 881 in cade1dd

    
           newPod, patchBytes, unchanged, err := statusutil.PatchPodStatus(context.TODO(), m.kubeClient, pod.Namespace, pod.Name, pod.UID, pod.Status, mergedStatus) 
        
           klog.V(3).InfoS("Patch status for pod", "pod", klog.KObj(pod), "podUID", uid, "patch", string(patchBytes)) 
        
           if err != nil { 
        
           	klog.InfoS("Failed to update status for pod", "pod", klog.KObj(pod), "err", err) 
        
           	return 
        
           } 
        
           if unchanged { 
        
           	klog.V(3).InfoS("Status for pod is up-to-date", "pod", klog.KObj(pod), "statusVersion", status.version)

API is called once unnecessarily in syncPod(), though:

kubernetes/pkg/kubelet/status/status_manager.go

Line 843 in cade1dd

    
           pod, err := m.kubeClient.CoreV1().Pods(status.podNamespace).Get(context.TODO(), status.podName, metav1.GetOptions{})

What did you expect to happen?

It might be better to normalize ephemeral container statuses, at least timestamps, in the status manager to avoid this unexpected behavior.

How can we reproduce it (as minimally and precisely as possible)?

Set the log verbosity of kubelet to 3 or larger. Then, create an ephemeral container like this.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
Client Version: v1.29.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Cloud provider

None

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

hshiina · 2024-05-14T20:51:31Z

/sig node

zhifei92 · 2024-05-15T03:46:10Z

@hshiina This is a bug. Can you submit a PR to fix it? If not, I'd be happy to work on it

hshiina · 2024-05-15T09:19:25Z

@zhifei92
If you are interested, please feel free to assign yourself.

zhifei92 · 2024-05-16T01:54:03Z

@zhifei92 If you are interested, please feel free to assign yourself.

thx

zhifei92 · 2024-05-16T01:54:38Z

/assign

ffromani · 2024-05-16T06:45:44Z

/triage accepted
/priority important-longterm

hshiina added the kind/bug Categorizes issue or PR as related to a bug. label May 14, 2024

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 14, 2024

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 14, 2024

k8s-ci-robot assigned zhifei92 May 16, 2024

zhifei92 linked a pull request May 16, 2024 that will close this issue

normalize ephemeral container statuses #124901

Open

SergeyKanzhelev added this to Triage in SIG Node Bugs May 22, 2024

haircommander moved this from Triage to Triaged in SIG Node Bugs May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Status manager does not normalize ephemeral container statuses #124873

Status manager does not normalize ephemeral container statuses #124873

hshiina commented May 14, 2024

hshiina commented May 14, 2024

zhifei92 commented May 15, 2024

hshiina commented May 15, 2024

zhifei92 commented May 16, 2024

zhifei92 commented May 16, 2024

ffromani commented May 16, 2024

Status manager does not normalize ephemeral container statuses #124873

Status manager does not normalize ephemeral container statuses #124873

Comments

hshiina commented May 14, 2024

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

hshiina commented May 14, 2024

zhifei92 commented May 15, 2024

hshiina commented May 15, 2024

zhifei92 commented May 16, 2024

zhifei92 commented May 16, 2024

ffromani commented May 16, 2024