-
Notifications
You must be signed in to change notification settings - Fork 675
Open
Description
Summary
There's a significant problem with Node Problem Detector v0.8.20 on EKS where the node shows conflicting status conditions.
Environment
- Node Problem Detector v0.8.20
- EKS Optimized AL2 AMI v20250519
Details
The following monitors are currently enabled in the NPD helm chart:
# node-problem-detector/values_my.yaml
settings:
log_monitors:
- /config/kernel-monitor.json
- /config/docker-monitor.json
- /config/readonly-monitor.json
# An example of activating a custom log monitor definition in
# Node Problem Detector
# - /custom-config/docker-monitor-filelog.json
custom_plugin_monitors:
- /config/health-checker-kubelet.json
kubectl get node
:
NAME STATUS ROLES AGE VERSION
ip-10-xxx-xx-xxx.ap-northeast-2.compute.internal Ready <none> 16m v1.32.3-eks-473151a
Node conditions:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
ReadonlyFilesystem False Wed, 16 Jul 2025 15:57:17 +0900 Wed, 16 Jul 2025 15:52:15 +0900 FilesystemIsNotReadOnly Filesystem is not read-only
KubeletUnhealthy True Wed, 16 Jul 2025 16:04:16 +0900 Wed, 16 Jul 2025 15:59:15 +0900 KubeletUnhealthy kubelet:kubelet was found unhealthy; repair flag : true
CorruptDockerOverlay2 False Wed, 16 Jul 2025 16:04:16 +0900 Wed, 16 Jul 2025 15:59:15 +0900 NoCorruptDockerOverlay2 docker overlay2 is functioning properly
KernelDeadlock False Wed, 16 Jul 2025 15:57:17 +0900 Wed, 16 Jul 2025 15:52:15 +0900 KernelHasNoDeadlock kernel has no deadlock
MemoryPressure False Wed, 16 Jul 2025 16:02:49 +0900 Wed, 16 Jul 2025 15:51:36 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 16 Jul 2025 16:02:49 +0900 Wed, 16 Jul 2025 15:51:36 +0900 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 16 Jul 2025 16:02:49 +0900 Wed, 16 Jul 2025 15:51:36 +0900 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 16 Jul 2025 16:02:49 +0900 Wed, 16 Jul 2025 15:51:54 +0900 KubeletReady kubelet is posting ready status
Directly accessing to node via node-shell:
$ which systemctl
/usr/bin/systemctl
$ systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubelet-args.conf, 30-kubelet-extra-args.conf
Active: active (running) since Wed 2025-07-16 07:53:47 UTC; 54min ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 4013 (kubelet)
CGroup: /runtime.slice/kubelet.service
└─4013 /usr/bin/kubelet --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime-endpoin...
Jul 16 08:47:43 ip-xx-xxx-xx-xxx.ap-northeast-2.compute.internal kubelet[4013]: I0716 08:47:43.657399 4013 util.go:30] "No sandbox for pod can be fo...gwao"
After checking directly on the node, I found that kubelet shows up normally in systemctl, the systemctl command exists, and the command path is returned correctly.
How can I solve this issue?
Metadata
Metadata
Assignees
Labels
No labels