You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are utilizing EFS on both Pods and Nodes(AL2) within AWS EKS (v1.29). Within the Pods, we acess efs directories by EFS CSI plugin(v1.5.8), while on the Nodes, we install amazon-efs-utils(v2.1.0) to modify the fstab for mounting directories. Both Pod and Node are enable TLS. Using the EFS CSI alone does not present any issues; however, when used simultaneously on the Nodes, although access is initially normal, after approximately a dozen hours, directory access errors begin to occur:
# ls -al /var/run/efs/
total 8
drwxr-xr-x 3 root root 100 Jan 16 14:46 .
drwxr-xr-x 30 root root 1000 Jan 17 06:55 ..
-rw-r--r-- 1 root root 1149 Jan 16 14:46 fs-059735545feda3989.var.lib.kubelet.pods.37f75393-a9db-4513-b474-28c2235b6ddc.volumes.kubernetes.io~csi.pv-efs-data-log.mount.20117
drwxr-xr-x 4 root root 160 Jan 16 08:01 fs-059735545feda3989.var.lib.kubelet.pods.37f75393-a9db-4513-b474-28c2235b6ddc.volumes.kubernetes.io~csi.pv-efs-data-log.mount.20117+
-rw-r--r-- 1 root root 793 Jan 16 08:01 stunnel-config.fs-059735545feda3989.var.lib.kubelet.pods.37f75393-a9db-4513-b474-28c2235b6ddc.volumes.kubernetes.io~csi.pv-efs-data-log.mount.20117
We find some errors in amazon-efs-watchdog log:
2025-01-17 03:27:10 UTC - ERROR - Unable to parse json in /var/run/efs/fs-059735545feda3989.var.lib.kubelet.pods.37f75393-a9db-4513-b474-28c2235b6ddc.volumes.kubernetes.io~csi.pv-efs-data-log.mount.20117
Traceback (most recent call last):
File "/usr/bin/amazon-efs-mount-watchdog", line 1155, in check_efs_mounts
state = json.load(f)
File "/usr/lib64/python3.7/json/__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib64/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python3.7/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 1149 (char 1148)
Hi @Ranler, It looks like the error is caused by amazon-efs-utils being installed on the worker nodes where the EFS CSI driver is also running as this behavior spins up two watchdog processes which creates conflicts in managing TLS configurations and state files. Its not recommended to install efs-utils in EKS nodes, so can you please uninstall efs-utils and see if the issue still persists.
We are utilizing EFS on both Pods and Nodes(AL2) within AWS EKS (v1.29). Within the Pods, we acess efs directories by EFS CSI plugin(v1.5.8), while on the Nodes, we install amazon-efs-utils(v2.1.0) to modify the fstab for mounting directories. Both Pod and Node are enable TLS. Using the EFS CSI alone does not present any issues; however, when used simultaneously on the Nodes, although access is initially normal, after approximately a dozen hours, directory access errors begin to occur:
dmesg show errors:
and some stunnel5 processes or efs-proxy processes:
but stunnel-config is missing:
We find some errors in amazon-efs-watchdog log:
The json format in stunnel-config is bad:
The text was updated successfully, but these errors were encountered: