-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random Failed to create pod sandbox
errors for all pods on nodes
#250
Comments
I can't see much useful information from the logs, but I suspect it might be due to a conflict between the CNI plugin mode and the AWS CNI plugin. Maybe I need to investigate further in a real environment. |
@kebe7jun given the randomness, some kind of racing with AWS VPC CNI is possible. |
Ok, we will try to add this option. |
We have reproduced a similar problem. Immediately after launching daemonset merbridge, everything works fine, but after a while new pods cease to have access to the network, and those of them that have init containers or other interaction with the network at startup fall into the status of ClashLoopBackoff. Restarting the merbridge pod on a problematic node helps for a while. Environment: |
Bug Description
Pods are failing to start with the following error on certain nodes. I could not find any obvious patterns of why pods on some nodes work fine and fail on others. These symptoms may indicate some racing issues.
Helm values
1st failing node details
OS:
linux (arm64)
OS Image:
Bottlerocket OS 1.11.1 (aws-k8s-1.24)
Kernel version:
5.15.59
Container runtime:
containerd://1.6.8+bottlerocket
Kubelet version:
v1.24.6-eks-4360b32
AWS EC2 instance type:
t4g.small
** Merbridge logs
2nd failing node details
OS:
linux (arm64)
OS Image:
Bottlerocket OS 1.11.1 (aws-k8s-1.24)
Kernel version:
5.15.59
Container runtime:
containerd://1.6.8+bottlerocket
Kubelet version:
v1.24.6-eks-4360b32
AWS EC2 instance type:
m6gd.medium
** Merbridge logs
Version
Probably related to #218
The text was updated successfully, but these errors were encountered: