-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thousands of deploy-job pods in pending state #755
Comments
What k8s version were you using? Was this k8s 1.11? |
No, this one was 1.10.3 |
have a look at the kubelet and cni install related containers' logs will help to solve your problem. |
rke/k8s will keep scheduling cni plugin pod if it failed |
It should clean up an old pod before it starts a new one. And after a certain number of tries it should report an error without continuing to start new pods |
@HighwayofLife This is known issue with k8s 1.10.x. It's fixed in 1.10.5. Using version 0.1.8 with default k8s 1.10.5-rancher1-1 should resolve this. |
@HighwayofLife Let me know if you start using k8s v1.10.5 and still have these issues. |
I'm using 1.10.5-rancher1 and still seeing this on occasion. I haven't had
a chance to investigate. Will do that next week.
|
@HighwayofLife are you still seeing this with v0.1.9 ? |
No, I have not seen this reappear in 0.1.9 |
Cool. I will close this issue for now. |
RKE version:
rke version v0.1.8-rc11
Docker version: (
docker version
,docker info
preferred)1.12
Operating system and kernel: (
cat /etc/os-release
,uname -r
preferred)CoreOS
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Azure Private Cloud
I just ran RKE with the newest version of RKE against an existing cluster previously provisioned by RKE, and the rke-network-plugin-deploy-job failed during the run, but when I checked the node, noticed that the CPU was at 95%, Disk writes were going crazy, and the kubelet was consuming a huge amount of CPU. Turns out, 19,000 pods for the rke-network-plugin-deploy-job had been created and were in Pending state.
19,000
The text was updated successfully, but these errors were encountered: