Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add-on jobs do not complete before timeout with nodes in vSphere #2535

Closed
bmdepesa opened this issue May 13, 2021 · 2 comments
Closed

Add-on jobs do not complete before timeout with nodes in vSphere #2535

bmdepesa opened this issue May 13, 2021 · 2 comments

Comments

@bmdepesa
Copy link
Member

RKE version: 1.3.0-rc1

With nodes from vSphere (4cpu, 8gb, Ubuntu), addon jobs after rke-network-plugin are timing out before completion, preventing the cluster from becoming active.

nodes:
- address: 172.16.x.x
  user: ubuntu
  role: [controlplane,etcd,worker]

The rke-network-plugin job always completes, but the following have failed repeatedly:

FATA[0172] Failed to get job complete status for job rke-metrics-addon-deploy-job in namespace kube-system
FATA[0152] Failed to get job complete status for job rke-ingress-controller-deploy-job in namespace kube-system
FATA[0242] Failed to get job complete status for job rke-coredns-addon-deploy-job in namespace kube-systemINFO[0159] 

This was seen while testing #2439

@superseb
Copy link
Contributor

Please add if raising addon_job_timeout to 120 or running rke up again solves the issue, else please share the setup or share the output of;

kubectl -n kube-system get pods -l job-name=rke-metrics-addon-deploy-job --no-headers -o custom-columns=NAME:.metadata.name | xargs -L1 kubectl -n kube-system logs
kubectl -n kube-system get pods -l job-name=rke-ingress-controller-deploy-job --no-headers -o custom-columns=NAME:.metadata.name | xargs -L1 kubectl -n kube-system logs
kubectl -n kube-system get pods -l job-name=rke-coredns-addon-deploy-job --no-headers -o custom-columns=NAME:.metadata.name | xargs -L1 kubectl -n kube-system logs
kubectl get nodes -o go-template='{{range .items}}{{$node := .}}{{range .status.conditions}}{{$node.metadata.name}}{{": "}}{{.type}}{{":"}}{{.status}}{{"\n"}}{{end}}{{end}}'

@slickwarren
Copy link

I was unable to reproduce this on the latest (v1.3.0-rc10), but did reproduce it on the version reported in this ticket (v1.3.0-rc1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants