Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to add new master/etcd node to cluster #3471

Closed
adoo123 opened this issue Oct 8, 2018 · 22 comments
Closed

Unable to add new master/etcd node to cluster #3471

adoo123 opened this issue Oct 8, 2018 · 22 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@adoo123
Copy link

adoo123 commented Oct 8, 2018

Current:

master1
master2
master3

[etcd]
master1
master2
master3

[kube-node]
node1
node2
node3

Added node3 to group etcd:

master1
master2
master3
master4
master5

[etcd]
master1
master2
master3
master4
master5

[kube-node]
node1
node2
node3

Now i have 3 master/etcd and 45 nodes, i've already refrenced #1122 but couldn't fix it.I extended etcd success but master failed.It shows "kubectl" error:

Unable to connect to the server: x509: certificate is valid for "new master ip"

And my extend command is:

ansible-playbook -i inventory/mycluster/host.ini cluster.yml -l master1,master2,master3,master4,master5

My kubernetes cluster version is 1.9.3,how to fix it?

@ykfq
Copy link

ykfq commented Apr 1, 2019

The feature of scaling master nodes seems imperfect, but it's possible to scale etcd cluster seperatly, to do so, just add etcd nodes under [etcd] and rerun cluster.yml.

@juliohm1978
Copy link
Contributor

I'm facing the same issue with adding new masters. I'm using Kubespray v2.10.x and the reason it fails is that Kubespray does not update the apiserver certificates to add the new master to the SAN list.

You can check your certificate with

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

... and the new master IP and hostname should be listed in the Subject Alternative Name section.

X509v3 Subject Alternative Name: 
                DNS:infra00-lab, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:infra00-lab, DNS:lb-apiserver.kubernetes.local, IP Address:10.233.0.1, IP Address:172.31.134.110, IP Address:172.31.134.110, IP Address:10.233.0.1, IP Address:127.0.0.1, IP Address:172.31.134.110

The execution of cluster.yml adds the new master IP and hostname to /etc/kubernetes/kubeadm-config.yaml as expected. It seems, however, that Kubespray is not calling kubeadm to replace the certificate before trying to join the new master node. We fixed this by using kubeadm manually to recreate the certificate.

NOTE: The works for v2.10.x. I never tested this in older versions of Kubespray.

In your first master, recreate the apiserver certificate.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

If you are doing this after you ended up with a broken master, be sure to run reset.yml using the parameter --limit=<broken_master_hostname> before continuing. If you take the precaution of recreating the certificate before adding the new master node, you won't need this.

Run cluster.yml to include the new master node. You should end up with a working cluster.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 10, 2019
@ppcololo
Copy link
Contributor

is it possible to add master? or replace failed master for new one?

@juliohm1978
Copy link
Contributor

juliohm1978 commented Sep 10, 2019

You should be able to. In the past, we managed to replace all nodes in the cluster: master, etcd and workers. But.... there are some misteps you need to be carefull along the way. After a lot of experiments and retries in our lab environment, we came up with a few guidelines.

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

For some reason, Kubespray will not update the apiserver certificate.

Edit /etc/kubernetes/kubeadm-config.yaml, include new host in certSANs list.

Use kubeadm to recreate the certs.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

Check the certificate, new host needs to be there.

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

2) Run cluster.yml

Add the new host to the inventory and run cluster.yml.

3) Restart kube-system/nginx-proxy

In all hosts, restart nginx-proxy pod. This pod is a local proxy for the apiserver. Kubespray will update its static config, but it needs to be restarted in order to reload.

# run in every host
docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restart

4) Remove old master nodes

If you are replacing a node, remove the old one from the inventory, and remove from the cluster runtime.

kubectl drain --force --ignore-daemonsets --grace-period 300 --timeout 360s --delete-local-data NODE_NAME

kubectl delete node NODE_NAME

After that, the old node can be safely shutdown. Also, make sure to restart nginx-proxy in all remaining nodes (step 3)

From any active master that remains in the cluster, re-upload kubeadm-config.yaml

kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml

Adding/replacing a worker node

This should be the easiest.

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

You can use --limit=node1 to limit Kubespray to avoid disturbing other nodes in the cluster.

3) Drain the node that will be removed

kubectl drain --force=true --grace-period=10 --ignore-daemonsets=true --timeout=0s --delete-local-data NODE_NAME

4) Run the remove-node.yml playbook

With the old node still in the inventory, run remove-node.yml. You need to pass -e node=NODE_NAME to the playbook to limit the execution to the node being removed.

5) Remove the node from the inventory

That's it.

Adding/Replacing an etcd node

You need to make sure there are always an odd number of etcd nodes in the cluster. In such a way, this is always a replace or scale up operation. Either add two new nodes or remove an old one.

1) Add the new node running cluster.yml.

Update the inventory and run cluster.yml passing --limit=etcd,kube-master -e ignore_assert_errors=yes.

Run upgrade-cluster.yml also passing --limit=etcd,kube-master -e ignore_assert_errors=yes. This is necessary to update all etcd configuration in the cluster.

At this point, you will have an even number of nodes. Everything should still be working, and you should only have problems if the cluster decides to elect a new etcd leader before you remove a node. Even so, running applications should continue to be available.

2) Remove an old etcd node

With the node still in the inventory, run remove-node.yml passing -e node=NODE_NAME as the name of the node that should be removed.

3) Make sure the remaining etcd members have their config updated

In each etcd host that remains in the cluster:

cat /etc/etcd.env | grep ETCD_INITIAL_CLUSTER

Only active etcd members should be in that list.

4) Remove old etcd members from the cluster runtime

Acquire a shell prompt into one of the etcd containers and use etcdctl to remove the old member.

# list all members
etcdctl member list 

# remove old member
etcdctl member remove MEMBER_ID

# careful!!! if you remove a wrong member you will be in trouble

# note: these command lines are actually much bigger, since you need to pass all certificates to etcdctl.

5) Make sure the apiserver config is correctly updated.

In every master node, edit /etc/kubernetes/manifests/kube-apiserver.yaml. Make sure only active etcd nodes are still present in the apiserver command line parameter --etcd-servers=....

6) Shutdown the old instance

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 10, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yujunz
Copy link
Member

yujunz commented Nov 26, 2019

@qvicksilver
Copy link
Contributor

@yujunz Not sure, haven't really tried that use case. Also I'm a bit unsure of the state of that playbook. Haven't had time to add it to CI. But please do try.

@holmesb
Copy link
Contributor

holmesb commented Jan 21, 2020

The procedure to add\remove masters belongs in the readme, not hidden away in a comment in this issue.

@floryut
Copy link
Member

floryut commented Apr 10, 2020

The procedure to add\remove masters belongs in the readme, not hidden away in a comment in this issue.

To be sure everybody see this, this was PR in #5570 and you can now find it here https://kubespray.io/#/docs/nodes

@maxisam
Copy link
Contributor

maxisam commented Jul 3, 2020

docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restar

I think this line doesn't work anymore, there is no k8s_nginx-proxy_nginx-proxy pod.

@olegsidokhmetov
Copy link

You should be able to. In the past, we managed to replace all nodes in the cluster: master, etcd and workers. But.... there are some misteps you need to be carefull along the way. After a lot of experiments and retries in our lab environment, we came up with a few guidelines.

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

For some reason, Kubespray will not update the apiserver certificate.

Edit /etc/kubernetes/kubeadm-config.yaml, include new host in certSANs list.

Use kubeadm to recreate the certs.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

Check the certificate, new host needs to be there.

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

2) Run cluster.yml

Add the new host to the inventory and run cluster.yml.

3) Restart kube-system/nginx-proxy

In all hosts, restart nginx-proxy pod. This pod is a local proxy for the apiserver. Kubespray will update its static config, but it needs to be restarted in order to reload.

# run in every host
docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restart

4) Remove old master nodes

If you are replacing a node, remove the old one from the inventory, and remove from the cluster runtime.

kubectl drain --force --ignore-daemonsets --grace-period 300 --timeout 360s --delete-local-data NODE_NAME

kubectl delete node NODE_NAME

After that, the old node can be safely shutdown. Also, make sure to restart nginx-proxy in all remaining nodes (step 3)

From any active master that remains in the cluster, re-upload kubeadm-config.yaml

kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml

Adding/replacing a worker node

This should be the easiest.

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

You can use --limit=node1 to limit Kubespray to avoid disturbing other nodes in the cluster.

3) Drain the node that will be removed

kubectl drain --force=true --grace-period=10 --ignore-daemonsets=true --timeout=0s --delete-local-data NODE_NAME

4) Run the remove-node.yml playbook

With the old node still in the inventory, run remove-node.yml. You need to pass -e node=NODE_NAME to the playbook to limit the execution to the node being removed.

5) Remove the node from the inventory

That's it.

Adding/Replacing an etcd node

You need to make sure there are always an odd number of etcd nodes in the cluster. In such a way, this is always a replace or scale up operation. Either add two new nodes or remove an old one.

1) Add the new node running cluster.yml.

Update the inventory and run cluster.yml passing --limit=etcd,kube-master -e ignore_assert_errors=yes.

Run upgrade-cluster.yml also passing --limit=etcd,kube-master -e ignore_assert_errors=yes. This is necessary to update all etcd configuration in the cluster.

At this point, you will have an even number of nodes. Everything should still be working, and you should only have problems if the cluster decides to elect a new etcd leader before you remove a node. Even so, running applications should continue to be available.

2) Remove an old etcd node

With the node still in the inventory, run remove-node.yml passing -e node=NODE_NAME as the name of the node that should be removed.

3) Make sure the remaining etcd members have their config updated

In each etcd host that remains in the cluster:

cat /etc/etcd.env | grep ETCD_INITIAL_CLUSTER

Only active etcd members should be in that list.

4) Remove old etcd members from the cluster runtime

Acquire a shell prompt into one of the etcd containers and use etcdctl to remove the old member.

# list all members
etcdctl member list 

# remove old member
etcdctl member remove MEMBER_ID

# careful!!! if you remove a wrong member you will be in trouble

# note: these command lines are actually much bigger, since you need to pass all certificates to etcdctl.

5) Make sure the apiserver config is correctly updated.

In every master node, edit /etc/kubernetes/manifests/kube-apiserver.yaml. Make sure only active etcd nodes are still present in the apiserver command line parameter --etcd-servers=....

6) Shutdown the old instance

Hello!
I have some issue with this commands

quersys@node1:/etc/kubernetes$ sudo kubeadm init phase certs apiserver --config kubeadm-config.yaml
W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[certs] Using existing apiserver certificate and key on disk

@juliohm1978
Copy link
Contributor

What version of K8s are you using? It's been almost a year since I posted. Did something change in kubeadm since then?

I would start by searching for official instructios on how to renew and recreate certs.

https://kubernetes.io/docs/tasks/administer-cluster/

@olegsidokhmetov
Copy link

olegsidokhmetov commented Aug 10, 2020

What version of K8s are you using? It's been almost a year since I posted. Did something change in kubeadm since then?

I would start by searching for official instructios on how to renew and recreate certs.

https://kubernetes.io/docs/tasks/administer-cluster/

quersys@node1:/etc/kubernetes$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:51:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

I tried to find information about 6 hours :(

@juliohm1978
Copy link
Contributor

W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]

Those look like warnings. Most people seem to ignore them. Are you sure no error messages appear as well? Does it hang and never return? If that's the case, I'd wait for a timeout to hopefully get some actual error messages.

@olegsidokhmetov
Copy link

olegsidokhmetov commented Aug 10, 2020 via email

@juliohm1978
Copy link
Contributor

Sounds like a conectivity problem or something that leads to it. If you can provide any further logs and relevant messages, it would be helpful.

@olegsidokhmetov
Copy link

olegsidokhmetov commented Aug 11, 2020

Thanks!!!

I have my new node ip in apiserver.crt

DNS:node1, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:node1, DNS:node3, DNS:lb-apiserver.kubernetes.local, DNS:node1.cluster.local, DNS:node3.cluster.local, IP Address:10.233.0.1, IP Address:172.26.1.225, IP Address:172.26.1.225, IP Address:10.233.0.1, IP Address:127.0.0.1, IP Address:172.26.1.225, IP Address:172.26.1.130

but when I use command ansible-playbook -i inventory/quersyscluster/hosts.yml cluster.yml I have problem connection "timeout"

@juliohm1978
Copy link
Contributor

Please post relevant log messages for more context. At this level, "connection timeout" is a broad error message.

@dagorka
Copy link

dagorka commented Oct 29, 2020

Hi,
I am interested in to replace the first master (and the others) in kubernetes cluster using kubespray scripts. Is it possible?

Story:
I have build k8s cluster using kubespray scripts on Openstack with an old centos7 image. Next I want to upgrade OS, eg. from 7.7 to 7.8. I have newer OS image on Openstack prepared. I am able to deploy new masters and new workers with newer OS image. But there is a problem with first master. I need to delete whole vm and bring the new one with new OS. Did you have similar problem?

I tried to force master2 to be the first one, but when I do a join task on new master (eg. master4) it looks like kubeadm still want to connect to master1 (6.0.1.57):

kubeadm join --config kubeadm-controlplane.yaml --ignore-preflight-errors=all
W1028 12:37:17.050916    1666 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING FileExisting-ebtables]: ebtables not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Get https://6.0.1.57:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s: dial tcp 6.0.1.57:6443: connect: no route to host
To see the stack trace of this error execute with --v=5 or higher

Present, eg.:
master1, centos7.7
master2, centos7.7
master3, centos7.7

worker1, centos7.7
worker2, centos7.7
worker3, centos7.7

Expected:
master2, centos7.8 - master2 becomes the first one
master3, centos7.8
master4, centos7.8

worker1, centos7.8
worker2, centos7.8
worker3, centos7.8

How did you manage with recreate first master?

@juliohm1978 , maybe can you help?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet