Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to renew the certificate when apiserver cert expired? #581

Closed
zalmanzhao opened this issue Nov 30, 2017 · 51 comments
Closed

how to renew the certificate when apiserver cert expired? #581

zalmanzhao opened this issue Nov 30, 2017 · 51 comments

Comments

@zalmanzhao
Copy link

Is this a request for help?

If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.

If no, delete this section and continue on.

What keywords did you search in kubeadm issues before filing this one?

If you have found any duplicates, you should instead reply there and close this page.

If you have not found any duplicates, delete this section and continue on.

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT or FEATURE REQUEST

Versions

kubeadm version (use kubeadm version):1.7.5

Environment:

  • Kubernetes version (use kubectl version):1.7.5
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Others:

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

@errordeveloper
Copy link
Member

Duplicate of #206.

@kachkaev
Copy link

kachkaev commented Aug 3, 2018

@zalmanzhao did you manage to solve this issue?

I created a kubeadm v1.9.3 cluster just over a year ago and it was working fine all this time. I went to update one deployment today and realised I was locked out of the API because the cert got expired. I can't even kubeadm alpha phase certs apiserver, because I get failure loading apiserver certificate: the certificate has expired (kubeadm version is currently 1.10.6 since I want to upgrade).

Adding insecure-skip-tls-verify: true to ~/.kube/configclusters[0].cluser does not help too – I see You must be logged in to the server (Unauthorized) when trying to kubectl get pods (kubernetes/kubernetes#39767).

The cluster is working, but it lives its own life until it self-destroys or until things get fixed 😅 Unfortunately, I could not find a solution for my situation in #206 and am wondering how to get out of it. The only relevant material I could dig out was a blog post called ‘How to change expired certificates in kubernetes cluster’, which looked promising at first glance. However, it did not fit in the end because my master machine did not have /etc/kubernetes/ssl/ folder (only /etc/kubernetes/pki/) – either I have a different k8s version or I simply deleted that folder without noticing.

@errordeveloper could you please recommend something? I'd love to fix things without kubeadm reset and payload recreation.

@davidcomeyne
Copy link

@kachkaev Did you have any luck on renewing the certs without resetting the kubeadm?
If so, please share, I'm having the same issue here with k8s 1.7.4. And I can't seem to upgrade ($ kubeadm upgrade plan) because the error pops up again telling me the the certificate has expired and that it cannot list the masters in my cluster:

[ERROR APIServerHealth]: the API Server is unhealthy; /healthz didn't return "ok"
[ERROR MasterNodesReady]: couldn't list masters in cluster: Get https://172.31.18.88:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D: x509: certificate has expired or is not yet valid

@kachkaev
Copy link

kachkaev commented Sep 6, 2018

Unfortunately, I gave up in the end. The solution was to create a new cluster, restore all the payload on it, switch DNS records and finally delete the original cluster 😭 At least there was no downtime because I was lucky enough to have healthy pods on the old k8s during the transition.

@davidcomeyne
Copy link

Thanks @kachkaev for responding. I will nonetheless give it another try.
If I find something I will make sure to post it here...

@danroliver
Copy link

danroliver commented Sep 14, 2018

If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).

You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.

  1. Update Kubeadm, if needed. I was on 1.7 previously.
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
  1. Backup old apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.old
$ sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/front-proxy-client.crt.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.key /etc/kubernetes/pki/front-proxy-client.key.old
  1. Generate new apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo kubeadm alpha phase certs apiserver --apiserver-advertise-address <IP address of your master server>
$ sudo kubeadm alpha phase certs apiserver-kubelet-client
$ sudo kubeadm alpha phase certs front-proxy-client
  1. Backup old configuration files
$ sudo mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.old
$ sudo mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.old
$ sudo mv /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.old
$ sudo mv /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.old
  1. Generate new configuration files.

There is an important note here. If you are on AWS, you will need to explicitly pass the --node-name parameter in this request. Otherwise you will get an error like: Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internal in your logs sudo journalctl -u kubelet --all | tail and the Master Node will report that it is Not Ready when you run kubectl get nodes.

Please be certain to replace the values passed in --apiserver-advertise-address and --node-name with the correct values for your environment.

$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address 10.0.8.141 --node-name ip-10-0-8-141.ec2.internal
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"

  1. Ensure that your kubectl is looking in the right place for your config files.
$ mv .kube/config .kube/config.old
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo chmod 777 $HOME/.kube/config
$ export KUBECONFIG=.kube/config
  1. Reboot your master node
$ sudo /sbin/shutdown -r now
  1. Reconnect to your master node and grab your token, and verify that your Master Node is "Ready". Copy the token to your clipboard. You will need it in the next step.
$ kubectl get nodes
$ kubeadm token list

If you do not have a valid token. You can create one with:

$ kubeadm token create

The token should look something like 6dihyb.d09sbgae8ph2atjw

  1. SSH into each of the slave nodes and reconnect them to the master
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
$ sudo kubeadm join --token=<token from step 8>  <ip of master node>:<port used 6443 is the default> --node-name <should be the same one as from step 5>

  1. Repeat Step 9 for each connecting node. From the master node, you can verify that all slave nodes have connected and are ready with:
$ kubectl get nodes

Hopefully this gets you where you need to be @davidcomeyne.

@davidcomeyne
Copy link

Thanks a bunch @danroliver !
I will definitely try that and post my findings here.

@ivan4th
Copy link

ivan4th commented Oct 22, 2018

@danroliver Thanks! Just tried it on an old single-node cluster, so did steps up to 7. It worked.

@dmellstrom
Copy link

@danroliver Worked for me. Thank you.

@davidcomeyne
Copy link

Did not work for me, had to set up a new cluster. But glad it helped others!

@fovecifer
Copy link

thank you @danroliver . it works for me
and my kubeadm version is 1.8.5

@kvchitrapu
Copy link

kvchitrapu commented Nov 1, 2018

Thanks @danroliver putting together the steps. I had to make small additions to your steps. My cluster is running v1.9.3 and it is in a private datacenter off of the Internet.

On the Master

  1. Prepare a kubeadm config.yml.
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
api:
  advertiseAddress: <master-ip>
kubernetesVersion: 1.9.3
  1. Backup certs and conf files
mkdir ~/conf-archive/
for f in `ls *.conf`;do mv $f ~/conf-archive/$f.old;done

mkdir ~/pki-archive
for f in `ls apiserver* front-*client*`;do mv $f ~/pki-archive/$f.old;done
  1. The kubeadm commands on master had --config config.yml like this:
kubeadm alpha phase certs apiserver --config ./config.yml 
kubeadm alpha phase certs apiserver-kubelet-client --config ./config.yml 
kubeadm alpha phase certs front-proxy-client --config ./config.yml
kubeadm alpha phase kubeconfig all --config ./config.yml --node-name <master-node>
reboot
  1. Create token

On the minions

I had to move

mv /etc/kubernetes/pki/ca.crt ~/archive/
mv /etc/kubernetes/kubelet.conf ~/archive/
systemctl stop kubelet
kubeadm join --token=eeefff.55550009999b3333 --discovery-token-unsafe-skip-ca-verification <master-ip>:6443

@borispf
Copy link

borispf commented Nov 14, 2018

Thanks @danroliver! Only my single-node cluster it was enough to follow steps 1-6 (no reboot) then send a SIGHUP to kube-apiserver. Just found the container id with docker ps and set the signal with docker kill -s HUP <container id>.

@BastienL
Copy link

BastienL commented Dec 5, 2018

Thanks a lot @danroliver! On our single-master/multi-workers cluster, doing the steps from 1 to 7 were enough, we did not have to reconnect every worker node to the master (which was the most painful part).

@kcronin
Copy link

kcronin commented Dec 12, 2018

Thanks for this great step-by-step, @danroliver! I'm wondering how this process might be applied to a multi-master cluster (bare metal, currently running 1.11.1), and preferably without downtime. My certs are not yet expired, but I am trying to learn how to regenerate/renew them before that happens.

@neolit123
Copy link
Member

@kcronin
please take a look at this new document:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
hope that helps.

@radanliska
Copy link

@danroliver: Thank you very much, it's working.

It's not necessary to reboot the servers.
It's enought to recreate kube-system pods (apiserver, schduler, ...) by these two commands:

systemctl restart kubelet
for i in $(docker ps | egrep 'admin|controller|scheduler|api|fron|proxy' | rev | awk '{print $1}' | rev);
do docker stop $i; done

@pmcgrath
Copy link

pmcgrath commented Mar 11, 2019

I had to deal with this also on a 1.13 cluster, in my case the certificates were about to expire so slightly different
Also dealing with a single master\control instance on premise so did not have to worry about a HA setup or AWS specifics
Have not included the back steps as the other guys have included above

Since the certs had not expired, the cluster already had workloads which I wanted to continue working
Did not have to deal with etcd certs either at this time so have omitted

So at a high level I had to

  • On the master
    • Generate new certificates on the master
    • Generate new kubeconfigs with embedded certificates
    • Generate new kubelet certicates - client and server
    • Generate a new token for the worker node kubelets
  • For each worker
    • Drain the worker first on the master
    • ssh to the worker, stop the kubelet, remove files and restart the kubelet
    • Uncordon the worker on the master
  • On master at the end
    • Delete token
# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates

# Generate the new certificates - you may have to deal with AWS - see above re extra certificate SANs
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client

# Generate new kube-configs with embedded certificates - Again you may need extra AWS specific content - see above
sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf

# chown and chmod so they match existing files
sudo chown root:root {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf

# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/

# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

# Generate kubelet.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo chown root:root kubelet.conf
sudo chmod 600 kubelet.conf

# Drain
kubectl drain --ignore-daemonsets $(hostname)
# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)

# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

Lets create a new token for nodes re-joining the cluster (After kubelet restart)

# On master
sudo kubeadm token create

Now for each worker - one at a time

kubectl drain --ignore-daemonsets --delete-local-data WORKER-NODE-NAME

ssh to worker node

# Stop kubelet
sudo systemctl stop kubelet

# Delete files
sudo rm /etc/kubernetes/kubelet.conf
sudo rm /var/lib/kubelet/pki/*

# Alter the bootstrap token
new_token=TOKEN-FROM-CREATION-ON-MASTER
sudo sed -i "s/token: .*/token: $new_token/" /etc/kubernetes/bootstrap-kubelet.conf

# Start kubelet
sudo systemctl start kubelet
		
# Check kubelet certificate
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet.crt -text -noout | grep Not

Back to master and uncordon the worker

kubectl uncordon WORKER-NODE-NAME

After all workers have been updated - Remove token - will expire in 24h but lets get rid of it

On master
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER

@rodrigc
Copy link

rodrigc commented Apr 6, 2019

@pmcgrath Thanks for posting those steps. I managed to follow them and renew my certificates, and get a working cluster.

@desdic
Copy link

desdic commented May 27, 2019

If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).

You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.

1. Update Kubeadm, if needed. I was on 1.7 previously.
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
1. Backup old apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.old
$ sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/front-proxy-client.crt.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.key /etc/kubernetes/pki/front-proxy-client.key.old
1. Generate new apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo kubeadm alpha phase certs apiserver --apiserver-advertise-address <IP address of your master server>
$ sudo kubeadm alpha phase certs apiserver-kubelet-client
$ sudo kubeadm alpha phase certs front-proxy-client
1. Backup old configuration files
$ sudo mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.old
$ sudo mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.old
$ sudo mv /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.old
$ sudo mv /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.old
1. Generate new configuration files.

There is an important note here. If you are on AWS, you will need to explicitly pass the --node-name parameter in this request. Otherwise you will get an error like: Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internal in your logs sudo journalctl -u kubelet --all | tail and the Master Node will report that it is Not Ready when you run kubectl get nodes.

Please be certain to replace the values passed in --apiserver-advertise-address and --node-name with the correct values for your environment.

$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address 10.0.8.141 --node-name ip-10-0-8-141.ec2.internal
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
1. Ensure that your `kubectl` is looking in the right place for your config files.
$ mv .kube/config .kube/config.old
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo chmod 777 $HOME/.kube/config
$ export KUBECONFIG=.kube/config
1. Reboot your master node
$ sudo /sbin/shutdown -r now
1. Reconnect to your master node and grab your token, and verify that your Master Node is "Ready". Copy the token to your clipboard. You will need it in the next step.
$ kubectl get nodes
$ kubeadm token list

If you do not have a valid token. You can create one with:

$ kubeadm token create

The token should look something like 6dihyb.d09sbgae8ph2atjw

1. SSH into each of the slave nodes and reconnect them to the master
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
$ sudo kubeadm join --token=<token from step 8>  <ip of master node>:<port used 6443 is the default> --node-name <should be the same one as from step 5>
1. Repeat Step 9 for each connecting node. From the master node, you can verify that all slave nodes have connected and are ready with:
$ kubectl get nodes

Hopefully this gets you where you need to be @davidcomeyne.

This is what I need only for 1.14.2 .. any hints on how to

I had to deal with this also on a 1.13 cluster, in my case the certificates were about to expire so slightly different
Also dealing with a single master\control instance on premise so did not have to worry about a HA setup or AWS specifics
Have not included the back steps as the other guys have included above

Since the certs had not expired, the cluster already had workloads which I wanted to continue working
Did not have to deal with etcd certs either at this time so have omitted

So at a high level I had to

* On the master
  
  * Generate new certificates on the master
  * Generate new kubeconfigs with embedded certificates
  * Generate new kubelet certicates - client and server
  * Generate a new token for the worker node kubelets

* For each worker
  
  * Drain the worker first on the master
  * ssh to the worker, stop the kubelet, remove files and restart the kubelet
  * Uncordon the worker on the master

* On master at the end
  
  * Delete token
# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates

# Generate the new certificates - you may have to deal with AWS - see above re extra certificate SANs
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client

# Generate new kube-configs with embedded certificates - Again you may need extra AWS specific content - see above
sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf

# chown and chmod so they match existing files
sudo chown root:root {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf

# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/

# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

# Generate kubelet.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo chown root:root kubelet.conf
sudo chmod 600 kubelet.conf

# Drain
kubectl drain --ignore-daemonsets $(hostname)
# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)

# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

Lets create a new token for nodes re-joining the cluster (After kubelet restart)

# On master
sudo kubeadm token create

Now for each worker - one at a time

kubectl drain --ignore-daemonsets --delete-local-data WORKER-NODE-NAME

ssh to worker node

# Stop kubelet
sudo systemctl stop kubelet

# Delete files
sudo rm /etc/kubernetes/kubelet.conf
sudo rm /var/lib/kubelet/pki/*

# Alter the bootstrap token
new_token=TOKEN-FROM-CREATION-ON-MASTER
sudo sed -i "s/token: .*/token: $new_token/" /etc/kubernetes/bootstrap-kubelet.conf

# Start kubelet
sudo systemctl start kubelet
		
# Check kubelet certificate
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet.crt -text -noout | grep Not

Back to master and uncordon the worker

kubectl uncordon WORKER-NODE-NAME

After all workers have been updated - Remove token - will expire in 24h but lets get rid of it

On master
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER

I know this issue is closed but I have the same problem on 1.14.2 and the guide gives no errors but I cannot connect to the cluster and reissue the token (I get auth failed)

@terrywang
Copy link

A k8s cluster created using kubeadm v1.9.x experienced the same issue (apiserver-kubelet-client.crt expired on 2 July) at the age of v1.14.1 lol

I had to refer to 4 different sources to renew the certificates, regenerate the configuration files and bring the simple 3 node cluster back.

@danroliver gave very good and structured instructions, very close to the below guide from IBM.
[Renewing Kubernetes cluster certificates] from IBM WoW! (https://www.ibm.com/support/knowledgecenter/en/SSCKRH_1.1.0/platform/t_certificate_renewal.html)

NOTE: IBM Financial Crimes Insight with Watson private is powered by k8s, never knew that.

Problem with step 3 and step 5

Step 3 should NOT have the phase in the command

$ sudo kubeadm alpha certs renew apiserver
$ sudo kubeadm alpha certs renew apiserver-kubelet-client
$ sudo kubeadm alpha certs renew front-proxy-client

Step 5 should be using below, kubeadm alpha does not have kubeconfig all, that is a kubeadm init phase instead

# kubeadm init phase kubeconfig all
I0705 12:42:24.056152   32618 version.go:240] remote version is much newer: v1.15.0; falling back to: stable-1.14
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file

@anapsix
Copy link

anapsix commented Jan 20, 2020

Note about tokens in K8s 1.13.x (possibly other K8s versions)
If you've ended up re-generating your CA certificate (/etc/kubernetes/pki/ca.crt), your tokens (see kubectl -n kube-system get secret | grep token) might have old CA, and will have to be regenerated. Troubled tokens included kube-proxy-token, coredns-token in my case (and others), which caused cluster-critical services to unable to authenticate with K8s API.
To regenerate tokens, delete old ones, and they will be recreated.
Same goes for any services talking to K8s API, such as PV Provisioner, Ingress Controllers, cert-manager, etc..

@shortsteps
Copy link

Thanks for this great step-by-step, @danroliver! I'm wondering how this process might be applied to a multi-master cluster (bare metal, currently running 1.11.1), and preferably without downtime. My certs are not yet expired, but I am trying to learn how to regenerate/renew them before that happens.

Hi @kcronin, how did you solved with multi-master config? I don't know how to proceed with --apiserver-advertise-address as I have 3 IPs and not only one.

Thanks

@SuleimanWA
Copy link

@pmcgrath In case I have 3 masters, should I repeat the steps on each master? or what is the . case

@anapsix
Copy link

anapsix commented Feb 27, 2020

@SuleimanWA, you can copy admin.conf over, as well as CA file, if CA was regenerated.
For everything else, you should repeat steps to regenerate certs (for etcd, kubelet, scheduler, etc..) on every master

@realshuting
Copy link

realshuting commented Mar 3, 2020

@anapsix
I'm running a 1.13.x cluster, and apiserver is reporting Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid] after I renewed the certs by running kubeadm alpha certs renew all.

To regenerate tokens, delete old ones, and they will be recreated.

Which token are you referring to in this case? Is the one generated by kubeadm or how can I delete the token ?

-----UPDATE-----
I figured out it's the secret itself. In my case the kube-controller was not up so the secret was not auto-generated.

@yanpengfei
Copy link

high version use:

kubeadm alpha certs renew all

@leh327
Copy link

leh327 commented Nov 22, 2020

When first master node's kubelet down (systemctl stop kubelet), other master nodes can't contact CA on the first master node. This resulting in the following message until kubelet on original master node brought back online:

kubectl get nodes
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get nodes)

Is there a way to have CA role transfer to other master nodes while the kublet on original CA node down?

@sumitKash
Copy link

@anapsix
I'm running a 1.13.x cluster, and apiserver is reporting Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid] after I renewed the certs by running kubeadm alpha certs renew all.

To regenerate tokens, delete old ones, and they will be recreated.

Which token are you referring to in this case? Is the one generated by kubeadm or how can I delete the token ?

-----UPDATE-----
I figured out it's the secret itself. In my case the kube-controller was not up so the secret was not auto-generated.

Hi, i have done this task but not on 1.13 version. May i ask few things if you have done this already?
So basically i will be doing:
kubeadm alpha certs renew all (which updates the control plane cert uber pki/ folder on Masters).
kubeadm init phase kubeconfig to update the kube config files. (On Master and worker).
Restart kubelet on all nodes.

Do i still need to create a token and run join on worker nodes? If possible, can you shares the steps you performed?

@lisenet
Copy link

lisenet commented Apr 8, 2021

@pmcgrath thanks a bunch for your comment, I used the instructions to update certificates on my Kubernetes 1.13 cluster.

@usamacheema786
Copy link

simplest way to update your k8s certs

kubeadm alpha certs check-expiration


kubeadm alpha certs renew all
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
systemctl daemon-reload&&systemctl restart kubelet

@neolit123
Copy link
Member

neolit123 commented Jun 3, 2021

sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf

you might also what to symlink the cert / key to files if kubelet client cert is enabled (it is by default):

client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/

@gemfield
Copy link

For k8s 1.15 ~ 1.18, this may be helpful: https://zhuanlan.zhihu.com/p/382605009

@crrazyman
Copy link

For Kubernetes v1.14 I find this procedure proposed by @desdic the most helpful:

$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/
$ kubeadm init phase certs all --apiserver-advertise-address <IP>
  • backup and re-generate all kubeconfig files:
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,mv kubelet.conf,scheduler.conf} ~/
$ kubeadm init phase kubeconfig all
$ reboot
  • copy new admin.conf:
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

Hello,

After following this ^ everything is OK (kubectl get nodes shows both nodes Ready) BUT! alot of pods (in kube-system and in all other namespaces) are stuck in ContainerCreating state.
also, in kube-system:

root@kube-master:~# kubectl -n kube-system get pods
NAME                                                               READY   STATUS              RESTARTS   AGE
calico-kube-controllers-6fc9b4f7d9-kmt9q                           0/1     Running             1          26m
calico-node-hb6rm                                                  1/1     Running             5          380d
calico-node-vtt6l                                                  1/1     Running             11         384d
coredns-74c9d4d795-l6x9h                                           0/1     ContainerCreating   0          5m9s
coredns-74c9d4d795-m6lgf                                           0/1     ContainerCreating   0          5m9s
dns-autoscaler-576b576b74-stnrz                                    0/1     ContainerCreating   0          5m9s
kube-apiserver-kube-master.domain.com                               1/1     Running             1654       84m
kube-controller-manager-kube-master.domain.com                      1/1     Running             77         385d
kube-proxy-7bgjn                                                   1/1     Running             6          380d
kube-proxy-pq4wr                                                   1/1     Running             5          380d
kube-scheduler-kube-master.domain.com                               1/1     Running             72         385d
kubernetes-dashboard-7c547b4c64-qhk2k                              0/1     Error               0          380d
nginx-proxy-kube-node01.domain.com                                  1/1     Running             2704       380d
nodelocaldns-6c5v2                                                 1/1     Running             5          380d
nodelocaldns-nmtg6                                                 1/1     Running             11         384d

The thing is that now i think that nobody can talk to kube-apiserver:

root@kube-master:~# kubectl -n kube-system logs kube-apiserver-kube-master.domain.com
[....]
I1014 11:51:19.817069       1 log.go:172] http: TLS handshake error from 10.18.74.25:59948: remote error: tls: bad certificate
I1014 11:51:19.819972       1 log.go:172] http: TLS handshake error from 10.18.74.25:59952: remote error: tls: bad certificate
I1014 11:51:20.733394       1 log.go:172] http: TLS handshake error from 127.0.0.1:43966: remote error: tls: bad certificate
I1014 11:51:20.734734       1 log.go:172] http: TLS handshake error from 127.0.0.1:43968: remote error: tls: bad certificate
I1014 11:51:20.823670       1 log.go:172] http: TLS handshake error from 10.18.74.25:59978: remote error: tls: bad certificate
I1014 11:51:20.823905       1 log.go:172] http: TLS handshake error from 10.18.74.25:59974: remote error: tls: bad certificate

All the pods that are stuck in ContaierCreating shows this in description:

  Normal   SandboxChanged          10m (x13 over 16m)    kubelet, kube-node01.domain.com  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  92s (x18 over 11m)    kubelet, kube-node01.domain.com  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f0e95a2fec2492c63c6756e3d71be3ee911a4c872a83f6f20c98cfc261045e7f" network for pod "www-mongo-0": NetworkPlugin cni failed to set up pod "www-mongo-0_default" network: Get https://[10.233.0.1]:443/api/v1/namespaces/default: dial tcp 10.233.0.1:443: i/o timeout

I have a cluster of 2 nodes:

  • kube-master
  • kube-node01 - 10.18.74.25

@neolit123
Copy link
Member

Normal SandboxChanged 10m (x13 over 16m) kubelet, kube-node01.domain.com Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 92s (x18 over 11m) kubelet, kube-node01.domain.com (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f0e95a2fec2492c63c6756e3d71be3ee911a4c872a83f6f20c98cfc261045e7f" network for pod "www-mongo-0": NetworkPlugin cni failed to set up pod "www-mongo-0_default" network: Get https://[10.233.0.1]:443/api/v1/namespaces/default: dial tcp 10.233.0.1:443: i/o timeout

that seems like a CNI plugin problem.
you could try removing Calico with kubectl delete -f .... and adding it again.

@crrazyman
Copy link

This cluster is made with kubespray, i cannot delete calico and add it again. Also, i don't think this is a problem with the CNI. Why does kube-apiserver logs http: TLS handshake error from 10.18.74.25:59948: remote error: tls: bad certificate ?
Again, the cluster was working perfectly until i renewed the certificates

@kruserr
Copy link

kruserr commented May 4, 2022

For anyone that stumbles upon this in the future, which are running a newer version of kubernetes >1.17, this is probably the simplest way to renew your certs.

The following renews all certs, restarts kubelet, takes a backup of the old admin config and applies the new admin config:

kubeadm certs renew all
systemctl restart kubelet
cp /root/.kube/config /root/.kube/.old-$(date --iso)-config
cp /etc/kubernetes/admin.conf /root/.kube/config

@amiyaranjansahoo
Copy link

@danroliver,
Thanks danroliver for the detailed instructions. It worked fine for single master k8s cluster ( 1 Master + 3 Worker)
However i have a multi master k8s cluster ( 3 Master + 5 Worker), so do you think I should follow the same approach to renew the certificates or any additional steps would be required.
FYI I am on v1.12.10
Thanks in advance..

@titaneric
Copy link

For this case, you may still need to ssh into these 3 master node and update the certificates by providing commands cause each master node have their individual api server.

@amiyaranjansahoo
Copy link

Thank you @titaneric, understood i need to recreate/renew the certificates across all master node separaetly.

What about the Step 4 and Step5

step 4 - moving the below old files

/etc/kubernetes/admin.conf
/etc/kubernetes/kubelet.conf
/etc/kubernetes/controller-manager.conf
/etc/kubernetes/scheduler.conf

Step5 - Generating admin.conf, kubelet.conf, controller-manager.conf and scheduler.conf

Using below command
sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address A.B.C.D

Because I can see only the cksum value of admin.conf is same across all master nodes but
cksum of rest of the files (kubelet.conf, controller-manager.conf and scheduler.conf) are different acorss the master nodes.

@Akash3221
Copy link

For anyone that stumbles upon this in the future, which are running a newer version of kubernetes >1.17, this is probably the simplest way to renew your certs.

The following renews all certs, restarts kubelet, takes a backup of the old admin config and applies the new admin config:

kubeadm certs renew all
systemctl restart kubelet
cp /root/.kube/config /root/.kube/.old-$(date --iso)-config
cp /etc/kubernetes/admin.conf /root/.kube/config

Hi @kruserr
After updating the certificate successfully using above commands , when I delete the namespace it keep stuck on terminating state, does anybody has clarity on this

@kruserr
Copy link

kruserr commented May 17, 2023

For anyone that stumbles upon this in the future, which are running a newer version of kubernetes >1.17, this is probably the simplest way to renew your certs.
The following renews all certs, restarts kubelet, takes a backup of the old admin config and applies the new admin config:

kubeadm certs renew all
systemctl restart kubelet
cp /root/.kube/config /root/.kube/.old-$(date --iso)-config
cp /etc/kubernetes/admin.conf /root/.kube/config

Hi @kruserr After updating the certificate successfully using above commands , when I delete the namespace it keep stuck on terminating state, does anybody has clarity on this

Can't say, have not seen that before.
Do you see anything in the events?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests