Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forcibly renewing apiserver.crt, admin.conf, etc. certs; along with kubelet PEM (and restricting lifetime/duration) #1826

Closed
IAXES opened this issue Oct 8, 2019 · 18 comments
Labels
area/security kind/support Categorizes issue or PR as a support question.

Comments

@IAXES
Copy link

IAXES commented Oct 8, 2019

Is this a request for help?

Yes.

What keywords did you search in kubeadm issues before filing this one?

  • Certificates
  • Cert rotation
  • PKI
  • kubelet
  • kubeadm

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT or FEATURE REQUEST

  • Bug report: unexpected functionality.

Versions

1.15.4-00

Environment:

  • Kubernetes version: 1.15.4-00.
  • Cloud provider or hardware configuration: kubeadm on Intel x86_x64 box.
  • OS: Ubuntu Server 18.04 LTS x86_64.
  • Kernel: 5.0.0-29-generic.

What happened?

I configured the ConfigurationManager static pod to use a run-time flag to force certs to expire every 15 minutes instead of every 365 days.

sudo cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep -B20 -A5 duration

spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --use-service-account-credentials=true
    - --experimental-cluster-signing-duration=0h15m0s
    image: k8s.gcr.io/kube-controller-manager:v1.15.4
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:

I then re-create the kubelet PEM cert, on the master and all nodes (or I can wait for the existing certs to expire; the PEM rotation/updates work either way):

# Optional: this will happen automatically on the master and nodes (eventually).
# Kubelet client cert (run on master and all nodes) so kubelet can talk to the API server.
# Working.
sudo rm -rf "/var/lib/kubelet/pki-backup"
sudo mv "/var/lib/kubelet/pki" "/var/lib/kubelet/pki-backup"
sudo systemctl restart kubelet
# Don't skip this delay. The new PEM file takes a while to appear.
sleep 30

I can confirm that the PEM cert has been updated. By polling via the following command, I see that it is updated periodically on the master and nodes via:
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -text | grep -A2 "Validity"

Validity
            Not Before: Oct  8 17:06:00 2019 GMT
            Not After : Oct  8 17:21:00 2019 GMT

However, other certs (e.g. API server cert, and basically all the certs in /etc/kubernetes/pki on the master) have two issues:

They appear to remain unchanged. This is confirmed via:
sudo openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep -A2 "Validity"

        Validity
            Not Before: Oct  7 21:08:23 2019 GMT
            Not After : Oct  7 14:56:02 2020 GMT

...and:
sudo kubeadm alpha certs check-expiration

CERTIFICATE                EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
admin.conf                 Oct 07, 2020 14:56 UTC   364d            no
apiserver                  Oct 07, 2020 14:56 UTC   364d            no
apiserver-etcd-client      Oct 07, 2020 14:56 UTC   364d            no
apiserver-kubelet-client   Oct 07, 2020 14:56 UTC   364d            no
controller-manager.conf    Oct 07, 2020 14:56 UTC   364d            no
etcd-healthcheck-client    Oct 07, 2020 14:56 UTC   364d            no
etcd-peer                  Oct 07, 2020 14:56 UTC   364d            no
etcd-server                Oct 07, 2020 14:56 UTC   364d            no
front-proxy-client         Oct 07, 2020 14:56 UTC   364d            no
scheduler.conf             Oct 07, 2020 14:56 UTC   364d            no

Also, config files (e.g. /etc/kubernetes/admin.conf) also appear to be referring to out-of-date certs.
sudo cat /etc/kubernetes/admin.conf | grep "certificate-authority-data" | cut -d ':' -f2- | sed "s/^\s\+//g" | base64 -d | openssl x509 -noout -text -in - | grep -A2 "Validity"

However, the certs don't get (forcibly) updated. So, I execute the following:
sudo kubeadm alpha certs renew all

certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healtcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed

Now, checking again:
sudo kubeadm alpha certs check-expiration

CERTIFICATE                EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
admin.conf                 Oct 07, 2020 17:10 UTC   364d            no
apiserver                  Oct 07, 2020 17:10 UTC   364d            no
apiserver-etcd-client      Oct 07, 2020 17:10 UTC   364d            no
apiserver-kubelet-client   Oct 07, 2020 17:10 UTC   364d            no
controller-manager.conf    Oct 07, 2020 17:10 UTC   364d            no
etcd-healthcheck-client    Oct 07, 2020 17:10 UTC   364d            no
etcd-peer                  Oct 07, 2020 17:10 UTC   364d            no
etcd-server                Oct 07, 2020 17:10 UTC   364d            no
front-proxy-client         Oct 07, 2020 17:10 UTC   364d            no
scheduler.conf             Oct 07, 2020 17:10 UTC   364d            no

The certs are updated, but, the duration/lifetime is still a year, rather than 15 minutes. Additionally, it appears that /etc/kubernetes/admin.conf is not updated at all.
sudo cat /etc/kubernetes/admin.conf | grep "certificate-authority-data" | cut -d ':' -f2- | sed "s/^\s\+//g" | base64 -d | openssl x509 -noout -text -in - | grep -A2 "Validity"

        Validity
            Not Before: Oct  7 21:08:23 2019 GMT
            Not After : Oct  4 21:08:23 2029 GMT

What you expected to happen?

  1. All the /etc/kubernetes/pki/ certs would be updated (good so far), and would have a 15 minute lifetime (didn't happen; is the --experimental-cluster-signing-duration expected to impact the config files and certs, or just the PEM file used by kubelet?).
  2. The various config files (including /etc/kubernetes/admin.conf) would be (forcibly) updated to reflect new certificates being generated.

How to reproduce it (as minimally and precisely as possible)?

CLI examples provided above.

Anything else we need to know?

I'm attempting to implement a recurring job that will execute either once every 2 hours, or once every week, for a set of isolated kubeadm-based bare-metal and VM-based clusters on air-gapped private networks, being used by teams of students. I need to be able to forcibly update all the certificates periodcially (i.e. much more frequently than once per year) for some labs. Since these machines have no internet access, relying on the upgrade option to automatically handle cert upgrades is not practical/viable. However, I'm now concerned that both manually and automatically driven certificate updates might not update the config files and /etc/kubernetes/pki/ certs (and would like to be able to forcibly update them and reduce their duration/lifetime like the PEM files to confirm all certs are updated as expected).

Thank you for your time and assistance.

References:

  1. "Improvement for k8s.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/ #15292", <https://github.com/kubernetes/website/issues/15292>, last accessed 2019-10-08.
  2. "kubeadm alpha phase certs renew all should also update certs in KubeConfig files kubeadm alpha phase certs renew all should also update certs in KubeConfig files #1361", <https://github.com/kubernetes/kubeadm/issues/1361>, last accessed 2019-10-08.
  3. "Kubelet fails to authenticate to apiserver due to expired certificate #65991", <https://github.com/kubernetes/kubernetes/issues/65991>, last accessed 2019-10-08.
@neolit123
Copy link
Member

All the /etc/kubernetes/pki/ certs would be updated (good so far), and would have a 15 minute lifetime (didn't happen; is the --experimental-cluster-signing-duration expected to impact the config files and certs, or just the PEM file used by kubelet?).

the kubeadm certificate lifespan is not linked to --experimental-cluster-signing-duration=0h15m0s, which is the duration the CM uses to sign certificates. the kubeadm certificate lifespan is also hardcoded to 1 year, but in the kubeadm binary.

you can still rotate the kubeadm certs on any period time you prefer, but the value of 1 year cannot be changed (by design).

i do not recommend using --experimental-cluster-signing-duration
see:
kubernetes/kubernetes#65991

once this period expires you need to rotate cluster CA, which isn't an easy task, especially if you have workloads.

The various config files (including /etc/kubernetes/admin.conf) would be (forcibly) updated to reflect new certificates being generated.

the only way for this to happen is the cluster operator (kubeadm user) to rotate the kubeadm certificats using sudo kubeadm alpha certs renew all.

I'm attempting to implement a recurring job that will execute either once every 2 hours, or once every week, for a set of isolated kubeadm-based bare-metal and VM-based clusters on air-gapped private networks, being used by teams of students. I need to be able to forcibly update all the certificates periodcially (i.e. much more frequently than once per year) for some labs. Since these machines have no internet access, relying on the upgrade option to automatically handle cert upgrades is not practical/viable. However, I'm now concerned that both manually and automatically driven certificate updates might not update the config files and /etc/kubernetes/pki/ certs (and would like to be able to forcibly update them and reduce their duration/lifetime like the PEM files to confirm all certs are updated as expected).

as pointed out you can rotate the kubeadm originated certificates on any perioid of time using the renew command. you don't have to upgrade https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#automatic-certificate-renewal .

this is by design and not a kubeadm bug.
/area security
/triage support
/close

please feel free to continue the discussion if you want.

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

All the /etc/kubernetes/pki/ certs would be updated (good so far), and would have a 15 minute lifetime (didn't happen; is the --experimental-cluster-signing-duration expected to impact the config files and certs, or just the PEM file used by kubelet?).

the kubeadm certificate lifespan is not linked to --experimental-cluster-signing-duration=0h15m0s, which is the duration the CM uses to sign certificates. the kubeadm certificate lifespan is also hardcoded to 1 year, but in the kubeadm binary.

you can still rotate the kubeadm certs on any period time you prefer, but the value of 1 year cannot be changed (by design).

i do not recommend using --experimental-cluster-signing-duration
see:
kubernetes/kubernetes#65991

once this period expires you need to rotate cluster CA, which isn't an easy task, especially if you have workloads.

The various config files (including /etc/kubernetes/admin.conf) would be (forcibly) updated to reflect new certificates being generated.

the only way for this to happen is the cluster operator (kubeadm user) to rotate the kubeadm certificats using sudo kubeadm alpha certs renew all.

I'm attempting to implement a recurring job that will execute either once every 2 hours, or once every week, for a set of isolated kubeadm-based bare-metal and VM-based clusters on air-gapped private networks, being used by teams of students. I need to be able to forcibly update all the certificates periodcially (i.e. much more frequently than once per year) for some labs. Since these machines have no internet access, relying on the upgrade option to automatically handle cert upgrades is not practical/viable. However, I'm now concerned that both manually and automatically driven certificate updates might not update the config files and /etc/kubernetes/pki/ certs (and would like to be able to forcibly update them and reduce their duration/lifetime like the PEM files to confirm all certs are updated as expected).

as pointed out you can rotate the kubeadm originated certificates on any perioid of time using the renew command. you don't have to upgrade https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#automatic-certificate-renewal .

this is by design and not a kubeadm bug.
/area security
/triage support
/close

please feel free to continue the discussion if you want.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/security kind/support Categorizes issue or PR as a support question. labels Oct 8, 2019
@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

@neolit123 Thank you for the clarication and help.

The only outstanding question that comes to mind is why the base64-encoded x509 data in /etc/kubernetes/admin.conf wasn't updated despite the /etc/kubernetes/pki/*.crt files being updated. Is there a way to manually trigger an immediate update of the .conf files too?

@neolit123
Copy link
Member

the .conf files should be updated by sudo kubeadm alpha certs renew <file>.conf or all. if you are seeing a bug please log a separate issue.

@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

@neolit123 Lastly, am I correct in assuming that, at least for 1.15.4-00 and onward, all certs (i.e. /etc/kubernetes/pki/, /var/lib/kubelet/pki/, etc), on the master and node, are automatically rotated (i.e. no manual intervention required), or do I still need a K8S job that periodcally rotates the certs (i.e. via sudo kubeadm alpha certs renew all).

@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

the .conf files should be updated by sudo kubeadm alpha certs renew <file>.conf or all. if you are seeing a bug please log a separate issue.

I'll investigate that now and follow-up here (and with another issue/ticket if there's a problem).

@neolit123
Copy link
Member

1.15 is still in the support skew, but maybe we merged fixes for the .conf files in 1.16. i cannot recall.

I still need a K8S job that periodcally rotates the certs (i.e. via sudo kubeadm alpha certs renew all).

kubeadm certs are only auto-rotated on upgrade, but that's still a manual trigger, so either way you are going to need a job to run periodically.

@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

1.15 is still in the support skew, but maybe we merged fixes for the .conf files in 1.16. i cannot recall.

I still need a K8S job that periodcally rotates the certs (i.e. via sudo kubeadm alpha certs renew all).

kubeadm certs are only auto-rotated on upgrade, but that's still a manual trigger, so either way you are going to need a job to run periodically.

Excellent. Thanks!

@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

@neolit123 I stand corrected. client-certificate-data and client-key-data were being updated correctly, whereas certificate-authority-data remains unchanged when renewing certs (expected). This last point was due to a local script bug (now fixed). All is well.

@neolit123
Copy link
Member

neolit123 commented Oct 8, 2019

great. thanks for confirming.

@IAXES
Copy link
Author

IAXES commented Oct 8, 2019

@neolit123 Another question on the topic: do the services that use these certs (i.e. in /etc/kubernetes/pki) need to be restarted in order to be aware of the update and use the new files? If so, is there a "best" way to ensure all services using the certs are properly/fully restarted?

Thank you.

@neolit123
Copy link
Member

the certificate rotation is important on control-plane nodes.
so you can just reboot the control-plane nodes, as most users online tend to suggest.

an alternative approach would be:

# temp move all static pod manifests
mv /etc/kubernetes/manifests/ /etc/kubernetes/manifests-backup
sleep 20 # enough time for the kubelet to find there are no static pods
# bring them back
mv /etc/kubernetes/manifests-backup/ /etc/kubernetes/manifests
# verify everything comes back after a while
watch kubectl get po -A

@IAXES
Copy link
Author

IAXES commented Oct 9, 2019

@neolit123 Last question on the topic: would it be reasonable to expect kubeadm to work normally (more or less) if I did a source dive and changed the /etc/kubernetes/pki/ certs to expire after 30 minutes? I'd like to setup an automated regression/worst-case test to break my test setups.

Thank you.

@neolit123
Copy link
Member

the constant you want to change is here:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/constants/constants.go#L48

@IAXES
Copy link
Author

IAXES commented Jan 20, 2020

@neolit123

the certificate rotation is important on control-plane nodes.
so you can just reboot the control-plane nodes, as most users online tend to suggest.

an alternative approach would be:

# temp move all static pod manifests
mv /etc/kubernetes/manifests/ /etc/kubernetes/manifests-backup
sleep 20 # enough time for the kubelet to find there are no static pods
# bring them back
mv /etc/kubernetes/manifests-backup/ /etc/kubernetes/manifests
# verify everything comes back after a while
watch kubectl get po -A

Good day,

When the static pods are brought down in this manner (i.e. to rotate the certs), does it necessarily take out other pods, deployments, services, etc.; in the rest of the cluster, assuming a single-master-multi-node cluster (i.e. does it terminate and drain all running K8S pods), or does the cluster just keep churning along until the static pods come back up on the master (provided health checks or other conditions don't take out the non-static pods)? I can test this out myself, but I need to confirm the intended/documented way kubeadm handles this.

Lastly, would it be practical/viable to have kubeadm support rotating the certs without requiring a restart? If so, I could go ahead an raise a ticket/request on the topic.

Thank you.

@neolit123
Copy link
Member

neolit123 commented Jan 20, 2020

When the static pods are brought down in this manner (i.e. to rotate the certs), does it necessarily take out other pods, deployments, services, etc.; in the rest of the cluster, assuming a single-master-multi-node cluster (i.e. does it terminate and drain all running K8S pods), or does the cluster just keep churning along until the static pods come back up on the master (provided health checks or other conditions don't take out the non-static pods)?

restarting the control-plane components static pods should not affect the rest of the pods and services in the cluster, but there could be some minimal downtime, until the control-plane is fully restarted.

that is why it is recommended to do upgrade and/or do certificate rotation during a maintenance window.

I can test this out myself, but I need to confirm the intended/documented way kubeadm handles this.

our certificate rotation documentation is here:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/

we even lack the restart recommendation in the "Manual certificate renewal" section.
that is because, given static pods mount the certificates from the host rotating the certificates on the host (using the "renew" command) would mean that the static pods know about the rotation. but you can still perform it to make sure the component responds to the new cert on your demand.

there are some other details too:

  • *ca.* and sa.* are not rotated!
  • the kubelet.conf is not rotated but it will auto rotate if you restart the kubelet service (systemctl restart kubelet)

for the kubelet.conf see this note about in above link:

On nodes created with kubeadm init, prior to kubeadm version 1.17, there is a bug where you manually have to modify the contents of kubelet.conf. After kubeadm init finishes, you should update kubelet.conf to point to the rotated kubelet client certificates, by replacing client-certificate-data and client-key-data with...

@IAXES
Copy link
Author

IAXES commented Oct 8, 2020

@neolit123 Good day. I raised a related topic to this. Does time permit to ping you for help/feedback this week? Thank you.

@neolit123
Copy link
Member

hi, i will respond to the new ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

3 participants