Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: fix certs renewal during upgrade #76862

Merged

Conversation

@fabriziopandini
Copy link
Member

commented Apr 20, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
as of today, kubeadm does not implement a consistent approach for certs renewal during upgrades:

  • certificates signed by etcd-ca are renewed every upgrade
  • apiserver certificate is upgraded only after 180 days after creation (and in post upgrade, so probably not picked up by the api server :-(
  • apiserver-kubelet-client and front-proxy-client certificate are not renewed at all

This PR fixes this by implementing a consistent approach by renewing all the certificates used by one component before upgrading the component itself.
If the component e.g. etcd does not change version during upgrades, related certificates are not updated.

Certificate renewal during kubeadm upgrade is skipped in case of external-ca (because kubeadm can't do certificate renewal without the ca key)

Which issue(s) this PR fixes:
Rif kubernetes/kubeadm#1361

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

kubeadm: kubeadm upgrade now renews all the certificates used by one component before upgrading the component itself, with the exception of certificates signed by external CAs. User can eventually opt-out from certificate renewal during upgrades by setting the new flag --certificate-renewal to false.

/sig cluster-lifecycle
/area kubeadm
/priority important-soon
@kubernetes/sig-cluster-lifecycle-pr-reviews

@neolit123
Copy link
Member

left a comment

/lgtm
/hold
for more eyes on the change.

If the component e.g. etcd does not change version during upgrades, related certificates are not updated.

i guess this is the best we can do for upgrade.
in my opinion we need to make sure the Upgrade output is clear about what we are doing.
e.g.

  • Upgrading componentX
  • [1] Renewing the certificate Y for componentX...
  • [1] Renewing the certificate Y for componentX...

i can't find any Printf() for [1], but i might have missed them.

@fabriziopandini fabriziopandini force-pushed the fabriziopandini:fix-upgrade-certs-renew branch from 60010b6 to 655a3b9 Apr 22, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Apr 22, 2019

@fabriziopandini

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2019

@neolit123 added Printf() for [1]

@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

/retest

@mauilion

This comment has been minimized.

Copy link
Contributor

commented Apr 23, 2019

In the name of safety we should

  • validate that the old and new sans are the same.
    If they are not recommend that the user make use of the phases to adjust the san name or check if the kubeadm.conf has been provided. I am concerned that the user will clobber certs in the upgrade and not realize it.
  • validate that the certs being rotated are signed by the ca we picked up.
    I am concerned that the user is using an external ca via the csr method. In this case there may be a ca.crt and key available but it was not the one that signed the apiserver. Also if the user is replacing the CA it should probably not be done via this upgrade method.
  • Check the age of the certs. or make this an optional thing.
    If the certs were recently rotated no reason to rotate them again. Perhaps the user used a phase to renew the certs prior to upgrade.
@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 25, 2019

thanks @mauilion
adding some comments to the comments from @mauilion for the sake of moving the decision forward:

validate that the old and new sans are the same.
If they are not recommend that the user make use of the phases to adjust the san name or check if the kubeadm.conf has been provided. I am concerned that the user will clobber certs in the upgrade and not realize it.

this is a valid concern.
i think renew should stop an accumulate a list of ERRORS for the certs the change SANs.
we do have a -f flag for apply already which can be utilized here.

if -f is provided such certs SANs can be overwritten.

validate that the certs being rotated are signed by the ca we picked up.
I am concerned that the user is using an external ca via the csr method. In this case there may be a ca.crt and key available but it was not the one that signed the apiserver. Also if the user is replacing the CA it should probably not be done via this upgrade method.

i think the PR covers this case?
external CA certs should be skipped, we just need verbosity in the output.

Check the age of the certs. or make this an optional thing.
If the certs were recently rotated no reason to rotate them again. Perhaps the user used a phase to renew the certs prior to upgrade.

my vote would be to print the age using the proposed ways using plan and/or e.g kubeadm certs status.
but always renew during upgrade unless there is a --experimental-skip-certs-renew.

this flag can be removed once we have phases for upgrades.

@fabriziopandini

This comment has been minimized.

Copy link
Member Author

commented Apr 26, 2019

@mauilion @neolit123 @timothysc

validate that the old and new sans are the same.

This is currently satisfied because renewal reads the current certificates as the authoritative source for certificates attributes
Nb. as discussed in kubeadm office hours the 24th of April, this has downsides as well. see kubernetes/kubeadm#1540 for tracking

validate that the certs being rotated are signed by the ca we picked up.
... or make this an optional thing.

kubeadm already detect external CA (when CA key are not provided) and skips certificate renewal
Additionally, I added a new commit to this PR, making certificate renewal optional so the user can opt out from automatic renewal.

Check the age of the certs.

during the discussion, it was decided to always renew certs; additionally, in following PRs, I will provide also a utility for checking certificate age (and possibly integrate this in the kubeadm upgrade output)
@mauilion please let me know if there are counter-arguments

cmd/kubeadm/app/cmd/upgrade/apply.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/cmd/upgrade/node.go Outdated Show resolved Hide resolved

@fabriziopandini fabriziopandini force-pushed the fabriziopandini:fix-upgrade-certs-renew branch from 70b0404 to 3fcf9b7 Apr 27, 2019

@fabriziopandini fabriziopandini force-pushed the fabriziopandini:fix-upgrade-certs-renew branch from 3fcf9b7 to 137137c Apr 27, 2019

@fabriziopandini

This comment has been minimized.

Copy link
Member Author

commented Apr 27, 2019

@neolit123 comments addressed
PR rebased and commits squashed too

@neolit123
Copy link
Member

left a comment

thanks @fabriziopandini
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm label Apr 27, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Apr 27, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fabriziopandini

This comment has been minimized.

Copy link
Member Author

commented Apr 27, 2019

@neolit123 is it possible to unhold this one now?

@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 27, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit c88b7cd into kubernetes:master Apr 27, 2019

20 checks passed

cla/linuxfoundation fabriziopandini authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@mauilion

This comment has been minimized.

Copy link
Contributor

commented Apr 27, 2019

Hi @fabriziopandini
I don't think this case is addressed cleanly by the PR.

validate that the certs being rotated are signed by the ca we picked up.
I am concerned that the user is using an external ca via the csr method. In this case there may be a ca.crt and key available but it was not the one that signed the apiserver. Also if the user is replacing the CA it should probably not be done via this upgrade method.

I think that we should validate that the internal or external CA is the CA that signed the existing certs before renewing them.

There will be cases where someone has minted the certs for public facing apis and hasn't figured a way to configure kubeadm such that these specific certs are not clobbered.

I'd like for us to explicitly only renew certs if the CA that we have is the one that signed the old cert.

On this case:

Check the age of the certs. or make this an optional thing.
If the certs were recently rotated no reason to rotate them again. Perhaps the user used a phase to renew the certs prior to upgrade.

My concern is that we have more high value and still very valid certs lying around on disk. If the cert is not set to expire soon and we back it up we should at least inform the user that there are high value tls assets that are still active. With the current TLS implementation there is no Certificate Revocation possible. So any valid cert will continue to extend trust until it is expired.

@mauilion

This comment has been minimized.

Copy link
Contributor

commented Apr 27, 2019

woo missed the cut off :)

@fabriziopandini

This comment has been minimized.

Copy link
Member Author

commented Apr 28, 2019

@mauilion don't worry
I will try to follow up and address your concern in separated PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.