Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet server certificate bootstrap and rotation #602

Conversation

jcbsmpsn
Copy link

@jcbsmpsn jcbsmpsn commented May 5, 2017

No description provided.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 5, 2017
@ericchiang
Copy link
Contributor

@kubernetes/sig-auth-proposals @kubernetes/sig-node-proposals

@k8s-ci-robot
Copy link
Contributor

@ericchiang: These labels do not exist in this repository: sig/auth, sig/node.

In response to this:

@kubernetes/sig-auth-proposals @kubernetes/sig-node-proposals

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@ericchiang ericchiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to cover kubernetes/enhancements#266 in this proposal too since they'll be using a lot of the same mechanisms and refactors?


### As Expiration Approaches

Rotating a Kubelet server certificate will work by generating a new private key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason it's regenerating a new private key rather then reusing the old one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, you want young keys... the older a key, the longer the exposure to factoring. if you have an auto-renewal system, using a new key each time is good.

rotation
- When cert/key files are specified in kubeconfig, or with the
`--tls-cert-file` and `--tls-private-key-file`:
- These values be used only if there are no updated cert/key pairs due
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having trouble parsing this section. Does this imply that the kubelet will try to request new serving certs even if you've configured them through flags?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect either an explicitly specified cert or an auto-rotated cert, not both

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were pre-existing flags and I'm trying to find a way to nicely incorporate their behavior.

The way I see this, section to to use this cert/key pair if it is there, and be able to move on to rotation when the time is right.

I could do as Jordan suggests, but it makes it harder to transition an existing cluster. If someone has a cluster, with this cert/key specified, then these are the valid authentication mechanisms for connecting to the API server in order to do a rotation. However, once there is a rotation, that cert/key pair will be newer, and these parameters will be ignored. Analogous to how bootstrapping works, a cert/key pair specified in the kube config file is valid, as long as there is no more up to date cert/key pair.

certificates.
1. Have the CertificateManager repeat the CSR process as certificate expiration
approaches.
- New certificates will be requested when the configured duration threshold
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually configurable through a command line flag or something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duration is chosen by the Certificate Request Signing API. There is currently no flag for changing the duration that will be assigned, but I've been thinking of adding one to facilitate testing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now a flag (--experimental-cluster-signing-duration) on the API server to set the certificate duration for signed certificates.

kubernetes/kubernetes#46058

pair.
- Delete kubelet-server-current.pem
- Move kubelet-server-updated.pem to kubelet-server-current.pem
1. Currently the Certificate Request Signing API is hard coded to issue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Certificate Request Signing API/Certificate Request Signing controller/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this doesn't seem like a step in the renewal process. Maybe a "future work" section that could include testing and approval?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you go about loading the new key? Is the application watching the file? Or is there some other part of this where we HUP the process when after the move to kubelet-server-current.pem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coresolve it's discussed in the section about the "CertificateManager".

Go can dynamically present a certificate on a per connection basis. The certificate manager code reads the cert/key from disk and kubelet will start using it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

kubelet-server-<timestamp>.pem.
- Write kubelet-server-updated.pem symlink to point to the new cert/key
pair.
- Delete kubelet-server-current.pem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Delete is redundant? e.g. the move already does this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@luxas
Copy link
Member

luxas commented May 9, 2017

@kubernetes/sig-cluster-lifecycle-feature-requests @timothysc @jbeda

@k8s-ci-robot
Copy link
Contributor

@luxas: These labels do not exist in this repository: sig/cluster-lifecycle.

In response to this:

@kubernetes/sig-cluster-lifecycle-feature-requests @timothysc @jbeda

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

- generate a key
- create a certificate signing request
- request a signed certificate from the API server
- wait for a response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't get approved, what will it do? What if it doesn't get an answer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't proceed.


### As Expiration Approaches

Rotating a Kubelet server certificate will work by generating a new private key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional or required? I suspect optional, but I'd like it called out as an optional feature at the top level.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean certificate rotation should be optional? Currently it is gated by a feature flag due to it's alpha status. Would you like it gated by a command line parameter and rotation of expiring certificates would be an optional feature of a kubelet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even once the rotation feature is no longer alpha, not all deployments will use cert rotation. I would expect to be able to continue providing a cert/key explicitly to the kubelet and have it simply use it.

any given moment (because of a rotation in progress).
- When cert/key data is directly embedded in the kubelet config file
(client-certificate-data, client-key-data)
- The values be used only if there are no updated cert/key pairs due to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override behavior seems backwards. I'd expect the explicit values in a kubeconfig to win and specifying both to be an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override behavior seems backwards. I'd expect the explicit values in a kubeconfig to win and specifying both to be an error.

I guess explicit is relative. Do our CLI flags work this way too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is all under a heading of serving-certificate management... serving certs are not embedded in kubeconfig files anywhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove the reference to keys in the kubeconfig files.

rotation, `RotateKubeletServerCertificate`
1. Centralize certificate access within the kubelet code to the
CertificateManager. The CertificateManager will be responsible for:
- Providing the correct certificate to use for establishing TLS connections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the CA bundle used to verify a connection to the kube-apiserver?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the certificate that is used for any incoming connections to port 10250. CAs are not affected by this proposal.

current certificate approaches expiry.
- Since certificates can rotate at any time, all other parts of the kubelet
should ask the CertificateManager for the correct certificate each time a
certificate is used. No certificate caching except by the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will you prevent the use of the "normal" connection caches we have in the client?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the client side the certificate manager overrides the restclient.Config GetClientCertificate method[0].

[0] https://github.com/kubernetes/kubernetes/pull/41912/files#diff-e0160f895346669c2e4495acfd858ea7R466

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making the client certificate dynamic is problematic for the transport cache at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/transport/cache.go#L41

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see that #41912 actually added the GetClientCertificate field to the TLS config.

Does it work for the cache to ignore configs that set this new field? I'm under the impression that the kubelet is a lot better about keeping a centralized client instead of constantly creating new ones like kubectl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, those have to be uncacheable. I'm not sure how many transports get created with a call to something like clientset.NewForConfig.

(client-certificate-data, client-key-data)
- The values be used only if there are no updated cert/key pairs due to
rotation
- When cert/key files are specified in kubeconfig, or with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serving certs are only specified via --tls-cert-file/--tls-private-key-file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this.

@luxas
Copy link
Member

luxas commented Jun 5, 2017

Can this be merged now when the code has been merged to master?

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 15, 2017
@k8s-github-robot
Copy link

This PR hasn't been active in 134 days. Closing this PR. Please reopen if you would like to work towards merging this change, if/when the PR is ready for the next round of review.

cc @ericchiang @jcbsmpsn @liggitt @mikedanese

You can add 'keep-open' label to prevent this from happening again, or add a comment to keep it open another 90 days

@mikedanese mikedanese reopened this Oct 18, 2017
@fejta fejta added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. keep-open labels Dec 15, 2017
@kubernetes kubernetes deleted a comment from k8s-github-robot Dec 15, 2017
@k8s-github-robot k8s-github-robot added the kind/design Categorizes issue or PR as related to design. label Feb 6, 2018
@gkarthiks
Copy link

hi @jcbsmpsn Since we have a KEP process, can we please merge this soon or close this and create a KEP?

@mikedanese
Copy link
Member

/close

@k8s-ci-robot
Copy link
Contributor

@mikedanese: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet