Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KCP should introspect machine certificate expiry information from node to enable automatic certificate renewal #7342

Closed
ykakarap opened this issue Oct 4, 2022 · 6 comments · Fixed by #7355
Assignees
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@ykakarap
Copy link
Contributor

ykakarap commented Oct 4, 2022

User Story

Recently a feature was introduced in KCP to enable automatic renewal of certificates by doing a repave: #6983

This feature only works for control plane machines created using Cluster API v1.3 or above.
For older control plane machines it needs manual intervention from the user to provide the actual certificate expiry date of the control plane machines.

This issue is to explore ideas around how we can introspect the certificate expiry time from the node without needing manual intervention from the user.

Detailed Description

The certificate expiry time cannot reliably calculated from the creationTimestamp of the Machine object. Example: If the machine object is restored from a backup the creationTimestamp would have changed.

One possible way to calculate the certificate expiry time:

Cluster API / KCP introspect these values by running an exec pod temporarily on the node

xref: #7268 (comment)

/kind feature
/area control-plane

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. area/control-plane Issues or PRs related to control-plane lifecycle management needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 4, 2022
@k8s-ci-robot
Copy link
Contributor

@ykakarap: This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member

I wonder if we should really introduce this mechanism in our controller. It's only needed for old machines, but we will still never be really able to remove this code again (as we will have to assume for a very long time that old machines still exist).

Apart from that I wonder if it's worth investing the effort to implement this. It's non-trivial to do considering that this probably has to work on Windows & Linux, requires privileged access to control plane machines, ... .

@sbueringer
Copy link
Member

sbueringer commented Oct 4, 2022

I wonder if there might be some other/better ways to determine the node creation date and I/we just missed them:

Verified "inspect the kube-apiserver"

# Create kind cluster
kind create cluster

# Get server IP
kubectl config view --minify

# Check cert
openssl s_client -showcerts -connect 127.0.0.1:41913 2>/dev/null | openssl x509 -text
...
        Issuer: CN = kubernetes
        Validity
            Not Before: Sep 30 16:27:20 2022 GMT
            Not After : Sep 30 16:27:20 2023 GMT
...

@sbueringer
Copy link
Member

@vincepri WDYT?

@fabriziopandini
Copy link
Member

I'm +1 to explore this idea, but we should find a solution that does not require adding new binaries to the controller image (do something equivalent via some go code)

@sbueringer
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants