Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA and Server certificate potentially get updated before ValidatingWebhookConfiguration #13

Open
dvob opened this issue Oct 14, 2020 · 14 comments

Comments

@dvob
Copy link

dvob commented Oct 14, 2020

Is it not a problem that the ValidatingWebhookConfiguration and the Secret are updated independently from each other? I think this can lead to the condition where the CA is already renewed but the ValidatingWebhookConfiguration still have the old CA and thus calls to the webhook would fail?

I did not really had problems with this. I only looked into the code and thought that this might become a problem. Or do I miss something?

@adrianludwin
Copy link
Contributor

Yeah this could definitely happen. But the secret's usually only generated if it's missing or when it expires (which is every N years) and so I doubt this is a significant problem. A few seconds of downtime a year (or even a month) is still many 9s of reliability.

Also, I'm not sure how this can be avoided. The webhook configs need the public keys, and the pods need the private keys, and there's no way to update both of those atomically. But most things in K8s can just be retried (yay, declarative APIs!) so rare failures shouldn't have much impact.

Does this sound reasonable?

@dvob
Copy link
Author

dvob commented Oct 14, 2020

I'm not sure how this can be avoided.

I think this should be possible by creating a new CA some time ahead before the existing CA expires but keeping the old CA around until it expires. This way you would have two trusted CAs for a certain period of time. I don't have tested if this works with the ValidatingWebhookConfiguration but it definitely should.

I understand that this makes the logic a bit more complex. But especially on big clusters with Gatekeeper where almost every API call also goes to the validation webhook, it's quite likely that one will run into this problem. Even if it is only for a couple of seconds I think it would be nice if this could be avoided completely.

@adrianludwin
Copy link
Contributor

adrianludwin commented Oct 14, 2020 via email

@dvob
Copy link
Author

dvob commented Oct 14, 2020

Or can a single caBundle include multiple acceptable certs?

That's what I would expect. But haven't tested it.

@dvob
Copy link
Author

dvob commented Oct 14, 2020

https://github.com/kubernetes/api/blob/master/admissionregistration/v1/types.go#L526

The name caBundle implies to me that it can contain multiple CAs.

@adrianludwin
Copy link
Contributor

adrianludwin commented Oct 14, 2020 via email

@maxsmythe
Copy link
Contributor

+1 to everything in this discussion.

I was also under the impression that only one root cert could be provided at any given time. If it's possible to just create a super bundle by concatenating the certs, zero-downtime rotation would be ideal.

@maxsmythe
Copy link
Contributor

Looking deeper into it, concatenation may work.

Note that the CA certs are self-signed, so there should be no intermediate certs.

We could test this by:

  • Launching a copy of HNC or Gatekeeper
  • Saving the secret contents
  • Deleting the secret
  • Saving the contents of the newly created secret.
  • Concatenating the two CA PEMs (they will likely need to be base64 decoded from the secret before we can concatenate)
  • Setting up two HNC /GK pods, one mounted with one copy of each secret
  • Creating a VWH config with the concatenated CA bundle
  • Point the VWH at one pod, see if it fails, point it at the second pod, see if it fails

Not a trivial amount of work, but worthwhile. I think the biggest benefit here is that each pod could maintain its own CA/key pair. This would avoid the need for all cert controllers across all versions to agree on their validity tests / generated cert contents and would remove cert generation as a SPOF.

We are already seeing a split WRT generated cert contents from #9

@dvob
Copy link
Author

dvob commented Oct 14, 2020

I tested this and it does work. I created two CA certificates (ca1 and ca2). Then I created one certificate (tls) and signed it with ca1 and another (tls-new) and signed it with ca2. Then I started the a webhook with the tls certificate and then switched to the tls-new certificate and both did work.
My validating webhook configuraiton does look like this:

---
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  name: validating-webhook-configuration
webhooks:
- clientConfig:
    caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJmakNDQVNTZ0F3SUJBZ0lSQVBtSytySFJmZU9kWXpEenFzRnVjbjR3Q2dZSUtvWkl6ajBFQXdJd0RqRU0KTUFvR0ExVUVBeE1EWTJFeE1CNFhEVEl3TVRBeE5ERTJORGd3TjFvWERUSXhNVEF4TkRFMk5EZ3dOMW93RGpFTQpNQW9HQTFVRUF4TURZMkV4TUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFYlFHaTN5YWN1Zml6Ck9LMmErLzVZWDZSMWMxa1ZJMklvLzg2eisvVmFnWWZHRDlxZUJmR3BQd3pvQWphWTQzRC9tQUs2dHFtM2llbTEKejdYOEZaRzMyYU5qTUdFd0RnWURWUjBQQVFIL0JBUURBZ0VHTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRApWUjBPQkJZRUZOQVlrY0lBS3A0aHkzQzVMaVV3UURSNUk4VUtNQjhHQTFVZEl3UVlNQmFBRk5BWWtjSUFLcDRoCnkzQzVMaVV3UURSNUk4VUtNQW9HQ0NxR1NNNDlCQU1DQTBnQU1FVUNJUURuUVFtUmxkbHB0TDdTdmZjeHQxSFMKa2RMeW1GTlhhNkNYcFh2bHJsdUgwZ0lnYjBLN2JuYWI0Y2pyMUhHY2dDbWc2QTQ2Y3V3SFpZV3hWVjJkMTlacwp1bkU9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0KLS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJmVENDQVNTZ0F3SUJBZ0lSQUpOZmVlUTJSYThRazIyM2dkNTNlaDh3Q2dZSUtvWkl6ajBFQXdJd0RqRU0KTUFvR0ExVUVBeE1EWTJFeU1CNFhEVEl3TVRBeE5ERTJORGd4TmxvWERUSXhNVEF4TkRFMk5EZ3hObG93RGpFTQpNQW9HQTFVRUF4TURZMkV5TUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFQkFkOXpzU1FzVm4wCnFva21uZ054SnpGSjhEZDFRSC9uaWJ6V2UybkNmTzYyRmFwNFEzRWdDTjhHNTk1SlNNdTNLQ29vK2ZIZVlnVk4KK1J1d3Y4b21PcU5qTUdFd0RnWURWUjBQQVFIL0JBUURBZ0VHTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRApWUjBPQkJZRUZNaEpqQzJXeWwyZnRZNWpLWHZKejdSdlBhY3JNQjhHQTFVZEl3UVlNQmFBRk1oSmpDMld5bDJmCnRZNWpLWHZKejdSdlBhY3JNQW9HQ0NxR1NNNDlCQU1DQTBjQU1FUUNJRmp3QVdYcC9DdjZhdjY3M0FuS2p6dFQKbmxWcjlubTVMZ1NrWXBYUVZ6TEdBaUEzRndXY0hTVUNES2FiTnN6cW1KZmVRN21ucVR4NDFqOUQ5ZXg0MVRUNQordz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    url: https://192.168.39.1:8443/validate
  failurePolicy: Fail
  name: vcm.kb.io
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - configmaps

@adrianludwin
Copy link
Contributor

adrianludwin commented Oct 14, 2020 via email

@dvob
Copy link
Author

dvob commented Oct 18, 2020

Currently I don't have time to contribute a patch.

@adrianludwin
Copy link
Contributor

adrianludwin commented Oct 19, 2020 via email

@rcorre
Copy link

rcorre commented Nov 27, 2023

EDIT: I was wrong, it totally works!


I tried to test this myself.

  1. kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io -ojson my-webhook | jq .webhooks[0].clientConfig.caBundle -r | base64 -d > /tmp/old-ca.pem
  2. Rotate the certificate, updating the secret and webhook
  3. kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io -ojson my-webhook | jq .webhooks[0].clientConfig.caBundle -r | base64 -d > /tmp/new-ca.pem
  4. cat /tmp/old-ca.pem /tmp/new-ca.pem | base64 -w0
  5. Edit the ValidatingWebhoookConfiguration to use the concatenated result.

I now get:

tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "x509: invalid signature: parent certificate cannot sign this kind of certificate" while trying to verify candidate authority certificate

This lines up with @adrianludwin's initial hunch that caBundle is a certificate chain rather than a list of separate leaf certificates.

@dvob do you have any more details on how you got it to work?

@rcorre
Copy link

rcorre commented Nov 28, 2023

Oops! I'm not sure what I was doing wrong yesterday, but I tested again today. Appending the old and new CA's in the webhook CA bundle seems to work. Requests to a webhook using either the old or the new certificate work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants