Skip to content

Commit

Permalink
Webhook known problems, solutions and debug techniques
Browse files Browse the repository at this point in the history
Signed-off-by: Richard Wall <richard.wall@jetstack.io>
  • Loading branch information
wallrj committed Oct 7, 2020
1 parent 4d26c1d commit 28b6487
Showing 1 changed file with 71 additions and 0 deletions.
71 changes: 71 additions & 0 deletions content/en/docs/concepts/webhook.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,78 @@ following two Secrets:
- `secret/cert-manager-webhook-tls`: A TLS certificate issued by the root CA
above, served by the webhook.

## Known Problems and Solutions

### Webhook connection problems on GKE private cluster

If errors occur around the webhook but the webhook is running then the webhook
is most likely not reachable from the API server. In this case, ensure that the
API server can communicate with the webhook by following the [GKE private
cluster explanation](../../installation/compatibility/#gke).

### Webhook connection problems on AWS EKS

When using a custom CNI (such as Weave or Calico) on EKS, the webhook cannot be reached by cert-manager.
This happens because the control plane cannot be configured to run on a custom CNI on EKS,
so the CNIs differ between control plane and worker nodes.
The solution is to [run the webhook in the host network](../../installation/compatibility/#aws-eks) so it can be reached by cert-manager.

### Webhook connection problems shortly after cert-manager installation

When you first install cert-manager, it will take a few seconds before the cert-manager API is usable.
This is because the cert-manager API requires the cert-manager webhook server, which takes some time to start up.
Here's why:

* The webhook server performs a leader election at startup which may take a few seconds.
* The webhook server may take a few seconds to start up and to generate its self-signed CA and serving certificate and to publish those to a Secret.
* `cainjector` performs a leader election at start up which can take a few seconds.
* `cainjector`, once started, will take a few seconds to update the `caBundle` in all the webhook configurations.

For these reasons, after installing cert-manager and when performing post-installation cert-manager API operations,
you will need to check for temporary API configuration errors and retry.

You could also add a post-installation check which performs `kubectl --dry-run` operations on the cert-manager API.
Or you could add a post-installation check which automatically retries the [Installation Verification](../../installation/kubernetes/#verifying-the-installation) steps until they succeed.

## Diagnosing Other Webhook Problems

### Check the webhook TLS certificates

The Kubernetes API server will load the CA content from the webhook configuration and use that to verify the serving certificate presented by the webhook server, when the TLS connection is established.

Get the webhook configuration and check the `caBundle` value.
For example, to check the `ValidatingWebhookConfiguration`:

```
kubectl get validatingwebhookconfigurations cert-manager-webhook -o yaml | grep caBundle
```

NOTE: If the value is empty there may be a problem with `cainjector`.
The `caBundle` value is set by [`cainjector` Injecting CA data from a Secret resource](../ca-injector/#injecting-ca-data-from-a-secret-resource).
Check that the `cainjector` Pod is running and check the `cainjector` logs for errors.

Next check that the `caBundle` has a valid CA certificate.

```
echo <CA BUNDLE VALUE> | base64 -d | openssl x509 -in - -noout -text
```

Then compare that with the certificates that are being used by the webhook server:

```
kubectl -n cert-manager get secrets cert-manager-webhook-ca -o yaml
```

You should be able to decode the `ca.crt` x509 content from that secret and see that the CA matches that which we saw in the webhook configuration.

You should also find that the `tls.crt` content has a certificate signed by that same CA.

NOTE: This process can also be repeated for the `caBundle` field in `MutatingWebhookConfiguration` and `CustomResourceDefinition` resources.

#### Temporarily work around webhook TLS problems

If necessary, you can manually add / update the TLS certificates in the `ValidatingWebhookConfiguration`, `MutatingWebhookConfiguration`,
and in each of the cert-manager `CustomResourceDefinition` resources.
Add the `caBundle` value, copied from the `ca.crt` field of the `cert-manager-webhook-ca` Secret.

NOTE: This should only be used as a temporary measure, while you investigate the root cause of `cainjector` failing to update the fields automatically.

0 comments on commit 28b6487

Please sign in to comment.