Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[helm] certificate-manager: avoid duplicate certificate generation and rate limiting #1715

Merged
merged 1 commit into from
Aug 24, 2021

Conversation

jschaul
Copy link
Member

@jschaul jschaul commented Aug 23, 2021

Problem description:

Before this change, when using the 'jetstack/cert-manager' alongside
'wire-server' and 'nginx-ingress-services' helm charts (with
useCertManager: true), then each subsequent 'helm upgrade' command
re-generated a fresh 'certificaterequest' object that retrieved a fresh
certificate from let's encrypt; even if the existing certificate was
still valid.
This is probably due to both the 'Certificate' resource managing the
'Secret' resource, as well as helm itself also creating the 'Secret'
resource (with empty strings)
This is a problem because:

  • it wastes let's encrypt resources
  • it quickly leads to duplicate certificate rate limiting (5/week)
    causing deployments to fail.

Symptoms:

Using this command:

kubectl -n wire get certificaterequest --sort-by=.metadata.creationTimestamp

one can see that two fresh certificate request resources (one for the
regular ingress, one for the federator ingress and the federator itself) are created on
each fresh deployment. After more than 5 deployments in a given 7-day
period, deployments fail with errors such as:

E0823 13:29:08.667867       1 sync.go:211] cert-manager/controller/orders "msg"="failed to create Order resource due to bad request, marking Order as failed" "error"="429 urn:ietf:params:acme:error:rateLimited: Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: DOMANNAME: see https://letsencrypt.org/docs/rate-limits/" "resource_kind"="Order" "resource_name"="DOMAINNAME-csr-9j8hq-1640003472" "resource_namespace"="wire" "resource_version"="v1"

resolution

With this PR, we only create the Secret resource manually if certificate
manager is disabled. After one initial more fresh certificate request, all subsequent
deployments no longer create fresh certificate requests, solving this
issue. Thanks @akshaymankar for your hunch of what the issue could be.

Drive-by improvement:

  • be explicit about client auth capabilities for the federator
    certificate

Useful tools when debugging let's encrypt issues

https://crt.sh/
https://letsdebug.net
https://tools.letsdebug.net/cert-search
https://check-your-website.server-daten.de/

Check for certificate certificaterequest order and challenge objects in kubernetes.

Checklist

  • The PR Title explains the impact of the change.
  • The PR description provides context as to why the change should occur and what the code contributes to that effect. This could also be a link to a JIRA ticket or a Github issue, if there is one.
  • Section Unreleased of CHANGELOG-draft.md contains the following bits of information:
    • A line with the title and number of the PR in one or more suitable sub-sections.

…d rate limiting

Problem description:

Before this change, when using the 'jetstack/cert-manager' alongside
'wire-server' and 'nginx-ingress-services' helm charts (with
useCertManager: true), then each subsequent 'helm upgrade' command
re-generated a fresh 'certificaterequest' object that retrieved a fresh
certificate from let's encrypt; even if the existing certificate was
still valid.
This is probably due to both the 'Certificate' resource managing the
'Secret' resource, as well as helm itself also creating the 'Secret'
resource (with empty strings)
This is a problem because:
* it wastes let's encrypt resources
* it quickly leads to duplicate certificate rate limiting (5/week)
  causing deployments to fail.

Symptoms:

Using this command:

kubectl -n wire get certificaterequest --sort-by=.metadata.creationTimestamp

one can see that two fresh certificate request resources (one for the
regular ingress, one for the federator ingress and the federator itself) are created on
each fresh deployment. After more than 5 deployments in a given 7-day
period, deployments fail with errors such as:

E0823 13:29:08.667867       1 sync.go:211] cert-manager/controller/orders "msg"="failed to create Order resource due to bad request, marking Order as failed" "error"="429 urn:ietf:params:acme:error:rateLimited: Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: DOMANNAME: see https://letsencrypt.org/docs/rate-limits/" "resource_kind"="Order" "resource_name"="DOMAINNAME-csr-9j8hq-1640003472" "resource_namespace"="wire" "resource_version"="v1"

With this PR, we only create the Secret resource manually if certificate
manager is disabled. After one initial more fresh certificate request, all subsequent
deployments no longer create fresh certificate requests, solving this
issue.

Drive-by improvement:

* be explicit about client auth capabilities for the federator
  certificate
@jschaul jschaul force-pushed the letsencrypt-keep-certificates-around branch from f29db4c to e5d385e Compare August 24, 2021 14:09
@jschaul jschaul changed the title WIP [skip-ci] don't delete the let's encrypt certificate [helm] certificate-manager: avoid duplicate certificate generation and rate limiting Aug 24, 2021
Copy link
Member

@akshaymankar akshaymankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@akshaymankar
Copy link
Member

BTW, this makes it necessary for ingress helm chart to be installed before wire-server helm chart. This is not the case with our docs, so we should also fix that. Maybe let's also add that in the CHANGELOG, but it will only affect new installations.

@jschaul jschaul merged commit fa2936c into develop Aug 24, 2021
@jschaul jschaul deleted the letsencrypt-keep-certificates-around branch August 24, 2021 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants