Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

cert manager certificates are not valid #532

Closed
timuckun opened this issue May 9, 2019 · 11 comments
Closed

cert manager certificates are not valid #532

timuckun opened this issue May 9, 2019 · 11 comments
Projects

Comments

@timuckun
Copy link

timuckun commented May 9, 2019

I have the following ingress for GKE

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ${APP_NAME}-ingress
  namespace: ${KUBE_NAMESPACE}
  labels:
    name: ${APP_NAME}
  annotations:
    kubernetes.io/ingress.class: "nginx"
    deploy_date: "$(date)"
    external-dns.alpha.kubernetes.io/hostname: ${APP_HOST_NAME}
    kubernetes.io/tls-acme: "true"
    certmanager.k8s.io/cluster-issuer: "letsencrypt-staging"
    #For keycloak we might need this if the env proxy setting doesn't work.
    #nginx.ingress.kubernetes.io/ssl-redirect: "true"

spec:
  tls:
    - hosts:
        - ${APP_HOST_NAME}
      secretName: ${APP_NAME}-tls
  rules:
    - host:  ${APP_HOST_NAME}
      http:
        paths:
          - path: /
            backend:
              serviceName: ${APP_NAME}-service
              servicePort: 8080

The cert manager issues a certificate for the deployment. The log files don't have any errors in them but the certificate is invalid.

When I visit the URL it get

NET::ERR_CERT_AUTHORITY_INVALID

Subject: HOST_NAME_HERE

Issuer: Fake LE Intermediate X1

Expires on: Aug 5, 2019

Current date: May 9, 2019

@project-bot project-bot bot added this to Inbox in BKPR May 9, 2019
@sameersbn
Copy link
Contributor

@timuckun The certificate is issued from the LE staging environment because of the annotation certmanager.k8s.io/cluster-issuer: "letsencrypt-staging".

@timuckun
Copy link
Author

timuckun commented May 9, 2019

I changed it to letsencrypt-prod and I get the same error.

NET::ERR_CERT_AUTHORITY_INVALID
Subject: HOST_NAME_HERE
Issuer: Fake LE Intermediate X1

Expires on: Aug 5, 2019

Current date: May 9, 2019

cert manage logs shows it created a cert

I  ingress-shim controller: syncing item 'keycloak-server-development/keycloak-server-development-ingress' 
I  Certificate "keycloak-server-development-tls" for ingress "keycloak-server-development-ingress" already exists 
I  ingress-shim controller: Finished processing work item "keycloak-server-development/keycloak-server-development-ingress" 
I  ingress-shim controller: syncing item 'keycloak-server-development/keycloak-server-development-ingress' 
I  Certificate "keycloak-server-development-tls" for ingress "keycloak-server-development-ingress" already exists 
I  Certificate "keycloak-server-development-tls" for ingress "keycloak-server-development-ingress" is up to date 
I  ingress-shim controller: Finished processing work item "keycloak-server-development/keycloak-server-development-ingress" 
I  certificates controller: syncing item 'keycloak-server-development/keycloak-server-development-tls' 
I  Certificate keycloak-server-development/keycloak-server-development-tls scheduled for renewal in 1429 hours 
I  certificates controller: Finished processing work item "keycloak-server-development/keycloak-server-development-tls" 

@timuckun
Copy link
Author

This remains a problem wether I use letsencrypt-prod or staging. The TLS certs set up for grafana, prometheus etc work fine, but any new certs issued via annotations in the ingress do not work.

This is a pretty urgent problem. If the cert manager isn't working it's pretty disastrous.

@sameersbn
Copy link
Contributor

sameersbn commented May 10, 2019

Turns out cert-manager does not automatically request a new certificated when the issuer is updated. The good news is that you can delete the secret associated with the certificate (kubectl delete secret {NAME OF THE SECRET NAMED ON THE CERTIFICATE HERE}) and that will prompt cert-manager to re-queue a new request.

Please let me know if this resolves the issue for you.

related: cert-manager/cert-manager#813 (comment)

@timuckun
Copy link
Author

I am sorry but this did not work. The cert manager did indeed recreate the certificate but I get the same error.

The error doesn't say bad certificate it says cert authority is invalid which seems to me may be a problem with the cluster issuer?

@timuckun
Copy link
Author

Is there any reason why the certs for the apps installed by kube-prod are valid but not the ones installed by cert manager after the install?

@sameersbn
Copy link
Contributor

By default the certificates are issued by the letsencrypt production environment. The only reason a certificate can be issued from the staging environment is if the issuer is set to letsencrypt-staging.

To debug the issue, I installed the stable/wordpress helm chart with the ingress annotation 'certmanager.k8s.io/cluster-issuer': 'letsencrypt-staging'. This resulted in being issued a certificated from the Fake LE issuer.

# curl -vkI https://blog.my-domain.com/
...
* Server certificate:
*  subject: CN=blog.my-domain.com
*  start date: May 13 08:51:13 2019 GMT
*  expire date: Aug 11 08:51:13 2019 GMT
*  issuer: CN=Fake LE Intermediate X1
...

Next I listed the ingresses with,

# kubectl get ing
NAME             HOSTS                              ADDRESS          PORTS     AGE
blog-wordpress   blog.my-domain.com   35.200.214.186   80, 443   8m48s

and certificates with

# kubectl get certificates
NAME                  READY   SECRET                AGE
wordpress.local-tls   True    wordpress.local-tls   9m

To switch the issuer to letsencrypt prod environment, I edited the blog-wordpress ingress with the following command:

# kubectl edit ing blog-wordpress

and updated the annotation to the following:

certmanager.k8s.io/cluster-issuer: letsencrypt-prod

When the ingress manifest is updated, the certificate manifest will automatically be updated. To verify I opened the manifest for the wordpress.local-tls certificate resource

kubectl edit certificate wordpress.local-tls

There I saw the issuer was updated.

spec:
  issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-prod

Finally to trigger the request for a new certificate, I deleted the secret associated with the certificate.

kubectl delete secret wordpress.local-tls

After a while a new certificate was issued from the LE production env

# curl -vkI https://blog.my-domain.com/
...
* Server certificate:
*  subject: CN=blog.my-domain.com
*  start date: May 13 09:04:14 2019 GMT
*  expire date: Aug 11 09:04:14 2019 GMT
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
...

@timuckun
Copy link
Author

Following along..

curl -vkI https://mydomain.com

* Server certificate:
*  subject: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  start date: Apr 18 00:24:20 2019 GMT
*  expire date: Apr 17 00:24:20 2020 GMT
*  issuer: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56549bfe2900)

So in my case the cert manage issued a fake certificate

I see the ingress there.

kubectl get ingress -n sample-rails-app-development
NAME                                   HOSTS                                               ADDRESS         PORTS     AGE
sample-rails-app-development-ingress   sample-rails-app-development.mydomain.com  x.x.x.x   80, 443   11m

I see the cert there

kubectl get certificates -n sample-rails-app-development
NAME                               AGE
sample-rails-app-development-tls   13m

here is the ingress.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    certmanager.k8s.io/cluster-issuer: letsencrypt-prod
    deploy_date: Tue May 14 00:30:02 UTC 2019
    external-dns.alpha.kubernetes.io/hostname: sample-rails-app-development.mydomain.com
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"certmanager.k8s.io/cluster-issuer":"letsencrypt-prod","deploy_date":"Tue May 14 00:30:02 UTC 2019","external-dns.alpha.kubernetes.io/hostname":"sample-rails-app-development.mydomain.com","kubernetes.io/ingress.class":"nginx","kubernetes.io/tls-acme":"true"},"labels":{"name":"sample-rails-app-development"},"name":"sample-rails-app-development-ingress","namespace":"sample-rails-app-development"},"spec":{"rules":[{"host":"sample-rails-app-development.mydomain.com","http":{"paths":[{"backend":{"serviceName":"sample-rails-app-development-service","servicePort":3000},"path":"/"}]}}],"tls":[{"hosts":["sample-rails-app-development.mydomain.com"],"secretName":"sample-rails-app-development-tls"}]}}
    kubernetes.io/ingress.class: nginx
    kubernetes.io/tls-acme: "true"
  creationTimestamp: 2019-05-14T00:30:07Z
  generation: 1
  labels:
    name: sample-rails-app-development
  name: sample-rails-app-development-ingress
  namespace: sample-rails-app-development
  resourceVersion: "11948384"
  selfLink: /apis/extensions/v1beta1/namespaces/sample-rails-app-development/ingresses/sample-rails-app-development-ingress
  uid: 6ca34647-75df-11e9-b011-42010a980039
spec:
  rules:
  - host: sample-rails-app-development.mydomain.com
    http:
      paths:
      - backend:
          serviceName: sample-rails-app-development-service
          servicePort: 3000
        path: /
  tls:
  - hosts:
    - sample-rails-app-development.mydomain.com
    secretName: sample-rails-app-development-tls
status:
  loadBalancer:
    ingress:
    - ip: x.x.x.x

I edit the cert and I get this

issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-prod
  secretName: sample-rails-app-development-tls
status:
  acme:
    order:
      url: ""
  conditions:
  - lastTransitionTime: 2019-05-14T00:48:59Z
    message: Order validated
    reason: OrderValidated
    status: "False"
    type: ValidateFailed
  - lastTransitionTime: null
    message: 'Failed to finalize order: acme: urn:ietf:params:acme:error:rateLimited:
      Error finalizing order :: too many certificates already issued for exact set
      of domains: sample-rails-app-development.mydomain.com see https://letsencrypt.org/docs/rate-limits/'
    reason: IssueError
    status: "False"
    type: Ready

Ooops a rate limit. This makes no sense as I am supposed to get 20 certs per week per domain.

So I switch to letsencrypt-staging.

spec:
acme:
config:
- domains:
- sample-rails-app-development.mydomain.com
http01:
ingress: ""
ingressClass: nginx
dnsNames:

  • sample-rails-app-development.mydomain.com
    issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-staging
    secretName: sample-rails-app-development-tls
    status:
    acme:
    order:
    url: https://acme-staging-v02.api.letsencrypt.org/acme/order/9217179/33818672
    conditions:
  • lastTransitionTime: 2019-05-14T01:00:47Z
    message: Order validated
    reason: OrderValidated
    status: "False"
    type: ValidateFailed
  • lastTransitionTime: 2019-05-14T01:00:51Z
    message: Certificate issued successfully
    reason: CertIssued
    status: "True"
    type: Ready

Ok the staging cert doesn't have that error.

curl says 

ALPN, server accepted to use h2

  • Server certificate:
  • subject: CN=sample-rails-app-development.mydomain.com
  • start date: May 14 00:00:50 2019 GMT
  • expire date: Aug 12 00:00:50 2019 GMT
  • issuer: CN=Fake LE Intermediate X1
  • SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
  • Using HTTP2, server supports multi-use
  • Connection state changed (HTTP/2 confirmed)
  • Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
  • Using Stream ID: 1 (easy handle 0x559afbcb5900)

So here there is an error about not getting the local issuer certificate.  Could that be the problem?


I am pretty sure you should be able to use the staging cert without it giving an error in the browser.

@sameersbn
Copy link
Contributor

I am pretty sure you should be able to use the staging cert without it giving an error in the browser.

This is not true. Staging certificates are not trusted by the browser and the therefore you are expected to see the certificate warning.

@sameersbn
Copy link
Contributor

sameersbn commented May 14, 2019

message: 'Failed to finalize order: acme: urn:ietf:params:acme:error:rateLimited:
      Error finalizing order :: too many certificates already issued for exact set
      of domains: sample-rails-app-development.mydomain.com see https://letsencrypt.org/docs/rate-limits/'

The staging environment is meant for testing against the letsencrypt api and therefore does not enforce any rate limits. Generally it's also a good idea to use the staging environment for short lived certificates or while experimenting with TLS in development. This will help you stay away from letsencrypt's rate limits.

@gustavovalverde
Copy link

This seems like it can be closed.

Thanks for the solution though

BKPR automation moved this from Inbox to Done Oct 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
BKPR
  
Done
Development

No branches or pull requests

3 participants