Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingress-gce: ACME certificates fail to issue for the first time #1343

Closed
munnerz opened this issue Feb 8, 2019 · 17 comments · Fixed by #1392
Closed

ingress-gce: ACME certificates fail to issue for the first time #1343

munnerz opened this issue Feb 8, 2019 · 17 comments · Fixed by #1392
Labels
area/acme/http01 Indicates a PR modifies ACME HTTP01 provider code area/acme Indicates a PR directly modifies the ACME Issuer code area/ingress-shim Indicates a PR or issue relates to the ingress-shim 'auto-certificate' component kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@munnerz
Copy link
Member

munnerz commented Feb 8, 2019

Is your feature request related to a problem? Please describe.

When using ingress-shim to automatically generate a Certificate resource for a GCE ingress using v0.6.0, if an existing Secret containing a signed keypair does not exist ahead of time, ingress-gce will enter a state where it won't update the GCLB in the google api to add the HTTP01 challenge solver paths.

Users will see an error such as:

Events:
  Type     Reason             Age                 From                     Message
  ----     ------             ----                ----                     -------
  Normal   ADD                22m                 loadbalancer-controller  cdn-how-to/my-ingress
  Normal   CreateCertificate  22m                 cert-manager             Successfully created Certificate "test1.rimusz.xyz"
  Warning  Sync               14m                 loadbalancer-controller  Error during sync: Error running backend syncing routine: unable to find nodeport {cdn-how-to/cm-acme-http-solver-dqvf5/8089 32735 8089 HTTP 8089 false <nil>} in any service
  Normal   CREATE             14m                 loadbalancer-controller  ip: 35.201.110.7
  Warning  Sync               33s (x33 over 21m)  loadbalancer-controller  Error during sync: Error running load balancer syncing routine: Cert creation failures - k8s-ssl-0fd30a7e225361e2-e3b0c44298fc1c14--1cea66e2afee452f Error:googleapi: Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

when running kubectl describe on their ingress resource.

Describe the solution you'd like

This is a new bug introduced in v0.6 - notably because as part of v0.6, we now generate a private key for the certificate before obtaining the signed certificate. This means we leave the Secret resource with only a tls.key entry, and no tls.crt.

This leads ingress-gce to throw errors, because tls.crt is empty. It then refuses to update the paths, which prevents the HTTP01 challenge passing.

We should automatically generate a self signed certificate if only a private key has been generated and no certificate already exists. This is probably useful behaviour for all Issuer types that only return private keys that need to be persisted sometimes.

Describe alternatives you've considered

In the meantime, users can either:

  1. create a secret resource with some kind of dummy secret in ahead of time
  2. remove the ingress.spec.tls[] entry temporarily, and instead manually create the Certificate resource - ingress-gce will not try and enable TLS on the LB until you manually add this entry again. This will allow the Certificate to be issued. Upon renewal, there will already be an existing secret there, meaning this problem won't be hit again at renew time.

Additional context

This is a regression from v0.5, and whilst it doesn't break existing deployments, it does cause problems for new users deploying v0.6 for the first time.

Environment details (if applicable):

  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): GKE
  • cert-manager version (e.g. v0.4.0): v0.6.x

/kind bug
/priority important-soon
/milestone v0.7
/area acme
/area acme/http01
cc @rimusz @ahmetb

@jetstack-bot jetstack-bot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 8, 2019
@jetstack-bot jetstack-bot added this to the v0.7 milestone Feb 8, 2019
@jetstack-bot jetstack-bot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/acme Indicates a PR directly modifies the ACME Issuer code area/acme/http01 Indicates a PR modifies ACME HTTP01 provider code labels Feb 8, 2019
@munnerz munnerz added this to To do in v0.7 Feb 8, 2019
@munnerz munnerz added the area/ingress-shim Indicates a PR or issue relates to the ingress-shim 'auto-certificate' component label Feb 8, 2019
@ahmetb
Copy link

ahmetb commented Feb 8, 2019

ingress-gce will enter a state where it won't update the GCLB in the google api to add the HTTP01 challenge solver paths.

If you have a reliable repro (like a very small manual Ingress file), I recommend filing that to the ingress-gce repo as an issue.

@munnerz
Copy link
Member Author

munnerz commented Feb 8, 2019

I'll put something more formal together, but the repro is basically creating a Secret that contains only a tls.key with no tls.crt, and referencing that secret in the ingress resource.

An error is then returned by the GCP API, which smells to me like ingress-gce isn't ever checking len(secret.Data["tls.crt"]) > 0 or similar.

Whilst this could be seen as a bug their end, I think we should probably do something to mitigate this case anyway? Interested to know what you think 😄

@ahmetb
Copy link

ahmetb commented Feb 8, 2019

I think ingress-gce shouldn't be making a GCE API request when tls.crt is missing. So I think it's probably a legit bug, especially if it's not correcting itself later on.

That said, even after the bug is fixed, it may take several months for the fix to ship. So I'd say maybe suggest GKE users to use an older version of cert-manager?

@munnerz
Copy link
Member Author

munnerz commented Feb 9, 2019

Rather than have GKE users run an older versions, I think we'll modify cert-manager to generate a self signed cert or similar in the meantime, or otherwise advise users to just create the Certificate resource themselves without using ingress-shim.

Advising users to use a version we don't support anymore seems like bad practice 😬, especially given the fairly substantial differences in v0.6.

@munnerz munnerz changed the title ingress-gce: ACME certificates fail to issue for the first time when using ingress-shim ingress-gce: ACME certificates fail to issue for the first time Feb 13, 2019
@KristinaHus
Copy link

KristinaHus commented Feb 18, 2019

Hi, I'm having exactly the same error on certificate issue for the first time.
http://prntscr.com/mmk5nc
I use dns01 challenge solver. And this is the output of kubectl descrbe ingress.

Events:
  Type     Reason             Age               From                     Message
  ----     ------             ----              ----                     -------
  Normal   ADD                12m               loadbalancer-controller  default/my-project
  Normal   CreateCertificate  12m               cert-manager             Successfully created Certificate "certificate-prod"
  Warning  Sync               1m (x3 over 11m)  loadbalancer-controller  Error during sync: Error running load balancer syncing routine: Cert creation failures - k8s-ssl-ff35a261db2c0666-e3b0c44298fc1c14--e03d90bc9b95698f Error:googleapi: Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

Is there any solution for now? I'm new to Kubernates. So can someone explain me what are the steps to fix it?

@mikesparr
Copy link

I had same issue and generated a self-signed cert using the tls.key to sign and place them in the secrets and errors ended. I found out, however, that the issue wasn't the signing but instead the hosts list was the issue and I had a couple domains that were not pointing to the static IP so the order with cert manager never completed. Once I removed those invalid URLs from hosts, and re-applied the ingress manifest, it worked fine.

Hope that helps and tl;dr is make sure every domain in ingress file resolves to the IP served by ingress (I use external DNS with GoDaddy)

@kushwahashiv
Copy link

kushwahashiv commented Oct 31, 2019

Hi All,
I'm getting following error ok GKE and my domain is hosted on godaddy.com

Error during sync: error running load balancer syncing routine: loadbalancer default-example-ingress--106e498635002265 does not exist: Cert creation failures - k8s-ssl-ee6cef5931e4877a-e3b0c44298fc1c14--106e498635002265 Error:googleapi: Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

What I'm missing here
/Shiv

@archonic
Copy link

archonic commented Jan 2, 2020

I've got just one domain in my ingress hosts list and I know it's pointing to the correct IP. I'm still getting these events. Error during sync: error running load balancer syncing routine: loadbalancer example-nginx--0a84289edb80f53e does not exist: Cert creation failures - k8s-ssl-fe4dc2b03cd5fcb9-e3b0c44298fc1c14--0a84289edb80f53e Error:googleapi: Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

If I'm reading that correctly, it needs a certificate to be able to create a certificate. Is that correct? I'm on v0.11 with ingress-nginx.

@kdemon1011
Copy link

I am using 0.12 version and got the below error
Error during sync: error running load balancer syncing routine: loadbalancer default-get-poshaq-com--903840200498a8bf does not exist: Cert creation failures - k8s-ssl-d60f1cb5e8dc7def-e3b0c44298fc1c14--903840200498a8bf Error:googleapi: Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

@w9jds
Copy link

w9jds commented Jan 15, 2020

I'm having this same issue. I tried it with the ingress shim and with my own dns01 issuer and I get the same problem.

@joshnewlinatclearobject

I'm also having this same issue within my k8s cluster. On 0.12.0

@ashitikov
Copy link

ashitikov commented Feb 13, 2020

Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

I had the same problem with ingress-gce on GKE platform, solved by:

  1. Clear all old certificates, remove your current ingress
  2. Annotate your ingress with: acme.cert-manager.io/http01-edit-in-place: "true"
  3. Your issuer's solvers block should use ingress/name instead ingress/class:
    solvers:
    - http01: 
        ingress: 
          name: your-ingress-name
  1. Apply new configurations

After some time ingress will pull new certificate from lets encrypt and start using it.
Note: I'm not using Certificate kind.

@kdemon1011
Copy link

kdemon1011 commented Feb 14, 2020

Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

I had the same problem with ingress-gce on GKE platform, solved by:

1. Clear all old certificates, remove your current ingress

2. Annotate your ingress with: acme.cert-manager.io/http01-edit-in-place: "true"

3. Your issuer's solvers block should use **ingress/name** instead ingress/class:
    solvers:
    - http01: 
        ingress: 
          name: your-ingress-name
1. Apply new configurations

After some time ingress will pull new certificate from lets encrypt and start using it.
Node: I'm not using Certificate kind.

I did not get the 3rd point. In which file I need to edit this block. @ashitikov suggestion, please.

@gustavovalverde
Copy link

@munnerz seems like the regression is "back"? Or are the steps in this comment the official way to solve it? #1343 (comment)

@ashitikov
Copy link

Error 400: Invalid value for field 'resource.certificate': ''. A certificate must be specified for SSL certificate creation., invalid

I had the same problem with ingress-gce on GKE platform, solved by:

1. Clear all old certificates, remove your current ingress

2. Annotate your ingress with: acme.cert-manager.io/http01-edit-in-place: "true"

3. Your issuer's solvers block should use **ingress/name** instead ingress/class:
    solvers:
    - http01: 
        ingress: 
          name: your-ingress-name
1. Apply new configurations

After some time ingress will pull new certificate from lets encrypt and start using it.
Node: I'm not using Certificate kind.

I did not get the 3rd point. In which file I need to edit this block. @ashitikov suggestion, please.

I use ClusterIssuer kind in this way:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your@email.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01: 
        ingress: 
          name: api-ingress

and ingress manifest's annotation section should be:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    kubernetes.io/ingress.class: "gce"
    kubernetes.io/ingress.global-static-ip-name: global-ip
    kubernetes.io/ingress.allow-http: "false"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    acme.cert-manager.io/http01-edit-in-place: "true"

I don't understand the whole process of acquiring certificates by cert-manager, but to me:

  1. cert-manager.io/cluster-issuer points which issuer we have to use to acquire certificates for this ingress
  2. solvers block in ingress with ingress/name combining with acme.cert-manager.io/http01-edit-in-place: "true" points, that cert-manager should modify ingress rules to be able letsencrypt to solve http01 challenge through this ingress.

More info about http01-edit-in-place annotation: https://cert-manager.io/docs/usage/ingress/
acme.cert-manager.io/http01-edit-in-place: "true": this controls whether the ingress is modified ‘in-place’, or a new one is created specifically for the HTTP01 challenge. If present, and set to “true”, the existing ingress will be modified. Any other value, or the absence of the annotation assumes “false”. This annotation will also add the annotation "cert-manager.io/issue-temporary-certificate": "true" onto created certificates which will cause a temporary certificate to be set on the resulting Secret until the final signed certificate has been returned. This is useful for keeping compatibility with the ingress-gce component.

@VGerris
Copy link

VGerris commented May 7, 2020

Hi,

Why was this issue closed?
I honestly don't understand why this is not addressed when multiple people, including myself have this issue.
Please reopen it, thank you.

@joerayme
Copy link

joerayme commented Jun 3, 2020

We just came across this issue too, running 0.13.0. I managed to fix it by swapping the issuer out for the self-signed one, waiting for the Ingress sync and then switching it back to use the acme issuer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/acme/http01 Indicates a PR modifies ACME HTTP01 provider code area/acme Indicates a PR directly modifies the ACME Issuer code area/ingress-shim Indicates a PR or issue relates to the ingress-shim 'auto-certificate' component kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.