Skip to content

GKE HTTP01 infinite failure loop if creating dual IPv4 and IPv6 ingresses for same host at same time #1157

@abevoelker

Description

@abevoelker

Describe the bug:

When creating dual IPv4 and IPv6 Ingresses for the same host with the following manifest:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: webapp-ingress-ipv4
  annotations:
    kubernetes.io/ingress.global-static-ip-name: webapp-ipv4
    certmanager.k8s.io/cluster-issuer: letsencrypt-prod
    certmanager.k8s.io/acme-http01-edit-in-place: "true"
spec:
  rules:
  - host: captioned-images.duckdns.org
    http:
      paths:
      - path: /*
        backend:
          serviceName: webapp
          servicePort: 3000
  tls:
  - secretName: webapp-tls
    hosts:
    - captioned-images.duckdns.org
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: webapp-ingress-ipv6
  annotations:
    kubernetes.io/ingress.global-static-ip-name: webapp-ipv6
    certmanager.k8s.io/cluster-issuer: letsencrypt-prod
    certmanager.k8s.io/acme-http01-edit-in-place: "true"
spec:
  rules:
  - host: captioned-images.duckdns.org
    http:
      paths:
      - path: /*
        backend:
          serviceName: webapp
          servicePort: 3000
  tls:
  - secretName: webapp-tls
    hosts:
    - captioned-images.duckdns.org

cert-manager seems to first create an Ingress rule / UrlMap just for the IPv4 Ingress:

$ kubectl describe ing
Name:             webapp-ingress-ipv4
Namespace:        default
Address:          35.244.211.136
Default backend:  default-http-backend:80 (10.4.2.8:8080)
TLS:
  webapp-tls terminates captioned-images.duckdns.org
Rules:
  Host                          Path  Backends
  ----                          ----  --------
  captioned-images.duckdns.org  
                                /*                                                                        webapp:3000 (<none>)
                                /.well-known/acme-challenge/W50w7IcM1NSrDUlXNnF0c6_h86_e4GXHLT-uhSzvIkk   cm-acme-http-solver-r5x22:8089 (<none>)
Annotations:
  kubernetes.io/ingress.global-static-ip-name:       webapp-ipv4
  certmanager.k8s.io/acme-http01-edit-in-place:      true
  certmanager.k8s.io/cluster-issuer:                 letsencrypt-prod
  ingress.kubernetes.io/backends:                    {"k8s-be-30660--4285d79a798c4100":"HEALTHY","k8s-be-30760--4285d79a798c4100":"HEALTHY","k8s-be-31522--4285d79a798c4100":"Unknown"}
  ingress.kubernetes.io/forwarding-rule:             k8s-fw-default-webapp-ingress-ipv4--4285d79a798c4100
  ingress.kubernetes.io/target-proxy:                k8s-tp-default-webapp-ingress-ipv4--4285d79a798c4100
  ingress.kubernetes.io/url-map:                     k8s-um-default-webapp-ingress-ipv4--4285d79a798c4100
  kubectl.kubernetes.io/last-applied-configuration:  {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"certmanager.k8s.io/acme-http01-edit-in-place":"true","certmanager.k8s.io/cluster-issuer":"letsencrypt-prod","kubernetes.io/ingress.global-static-ip-name":"webapp-ipv4"},"name":"webapp-ingress-ipv4","namespace":"default"},"spec":{"rules":[{"host":"captioned-images.duckdns.org","http":{"paths":[{"backend":{"serviceName":"webapp","servicePort":3000},"path":"/*"}]}}],"tls":[{"hosts":["captioned-images.duckdns.org"],"secretName":"webapp-tls"}]}}

Events:
  Type     Reason  Age               From                     Message
  ----     ------  ----              ----                     -------
  Warning  Sync    2m (x84 over 3h)  loadbalancer-controller  Could not find TLS certificates. Continuing setup for the load balancer to serve HTTP. Note: this behavior is deprecated and will be removed in a future version of ingress-gce


Name:             webapp-ingress-ipv6
Namespace:        default
Address:          2600:1901:0:ed32::
Default backend:  default-http-backend:80 (10.4.2.8:8080)
TLS:
  webapp-tls terminates captioned-images.duckdns.org
Rules:
  Host                          Path  Backends
  ----                          ----  --------
  captioned-images.duckdns.org  
                                /*   webapp:3000 (<none>)
Annotations:
  ingress.kubernetes.io/target-proxy:                k8s-tp-default-webapp-ingress-ipv6--4285d79a798c4100
  ingress.kubernetes.io/url-map:                     k8s-um-default-webapp-ingress-ipv6--4285d79a798c4100
  kubectl.kubernetes.io/last-applied-configuration:  {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"certmanager.k8s.io/acme-http01-edit-in-place":"true","certmanager.k8s.io/cluster-issuer":"letsencrypt-prod","kubernetes.io/ingress.global-static-ip-name":"webapp-ipv6"},"name":"webapp-ingress-ipv6","namespace":"default"},"spec":{"rules":[{"host":"captioned-images.duckdns.org","http":{"paths":[{"backend":{"serviceName":"webapp","servicePort":3000},"path":"/*"}]}}],"tls":[{"hosts":["captioned-images.duckdns.org"],"secretName":"webapp-tls"}]}}

  kubernetes.io/ingress.global-static-ip-name:   webapp-ipv6
  certmanager.k8s.io/acme-http01-edit-in-place:  true
  certmanager.k8s.io/cluster-issuer:             letsencrypt-prod
  ingress.kubernetes.io/backends:                {"k8s-be-30660--4285d79a798c4100":"HEALTHY","k8s-be-30760--4285d79a798c4100":"HEALTHY"}
  ingress.kubernetes.io/forwarding-rule:         k8s-fw-default-webapp-ingress-ipv6--4285d79a798c4100
Events:
  Type     Reason  Age               From                     Message
  ----     ------  ----              ----                     -------
  Warning  Sync    6m (x22 over 3h)  loadbalancer-controller  Could not find TLS certificates. Continuing setup for the load balancer to serve HTTP. Note: this behavior is deprecated and will be removed in a future version of ingress-gce

After the GCLB path / UrlMap is online enough for the 5 successful requests happen:

screenshot from 2018-12-18 13-33-28

The challenge gets passed to Let's Encrypt; but unfortunately the Let's Encrypt request comes in on IPv6, which doesn't have the Ingress path / cm-acme-http-solver running, so it gets passed to my internal application and 404s (note request IP 2600:1901:0:ed32::):

screenshot from 2018-12-18 13-20-51

Then the challenge process is failed (acme: authorization for identifier captioned-images.duckdns.org is invalid), the process repeats with a new challenge/path, then fails again, until Let's Encrypt rate limits me, ad infinitum.

Expected behaviour:
I expect cert-manager to be amazing and successfully create certificates for both Ingresses 😄

Steps to reproduce the bug:
I'm following the instructions at https://github.com/ahmetb/gke-letsencrypt with a fresh cluster, only changing Kubernetes version to 1.11.5-gke.4 and enabling private networking by checking "Enable VPC-native (using alias IP)" when creating the GKE cluster, and using the above sample manifest for the Ingresses.

Anything else we need to know?:

If I delete the IPv6 Ingress and wait, the process still seems to fail. I'm guessing because I still have an AAAA record out there and Let's Encrypt must default to IPv6 if it finds one.

Environment details::

  • Kubernetes version (e.g. v1.10.2):

v1.11.5-gke.4

  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc):

GKE

  • cert-manager version (e.g. v0.4.0):

v0.5.2

  • Install method (e.g. helm or static manifests):

cert-manager installed via helm; my application installed via static manifests

/kind bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions