Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cert-manager is not able to create CertificateRequest (OVHCloud Managed Kubernetes) #4418

Closed
vanillathunder1337 opened this issue Aug 28, 2021 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vanillathunder1337
Copy link

Problem description:
regardless of the kind of issuer/clusterissuer I'm using, the certificates won't be created.
Currently im using a selfsigned clusterissuer to repair the certificate process.

It already stucks, before the issuer can take action. Im getting the following error message when executing

kubectl describe certificate XXX

Events:
  Type     Reason         Age                   From          Message
  ----     ------         ----                  ----          -------
  Normal   Issuing        28m                   cert-manager  Issuing certificate as Secret does not exist
  Normal   Generated      28m                   cert-manager  Stored new private key in temporary Secret resource "wildcard-apps-fainin-dev-selfsigned-ld4jr"
  Warning  RequestFailed  3m40s (x55 over 28m)  cert-manager  Failed to create CertificateRequest: admission webhook "webhook.cert-manager.io" denied the request: [spec.uid: Forbidden: uid identity must be that of the requester, spec.username: Forbidden: username identity must be that of the requester, spec.groups: Forbidden: groups identity must be that of the requester]

After that I increased the debug level of cert-manager to 6 and deployed it again. In the logs of the main pod of the cert-manager I found the following messages that are interesting as well:

I0828 13:19:06.744025       1 round_trippers.go:443] POST https://10.3.0.1:443/apis/cert-manager.io/v1/namespaces/default/certificaterequests 406 Not Acceptable in 15 milliseconds
E0828 13:19:06.744444       1 controller.go:158] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="admission webhook \"webhook.cert-manager.io\" denied the request: [spec.uid: Forbidden: uid identity must be that of the requester, spec.username: Forbidden: username identity must be that of the requester, spec.groups: Forbidden: groups identity must be that of the requester]" "key"="default/wildcard-apps-fainin-dev-selfsigned"

So it looks like the pod is getting a 406 when trying to create a certificaterequest.

I also checked if the pod is generally able to communicate with the api:

I0828 13:18:59.570250       1 round_trippers.go:443] POST https://10.3.0.1:443/api/v1/namespaces/default/events 201 Created in 53 milliseconds
I0828 13:18:59.629036       1 round_trippers.go:443] POST https://10.3.0.1:443/api/v1/namespaces/default/secrets 201 Created in 22 milliseconds
...
I0828 13:19:03.913371       1 round_trippers.go:443] GET https://10.3.0.1:443/api/v1/namespaces/kube-system/configmaps/cert-manager-controller 200 OK in 11 milliseconds
I0828 13:19:03.928215       1 round_trippers.go:443] PUT https://10.3.0.1:443/api/v1/namespaces/kube-system/configmaps/cert-manager-controller 200 OK in 14 milliseconds

Expected behaviour:
Clean creation of the CertificateRequests

Steps to reproduce the bug:

  1. create ovhcloud managed kubernetes cluster
  2. configure kubectl
  3. install cert-manager
  4. configure clusterissuer or issuer
  5. generate certificate
  6. view kubectl describe certificate

ClusterIssuer and Certificate yaml's:
clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: selfsigned-cluster-issuer
spec:
  selfSigned: {}

(!! Tried many different issuers - but the error message isn't different)

certificate.yaml

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: wildcard-selfsigned
spec:
  secretName: wildcard-tls-selfsigned
  renewBefore: 240h
  dnsNames:
  - '*.mydomain.com'
  issuerRef:
    name: selfsigned-cluster-issuer
    kind: ClusterIssuer

(!! Also tried many different ways of creating certificates - but this one is the easiest)

Environment details::

  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:18:45Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:20:00Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud-provider/provisioner:
    OVH Cloud Managed Kubernetes
  • cert-manager version:
    currently 1.3.1
    also tried 1.5.3
  • Install method:
helm upgrade -i cert-manager jetstack/cert-manager --namespace cert-manager  --create-namespace  --version v1.3.1

I also tried to modify some variables, which are maybe related with this issue, in the values.yaml.. But im getting the same errors as with the default configuration.

my guesses
I'm honestly thinking, that there is a problem with the rbac or any other kind of access restrictions.
I'm kind of tired searching for the error messages that literally give no results anywhere in the internet. Maybe someone of you already faced this issue and can help me with this.

Thank you in advance for any help.

@irbekrm
Copy link
Contributor

irbekrm commented Aug 31, 2021

Thanks for creating the issue.

[spec.uid: Forbidden: uid identity must be that of the requester, spec.username: Forbidden: username identity must be that of the requester, spec.groups: Forbidden: groups identity must be that of the requester]

This error comes from cert-manager validating webhook here. The identity fields (uid, username, groups) should have been populated by the mutating webhook here just before the validating webhook gets called. For a CertificateRequest that gets created from a Certificate like in your case, it should have been UID, name and group of the service account assigned to cert-manager controller pod.
It would appear that in your case somehow there was a create call to validating webhook with different identity fields to those to the original call to the mutating webhook? I am at the moment not entirely sure how that may have happened.
It may or may not be an OVH-specific issue.

Have you got any CertificateRequest successfully created and if so do the identity fields on them match the cert-manager service account's identity?

helm upgrade -i cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.3.1

How do you upgrade the CRDs? Are there any other values that are configured for the deployment?

also tried 1.5.3

You should probably upgrade to v1.5.3, v1.3.1 is no longer supported.

I wonder if we should print the actual not-matching identity fields to those validating webhook error messages, so it's easier to debug.

@vanillathunder1337
Copy link
Author

Hi @irbekrm
thank you for your detailed review. I already saw the mentioned code you linked here. But without reverse engineering the entire project, I won't be able to debug it effectively.

Have you got any CertificateRequest successfully created and if so do the identity fields on them match the cert-manager service account's identity?

No, I tried many different Issuers and every one of them is exiting before - with the above error message.

How do you upgrade the CRDs? Are there any other values that are configured for the deployment?

I tried both ways. Setting the install crds parameter in values.yaml or Installing it from a raw yaml like this.

I wonder if we should print the actual not-matching identity fields to those validating webhook error messages, so it's easier to debug.

That would be very helpful!

I'm still thinking that this is an OVH problem. That would explain the uniqueness of the issue.

Thank you for all your work and in advance for any further

@kdasme
Copy link

kdasme commented Sep 3, 2021

Hey @vanillathunder1337 - just hit the same issue, and you guessed it — on OVH k8s as well! :)
Don't really know what has changed, but after I recreated ClusterIssuer (which I use ZeroSSL for) it is no longer an issue.

@vanillathunder1337
Copy link
Author

Hmm I'm still having problems with let's encrypt.. Even if I reinstall the entire cluster

@zakov-kara
Copy link

Hi @vanillathunder1337 , I had the same problem.

I contacted OVH support and they told me to restart the control plane :
https://eu.api.ovh.com/console/#/cloud/project/{serviceName}/kube/{kubeId}/restart#POST

After reinstalling helm cert-manager everything worked fine.

@maelvls
Copy link
Member

maelvls commented Sep 29, 2021

/retitle cert-manager is not able to create CertificateRequest (OVH Cloud Managed Kubernetes)

@maelvls maelvls added the kind/bug Categorizes issue or PR as related to a bug. label Sep 29, 2021
@vanillathunder1337
Copy link
Author

Hi @zakov-kara , it worked for me as well. I'm still wondering why but whatever ...

@maelvls I think the "bug" is solved but some enhanced logging would be nice.

Regards

@maelvls maelvls changed the title cert-manager is not able to create CertificateRequest cert-manager is not able to create CertificateRequest (OVHCloud Managed Kubernetes) Oct 6, 2021
@maelvls
Copy link
Member

maelvls commented Oct 6, 2021

Thanks for the update @zakov-kara and @vanillathunder1337!

I agree, the error message is somehow cryptic. For example, it doesn't say what is the username of the "requester" (the one who tries to create the certificaterequest; in this case, that would be cert-manager itself). A message with "expect/got" may be more helpful maybe:

Failed to create CertificateRequest: admission webhook "webhook.cert-manager.io" denied the request:
the identity (uid, username, and groups fields) of the creator of this CertificateRequest doesn't
match the identity of the CertificateRequest: the username on the CertificateRequest is '...' but
the username of the creator of this CertificateRequest is '...'.

I am still confused as to how all this should be interpreted though.

@maelvls
Copy link
Member

maelvls commented Oct 7, 2021

I'll close the issue for now since a fix has been found. I'll try to have a PR with better error messages.

@maelvls maelvls closed this as completed Oct 7, 2021
@maelvls
Copy link
Member

maelvls commented Oct 7, 2021

I decided to dig a bit more, I find it surprising that OVHCloud's apiserver needs to be restarted.

When the CertificateRequest is created, the apiserver does 2 calls to cert-manager-webhook:

  1. /mutate,
  2. /validate.

/mutate

(I transformed the HTTP request body into YAML and removed some fields to make it more legible)

# POST /mutate?timeout=10s HTTP/1.1
# Host: localhost:8081
# User-Agent: kube-apiserver-admission
# Content-Length: 3617
# Accept: application/json, */*
# Content-Type: application/json
# Accept-Encoding: gzip

kind: AdmissionReview
apiVersion: admission.k8s.io/v1
request:
  operation: CREATE
  userInfo:
    username: "system:serviceaccount:cert-manager:cert-manager"
    uid: 8a21a8aa-7c47-4051-9640-3730ef8c5e9a
    groups:
      - "system:serviceaccounts"
      - "system:serviceaccounts:cert-manager"
      - "system:authenticated"
    extra:
      authentication.kubernetes.io/pod-name:
        - cert-manager-64df6fbb55-sdfcw
      authentication.kubernetes.io/pod-uid:
        - 7fa154c0-196f-46f5-8936-c5c1359d8360
  object:
    apiVersion: cert-manager.io/v1
    kind: CertificateRequest
    spec:
      issuerRef:
        group: cert-manager.io
        kind: Issuer
        name: letsencrypt
      request: LS0tLS1CRUd...LS0tCg==
      usages:
        - "digital signature"
        - "key encipherment"
    status: {}
  options:
    kind: CreateOptions
    apiVersion: meta.k8s.io/v1

The webhook adds the uid, username, etc. as a result of mutate:

# HTTP/1.1 200 OK
# Date: Thu, 07 Oct 2021 15:44:48 GMT
# Content-Type: text/plain; charset=utf-8
# Transfer-Encoding: chunked
# content-length: 4737

kind: AdmissionReview
apiVersion: admission.k8s.io/v1
request:
  # Same as in the HTTP request.
response:
  uid: 5bcca4f4-6204-4085-bf6b-1ab3cd6fd637
  allowed: true
  patch: # This is normally base64-encoded, I decoded it for readability.
    [
      {
        "op": "add",
        "path": "/spec/extra",
        "value":
          {
            "authentication.kubernetes.io/pod-name":
              ["cert-manager-64df6fbb55-sdfcw"],
            "authentication.kubernetes.io/pod-uid":
              ["7fa154c0-196f-46f5-8936-c5c1359d8360"],
          },
      },
      {
        "op": "add",
        "path": "/spec/groups",
        "value":
          [
            "system:serviceaccounts",
            "system:serviceaccounts:cert-manager",
            "system:authenticated",
          ],
      },
      {
        "op": "add",
        "path": "/spec/uid",
        "value": "8a21a8aa-7c47-4051-9640-3730ef8c5e9a",
      },
      {
        "op": "add",
        "path": "/spec/username",
        "value": "system:serviceaccount:cert-manager:cert-manager",
      },
    ]
  patchType: JSONPatch

After /mutate, the CertificateRequest is validated:

/validate

# POST /validate?timeout=10s HTTP/1.1
# Host: localhost:8081
# User-Agent: kube-apiserver-admission
# Content-Length: 4096
# Accept: application/json, */*
# Content-Type: application/json
# Accept-Encoding: gzip

kind: AdmissionReview
apiVersion: admission.k8s.io/v1
request:
  operation: CREATE
  userInfo:
    username: "system:serviceaccount:cert-manager:cert-manager"
    uid: 8a21a8aa-7c47-4051-9640-3730ef8c5e9a
    groups:
      - "system:serviceaccounts"
      - "system:serviceaccounts:cert-manager"
      - "system:authenticated"
    extra:
      authentication.kubernetes.io/pod-name:
        - cert-manager-64df6fbb55-sdfcw
      authentication.kubernetes.io/pod-uid:
        - 7fa154c0-196f-46f5-8936-c5c1359d8360
  object:
    apiVersion: cert-manager.io/v1
    kind: CertificateRequest
    spec:
      extra:
        authentication.kubernetes.io/pod-name:
          - cert-manager-64df6fbb55-sdfcw
        authentication.kubernetes.io/pod-uid:
          - 7fa154c0-196f-46f5-8936-c5c1359d8360
      groups:
        - "system:serviceaccounts"
        - "system:serviceaccounts:cert-manager"
        - "system:authenticated"
      issuerRef:
        group: cert-manager.io
        kind: Issuer
        name: letsencrypt
      request: LS0tLS1C...VC0tLS0tCg==
      uid: 8a21a8aa-7c47-4051-9640-3730ef8c5e9a
      usages:
        - "digital signature"
        - "key encipherment"
      username: "system:serviceaccount:cert-manager:cert-manager"

Since the uid, username, groups and extras all match, the validation succeeds:

# HTTP/1.1 200 OK
# Date: Thu, 07 Oct 2021 15:44:48 GMT
# Content-Type: text/plain; charset=utf-8
# Transfer-Encoding: chunked
# content-length: 4503

kind: AdmissionReview
apiVersion: admission.k8s.io/v1
request:
  # Same as in the request.
response:
  uid: 32169f3b-65f3-4fe0-a48a-ec17eeae73ef
  allowed: true #

My guess with OVHCloud is that, somehow, the initial /mutate doesn't get called, which makes the second call (/validate) fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants