Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cert-manager does not allow us to create certs for 64+ bytes name ksvc #214

Closed
nak3 opened this issue May 18, 2021 · 24 comments
Closed

cert-manager does not allow us to create certs for 64+ bytes name ksvc #214

nak3 opened this issue May 18, 2021 · 24 comments
Assignees
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@nak3
Copy link
Contributor

nak3 commented May 18, 2021

Step to reproduce

1. Create ksvc domain with long name.

e.g.

$ kn service create xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx --image=gcr.io/knative-samples/helloworld-go

$ kubectl get ksvc
NAME                                        URL                                                                              LATESTCREATED                                       LATESTREADY                                         READY   REASON
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   http://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.default.18.136.81.50.sslip.io   xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-kqxyw-1   xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-kqxyw-1   True

As you can see, http does not become https

2. kcert becomes ReconcileFailed

$ kubectl get kcert
NAME                                         READY     REASON
route-d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363   Unknown   ReconcileFailed

$  kubectl get kcert -o yaml route-d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363
apiVersion: networking.internal.knative.dev/v1alpha1
kind: Certificate
metadata:
  annotations:
    networking.knative.dev/certificate.class: cert-manager.certificate.networking.knative.dev
    serving.knative.dev/creator: kubecfg-knakayam
    serving.knative.dev/lastModifier: kubecfg-knakayam
  creationTimestamp: "2021-05-18T06:29:32Z"
  generation: 1
  labels:
    serving.knative.dev/route: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  name: route-d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363
  namespace: default
  ownerReferences:
  - apiVersion: serving.knative.dev/v1
    blockOwnerDeletion: true
    controller: true
    kind: Route
    name: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    uid: d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363
  resourceVersion: "35202"
  uid: f9303095-1e9c-4869-92ad-60aa898a599c
spec:
  dnsNames:
  - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.default.18.136.86.50.sslip.io
  secretName: route-d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363
status:
  conditions:
  - lastTransitionTime: "2021-05-18T06:29:32Z"
    message: Cert-Manager certificate has not yet been reconciled.
    reason: ReconcileFailed
    status: Unknown
    type: Ready
  observedGeneration: 1

3. certmanager pod produces following error.

$ kubectl -n knative-serving logs networking-certmanager-8956c788f-n
{"severity":"INFO","timestamp":"2021-05-18T06:17:21.708101824Z","logger":"certcontroller.certificate-controller.event-broadcaster","caller":"record/event.go:278","message":"Event(v1.ObjectReference{Kind:\"Certificate\", Namespace:\"serving-tests\", Name:\"route-089742a5-58c7-4e70-85e7-f512c3f5fca2\", UID:\"72818c29-1ae3-4200-935e-ef9b9aaf168e\", APIVersion:\"networking.internal.knative.dev/v1alpha1\", ResourceVersion:\"24711\", FieldPath:\"\"}): type: 'Warning' reason: 'InternalError' failed to create Cert-Manager Certificate: admission webhook \"webhook.cert-manager.io\" denied the request: spec.commonName: Too long: must have at most 64 bytes","commit":"d43cafb","knative.dev/controller":"certificate-controller"}

It says spec.commonName: Too long: must have at most 64 bytes".

Additional Info

Normal name ksvc could create kcert as expected.

$ kn service create hello-example --image=gcr.io/knative-samples/helloworld-go

$ kubectl get ksvc
NAME                                        URL                                                                              LATESTCREATED                                       LATESTREADY                                         READY   REASON
hello-example                               https://hello-example.default.18.136.86.50.sslip.io                              hello-example-zjdmt-1                               hello-example-zjdmt-1                               True

$  kubectl get kcert
NAME                                         READY     REASON
route-6842ba27-5b1c-478f-8d42-426519471e3b   True
route-d15c50c3-c3e6-4a1b-b49b-4ebdc7f1a363   Unknown   ReconcileFailed

cert-manager generate certificates.cert-manager.io which has spec.commonName with the domain name.

$ kubectl get certificates.cert-manager.io -o yaml
apiVersion: v1
items:
- apiVersion: cert-manager.io/v1
  kind: Certificate
  metadata:
    annotations:
      networking.knative.dev/certificate.class: cert-manager.certificate.networking.knative.dev
      serving.knative.dev/creator: kubecfg-knakayam
      serving.knative.dev/lastModifier: kubecfg-knakayam
    creationTimestamp: "2021-05-18T05:54:31Z"
    generation: 1
    labels:
      serving.knative.dev/route: hello-example
    name: route-6842ba27-5b1c-478f-8d42-426519471e3b
    namespace: default
    ownerReferences:
    - apiVersion: networking.internal.knative.dev/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Certificate
      name: route-6842ba27-5b1c-478f-8d42-426519471e3b
      uid: c2a97c97-0461-448e-94a0-2ff35ab2bc22
    resourceVersion: "5267"
    uid: 8bb09d87-6919-4f37-b316-9ec222ae8a7b
  spec:
    commonName: hello-example.default.18.136.86.50.sslip.io
    dnsNames:
    - hello-example.default.18.136.86.50.sslip.io
    issuerRef:
      kind: ClusterIssuer
      name: ca-issuer
    secretName: route-6842ba27-5b1c-478f-8d42-426519471e3b
    ...
@dprotaso
Copy link
Contributor

Is there a reason why cert manager has this 64 byte limit?

@nak3
Copy link
Contributor Author

nak3 commented May 20, 2021

Yes, cert-manager/cert-manager#1462 has the discussion.

It seems CA has the limitation as per the comment linked in above issue.

The Common Name field in certificates is limited to 64 characters. Conveniently, it’s been deprecated for 15 20 years. Inconveniently, CAs still use it.
https://community.letsencrypt.org/t/a-certificate-for-a-63-character-domain/78870

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2021
@nak3 nak3 removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2021
@dprotaso
Copy link
Contributor

dprotaso commented Aug 24, 2021

@nak3 is there anything we can do regarding this issue?

Thinking of a few things

  1. Document this limitation on the website
  2. Ensure we surface this error message in our status conditions properly

@nak3
Copy link
Contributor Author

nak3 commented Aug 25, 2021

/assign

Yes, I think these two suggestions are what we can do now. I will work on it.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 23, 2021
@LeonardAukea
Copy link

LeonardAukea commented Mar 16, 2022

@nak3 is there anything we can do regarding this issue?

Thinking of a few things

  1. Document this limitation on the website
  2. Ensure we surface this error message in our status conditions properly

@nak3 Why not just ensure that the commonName is less than 64 bytes, by setting some uid for commonName ?

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: route-5178950c-94db-4563-9e3f-bd7583570bd8
  namespace: knative-serving
spec:
  commonName: service-name.namespace.example.com
  dnsNames:
  - service-name.namespace.example.com
  issuerRef:
    kind: ClusterIssuer
    name: letsencrypt
  secretName: route-5178950c-94db-4563-9e3f-bd7583570bd8

you can just as easily read from dnsNames what this cert is for. Keeping it as is will entail that this limit will be hit very fast.

@LeonardAukea
Copy link

LeonardAukea commented Mar 16, 2022

@nak3 nak3 reopened this Mar 17, 2022
@nak3 nak3 added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 17, 2022
@nak3 nak3 removed their assignment Mar 17, 2022
@nak3
Copy link
Contributor Author

nak3 commented Mar 17, 2022

/cc @ZhiminXiang @itsmurugappan

Could you please check the Leonard's comment #214 (comment) ?

@ZhiminXiang
Copy link

I think reverting #188 may break the support to venafi.

@itsmurugappan could you double check?

@itsmurugappan
Copy link
Contributor

yes, reverting would break venafi support.

@dprotaso
Copy link
Contributor

Does it make sense to make setting the common name configurable? It's unclear to me if Venafi is an outlier so I don't know what the default should behaviour should be.

@LeonardAukea
Copy link

Does it make sense to make setting the common name configurable? It's unclear to me if Venafi is an outlier so I don't know what the default should behaviour should be.

I think the default behaviour should be to try to stick to standards for certificates. Venafi should consider changing their implementation to support standards as well.

@evankanderson
Copy link
Contributor

Can people comment on #380? This is breaking at least one customer who does end up with longer names, and shorter commonNames would un-break their use-case.

@ZhiminXiang
Copy link

ZhiminXiang commented Mar 24, 2022

There are two facts we need to be aware:

  1. LetsEncrypt requires commonName to be set. The length of commonName needs to be smaller than 63.
  2. when interacting with LetsEncrypt/ACME CA, cert-manager will set the first value of DNSNames as the commonName if the commonName is not set per CN was longer than 64 bytes cert-manager/cert-manager#1462 (comment).

Considering this, I would say the error @nak3 reported is expected no matter if we set commonName or not on our end as there is only one DNSNames which is longer than 63 char, and will be used as commonName by cert-manager when interacting with LE.

What we can do better is to handle the case where there are multiple values in the DNSNames field. For this case, we could pick up the shortest DNSNames as the commonName to meet the requirement of 63 char assuming that the short one is shorter than 63 char.

@carlisia carlisia added this to the v1.6.0 milestone Jun 10, 2022
@ianonavy
Copy link

As an end user, I'd like to add my perspective for how I would expect this to work with AutoTLS. Instead of automatically using top-level domain (which as mentioned here could lead to unexpected security issues), KNative could expose a common-name-template config. It could default to "" which falls back to the current behavior, otherwise the example config-network could recommend "{{.Domain}}" or "placeholder.{{.Domain}}". Then, there should be a clear section in the documentation about CNs longer than 63 chars explaining the nuance of using the top-level domain vs. some other placeholder and what the requirements should be in each case.

@evankanderson
Copy link
Contributor

I can make this configurable, but I think my default would be a.{{.Domain}} or maybe k.{{.Domain}}, because our expectation is that *.{{.Domain}} is mapped in DNS (e.g. via a wildcard record) to the cluster Ingress. We don't assume that {{.Domain}} itself is mapped.

@KauzClay
Copy link
Contributor

/assign

@KauzClay
Copy link
Contributor

KauzClay commented Jan 9, 2023

I've started a new set of PRs to address this issue. The main one for net-certmanager is here: #472

It links to the other dependent PRs.

Any feedback on those would be appreciated!

@dprotaso dprotaso removed this from the v1.6.0 milestone Jan 19, 2023
@dprotaso
Copy link
Contributor

dprotaso commented Jan 23, 2023

Following up - I've been testing Clay's changes.

Rebased PR+test fixes: #478
Serving Changes: knative/serving#13621

Few observations about the k.{domain} trick

  • The route reconciler creates the modifies the Route's Knative Ingress's to route the HTTP01 challenges to the cert manager pods. This is what the HTTP01 challenges on the certificate is used for. The serving route reconciler logic filters HTTP01 challenges that don't match the route full domain. This guard needs to be dropped (which I've done in my testing PR)
  • Once the initial k.{domain} certificate is issued subsequent certs with this DNS name don't perform the challenge. This causes issues for net-certmanager because it typically looks for a K8s service created by cert-manager which doesn't exist
  • Serving HTTP01 tests weren't running & working (Don't explicitly require a service account key for autotls dns tests knative/serving#13627) at all

The second issue is obvious when using route tags and I think we have a few options from the top of my head:

  • Scan list of existing cert manager certs with Ready=True and see if the missing domain is present
  • Use a unique commonName for route/cert - maybe use knative's cert's uuid and make it shorter by changing the base/alpha used.

@dprotaso
Copy link
Contributor

Scanning a list of certs isn't so bad (0c452ab) I just realize we have to do it across all namespaces.

@dprotaso
Copy link
Contributor

Serving tests passed here: https://prow.knative.dev/view/gs/knative-prow/pr-logs/pull/knative_serving/13621/istio-latest-no-mesh-tls_serving_main/1617586794230976512

This tested the change where we scan other 'Ready' certs

@dprotaso
Copy link
Contributor

Talked to @KauzClay we'll go ahead an land a fix using the cert look up - though it is uncertain whether it works when renewing certs (which is already broken #416).

Thus we'll land this fix in 1.9 and do a subsequent point release that swiches the code over to attempt unique commonNames and fixing renewal in the serving code.

knative-prow bot pushed a commit that referenced this issue Mar 28, 2023
* fix: don't use childname

* childname puts the hash at the end of the domain, making the domain invalid for the challenge
* this means that it is still possible for a cert to be created with a common name longer than 63 bytes if the domain itself is really large
* however, it should be less likely. hopefully that can be addressed completely if subsequent commits

* fix: make sure checking for existing cert excludes itself

* add logging to renewing codepath

* wip: create unique common name

* still can't guarantee that it will be a valid length, though this will hopefully greatly reduce that possibility

* wip: don't need to look for other certs

* if we always create unique common names for the certs, we shouldn't need to find another cert with the same common name
* don't need to log for this anymore either

* use UID as prefix to common name when original is too long

* respond with appropriate errors in different cases where shortening doesn't work
* return status conditions and update kcert accordingly

* run update codegen

* remove extra status, just update ready status instead

* use trimsuffix
knative-prow bot pushed a commit that referenced this issue Mar 30, 2023
* fix: don't use childname

* childname puts the hash at the end of the domain, making the domain invalid for the challenge
* this means that it is still possible for a cert to be created with a common name longer than 63 bytes if the domain itself is really large
* however, it should be less likely. hopefully that can be addressed completely if subsequent commits

* fix: make sure checking for existing cert excludes itself

* add logging to renewing codepath

* wip: create unique common name

* still can't guarantee that it will be a valid length, though this will hopefully greatly reduce that possibility

* wip: don't need to look for other certs

* if we always create unique common names for the certs, we shouldn't need to find another cert with the same common name
* don't need to log for this anymore either

* use UID as prefix to common name when original is too long

* respond with appropriate errors in different cases where shortening doesn't work
* return status conditions and update kcert accordingly

* run update codegen

* remove extra status, just update ready status instead

* use trimsuffix

---------

Co-authored-by: Clay Kauzlaric <ckauzlaric@vmware.com>
@dprotaso
Copy link
Contributor

dprotaso commented Apr 4, 2023

Patch release is out I'm going to close this out
https://github.com/knative-sandbox/net-certmanager/releases/tag/knative-v1.9.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
9 participants