Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requestmanager_controller got stuck in a loop and stopped generating new certificates afterward #3565

Closed
ragoragino opened this issue Jan 13, 2021 · 24 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ragoragino
Copy link

Describe the bug:
At some point, it seems that the communication between the cert-manager-cainjector and ServerAPI stopped working (we received few EOF logs and subsequently "Successfully Reconciled" logs in the cert-manager-cainjector). However, after the communication restarted, we started receiving:

1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item  due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="default/stan-client-tls"

After another while (like 10s), the controller moved further in the processing of items, but outputted this log for all the previous logs:

I0112 15:48:53.799058       1 requestmanager_controller.go:196] cert-manager/controller/CertificateRequestManager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="default/stan-client-tls"

Afterward, the generation of this certificate stopped altogether.

In the Kubernetes environment, we could see that multiple CertificateRequest objects have been generated for stan-client-tls Certificate with the same revision number. So probably, the client interface (https://github.com/jetstack/cert-manager/blob/cdc53b65cbd344dbef64f0c5c22e6070e79c5b5c/pkg/controller/certificates/requestmanager/requestmanager_controller.go#L339) was fully working and creating new instances, while certificateRequestLister was unable to get proper current state (https://github.com/jetstack/cert-manager/blob/cdc53b65cbd344dbef64f0c5c22e6070e79c5b5c/pkg/controller/certificates/requestmanager/requestmanager_controller.go#L165).

Expected behaviour:
The controller should probably delete the unused CertificateRequests objects and continue with creating new ones until one of them succeeds.

Environment details::

  • Kubernetes version: 1.17.9
  • Cloud-provider/provisioner: Azure
  • cert-manager version: 1.0.3
  • Install method: Helm

/kind bug

@jetstack-bot jetstack-bot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 13, 2021
@smacl
Copy link

smacl commented Feb 18, 2021

We recently observed a similar issue where multiple CertificateRequests were created by cert-manager (not cainjector in our case) for the same Certificate.

First,

I0216 08:07:52.779649       1 logs.go:179] cert-manager/controller/build-context "msg"="Event(v1.ObjectReference{Kind:\"Certificate\", Namespace:\"cert-manager\", Name:\"mycert\", UID:\"b939275f-915d-444e-b6ef-87defb6e7070\", APIVersion:\"cert-manager.io/v1\", ResourceVersion:\"12914\", FieldPath:\"\"}): type: 'Normal' reason: 'Requested' Created new CertificateRequest resource \"mycert-jtfbl\""

Then an retry:

E0216 08:07:57.779474       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="cert-manager/mycert"

And another CertificateRequest,

I0216 08:07:58.403286       1 logs.go:179] cert-manager/controller/build-context "msg"="Event(v1.ObjectReference{Kind:\"Certificate\", Namespace:\"cert-manager\", Name:\"mycert\", UID:\"b939275f-915d-444e-b6ef-87defb6e7070\", APIVersion:\"cert-manager.io/v1\", ResourceVersion:\"12914\", FieldPath:\"\"}): type: 'Normal' reason: 'Requested' Created new CertificateRequest resource \"mycert-nmgx7\""

After which there are repeated errors during processing of both requests:

I0216 08:07:58.503826       1 requestmanager_controller.go:196] cert-manager/controller/CertificateRequestManager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="cert-manager/mycert"

Eventually both requests seemingly succeeded:

I0216 08:09:17.853062       1 ca.go:124] cert-manager/controller/certificaterequests-issuer-ca/sign "msg"="certificate issued" "resource_kind"="CertificateRequest" "resource_name"="mycert-nmgx7" "resource_namespace"="cert-manager" "resource_version"="v1"
I0216 08:09:17.853183       1 conditions.go:222] Found status change for CertificateRequest "mycert-nmgx7" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-02-16 08:09:17.853177395 +0000 UTC m=+324.734716872
I0216 08:09:17.853342       1 logs.go:179] cert-manager/controller/build-context "msg"="Event(v1.ObjectReference{Kind:\"CertificateRequest\", Namespace:\"cert-manager\", Name:\"mycert-nmgx7\", UID:\"88ec3dc9-c65d-4d84-842a-1219bd42ede1\", APIVersion:\"cert-manager.io/v1\", ResourceVersion:\"12999\", FieldPath:\"\"}): type: 'Normal' reason: 'CertificateIssued' Certificate fetched from issuer successfully"
I0216 08:09:17.854980       1 ca.go:124] cert-manager/controller/certificaterequests-issuer-ca/sign "msg"="certificate issued" "resource_kind"="CertificateRequest" "resource_name"="mycert-jtfbl" "resource_namespace"="cert-manager" "resource_version"="v1"
I0216 08:09:17.855206       1 conditions.go:222] Found status change for CertificateRequest "mycert-jtfbl" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-02-16 08:09:17.855198125 +0000 UTC m=+324.736737702
I0216 08:09:17.855343       1 logs.go:179] cert-manager/controller/build-context "msg"="Event(v1.ObjectReference{Kind:\"CertificateRequest\", Namespace:\"cert-manager\", Name:\"mycert-jtfbl\", UID:\"b4fcca1e-9429-4876-a3f6-5446d6de650f\", APIVersion:\"cert-manager.io/v1\", ResourceVersion:\"12998\", FieldPath:\"\"}): type: 'Normal' reason: 'CertificateIssued' Certificate fetched from issuer successfully"

After which they were considered Ready:

I0216 08:09:17.938477       1 sync.go:59] cert-manager/controller/certificaterequests-issuer-ca "msg"="certificate request Ready condition true so skipping processing" "resource_kind"="CertificateRequest" "resource_name"="mycert-nmgx7" "resource_namespace"="cert-manager" "resource_version"="v1"
I0216 08:09:17.941633       1 sync.go:59] cert-manager/controller/certificaterequests-issuer-ca "msg"="certificate request Ready condition true so skipping processing" "resource_kind"="CertificateRequest" "resource_name"="mycert-jtfbl" "resource_namespace"="cert-manager" "resource_version"="v1"

However, no secret was created:

E0216 08:11:57.808340       1 sync.go:83] cert-manager/controller/clusterissuers "msg"="error setting up issuer" "error"="secret \"mycert\" not found" "resource_kind"="ClusterIssuer" "resource_name"="mycert" "resource_namespace"="" "resource_version"="v1"
E0216 08:11:57.808374       1 controller.go:158] cert-manager/controller/clusterissuers "msg"="re-queuing item due to error processing" "error"="secret \"mycert\" not found" "key"="mycert"

A couple of other points:

  • The issuer of the certificate in question wasn't itself ready until after the "Multiple CertificateRequest resources" errors began
  • Between 5-20 seconds before the duplicate request creation, the logs indicate several communication issues with the API server (connection resets, Retry-After responses).

@ragoragino
Copy link
Author

ragoragino commented Mar 2, 2021

As the problem continues to occur for us, I started digging into the possible root cause and a solution to see if I can be of any help here.

The issue seems to occur after a failure of Server API communication that is followed shortly by a certificate issuance process. At that particular moment, there might be a delay between creating a Server API object and a subsequent local cache update from a shared informer. Therefore, creating a new CertificateRequests in the CertificateRequestManager might return an error because the local cache wasn't updated in a specified timeout. As a result, subsequent retries of processing (i.e. ProcessItem) won't be able to find previous CertificateRequests (with identical revisions) and will continue creating new CertificateRequests. However, the original CertificateRequests object exists on Server API. The consequence of this behaviour is that the CertificateIssuing controller won't move further and renewal of the given Certificate stops forever. The relevant piece of code is here:

https://github.com/jetstack/cert-manager/blob/master/pkg/controller/certificates/requestmanager/requestmanager_controller.go#L344

  1. Therefore, one possible solution would be to always check Server API before creating a new CertificateRequest.
  2. Another possible way to mitigate this issue might be to remove this check here: https://github.com/jetstack/cert-manager/blob/cdc53b65cbd344dbef64f0c5c22e6070e79c5b5c/pkg/controller/certificates/issuing/issuing_controller.go#L217
  3. Final solution can be to deal with the issue when detected (i.e. here https://github.com/jetstack/cert-manager/blob/master/pkg/controller/certificates/requestmanager/requestmanager_controller.go#L192), however, I am not sure about possible repercussions for other controllers at that moment.

What do you think?

Thanks.

@marinoborges
Copy link

I'm facing the same issue in OVH.. Multiple CertificateRequests created and multiple entries like this in cert-manager pod log:

E0330 17:07:08.164430       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="cert-manager-test/selfsigned-cert" 

@frabe1579
Copy link

Same issue for me, infinete creation of certificaterequest, one every 30-40 seconds; no orders and no challenges created.
Every certificate request has zero events and no status.
Some days before I've create three certificates with success.

This is a sample log from cert-manager:

W0405 14:31:53.353021       1 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
I0405 14:35:36.753993       1 conditions.go:173] Setting lastTransitionTime for Certificate "tls-cert" condition "Issuing" to 2021-04-05 14:35:36.753984598 +0000 UTC m=+1191568.543064875
I0405 14:35:36.754048       1 conditions.go:173] Setting lastTransitionTime for Certificate "tls-cert" condition "Ready" to 2021-04-05 14:35:36.754044213 +0000 UTC m=+1191568.543124449
E0405 14:35:36.864108       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-cert\": the object has been modified; please apply your changes to the latest version and try again" "key"="evo-seven/tls-cert"
I0405 14:35:36.864205       1 conditions.go:173] Setting lastTransitionTime for Certificate "tls-cert" condition "Ready" to 2021-04-05 14:35:36.864201839 +0000 UTC m=+1191568.653282102
E0405 14:35:37.098787       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-cert\": the object has been modified; please apply your changes to the latest version and try again" "key"="evo-seven/tls-cert"
E0405 14:35:42.134625       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:35:48.154531       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:35:55.184975       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:36:04.213143       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:36:17.238978       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:36:38.258443       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:37:13.303129       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:37:48.337118       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:38:23.372114       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:38:58.527014       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
W0405 14:39:33.357965       1 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
E0405 14:39:33.550267       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:40:08.575066       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:40:44.057994       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:41:19.102416       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:41:54.138793       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:42:29.167208       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:43:04.189638       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:43:39.488843       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:44:14.517500       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:44:50.404372       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:45:25.445090       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:46:00.480316       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:46:35.577328       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
W0405 14:46:46.361263       1 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
E0405 14:47:10.598888       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:47:45.876135       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:48:21.060154       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:48:56.403248       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:49:31.791474       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:50:07.610210       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:50:42.690564       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:51:17.790211       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:51:52.813480       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:52:29.144392       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:53:04.905653       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:53:40.435520       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:54:16.079388       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:54:51.104798       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
W0405 14:54:58.364606       1 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
E0405 14:55:26.209327       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:56:01.233293       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:56:36.265480       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:57:11.321807       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:57:46.349915       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:58:21.374360       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:58:56.406887       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 14:59:31.430364       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:00:06.458617       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:00:41.821982       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:01:17.112874       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:01:52.158603       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
W0405 15:01:56.367271       1 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
E0405 15:02:27.200101       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:03:02.262696       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:03:37.368137       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:04:12.416447       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:04:47.666610       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:05:22.699331       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:05:57.726778       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"
E0405 15:06:32.751335       1 controller.go:158] cert-manager/controller/CertificateRequestManager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="evo-seven/tls-cert"

cainjector is not involved, no special logs inside.

@devlifealways
Copy link

Same issue on OVH cloud provider, as the certmanager controller continues to spawn new CertificateRequest objects, without ever detecting them.

Is there any progress ?

@imbrou
Copy link

imbrou commented Aug 30, 2021

Exact same issue here too. Same OVH cloud provider. Any ideas ?

Thank you in advance for any response, I'm still investigating !

@cuttingedge1109
Copy link

Order is not created from the certificateRequest.
So it keeps creating new certificateRequest objects. It ends up to the ETCD storage quota exceeded.

@ikallali
Copy link

ikallali commented Oct 1, 2021

+1

@Juankimr
Copy link

Juankimr commented Oct 8, 2021

Hello everyone. Im having the same problem with Ovh provider. Did someone find how to solve this problem?

@maelvls
Copy link
Member

maelvls commented Oct 8, 2021

Hi! I was not able to reproduce the issue (yet). I created a cluster using the OVHCloud Managed Kubernetes offering; I then created a lot (5000) certificates hoping that I would trigger the message "this may indicate an apiserver running slowly"; instead, I have hit a quota limit ("The OVHcloud storage quota has been reached").

My guess is that (somehow) the apiserver, or etcd instance, runs slowly, leading to the informer cache being updated too late (more than 5 seconds). I don't think the issue comes from cert-manager itself, but rather the fact that 5 seconds might be too short for some Kubernetes clusters.

Related:

@pdesgarets
Copy link

For the record, for other OVH users having this issue and looking for a workaround/quick fix :

@justinrush
Copy link

justinrush commented Nov 19, 2021

I'm seeing this using a ca issuer. We've only got one Certificate on this cluster, so I doubt its a throughput problem and we shouldn't be reaching out to any external CAs.

We're running on Azure AKS, Kubernetes v1.20.9.

Logs attached - the interesting bits with my comments are here:

# we see the cert has to be renewed
I1118 21:18:46.014267       1 trigger_controller.go:189] cert-manager/controller/certificates-trigger "msg"="Certificate must be re-issued" "key"="default/eds" "message"="Renewing certificate as renewal was scheduled at 2021-11-18 21:18:46 +0000 UTC" "reason"="Renewing"

# we set the status on the cert to issuing
I1118 21:18:46.181084       1 conditions.go:182] Setting lastTransitionTime for Certificate "eds" condition "Issuing" to 2021-11-18 21:18:46.18104088 +0000 UTC m=+741577.142669150

# we had some error talking to our backend - we didn't clean up the CertificateRequest
E1118 21:19:00.105292       1 controller.go:164] cert-manager/controller/certificates-trigger "msg"="re-queuing item due to error processing" "error"="rpc error: code = Unavailable desc = transport is closing" "key"="default/eds" 

# we see the cert has to be renewed
I1118 21:19:01.586256       1 trigger_controller.go:189] cert-manager/controller/certificates-trigger "msg"="Certificate must be re-issued" "key"="default/eds" "message"="Renewing certificate as renewal was scheduled at 2021-11-18 21:18:46 +0000 UTC" "reason"="Renewing"

# we set the status to issuing
I1118 21:19:01.785024       1 conditions.go:182] Setting lastTransitionTime for Certificate "eds" condition "Issuing" to 2021-11-18 21:19:01.785003802 +0000 UTC m=+741592.746632072

# We start an endless loop
I1118 21:19:14.894253       1 requestmanager_controller.go:198] cert-manager/controller/certificates-request-manager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="default/eds"

To fix it I had to delete both existing CertificateRequests. I first tried just deleting the oldest one and a new CertificateRequest was immediately created, so I deleted the other old one and the Certificate was renewed.

cert-manager-double-approved-csr-issue.log

@arkdevuk
Copy link

arkdevuk commented Dec 5, 2021

having the same issue with ovh aswell.

@pdesgarets solution fixed the infinite loop (thx 💪🏼)

But this issue need to be adressed properly, a sync error shouldn't result in.. that

@ikallali
Copy link

Hi, I still have the same issue with ovh for the second time...
Is there any progress ?

@jetstack-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

@jetstack-bot jetstack-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 30, 2022
@justinrush
Copy link

/remove-lifecycle stale

@jetstack-bot jetstack-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 30, 2022
@frabe1579
Copy link

frabe1579 commented Apr 29, 2022

Same problem here, I gave up, I'll use Traefik.

@ikallali
Copy link

ikallali commented Jun 9, 2022

Is there any progress ?

@ptsk5
Copy link

ptsk5 commented Aug 1, 2022

I want to report the same (bad) behaviour. Are there any plans to fix this, please? (the same question posted here)

I0801 10:01:36.376261       1 requestmanager_controller.go:210] cert-manager/certificates-request-manager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="my-ns/my-cert"

@khatrig
Copy link

khatrig commented Sep 23, 2022

Checking the k8s API server logs clarified things further. For me, this happened because cert-manager-webhook pod was not up.

@jetstack-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

@jetstack-bot jetstack-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2022
@jetstack-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

@jetstack-bot jetstack-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 21, 2023
@jetstack-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

@jetstack-bot
Copy link
Contributor

@jetstack-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests