New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Order stuck in errored state #4441
Comments
This should have been fixed in #4130 which was released in v1.5.0- would you be able to upgrade and let us know if it got fixed? |
@irbekrm Sure we can check after our next upgrade cycle for cert-manager. |
Please re-open if you experience the issue when cert-manager is up to date. |
@jakexks: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
From a brief look #2765 is about not retrying to finalize orders that have already been finalized. The issue described here is also caused by retrying to finalize already finalized
Do you have some logs that you could add to #2765 ? Also the status of the |
@irbekrm Sorry, I don't have the exact logs and resource states anymore. The issue is now resolved for us.
I needed to run this command as we have set the |
Describe the bug:
We have a certificate that is supposed to be refreshed with Let's Encrypt once a week. Back in June, the Order failed mysteriously:
The Order then remained in the "errored" state for about 3 months, until the certificate itself finally expired. It seems that it is incorrectly trying to reuse the broken Order instead of starting a new one.
This particular certificate had been renewed many times previously. We did notice similar issues with other CertificateRequests/Orders, but not all the same time. Nonetheless, the results were identical. If an Order fails due to some transient issue, cert-manager incorrectly tries to reuse that broken CertificateRequest forever rather than making a new one, and then eventually the Certificate expires.
The only way to fix it is to manually delete the offending CertificateRequest, and then wait an hour for it to try again.
Expected behaviour:
It should have fixed itself automatically without any manual intervention.
Steps to reproduce the bug:
We really don't know what specifically caused the Order to fail in the first place, but once it does, this will happen.
Anything else we need to know?:
Environment details::
kubectl apply
/kind bug
The text was updated successfully, but these errors were encountered: