Slow healthz and livez endpoints cause liveness and readiness probe failures #5137
Labels
kind/bug
Categorizes issue or PR as related to a bug.
lifecycle/rotten
Denotes an issue or PR that has aged beyond stale and will be auto-closed.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone
Describe the bug:
After upgrading from 1.1 to 1.7.2 I have noticed increased webhook timeouts. We use fluxcd and that causes frequent reconciliation using server-side dry-run.
After looking at the pod, I can see frequent probe failures. And even restarts with exit code 0.
Exploring the probes, I manually started probing using
kubectl port-forward
. I did notice that with some requests, the response could be pretty delayed, by a few seconds. The default probe timeout is1s
, so I imagine that is the cause.Expected behaviour:
Faster probe response times.
Steps to reproduce the bug:
Install cert-manager 1.7.2, trigger a webhook validation dry run every minute.
Anything else we need to know?:
Environment details::
/kind bug
The text was updated successfully, but these errors were encountered: