This repository has been archived by the owner on Nov 9, 2022. It is now read-only.
Worker routines get stuck #99
Labels
area/robustness
Robustness, reliability, resilience related
kind/bug
Bug
priority/3
Priority (lower number equals higher priority)
How to categorize this issue?
/area robustness
/kind bug
/priority normal
What happened:
We have observed some situations, were grm gets stuck reconciling a specific managed resource and does not act upon it anymore.
In all cases I observed, it was either happening in conjunction with a longer period of downtime of the source or target API server (before #95) or a large amount of secret data in the target cluster (like described in #92).
What you expected to happen:
grm should not get stuck and reconcile all managed resources with the given sync interval.
How to reproduce it (as minimally and precisely as possible):
Not sure yet.
My guess would be that the worker goroutines get stuck in some
WaitForCacheSync
, when the API server is unavailable for a longer period of time or the amount of watched data is to big.Anything else we need to know?:
Environment:
kubectl version
):The text was updated successfully, but these errors were encountered: