New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Klog panic lead to controller WaitForNamedCacheSync
never returned
#106534
Comments
@astraw99: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue probably belongs in https://github.com/kubernetes/klog/ instead |
Got it, new PR: kubernetes/klog#272 |
If the logger panicked, why did that not kill the process? IMHO a logger implementation should never panic and if it does, that's so fatal that the code using it doesn't need to handle that gracefully. |
Yes, updated with aborting the process. |
side note, as the warning above says this is unsupported skew. /sig instrumentation |
Server version is unsupported (1.18) and this seems like a klog bug rather than Kubernetes bug as this is a custom controller. Closing. /close |
@ehashman: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened?
In our K8s cluster, we provided the
klog.SetLogger(selfLogger)
, then observed our controller pod running, but its log isWaiting for caches to sync for...
, and never returned.After debug, we found that it is the klog Error(impl by our logger) func panic, lead the logr
mutex
waiting to unlock.https://github.com/kubernetes/kubernetes/blob/b8af116327cd5d8e5411cbac04e7d4d11d22485d/vendor/k8s.io/klog/v2/klog.go#L931
Go pprof debug indicate the logger lock is waiting to unlock:
What did you expect to happen?
The logger should handle panic, and release the lock instantly.
How can we reproduce it (as minimally and precisely as possible)?
Provide the klog.SetLogger(selfLogger), but with the Error func return a panic, then will observe the issue described above.
Anything else we need to know?
No response
Kubernetes version
The text was updated successfully, but these errors were encountered: