New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event spam in some storage Operation Generator functions #74988
Comments
fixed for mount here: #71581 |
Hi @davidz627 , I would like to work on this issue, I think I'll apply the similar approach like #71581. |
SGTM |
/reopen |
@msau42: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/assign |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
/lifecycle frozen |
@msau42: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Per @msau42 's suggestion, for mount success events we could disable events only for remountable volumes (secrets/configmaps/etc., CSI volumes with this field set). Also consider adding failure events for other volume operations, most importantly teardown operations such as unmount and detach. Currently failure events only exist for attach, mount, map, expand, and expandInUse. |
This original issue is tracking a problem where we generate events outside of the operation executor, which means that any errors that happen are event spammed without backoff. |
/triage needs-information |
Seems like in some Operation Generator functions (I saw this in
GenerateAttachVolumeFunc
if an error is encountered in the "set up" part outside of the actualAttachVolumeFunc
there is no rate limiting and theGenerateAttachVolumeFunc
will be retried over and over hundreds of times a second. This causes significant event spam and can also slow down the controller./sig storage
/kind bug
/cc @msau42 @saad-ali @jingxu97 @verult
How to reproduce it (as minimally and precisely as possible):
Cause some long term error in lines:
https://github.com/davidz627/kubernetes/blob/f7a6b0a8602e02d67fabaa85458a94c0f14599a5/pkg/volume/util/operationexecutor/operation_generator.go#L305-L334
Observe function being retried without rate limiting.
The text was updated successfully, but these errors were encountered: