New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.10] Bug 2052017: restart pod on non-retriable failures when deleting stale objects #945
[release-4.10] Bug 2052017: restart pod on non-retriable failures when deleting stale objects #945
Conversation
Signed-off-by: Flavio Fernandes <flaviof@redhat.com> (cherry picked from commit 44d06f5)
Signed-off-by: Flavio Fernandes <flaviof@redhat.com> (cherry picked from commit 4e9e424)
findSwitch only sets the UUID in the provided parameter. So, renaming it to findSwitchUUID Signed-off-by: Flavio Fernandes <flaviof@redhat.com> (cherry picked from commit d92eab2)
Upon starting, failures when syncing OVN DB with K8 should be considered fatal. Still, this change will introduce retry logic to minimize pod restarts. Conflicts: go-controller/pkg/ovn/pods.go Signed-off-by: Flavio Fernandes <flaviof@redhat.com> (cherry picked from commit af27b80)
@flavio-fernandes: This pull request references Bugzilla bug 2052017, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: flavio-fernandes The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/bugzilla refresh |
@flavio-fernandes: This pull request references Bugzilla bug 2052017, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@flavio-fernandes: This pull request references Bugzilla bug 2052017, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold waiting for additional changes to retry (cc @tssurya ) |
Holding this PR to cherry-pick the commits from @tssurya : ovn-org/ovn-kubernetes#2787 |
/retest-required |
@flavio-fernandes: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This is now folded into #994 |
@flavio-fernandes: This pull request references Bugzilla bug 2052017. The bug has been updated to no longer refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In cases where we currently miss doing retries for removal of stale
objects, it is best to restart the pod than simply log an error and
bring the pod up. This change is changing that behavior on functions
run early on the pod start up.
Signed-off-by: Flavio Fernandes flaviof@redhat.com