New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-174: [release-4.10] Fix race when adding and removing pod with same name #1250
OCPBUGS-174: [release-4.10] Fix race when adding and removing pod with same name #1250
Conversation
Trivial change to pod retries logs when attempting deletion to provide the error value itself. When @ricky-rav moved logic from /pkg/ovn/pods_retry.go to /pkg/ovn/obj_retry.go in 4.11, he already took care of showing the error message in the log. https://github.com/ricky-rav/ovn-kubernetes/blob/b4738c77138b1f332d41c88046d29e7b558f6683/go-controller/pkg/ovn/obj_retry.go#L787 Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
@flavio-fernandes: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@flavio-fernandes: This pull request references [Jira Issue OCPBUGS-174](https://issues.redhat.com//browse/OCPBUGS-174), which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
…dicate This small change pulls FindLogicalSwitchesWithPredicate from release 4.11 and newer, to facilitate the backporting of other changes that depend on this function. The function was initially introduced as part of a much bigger PR. This commit is just a small portion of it: openshift#1049 openshift@6d60741#r81458726 Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
…e informer cache When processing an object in terminal state there is a chance that it was already removed from the API server. Since delete events for objects in terminal state are skipped delete it here. Conflicts: go-controller/pkg/ovn/obj_retry.go --> where code lives in 4.11 and newer go-controller/pkg/ovn/ovn.go --> where code lives in 4.10 and older Signed-off-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit f1be8d2)
cd2a1cb
to
697267f
Compare
/remove-hold |
697267f
to
b3ff64d
Compare
/test e2e-aws-ovn |
/bugzilla valid-bug JIRA OCPBUGS-174 tracks this and depends on the 4.11 bug which was open in Bugzilla. |
Adding and removing a pod on changing nodes back to back can end up in a race where corresponding logical switch port remains in the wrong logical switch and never gets properly removed. In order for this to happen, the logical switch port has to have the same name, which is the <namespace>_<podName>. Conflicts: go-controller/pkg/ovn/pods.go go-controller/pkg/ovn/pods_test.go Signed-off-by: Flavio Fernandes <flaviof@redhat.com> Co-authored-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit be8786a)
b3ff64d
to
b62dba2
Compare
@trozet Please add the missing labels, if it all looks right to you. |
/test e2e-aws-ovn-windows |
/test e2e-vsphere-ovn |
/jira refresh |
@flavio-fernandes: This pull request references [Jira Issue OCPBUGS-174](https://issues.redhat.com//browse/OCPBUGS-174), which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test e2e-vsphere-ovn |
/lgtm |
/test e2e-openstack-ovn |
label /backport-risk-assessed |
/approve |
/label backport-risk-assessed lol |
/jira refresh |
@flavio-fernandes: This pull request references [Jira Issue OCPBUGS-174](https://issues.redhat.com//browse/OCPBUGS-174), which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/label cherry-pick-approved |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: flavio-fernandes, jcaamano, knobunc, kyrtapz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@flavio-fernandes: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@flavio-fernandes: All pull requests linked via external trackers have merged: [Jira Issue OCPBUGS-174](https://issues.redhat.com//browse/OCPBUGS-174) has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Adding and removing a pod on changing nodes back to back can end up in a race where
corresponding logical switch port remains in the wrong logical switch and never gets
properly removed. In order for this to happen, the logical switch port has to have
the same name, which is the _.
This PR includes the back-porting of 2 fixes needed to address this race.
They were merged to D/S master via PR #1237