New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2079012: Fix egressIP object deletion if the node is deleted first #1143
Conversation
Currently gatewayCleanup doesn't delete rtoj-* ports which is fine since it deletes the router and probably on the OVN side the ports on the router are also removed as they are dependent on the router. However from a testing perspective its nice to have the gatewayCleanup remove these ports explictly just like multiJoinSwitchGatewayCleanup does. Corresponding gatewayCleanup unit tests are also updated. We also add a unit test to egressip testing where we test reassignment upon egress node deletion, specifically where the node has already gotten deleted by the node watcher. This test uncovers a bug we have in egressIP code where the reassignment will not be successful and will fail with the error E0608 09:42:53.781497 971577 obj_retry.go:1513] Failed to delete resource object node1 of type *factory.egressNode, error: Re-assignment for EgressIP: egressip failed, unable to update object, err: unable to retrieve gateway IP for node: node1, protocol is IPv6: false, err: attempt at finding node gateway router network information failed, err: unable to find router port rtoj-GR_node1: object not found Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> (cherry picked from commit 9d0aebe)
There are a few instances in egressIP code where we shouldn't fail hard if objects are not found, example if a node gets deleted before egressIP watcher takes action. In this case, we can log a warning saying objects couldn't be deleted and move on since its technically not an error if the object was deleted by some other code path. In oder to ensure we don't leave behind stale reroute policy entries, we also add the logic of removing policies that have nexthop == joinIP of the node that's getting deleted into the gateway cleanup code. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> (cherry picked from commit 95ab759)
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> (cherry picked from commit 35b4793)
@tssurya: This pull request references Bugzilla bug 2079012, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
/lgtm |
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kyrtapz, trozet, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
@tssurya: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test e2e-metal-ipi-ovn-dualstack |
@tssurya: All pull requests linked via external trackers have merged: Bugzilla bug 2079012 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.10 |
@tssurya: #1143 failed to apply on top of branch "release-4.10":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Clean cherry-pick from ovn-org/ovn-kubernetes#3008.
NO conflicts