New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2039099: EgressIP fixes for 4.10 #917
Bug 2039099: EgressIP fixes for 4.10 #917
Conversation
@alexanderConstantinescu: This pull request references Bugzilla bug 2039099, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
On CloudPrivateIPConfig deletion there can still be a finalizer left in the CR, instead of checking if there are any finalizers left check if new is nil. Signed-off-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit 391bbe2)
When ovnkube-master restarts we might have multiple objetct modification being performed from WatchEgressNodes and WatchEgressIP, this will also happen in quick succession: meaning we need to make sure we constantly act on the latest state of the object, which might not be the state of the informer cache. Hence: always retrieve the object from the API server Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 2fa8386)
When deleting a egressip.status.items we are better off just utilizing the fact that the setup made in the NB DB gets tagged with the egress IP name in the external_id fields for the LRP and NAT objects. Doing this fixes two issues: 1. If a ovnkube-master doesn't run while an egress IP matching pod gets deleted, we'll still be able to cleanup the setup made for that pod when we restart by doing this. 2. If an EgressIP gets deleted we will start deleting CloudPrivateIPConfigs. However, with the previous implementation we needed to fetch the EgressIP object and its podSelector / namespaceSelector, as to figure out which pods it matched. That won't be possible because the EgressIP object has been deleted, and so we were left with setups for all pods which matched the EgressIP object Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 4eb926e)
The pod assignment cache logic was completely off, leading to incorrect setups in many cases. This fixes the logic Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 24dbb96)
The reconciliation logic for egress IP was a bit off in the sense that we were processing our own updates, leading to unecessary and also incorrect changes for each object. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit db0e0ae)
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit baaea66)
There are two caches used for egress IP: the allocator cache, tracking which egress IP has been assigned to which node, and the podAssignment cache, tracking what set up has been performed on a per-pod basis. The logic in reconcileEgressIP only warmed up the cache when new assignments were actually being performed, overlooking the fact that we need to add everything that is already existing and which won't be re-assigned to the cache as well. Moreover, when running in a cloud environment we also never cleared the `statusToRemove` from the allocator cache before possibly trying to re-assign new IPs. This fixes that too Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 994ab9d)
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> Annotate the CloudPrivateIPConfig with the EgressIP owner reference Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 1e50941)
Pre-this-patch-series: we always deleted the external GW setup for each pod matching an egress IP and proceeded to wiring it up for egress IP. That can happen around 20 times in a short time span since pod updates are atomic and many controllers can change many fields on the pod when it's created. With this series of patches: we now have something which tracks what has been performed for each pod and do not perform repeating work. This is also needed since we have several different handlers which react to changes and can perform the egress IP setup, WatchEgressIP, WatchEgressNodes, etc. We need to be sure that the external GW setup created in addLogicalPort has actually been performed once the egress IP code is supposed to delete that. This was never the case before because it wasn't required given that we always re-did the same work for every pod update and were bound to "finish last" at some point. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 9353e94)
Pods should not match multiple EgressIP objects: that is considered a user error and is undefined. However, in such events iterate all EgressIPs and clean up as much as possible. By iterating all EgressIPs we also cover the case where a pod has its labels changed from matching one EgressIP to another, ex: EgressIP1 matching "blue pods" and EgressIP2 matching "red pods". If a pod with a red label gets changed to a blue label: we need add and remove the set up for both EgressIP obejcts - since we can't be sure of which EgressIP object we process first, always iterate all. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit a75838a)
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit e920435)
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit 6d66ec2)
WatchEgressIPNamespaces and WatchEgressIPPods only use the informer cache to retrieve the egress IPs when determining if namespace/pods match. It is thus better if we initialize them first and allow WatchEgressNodes / WatchEgressIP to initialize after. Those handlers might change the assignments of the existing objects. If we do the inverse and start WatchEgressIPNamespaces / WatchEgressIPPod last, we risk performing a bunch of modifications on the EgressIP objects when we restart and then have these handlers act on stale data when they sync. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> (cherry picked from commit eca24d0)
466f29a
to
2715f97
Compare
/hold cancel |
/test e2e-gcp-ovn |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexanderConstantinescu, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-metal-ipi-ovn-dualstack In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
4 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/override ci/prow/e2e-metal-ipi-ovn-dualstack |
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-metal-ipi-ovn-dualstack In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
6 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/skip |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/skip |
/test e2e-aws-ovn-upgrade |
/override ci/prow/e2e-metal-ipi-ovn-dualstack |
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-metal-ipi-ovn-dualstack In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@alexanderConstantinescu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
aws-ovn-upgrade already passed twice on this commit. The previous run's only failure was: Jan 21 13:05:30.373 - 119s W alert/etcdHighNumberOfLeaderChanges ns/openshift-etcd pod/etcd-ip-10-0-155-187.ec2.internal ALERTS{alertname="etcdHighNumberOfLeaderChanges", alertstate="firing", endpoint="etcd-metrics", job="etcd", namespace="openshift-etcd", pod="etcd-ip-10-0-155-187.ec2.internal", prometheus="openshift-monitoring/k8s", service="etcd", severity="warning"} Unrelated to this PR. /override ci/prow/e2e-aws-ovn-upgrade |
@trozet: Overrode contexts on behalf of trozet: ci/prow/e2e-aws-ovn-upgrade In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@alexanderConstantinescu: All pull requests linked via external trackers have merged: Bugzilla bug 2039099 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold
until #915 and openshift/cluster-network-operator#1285 gets in
/assign @trozet
- What this PR does and why is it needed
- Special notes for reviewers
- How to verify it
- Description for the changelog