Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2117078: [release-4.7] Multiple ExGW cache validation/improvements #1241

Merged
merged 4 commits into from Aug 26, 2022

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Aug 9, 2022

Minor conflicts in the 2nd and 3rd commit

On every pod add we assemble a new slice with all of the gatewayInfos.
This had no capacity, so every append was an underlying array copy.
Attempt to allocate at least some predictable capacity to avoid this.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit 87368a7)
(cherry picked from commit e65ab58)
(cherry picked from commit 6101853)
(cherry picked from commit dd5bc6f)
@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 9, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 9, 2022

@trozet: This pull request references Bugzilla bug 2117078, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.z) matches configured target release for branch (4.7.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2070762 is in the state CLOSED (ERRATA), which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2070762 targets the "4.8.z" release, which is one of the valid target releases: 4.8.0, 4.8.z
  • bug has dependents

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

Bug 2117078: [release-4.7] Multiple ExGW cache validation/improvements

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2022
@trozet
Copy link
Contributor Author

trozet commented Aug 9, 2022

/assign @dcbw

@trozet
Copy link
Contributor Author

trozet commented Aug 17, 2022

/retest

@trozet
Copy link
Contributor Author

trozet commented Aug 18, 2022

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Aug 18, 2022
Copy link
Contributor

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMMIT1: CLEAN PICK LGTM
COMMIT2: PTAL my comment AND should we be bringing in 988fbb1#diff-3696280f7dbfece2e1136d5fa06a3339a05a8a3067710f7510ecdefba51bcb29R194 ? I moved the parseRoutingExternalGWAnnotation into utils I see.. but do we need to bring it down?

go-controller/pkg/ovn/egressgw.go Outdated Show resolved Hide resolved
ipTracker := make(map[string]string)
for podName, gwInfo := range podGWs {
for gwIP := range gwInfo.gws {
if foundPod, ok := ipTracker[gwIP]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self note: line different from 4.8 pick since we don't need this to be gwIP.String() since we converted these variables into strings as part of that conntrack backport.

Copy link
Contributor

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMMIT 3 & 4 LGTM

foundGws, ok := nsInfo.routingExternalPodGWs[pod]
delete(nsInfo.routingExternalPodGWs, pod)
foundGws, ok := nsInfo.routingExternalPodGWs[podGWKey]
delete(nsInfo.routingExternalPodGWs, podGWKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming conflict was here somewhere? But code change lgtm based on comparision with 18c8afe.

Adds some checking to ensure user provided IPs are correct as well as
detect any cache issues.

Changes Include:
 - Ensure on exgw namespace annotation there are not duplicate IPs
 - Ensure for exgw pod addition, there is not already another pod with
   the same IP
 - If exgw pod cache becomes corrupt with duplicate IPs, emit a warning
   during pod add

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit dd6fe83)
(cherry picked from commit 6853ff5)
(cherry picked from commit 7c0893d)
(cherry picked from commit 988fbb1)
When pods are added to the cache as exgws for a namespace, only the
pod's name is used as the key. This breaks a scenario where 2 pods with
the same name are serving as exgws for the same namespace. Consider this
example:

1. app pod is created in ns foo
2. exgwAPod is created in ns exgw1 (172.0.1.1), serving ns foo
3. exgwAPod is created in ns exgw2 (172.0.1.2), serving ns foo

In the above example, the app pod will only have one ECMP route for
172.0.1.2, because the cache is keyed only on pod name.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit a801777)
(cherry picked from commit 2172c74)
(cherry picked from commit cf3adb2)
(cherry picked from commit 18c8afe)
The map is keyed by namespace_pod, but was accidentally adding the new
pod by pod name only. If this was a retry pod add it could result in the
an error like:

duplicate IP found in ECMP Pod route cache! IP: "192.168.222.33",
first pod: "dep-serving-33-4-serving-job-66f7948f5d-dzx52", second pod:
"serving-ns-33_dep-serving-33-4-serving-job-66f7948f5d-dzx52"

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit 10bf713)
(cherry picked from commit 0ade239)
(cherry picked from commit 618eb5f)
(cherry picked from commit baf254c)
@tssurya
Copy link
Contributor

tssurya commented Aug 19, 2022

/lgtm

@tssurya
Copy link
Contributor

tssurya commented Aug 19, 2022

/retest

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 19, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 19, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: trozet, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@anuragthehatter
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Aug 22, 2022
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 2 against base HEAD b1abcda and 8 for PR HEAD 1a9f8fb in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 1 against base HEAD b1abcda and 7 for PR HEAD 1a9f8fb in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD b1abcda and 6 for PR HEAD 1a9f8fb in total

@trozet
Copy link
Contributor Author

trozet commented Aug 26, 2022

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2022

@trozet: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 41d32ea into openshift:release-4.7 Aug 26, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2022

@trozet: All pull requests linked via external trackers have merged:

Bugzilla bug 2117078 has been moved to the MODIFIED state.

In response to this:

Bug 2117078: [release-4.7] Multiple ExGW cache validation/improvements

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants