New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2117078: [release-4.7] Multiple ExGW cache validation/improvements #1241
Bug 2117078: [release-4.7] Multiple ExGW cache validation/improvements #1241
Conversation
On every pod add we assemble a new slice with all of the gatewayInfos. This had no capacity, so every append was an underlying array copy. Attempt to allocate at least some predictable capacity to avoid this. Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 87368a7) (cherry picked from commit e65ab58) (cherry picked from commit 6101853) (cherry picked from commit dd5bc6f)
@trozet: This pull request references Bugzilla bug 2117078, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @dcbw |
/retest |
/label backport-risk-assessed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COMMIT1: CLEAN PICK LGTM
COMMIT2: PTAL my comment AND should we be bringing in 988fbb1#diff-3696280f7dbfece2e1136d5fa06a3339a05a8a3067710f7510ecdefba51bcb29R194 ? I moved the parseRoutingExternalGWAnnotation
into utils I see.. but do we need to bring it down?
ipTracker := make(map[string]string) | ||
for podName, gwInfo := range podGWs { | ||
for gwIP := range gwInfo.gws { | ||
if foundPod, ok := ipTracker[gwIP]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self note: line different from 4.8 pick since we don't need this to be gwIP.String() since we converted these variables into strings as part of that conntrack backport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COMMIT 3 & 4 LGTM
foundGws, ok := nsInfo.routingExternalPodGWs[pod] | ||
delete(nsInfo.routingExternalPodGWs, pod) | ||
foundGws, ok := nsInfo.routingExternalPodGWs[podGWKey] | ||
delete(nsInfo.routingExternalPodGWs, podGWKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming conflict was here somewhere? But code change lgtm based on comparision with 18c8afe.
Adds some checking to ensure user provided IPs are correct as well as detect any cache issues. Changes Include: - Ensure on exgw namespace annotation there are not duplicate IPs - Ensure for exgw pod addition, there is not already another pod with the same IP - If exgw pod cache becomes corrupt with duplicate IPs, emit a warning during pod add Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit dd6fe83) (cherry picked from commit 6853ff5) (cherry picked from commit 7c0893d) (cherry picked from commit 988fbb1)
When pods are added to the cache as exgws for a namespace, only the pod's name is used as the key. This breaks a scenario where 2 pods with the same name are serving as exgws for the same namespace. Consider this example: 1. app pod is created in ns foo 2. exgwAPod is created in ns exgw1 (172.0.1.1), serving ns foo 3. exgwAPod is created in ns exgw2 (172.0.1.2), serving ns foo In the above example, the app pod will only have one ECMP route for 172.0.1.2, because the cache is keyed only on pod name. Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit a801777) (cherry picked from commit 2172c74) (cherry picked from commit cf3adb2) (cherry picked from commit 18c8afe)
The map is keyed by namespace_pod, but was accidentally adding the new pod by pod name only. If this was a retry pod add it could result in the an error like: duplicate IP found in ECMP Pod route cache! IP: "192.168.222.33", first pod: "dep-serving-33-4-serving-job-66f7948f5d-dzx52", second pod: "serving-ns-33_dep-serving-33-4-serving-job-66f7948f5d-dzx52" Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 10bf713) (cherry picked from commit 0ade239) (cherry picked from commit 618eb5f) (cherry picked from commit baf254c)
/lgtm |
/retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: trozet, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label cherry-pick-approved |
/retest-required |
@trozet: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@trozet: All pull requests linked via external trackers have merged: Bugzilla bug 2117078 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Minor conflicts in the 2nd and 3rd commit