New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reelase-4.11] OCPBUGS-772: Fix NP upgrade bug: unexpectedly found multiple equivalent ACLs (arp v/s arp||nd) #1259
[reelase-4.11] OCPBUGS-772: Fix NP upgrade bug: unexpectedly found multiple equivalent ACLs (arp v/s arp||nd) #1259
Conversation
This PR does two things: 1) Loops through all the ACLs that have `ARPallowPolicy` in their names and have " && arp" in their match expression and deletes it. This is because we did https://github.com/openshift/ovn-kubernetes/pull/1043/files where we changed the match but didn't remove acls on the older match which causes problems like: 2022-06-01T08:17:44.635401164Z E0601 08:17:44.634509 1 ovn.go:753] Failed to create network policy mdh-old/allow-from-other-namespaces, error: failed to create default port groups and acls for policy: mdh-old/allow-from-other-namespaces, error: unexpectedly found multiple equivalent ACLs: [{UUID:3bc36c0e-ee1a-4609-a240-c211f14f379b Action:allow Direction:to-lport ExternalIDs:map[default-deny-policy-type:Ingress] Label:0 Log:false Match:outport == @a13985064446031893020_ingressDefaultDeny && arp Meter:0xc003245210 Name:0xc003245220 Options:map[] Priority:1001 Severity:0xc003245230} {UUID:41702278-fb49-4418-abca-6425912d1e63 Action:allow Direction:to-lport ExternalIDs:map[default-deny-policy-type:Ingress] Label:0 Log:false Match:outport == @a13985064446031893020_ingressDefaultDeny && (arp || nd) Meter:0xc0032452a0 Name:0xc0032452b0 Options:map[] Priority:1001 Severity:0xc0032452c0}] on upgrades 2) When we create the default deny ACLs for a namespace with at least one network policy, we name it "namespace_np-name". This doesn't make sense since the ACL is applicable to all the network policies in that namespace. The policy that get's created first becomes the lucky winner. This PR updates the names of default deny ACLs to "namespace_egressDefaultDeny" OR "namespace_ingressDefaultDeny" so that "we can stop being stupid" :) and stop errors like: 2022-06-09T17:15:00.952174381Z E0609 17:15:00.952154 1 ovn.go:753] Failed to create network policy oit-ssi-fluentd/fluentd-input, error: failed to create default port groups and acls for policy: oit-ssi-fluentd/fluentd-input, error: unexpectedly found multiple equivalent ACLs: [{UUID:5b98f17d-789f-4de1-9beb-36741bfa40d1 Action:drop Direction:from-lport ExternalIDs:map[default-deny-policy-type:Egress] Label:0 Log:false Match:inport == @a12933912868060780448_egressDefaultDeny Meter:0xc001388aa0 Name:0xc001388ab0 Options:map[apply-after-lb:true] Priority:1000 Severity:0xc001388ac0} {UUID:b9349cb8-99b0-4130-92cf-ab11b4874a11 Action:drop Direction:from-lport ExternalIDs:map[default-deny-policy-type:Egress] Label:0 Log:false Match:inport == @a12933912868060780448_egressDefaultDeny Meter:0xc001388c90 Name:0xc001388ca0 Options:map[apply-after-lb:true] Priority:1000 Severity:0xc001388cb0}] (NOTE: not sure how the user ended up with two default ACLs in the same namespace, but they did!) Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> (cherry picked from commit 5fea420) Conflicts in 4.11: test/e2e/acl_logging.go had minor conflicts since openshift@e1f7da0 is missing in 4.11
We weren't removing reference of ACL from PG before deleting it. Unfortuntely unit tests don't catch this. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> (cherry picked from commit 4d446ba)
@tssurya: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@tssurya: This pull request references Jira Issue OCPBUGS-772, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@tssurya: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@tssurya: This pull request references Jira Issue OCPBUGS-772, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.10 |
@tssurya: once the present PR merges, I will cherry-pick it on top of release-4.10 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
hmm we might need openshift/origin#27312 to go into 4.11 as well |
/retest |
@tssurya: No Bugzilla bug is referenced in the title of this pull request. Retaining the bugzilla/valid-bug label as it was manually added. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
/lgtm |
/retest-required |
/approve |
dumb bot listen to tim's approve... |
/lgtm |
@tssurya: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: npinaeva, trozet, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@anuragthehatter : PTAL |
/assign @anuragthehatter |
/label cherry-pick-approved |
@tssurya: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test e2e-aws-ovn-upgrade |
@tssurya: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-772 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@tssurya: #1259 failed to apply on top of branch "release-4.10":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift/ovn-kubernetes#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patrk Diak <pdiak@redhat.com>
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift/ovn-kubernetes#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com>
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift/ovn-kubernetes#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com>
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b)
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0)
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. NOTE: In 4.11 openshift@44ad754#diff-5b9331449265f93da0d6fac90800eb68b7cb28f72b3f55eb01ad7026fc2b6089R68 is missing so we are directly using libovsdbops.BuildACL. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0)
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. NOTE: In 4.11 openshift@44ad754#diff-5b9331449265f93da0d6fac90800eb68b7cb28f72b3f55eb01ad7026fc2b6089R68 is missing so we are directly using libovsdbops.BuildACL. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0)
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. NOTE: In 4.11 openshift@44ad754#diff-5b9331449265f93da0d6fac90800eb68b7cb28f72b3f55eb01ad7026fc2b6089R68 is missing so we are directly using libovsdbops.BuildACL. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0)
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. NOTE: In 4.11 openshift@44ad754#diff-5b9331449265f93da0d6fac90800eb68b7cb28f72b3f55eb01ad7026fc2b6089R68 is missing so we are directly using libovsdbops.BuildACL. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0) (cherry picked from commit a039f90) Conflicts in 4.10: go-controller/pkg/ovn/policy.go had to remove passing `ACL.options` into BuildACL because in 4.10 we have: openshift@9b723ec versus in 4.11 we had: openshift@14bf1f2 So the function signature for libovsdbops.BuildACL has changed between versions.
We didn't consider this tiny hack that we do https://github.com/openshift/ovn-kubernetes/blob/44ad75466e486cce605e39513a3ecd9e0b306e7d/go-controller/pkg/libovsdbops/acl.go#L60 there when we wrote openshift#1259 and unit tests don't actually scream loudly for longer names. Without this PR we break users who have namespace names longer than 45 characters. NOTE: In 4.11 openshift@44ad754#diff-5b9331449265f93da0d6fac90800eb68b7cb28f72b3f55eb01ad7026fc2b6089R68 is missing so we are directly using libovsdbops.BuildACL. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit b10ed7b) (cherry picked from commit 54381b0) (cherry picked from commit a039f90) Conflicts in 4.10: go-controller/pkg/ovn/policy.go had to remove passing `ACL.options` into BuildACL because in 4.10 we have: openshift@9b723ec versus in 4.11 we had: openshift@14bf1f2 So the function signature for libovsdbops.BuildACL has changed between versions.
Minor conflict in commit 1 on the e2e file since e1f7da0 is missing in 4.11