Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1873311: 8 27 2020 merge #243

Merged
merged 16 commits into from Aug 30, 2020

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Aug 27, 2020

Includes fixes for Network policy and egress IP.
Note the revert for allow-related will lower performance for pod<->pod traffic without network policy.

rsevilla87 and others added 16 commits August 23, 2020 16:51
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Keep `getPodIPs` true to its functional name

`getPodIPs` was doing additional operations not related
to "getting the pod IPs". Move that to `addPodEgressIP`

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Rename nodeLogicalRouterIP with IP version for egress IP tests

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Fix variable spelling mistake

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
The initial implementation of egress IP in shared gateway mode
was dependent on having the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1861294
resolved. This is now the case and the fix is in the latest Fedora 31 release.
This commit thus switches to the correct shared gateway mode implementation.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Tag egress IP related OVN transactions with external_id

This patch deals with avoiding inconsistent state when a pod matches multiple
egress IPs. Before this patch: the lr-policy-add / lr-nat-add commands failed once the second
egress IP setup happened. This wasn't a problem in itself, but if that egress IP was later
deleted: it then proceeded to delete the egress setup for the first egress IP. It goes like this:

1) EgressIP-1 is created matching pod X, setup happens correctly
2) EgressIP-2 is created matching pod X, setup fails because that exact same setup (nat and policy rules) already exists
3) EgressIP-2 is deleted, deleting EgressIP-1's setup.
4) OVN state is now screwed up and pod egress functionality will no longer work even though EgressIP-1 still exists.

To avoid this we have the possibility to tag the policy and nat objects with the external_id,
which in this case is the EgressIP.Name. As such we will only try deleting whatever setup we find
for that particular object. The OVN API for doing this is not as straightforward though, and we use
an extra OVN transaction for the delete, but we win in correctness.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Expanding on the docstring for policyAlreadyExistsMsg

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Simplify logical router policy filter for egress IP

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Check if nat/lr-policy exists before creating it

As to avoid duplicate objects in the OVN database, check that
the NAT / logical router policy object which is about to be created
does not already exist

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
The policy command in local gateway mode (specifying pkt_mark) was incorrect.
It is not a column in the logical_router_policy table, it's an option.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
The egress IP code was modifying the API server object directly. Leading
to data races when running with the race detector set to "on". Instead copy
the object, modify it and update the API server object with the copied one.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Remove unecessary locking in egress tests.

Egress tests were locking eIPAllocatorMutex as to avoid the data races, which
have now been fixed by the top level commit.

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Encapsulate all assignEgressIPs operations in lock

Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>

Rename utility functions in egress*_test.go to reflect its new behaviour
This PR introduces a fix for the issue addressed in
ovn-org/ovn-kubernetes#1562
As of now, running hybrid-overlay-node as a Windows service
relies on os.Exit() to stop the service, this PR removes
abrupt exit by instead cancelling the context on svc.Stop.

Signed-off-by: mansikulkarni96 <mankulka@redhat.com>
Moved installation of j2cli via PIP after parsing the arguments.
Parsing command line arguments should be the first statement in the script so that if someone is just providing
the "-h" flag we print out the help  without doing any check. By moving the above statement as mentioned
we have ensured "-h" flag works even if say pip is not installed in the system.

Signed-off-by: Balaji Varadaraju <bvaradar@redhat.com>
This reverts commit fd47587.

We found that this causes problems with network policy ACLs when using
multi-node setups where the network policy ACL only targets a portgroup
of the destination, and in effect only places the ACL there with
allow-related. This causes return traffic destined to another node to be
dropped, because that node does not have any allow-related ACL, and thus
never able to determine via CT if this is reply traffic.

Signed-off-by: Tim Rozet <trozet@redhat.com>
…ange

Revert "Fix default allow mgmt ACL using conntrack"
@trozet
Copy link
Contributor Author

trozet commented Aug 27, 2020

@dcbw
Copy link
Member

dcbw commented Aug 27, 2020

/lgtm

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 27, 2020
@trozet
Copy link
Contributor Author

trozet commented Aug 27, 2020

/retest

1 similar comment
@dcbw
Copy link
Member

dcbw commented Aug 27, 2020

/retest

@dcbw
Copy link
Member

dcbw commented Aug 27, 2020

rhel8 RPM mirrors are broken and cannot download necessary RPMs:

Errors during downloading metadata for repository 'rhel-8-server-ose':
  - Status code: 404 for http://base-4-6-rhel8.ocp.svc/rhel-8-server-ose/repodata/b8df5faabb92041ca5365f3e48729158d42777d26e6af0a50a8040a8197a4f9a-filelists.xml.gz (IP: 172.30.186.152)
  - Status code: 404 for http://base-4-6-rhel8.ocp.svc/rhel-8-server-ose/repodata/dcab78e42cb4e94713200f9b8901b9059b1b09bac1e14f7c04963ada4281cf3b-primary.xml.gz (IP: 172.30.186.152)
Error: Failed to download metadata for repo 'rhel-8-server-ose': Yum repo downloading error: Downloading error(s): repodata/b8df5faabb92041ca5365f3e48729158d42777d26e6af0a50a8040a8197a4f9a-filelists.xml.gz - Cannot download, all mirrors were already tried without success; repodata/dcab78e42cb4e94713200f9b8901b9059b1b09bac1e14f7c04963ada4281cf3b-primary.xml.gz - Cannot download, all mirrors were already tried without success
error: build error: running 'yum install -y  	selinux-policy && 	yum clean all' failed with exit code 1

@dcbw
Copy link
Member

dcbw commented Aug 27, 2020

/retest

@trozet trozet changed the title 8 27 2020 merge Bug 1873311: 8 27 2020 merge Aug 27, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Aug 27, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

24 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link
Contributor

@trozet: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-openstack da39713 link /test e2e-openstack
ci/prow/e2e-vsphere da39713 link /test e2e-vsphere
ci/prow/e2e-ovn-hybrid-step-registry da39713 link /test e2e-ovn-hybrid-step-registry

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 15877ff into openshift:master Aug 30, 2020
@openshift-ci-robot
Copy link
Contributor

@trozet: All pull requests linked via external trackers have merged:

Bugzilla bug 1873311 has been moved to the MODIFIED state.

In response to this:

Bug 1873311: 8 27 2020 merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants