Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream merge 2020-02-18 #93

Merged
merged 36 commits into from
Feb 21, 2020

Conversation

dcbw
Copy link
Contributor

@dcbw dcbw commented Feb 18, 2020

@openshift/networking @danwinship

trozet and others added 30 commits February 6, 2020 21:12
When config is read, the config file is parsed first. After these
settings are loaded, the CLI settings are parsed and override any
settings read from the file. This is fine, except for the CLI settings
may be defaulted, which will just override any specified config file
settings with defaults.

This patch addresses the problem by checking during config override if
the the configuration option has a default specified, and if so, ignores
overriding the current config option.

Additionally, this patch adds real default values for settings.

Signed-off-by: Tim Rozet <trozet@redhat.com>
With this change, any config setting is checked against its default to
determine if the value is valid when parsed. The only case where empty
string values are considered "unset" is with environment variables.

Signed-off-by: Tim Rozet <trozet@redhat.com>
This commit implements the clustered database as a statefulsets with
replicas set to 3. The PodManagementPolicy of OrderReady is too
restrictive for how RAFT works, so instead Parallel policy is used.
The existing ovnkube-db service acts as the headless Service for the
statefulset. The pods are scheduled on the nodes with labels
`ovn.org/ovnkube-db=true`.

Implementation notes:

1. ovnkube-db-0, the first pod in the statefulset creates the initial
cluster DB. If an existing standalone DB is present, then it converts it
into the clustered DB.

2. ovnkube-db-1 ... ovnkube-db-{n-1} pods wait until the ovnkube-db-0
has successfully started and then only join the cluster.

3. ovnkube-db-{n-1}, the last pod in the statefulset will ensure that
all of the pods in the statefulet are up and running and then only
publish the ovnkube-db endpoints to the ovnkube-db service.

4. Like before, all of the ovnkube-nodes and ovnkube-master are waiting
on the endpoints to be published. Once the endpoints are published, they
proceed further.

Signed-off-by: Yun Zhou <yunz@nvidia.com>
Signed-off-by: Girish Moodalbail <gmoodalbail@nvidia.com>
It's useful to see successful logs too.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Clustered database (aka RAFT) support for OVN DBs
Girish identified a race where the initial object Add events that are
sent when a handler is attached to an informer were not properly locked
against incoming events to that informer from the apiserver.

To fix this send the initial Add events while the informer's lock is held.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Previously OVN Kubernetes assumed the node's name in kubernetes was the node's
FQDN, which isn't necessarily true, with this change we read it from the
kubernetes API, which is more reliable.

Signed-off-by: Juan Luis de Sousa-Valadas Castaño <jdesousa@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Girish Moodalbail <gmoodalbail@nvidia.com>
7358508 changed GatewayReady() from using ovn-nbctl to using
ovs-ofctl, but checked for an IPv4-specific rule. Fix it to use IPv6
syntax when checking for IPv6 CIDRs.

Signed-off-by: Dan Winship <danw@redhat.com>
We're not ready until the messages channel is closed.

Signed-off-by: Dan Winship <danw@redhat.com>
kube-proxy is not necessary for OVN.

Signed-off-by: Tim Rozet <trozet@redhat.com>
Address sets were being deleted before the port group. The port group
contained an ACL which referenced the address set. This would cause OVN
controller to issue:

2020-02-17T16:31:09.160Z|00032|lflow|WARN|error parsing match "!ct.new
&& ct.est && !ct.rpl && ct_label.blocked == 0 && (ip4.src ==
{$a2376695646506571724, $a6566950664295833471} && outport ==
@a5584531700272134285)": Syntax error at `$a6566950664295833471'
expecting address set name.

With this patch address set is deleted after the associated ACL in the
port group is removed.

See ovn-org/ovn#35

Signed-off-by: Tim Rozet <trozet@redhat.com>
Fix delete order for network policy
…ults

Fixes accidental override of file conf settings with defaults
Signed-off-by: Dan Winship <danw@redhat.com>
other-config:subnet takes a CIDR string, but other-config:ipv6_prefix
just takes an IP address (and just assumes a /64 prefix). Fix this.
(Fixes dynamic assignment of IPv6 addresses.)

Also belatedly fix syncNodes which was always looking for
"other-config:subnet" even on IPv6.

Since other-config:subnet and other-config:ipv6_prefix don't behave
the same way and so need separate code (and because they're used less
now than when Russell originally wrote that patch anyway), remove
config.OtherConfigSubnet().

Signed-off-by: Dan Winship <danw@redhat.com>
add a -no-host-subnet-nodes flag which takes a label and causes ovn-kubernetes to
not assign a hostsubnet to that node. This will be used in cases where ovn is not supported
and nodes can manage there own subnets

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
add the ability for ovn-kubernetes to ignore a node by label
config: don't let environment affect test results
Signed-off-by: Dan Winship <danw@redhat.com>
The annotator provides a better-encapsulated interface for setting
or removing multiple annotations on a node from disconnected functions
since it bundles them all up into a single set and/or delete call.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Doesn't check for OVS/OVN utilities; to be used for
components that don't need those.

Signed-off-by: Dan Williams <dcbw@redhat.com>
@dcbw
Copy link
Contributor Author

dcbw commented Feb 18, 2020

/retest

@dcbw
Copy link
Contributor Author

dcbw commented Feb 19, 2020

Registry returns 500 when pushing the ovnkube image.

/retest

@danwinship
Copy link
Contributor

We're still getting upstream CI. Didn't we agree to delete/revert that in the merge?

@dcbw
Copy link
Contributor Author

dcbw commented Feb 19, 2020

We're still getting upstream CI. Didn't we agree to delete/revert that in the merge?

@danwinship not that I recall but I may have been on PTO at that time. But we can kill it since the openshift e2e tests should be more comprehensive.

Copy link
Contributor

@pecameron pecameron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally /lgtm
There are some upstream changes that we don't use. Just as good to keep things in sync.

dist/images/ovnkube.sh Show resolved Hide resolved
@dcbw
Copy link
Contributor Author

dcbw commented Feb 19, 2020

[Conformance][Area:Networking][Feature:Router] The HAProxy router [Top Level] [Conformance][Area:Networking][Feature:Router] The HAProxy router should set Forwarded headers appropriately [Suite:openshift/conformance/parallel/minimal]

/test e2e-gcp-ovn

@dcbw
Copy link
Contributor Author

dcbw commented Feb 19, 2020

/test e2e-aws-ovn

@dcbw
Copy link
Contributor Author

dcbw commented Feb 19, 2020

level=fatal msg="failed to fetch Terraform Variables: failed to fetch dependency of \"Terraform Variables\": failed to fetch dependency of \"Bootstrap Ignition Config\": failed to fetch dependency of \"Common Manifests\": failed to generate asset \"DNS Config\": getting public zone for \"origin-ci-int-aws.dev.rhcloud.com\": listing hosted zones: Throttling: Rate exceeded\n\tstatus code: 400, request id: 4d0d3929-15d4-4cc6-a508-4c2d70c649f1"
/test e2e-aws-ovn

@dcbw
Copy link
Contributor Author

dcbw commented Feb 20, 2020

/retest

4 similar comments
@vrutkovs
Copy link
Member

/retest

@vrutkovs
Copy link
Member

/retest

@vrutkovs
Copy link
Member

/retest

@vrutkovs
Copy link
Member

/retest

@vrutkovs
Copy link
Member

/test e2e-gcp-ovn

2 similar comments
@dcbw
Copy link
Contributor Author

dcbw commented Feb 21, 2020

/test e2e-gcp-ovn

@dcbw
Copy link
Contributor Author

dcbw commented Feb 21, 2020

/test e2e-gcp-ovn

@pecameron
Copy link
Contributor

@dcbw There are OOMKilled problems that the node team is working on. See slack. Its not likely anything will pass until they are done.

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 21, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, dcbw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit a4f8586 into openshift:master Feb 21, 2020
@openshift-ci-robot
Copy link
Contributor

@dcbw: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-gcp-ovn d2e186b link /test e2e-gcp-ovn

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.