New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DownstreamMerge] 8-25-2022 #1253
[DownstreamMerge] 8-25-2022 #1253
Conversation
Signed-off-by: Zenghui Shi <zshi@redhat.com>
In the DPU 2-clusters design, a host with DPU will join the cluster with ovn-controller running in the host first, then it will be moved to the DPU. Therefore, we will have 2 chassis with the same hostname in the ovn sbdb. With this patch, the master will try to remove the stale chassis when the Chassis ID is inconsistent with the value of node annotation 'k8s.ovn.org/node-chassis-id'. Signed-off-by: Peng Liu <pliu@redhat.com>
Bump libovsdb to include ovn-org/libovsdb#321
Signed-off-by: Numan Siddique <numans@ovn.org>
Currently, they are reported in milliseconds. Prometheus mandates time values are reported in seconds. Signed-off-by: Martin Kennelly <mkennell@redhat.com>
OVN-K Metrics: Ensure stopwatch metrics are reported in seconds
Report OVN controller connection status via a metric from each OVN kube node. The metric is updated every 2 minutes in-order not to incur polling overhead which is not insignificant. Prometheus scrape interval is usually every 30 seconds, and is this value on OCP. The reason we report this from every node instead of grabbing it from SB DB via (nb_cfg trigger), is to save SB DB from each OVN controller having to write to it. This maybe an issue when scaling node. Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Add metrics to observe core events related to EgressIP: - name: egress_ips_assign_latency_seconds -- Histogram ** desc: The latency of egress IP assignment to ovn nb database - name: egress_ips_unassign_latency_seconds -- Histogram ** desc: The latency of egress IP unassignment from ovn nb database - name: egress_ips_node_unreachable_total -- Counter desc: The total number of times assigned egress IP(s) were unreachable - name: egress_ips_rebalance_total -- Counter desc: The total number of times assigned egress IP(s) needed to be moved to a different node Add flags to explicitly enable the histogram metrics, since we only see value in having them when scale testing egress ips. The flag introduced here is: --metrics-enable-eip-scale Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
This reverts commit 503de2f. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
…ube-node" This reverts commit 1a225ea. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
…metric OVN-K metrics: Add OVN controller southbound database connection
Bumps [@actions/core](https://github.com/actions/toolkit/tree/HEAD/packages/core) from 1.2.6 to 1.9.1. - [Release notes](https://github.com/actions/toolkit/releases) - [Changelog](https://github.com/actions/toolkit/blob/main/packages/core/RELEASES.md) - [Commits](https://github.com/actions/toolkit/commits/HEAD/packages/core) --- updated-dependencies: - dependency-name: "@actions/core" dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Remove stale chassis for hosts that run ovnkube-node on DPU
Revert EndpointSlice commits
This deletes any stale ip from node allocations map before assignment so that update on same egress ip object would assign new ip address on a particular assignable node in any timing scenarios. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
fedora Dockerfile : Switch to f36 and OVN 22.06.
…hub/actions/cleanup-action/actions/core-1.9.1 Bump @actions/core from 1.2.6 to 1.9.1 in /.github/actions/cleanup-action
egressip: add metrics
libovsdb not being able to reset ACl severity to nil), use nil to reset. Unify ACL logging setup (ovn/acl.go) to make ACL severity properly set and reset (since severity is *string, can be error-prone) Add unified handling for ACL logging that is always disabled ( like arpAllowACL) Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Add egress firewall external ID constant. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
namespace update handler will only update them if len(nsInfo.networkPolicies) > 0, which is not the case during the first ns policy creation - it can lead to default ACLs loglevel not being updated. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Pardhakeswar Pacha <ppacha@nvidia.com> (cherry picked from commit 1a225ea)
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit 503de2f)
The selector in the endpointslice informer now correctly chooses only endpointslices with a non-empty "kubernetes.io/service-name" label and without "service.kubernetes.io/headless" label. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Remove default ACL severity
OCPBUGS-417: Fix informer selector for endpointslices
When the logical port is already deleted, deleteLogicalPort should not return an error. Otherwise, delete will be retried up for maxFailedAttempts (currently set to 15). This issue introduced in commit be8786a Signed-off-by: Flavio Fernandes <flaviof@redhat.com> Co-authored-by: Tim Rozet <trozet@redhat.com> Reported-at: https://issues.redhat.com/browse/OCPBUGS-469
pods: deleteLogicalPort should not fail when port is already gone
39ec00e
to
2174b09
Compare
Delete stale egress ip before assigning new ip
/retest |
/payload 4.12 ci blocking |
@ricky-rav: trigger 4 job(s) of type blocking for the ci release of OCP 4.12
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6b3a7870-250d-11ed-82de-9d24899b85bd-0 trigger 6 job(s) of type blocking for the nightly release of OCP 4.12
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6b3a7870-250d-11ed-82de-9d24899b85bd-1 |
@ricky-rav could you please get this latest commit ovn-org/ovn-kubernetes@a86eaf4 as well ? |
2174b09
to
826f36f
Compare
/retest |
@ricky-rav: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ricky-rav, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@trozet PTAL