Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DownstreamMerge] 8-25-2022 #1253

Merged

Conversation

ricky-rav
Copy link
Contributor

@trozet PTAL

zshi-redhat and others added 30 commits August 10, 2022 08:06
Signed-off-by: Zenghui Shi <zshi@redhat.com>
In the DPU 2-clusters design, a host with DPU will join the cluster
with ovn-controller running in the host first, then it will be moved
to the DPU. Therefore, we will have 2 chassis with the same hostname
in the ovn sbdb.

With this patch, the master will try to remove the stale chassis when
the Chassis ID is inconsistent with the value of node annotation
'k8s.ovn.org/node-chassis-id'.

Signed-off-by: Peng Liu <pliu@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Currently, they are reported in milliseconds.
Prometheus mandates time values are reported in seconds.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
OVN-K Metrics: Ensure stopwatch metrics are reported in seconds
Report OVN controller connection status via a metric from each
OVN kube node.

The metric is updated every 2 minutes in-order not to incur
polling overhead which is not insignificant.

Prometheus scrape interval is usually every 30 seconds, and is
this value on OCP.

The reason we report this from every node instead of grabbing it
from SB DB via (nb_cfg trigger), is to save SB DB from each
OVN controller having to write to it.
This maybe an issue when scaling node.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Add metrics to observe core events related to EgressIP:

- name: egress_ips_assign_latency_seconds -- Histogram **
  desc: The latency of egress IP assignment to ovn nb database

- name: egress_ips_unassign_latency_seconds -- Histogram **
  desc: The latency of egress IP unassignment from ovn nb database

- name: egress_ips_node_unreachable_total -- Counter
  desc: The total number of times assigned egress IP(s) were unreachable

- name: egress_ips_rebalance_total -- Counter
  desc: The total number of times assigned egress IP(s) needed to be moved to a different node

Add flags to explicitly enable the histogram metrics, since we only see value
in having them when scale testing egress ips. The flag introduced here is:
    --metrics-enable-eip-scale

Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
This reverts commit 503de2f.

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
…ube-node"

This reverts commit 1a225ea.

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
…metric

OVN-K metrics: Add OVN controller southbound database connection
Bumps [@actions/core](https://github.com/actions/toolkit/tree/HEAD/packages/core) from 1.2.6 to 1.9.1.
- [Release notes](https://github.com/actions/toolkit/releases)
- [Changelog](https://github.com/actions/toolkit/blob/main/packages/core/RELEASES.md)
- [Commits](https://github.com/actions/toolkit/commits/HEAD/packages/core)

---
updated-dependencies:
- dependency-name: "@actions/core"
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Remove stale chassis for hosts that run ovnkube-node on DPU
This deletes any stale ip from node allocations map before assignment
so that update on same egress ip object would assign new ip address
on a particular assignable node in any timing scenarios.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
fedora Dockerfile : Switch to f36 and OVN 22.06.
…hub/actions/cleanup-action/actions/core-1.9.1

Bump @actions/core from 1.2.6 to 1.9.1 in /.github/actions/cleanup-action
libovsdb not being able to reset ACl severity to nil), use nil to reset.
Unify ACL logging setup (ovn/acl.go) to make ACL severity properly
set and reset (since severity is *string, can be error-prone)
Add unified handling for ACL logging that is always disabled (
like arpAllowACL)

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Add egress firewall external ID constant.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
namespace update handler will only update them if
len(nsInfo.networkPolicies) > 0, which is not the case during the first
ns policy creation - it can lead to default ACLs loglevel not being
updated.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Pardhakeswar Pacha <ppacha@nvidia.com>
(cherry picked from commit 1a225ea)
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
(cherry picked from commit 503de2f)
The selector in the endpointslice informer now correctly chooses only endpointslices with a non-empty "kubernetes.io/service-name" label and without "service.kubernetes.io/headless" label.

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
OCPBUGS-417: Fix informer selector for endpointslices
When the logical port is already deleted, deleteLogicalPort should
not return an error. Otherwise, delete will be retried up for
maxFailedAttempts (currently set to 15).

This issue introduced in commit be8786a

Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
Co-authored-by: Tim Rozet <trozet@redhat.com>
Reported-at: https://issues.redhat.com/browse/OCPBUGS-469
pods: deleteLogicalPort should not fail when port is already gone
Delete stale egress ip before assigning new ip
@trozet
Copy link
Contributor

trozet commented Aug 25, 2022

/retest

@ricky-rav
Copy link
Contributor Author

/payload 4.12 ci blocking
/payload 4.12 nightly blocking

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2022

@ricky-rav: trigger 4 job(s) of type blocking for the ci release of OCP 4.12

  • periodic-ci-openshift-release-master-ci-4.12-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-e2e-azure-sdn-upgrade
  • periodic-ci-openshift-release-master-ci-4.12-e2e-aws-sdn-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6b3a7870-250d-11ed-82de-9d24899b85bd-0

trigger 6 job(s) of type blocking for the nightly release of OCP 4.12

  • periodic-ci-openshift-release-master-nightly-4.12-e2e-aws-sdn-upgrade
  • periodic-ci-openshift-release-master-ci-4.12-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.12-e2e-aws-sdn-serial
  • periodic-ci-openshift-release-master-nightly-4.12-e2e-metal-ipi-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.12-e2e-metal-ipi-sdn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6b3a7870-250d-11ed-82de-9d24899b85bd-1

@pperiyasamy
Copy link
Member

@ricky-rav could you please get this latest commit ovn-org/ovn-kubernetes@a86eaf4 as well ?

@ricky-rav
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2022

@ricky-rav: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift 826f36f link false /test e2e-hypershift
ci/prow/e2e-vsphere-windows 826f36f link false /test e2e-vsphere-windows
ci/prow/okd-e2e-gcp-ovn 826f36f link false /test okd-e2e-gcp-ovn
ci/prow/e2e-azure-ovn 826f36f link false /test e2e-azure-ovn
ci/prow/e2e-openstack-ovn 826f36f link false /test e2e-openstack-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@trozet
Copy link
Contributor

trozet commented Aug 26, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 26, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ricky-rav, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 26, 2022
@openshift-merge-robot openshift-merge-robot merged commit 9592807 into openshift:master Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet