Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-24219,OCPBUGS-22221,OCPBUGS-25670: Downstream Merge 31st Jan 2024 #2038

Merged
merged 28 commits into from Feb 1, 2024

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Jan 31, 2024

npinaeva and others added 28 commits January 10, 2024 16:06
Policies do add hostNetwork pods to their own address sets. An attempt
to unify this behaviour ended up breaking customers relying on the
previous logic, so we have to keep it.
The logic on when to use namespace address sets was restored. It is
applied when podSelector is nil. That means that peer
```
to:
    - namespaceSelector: <ns selector>
```
won't include hostNetwork pods, while peer
```
to:
    - namespaceSelector: <ns selector>
      podSelector: {}
```
will.
You can find 2 new unit tests verifying this behaviour.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
We used to build new acl and update it to make it "stale". That means
we have to change stale acls generation with every current acl
format change.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
names.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
in a structure to avoid changing function calls for every new
data tweak.
Part 1: update policy ACLs builder.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
in a structure to avoid changing function calls for every new
data tweak.
Part 2: update defalt deny ACLs builder.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
OVN drops the first packet from a given client
for every new session which is undesirable and
a regression compared to how SDN behaves (it can
be argued that this is an implementation detail
but ultimately users want same behaviour and no
drops). This is a complicated fix and will take
time.
The OVN fix will take time and in K8s there is no
affinity without a timeout unfortunately. So in
OVNKube we will introduce permanent session affinity
as an alternative. Hence if user sets 86000 which is
the highest timeout value it would mean no timeout.
This is not how the K8s definition works, this will be
an ovnkube implementation detail.
There is no harm in changing what a 1 day timeout means
from it being a day to it being infinite affinity because
either ways OVN only supports upto 18hours of timeout value
max (UINT_MAX). So currently range 18-24 is not used and
is configured as 18 itself in OVN.
So from this change forward, if timeout is set to 86000
by the user, they will get permanent session affinity.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Trying to bringup a k8s cluster with kubeadm
with ovn-kubernetes repo needs the following
minimum changes to README.md.

Signed-off-by: Guru Shetty <gurus@nvidia.com>
currently if hybrid overlay is being disabled only the nodes annotations
get cleaned up. This will lead to erroneous entries in the ovn database
that will lead to potentially one pod on every node (the one allocated
the reroute address for hybrid overlay) to not being able to send/receive
network traffic.

Correctly clean up hybrid overlay ports and logical router policies on
node add. A unit test was amended to include this cleanup.

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
The latest metallb skips nodes having label node.kubernetes.io/exclude-from-external-load-balancers
(particularly set on the control plane nodes) from advertising bgp routes for
external service.

But this breaks ovnk service related tests. Hence this commit removes the label
from those nodes after metallb deployment on kind cluster, so that these nodes
can also act as bgp speaker for external services.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
…ight

Update netpol namespace address sets usage to the old ways
Remove exclude external lb label from control-plane nodes
checkout@v3, cache@v3 and setup-go@v3 are using outdated Node.js 16
which is now deprecated in GHA [1], so these actions may stop working
soon.

Updating to most recent major versions with Node.js 20.  This stops GHA
from throwing warnings in every build.

[1] https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

While at it also updating upload-artifact and download-artifact to
the latest versions.

Note: golangci/golangci-lint-action doesn't have a version based
      on Node.js 20 yet, so this warning will remain for now.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
These docker actions are using outdated Node.js 16 which is now
deprecated in GHA [1], so they may stop working soon.

Updating to most recent major versions with Node.js 20.  This stops
GHA from throwing warnings on every build.

[1] https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

I'm not very sure why some of these were pinned to particular minor
releases, but it doesn't matter much, since we have to use the most
recent versions in order to get Node 20.  I'm also not sure why some
of these followed 'master' versions of the actions, presumably that
was because they didn't have support for certain features at a time
of introduction.  Pinning those to the most recent releases as well.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
'set-output' was deprecated [1] long ago and supposed to be completely
removed last summer, but GitHub is keeping it for now due to
widespread use.  It throws a warning on every build and GitHub may
decide to actually stop supporting these at any time.

Update the workflow to use a modern GITHUB_OUTPUT environment file
instead.

[1] https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
'skip-go-installation' is not supported since golangci-lint-action@v3.
It requires explicit setup-go action instead.

Remove the option to fix the warning printed on every run:

  Unexpected input(s) 'skip-go-installation', valid inputs are [
    'version', 'args', 'working-directory', 'github-token',
    'only-new-issues', 'skip-cache', 'skip-pkg-cache',
    'skip-build-cache', 'install-mode'
  ]

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
New setup-go@v5 attempts to cache dependencies by default.  However,
the default path it uses is go.sum in the root directory.  This
triggers a warning, since it doesn't exist:

  Restore cache failed: Dependencies file is not found in
  /home/runner/work/ovn-kubernetes/ovn-kubernetes.
  Supported file pattern: go.sum

Specify a path to all .sum files we have in the repository to make
the setup-go happy.  This should in theory make the builds a touch
faster.

Repo checkout moved before setup-go, so the paths are available.

Also removed the strange last_run_status logic.  There is no such
step anywhere.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently the condition for 'Set up Go' is using an output variable
MASTER_IMAGE_RESTORED that doesn't exist, i.e. the step
is_master_image_build_needed is never setting it up.  It sets two
other variables instead.  Also, all the other steps are checking
both of these variables making the condition prone to errors.

Removing the MASTER_IMAGE_RESTORED_FROM_GHCR and setting up
MASTER_IMAGE_RESTORED whenever any cache was hit to simplify the
logic and avoid setting up Go when no build is needed.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This job does nothing useful.  It sets up Go and downloads one
binary.  Filesystem is not preserved between jobs, so this is
practically useless, IMO.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It downloads a kind binary from aojea's repo.  This binary is not
used since the later 'kind setup' step will install a proper official
version of kind.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When OVN-Kubernetes lists all datapaths with 'ovs-appctl dpctl/dump-dps',
it stores both a datapath's name and type. But it then continues to use
only each datapath's name when querying 'ovs-appctl dpctl/show'.

For system datapaths this works because 'ovs-appctl dpctl/show' will
assume the 'system' type if no type has been set. However, this breaks
when using other datapath types such as 'netdev', causing 'ovs-appctl'
to fail with:

  ovs-vswitchd: opening datapath ovs-netdev failed (No such device)

This change keeps using a datapath's name as an identifier for metrics
but includes the type when querying ovs-appctl.

Whether a datapath's name without a type is sufficient for identifying
datapaths uniquely in OpenShift and Open vSwitch is left for future
work.

Signed-off-by: Jakob Meng <code@jakobmeng.de>
github: Update all the deprecated actions and features.
correctly clean up hybrid-overlay on during node add
@tssurya tssurya requested a review from dcbw as a code owner January 31, 2024 07:12
@openshift-ci-robot openshift-ci-robot added the jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. label Jan 31, 2024
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jan 31, 2024
@openshift-ci-robot
Copy link
Contributor

@tssurya: This pull request references Jira Issue OCPBUGS-24219, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

This pull request references Jira Issue OCPBUGS-22221, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

@npinaeva @JacobTanenbaum @pperiyasamy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 31, 2024
@npinaeva
Copy link
Member

/retitle OCPBUGS-24219,OCPBUGS-22221,OCPBUGS-25670: Downstream Merge 31st Jan 2024

@openshift-ci openshift-ci bot changed the title OCPBUGS-24219,OCPBUGS-22221: Downstream Merge 31st Jan 2024 OCPBUGS-24219,OCPBUGS-22221,OCPBUGS-25670: Downstream Merge 31st Jan 2024 Jan 31, 2024
@openshift-ci-robot
Copy link
Contributor

@tssurya: This pull request references Jira Issue OCPBUGS-24219, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

This pull request references Jira Issue OCPBUGS-22221, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

This pull request references Jira Issue OCPBUGS-25670, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

@npinaeva @JacobTanenbaum @pperiyasamy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tssurya
Copy link
Contributor Author

tssurya commented Jan 31, 2024

/retest

@pperiyasamy
Copy link
Member

bc05711 is LGTM

Copy link
Contributor

openshift-ci bot commented Jan 31, 2024

@tssurya: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 2ec69db link false /test security
ci/prow/okd-e2e-gcp-ovn 2ec69db link false /test okd-e2e-gcp-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tssurya
Copy link
Contributor Author

tssurya commented Jan 31, 2024

/assign @trozet

@jcaamano
Copy link
Contributor

jcaamano commented Feb 1, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 1, 2024
Copy link
Contributor

openshift-ci bot commented Feb 1, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcaamano, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 0ef25f9 into openshift:master Feb 1, 2024
30 of 32 checks passed
@openshift-ci-robot
Copy link
Contributor

@tssurya: Jira Issue OCPBUGS-24219: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-24219 has been moved to the MODIFIED state.

Jira Issue OCPBUGS-22221: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-22221 has been moved to the MODIFIED state.

Jira Issue OCPBUGS-25670: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-25670 has been moved to the MODIFIED state.

In response to this:

@npinaeva @JacobTanenbaum @pperiyasamy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-ovn-kubernetes-base-container-v4.16.0-202402011241.p0.g0ef25f9.assembly.stream for distgit ovn-kubernetes-base.
All builds following this will include this PR.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.16.0-0.nightly-2024-02-02-002725

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet