Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-4168: Add IPsec e2e tests #28658

Merged
merged 6 commits into from
May 7, 2024

Conversation

pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Mar 15, 2024

This adds relevant ipsec e2e tests to validate both control plane and dataplane for both east west and north south traffic scenarios.

Run IPsec tests with command: ./openshift-tests run openshift/network/ipsec

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 15, 2024
@openshift-ci openshift-ci bot requested review from bparees and knobunc March 15, 2024 15:25
@openshift-ci openshift-ci bot added the vendor-update Touching vendor dir or related files label Mar 15, 2024
@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 571f86e

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (26) are below the historical average (395): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@pperiyasamy
Copy link
Member Author

/assign @yuvalk @jcaamano

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 8f2cb8c

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-upgrade High
[sig-apps] job-upgrade
This test has passed 100.00% of 23 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-aws-ovn-upgrade'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-serial High

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 119a4ed

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 IncompleteTests
Tests for this run (14) are below the historical average (1144): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn IncompleteTests
Tests for this run (17) are below the historical average (1491): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: d10b3a1

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 IncompleteTests
Tests for this run (99) are below the historical average (1149): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-upgrade IncompleteTests
Tests for this run (27) are below the historical average (590): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade IncompleteTests
Tests for this run (27) are below the historical average (526): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@pperiyasamy pperiyasamy force-pushed the ipsec-e2e-tests branch 3 times, most recently from 2583612 to e079638 Compare April 3, 2024 14:22
@pperiyasamy pperiyasamy changed the title [WiP] Add IPsec e2e tests Add IPsec e2e tests Apr 3, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 3, 2024
@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: e079638

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node Medium
[sig-node][invariant] alert/TargetDown should not be at or above info in ns/kube-system
This test has passed 84.78% of 46 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node'] in the last 14 days.

Open Bugs
Help needed understanding 4.16 upgrade duration increase on metal

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 587ab3c

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-serial IncompleteTests
Tests for this run (102) are below the historical average (781): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial Low
[sig-arch] events should not repeat pathologically for ns/openshift-etcd-operator
This test has passed 28.79% of 66 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days.

@pperiyasamy pperiyasamy force-pushed the ipsec-e2e-tests branch 2 times, most recently from d043f01 to 8a66229 Compare April 4, 2024 13:45
@pperiyasamy
Copy link
Member Author

There are two issues to be addressed with latest change.

  • Since each IPsec tests take around 30-45 mins to complete and e2e-aws-ovn-serial expects serial test suite to complete in 4 hrs. Because of this some IPsec and other serial test never get executed.
  • The north south IPsec test is failing in 4.16 cluster, this is because ipsec config not getting loaded via nmstate. This may be due to libreswan bug. But the same test is running fine with 4.15 cluster. working with @sabinaaledort on this.

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 8a66229

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-gcp-ovn-upgrade IncompleteTests
Tests for this run (15) are below the historical average (585): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade IncompleteTests
Tests for this run (15) are below the historical average (558): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-builds IncompleteTests
Tests for this run (14) are below the historical average (599): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn IncompleteTests
Tests for this run (14) are below the historical average (1549): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-csi IncompleteTests
Tests for this run (14) are below the historical average (627): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-serial IncompleteTests
Tests for this run (29) are below the historical average (765): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (14) are below the historical average (542): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@pperiyasamy pperiyasamy changed the title Add IPsec e2e tests SDN-4168: Add IPsec e2e tests Apr 8, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 8, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 8, 2024

@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

This adds relevant ipsec e2e tests to validate both control plane and dataplane for both east west and north south traffic scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: d0e4f09

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-gcp-ovn High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 18 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 37 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-fips High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 21 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-fips'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2 High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 99.41% of 1182 runs on release 4.16 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial Low
Undiagnosed panic detected in pod
This test has passed 72.73% of 22 runs on release 4.16 [amd64 aws ovn serial single-node] in the last week.

@@ -862,6 +864,20 @@ func areMachineConfigPoolsReadyWithIPsec(oc *exutil.CLI) (bool, error) {
return masterWithIPsec && workerWithIPsec, nil
}

func areMachineConfigPoolsReadyWithoutIPsec(oc *exutil.CLI) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something in this code that makes it very difficult to read. Some suggestions:

  • Remove method areWorkerMachineConfigPoolsReady, just as you don't have a areMasterMachineConfigPoolsReady
  • s/areMachineConfigPoolReadyWithoutMachineConfig/areMachineConfigPoolsReadyWithoutMachineConfig
  • s/isMachineConfigReadyInPools/areMachineConfigPoolsReadyWithMachineConfig
  • Avoid the mustExist flag. Have a getMachineConfigPoolsByLabel(oc *exutil.CLI, mcpSelectorLabel string), correctly handle MMatchExpressionthere as well and do somethign like
func areMasterMachineConfigPoolsWithIPsec(oc *exutil.CLI) (bool, error) {
    pools , err := getMachineConfigPoolsByLabel(...)
    if err != nil {
        return false, err
    }
    return areMachineConfigPoolsReadyWithMachineConfig(pools, ...)    
}

and so on...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks perfect and clean. Done.
Thanks @jcaamano .

@@ -193,22 +193,19 @@ func ensureIPsecDisabled(oc *exutil.CLI) error {
return false, err
}
}
if done {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to follow the same order here and in ensureIPsecEnabled on the machine config and cluster operator checks.

Also, I wouldn't reuse ensureIPsecMachineConfigRolloutComplete in ensureIPsecEnabled. Just call areMachineConfigPoolsReadyWithIPsec directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, done.

@@ -46,6 +46,7 @@ require (
golang.org/x/crypto v0.16.0
golang.org/x/net v0.19.0
golang.org/x/oauth2 v0.10.0
golang.org/x/sync v0.5.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still confused. Why is go mod vendor bringing in the machineconfig dependency if it is not referred from go.mod? Ideally, with no changes in go.mod, go mod vendor should not bring anything new? I am probably missing something.

I would keep doing the go mod tidyp afterward however.

@pperiyasamy pperiyasamy force-pushed the ipsec-e2e-tests branch 2 times, most recently from 3201250 to 125282d Compare May 3, 2024 16:08
@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 125282d

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-gcp-ovn High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 25 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 51 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-fips High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 30 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-fips'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (25) are below the historical average (588): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 Medium
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 97.83% of 46 runs on release 4.16 [amd64 ha metal-ipi ovn] in the last week.

This is needed to consume machine config APIs for IPsec related e2e tests.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@jcaamano
Copy link
Contributor

jcaamano commented May 6, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 6, 2024
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ab28660 and 2 for PR HEAD accf68f in total

@pperiyasamy
Copy link
Member Author

/retest-required

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: accf68f

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-gcp-ovn High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 28 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 46 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2 High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 99.71% of 2054 runs on release 4.16 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (25) are below the historical average (577): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: accf68f

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 31 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-gcp-ovn High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 27 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 46 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-fips High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 100.00% of 47 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-fips'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2 High
[sig-network] external gateway address when using openshift ovn-kubernetes should match the address family of the pod [Suite:openshift/conformance/parallel]
This test has passed 99.71% of 2054 runs on release 4.16 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (25) are below the historical average (588): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

This adds relevant ipsec e2e tests to validate both control plane
and dataplane for east west traffic scenario.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
This test applies ipsec configuration on two chosen worker nodes
via IPsec North South mechanisam and ensure traffic between those
two nodes are ESP encrypted.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
It is appropriate to generate certificates offline with certutil
and butane, Then import those certficates into worker nodes via
machine config. Hence this commit makes use of such a machine
config to import certficates instead of doing it from a pod.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
This commit adds and updates relevant tests for validating node traffic
for both east west and north south IPsec configurations.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
This commit has fixes for issues found with ipsec tests while testing
with a 4.16 cluster running in AWS.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label May 6, 2024
@pperiyasamy
Copy link
Member Author

/retest

@jcaamano
Copy link
Contributor

jcaamano commented May 7, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2024
Copy link
Contributor

openshift-ci bot commented May 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, jcaamano, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pperiyasamy
Copy link
Member Author

/test e2e-aws-ovn-serial

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 15c3521 and 2 for PR HEAD 06e240d in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 75f7e06 and 1 for PR HEAD 06e240d in total

Copy link
Contributor

openshift-ci bot commented May 7, 2024

@pperiyasamy: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node-upgrade 06e240d link false /test e2e-aws-ovn-single-node-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 539491f into openshift:master May 7, 2024
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants