Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-4384: Move to new IPsec API for >=4.15 #50690

Closed
wants to merge 1 commit into from

Conversation

pperiyasamy
Copy link
Member

this is to align with CNO API changes on how to enable ipsec and support N-S

(cherry picked from commit a54aaca)

this is to align with CNO API changes on how to enable ipsec and support
N-S

(cherry picked from commit a54aaca)
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 8, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 8, 2024

@pperiyasamy: This pull request references SDN-4384 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

this is to align with CNO API changes on how to enable ipsec and support N-S

(cherry picked from commit a54aaca)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@pperiyasamy: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-svt-master-reliability-v2-azure-4.15-nightly-x86-reliability-v2-10h openshift/svt presubmit Registry content changed
pull-ci-openshift-svt-master-reliability-v2-azure-4.15-nightly-x86-reliability-v2-1h openshift/svt presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.17-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.16-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.15-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.14-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.13-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.12-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.11-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.10-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.9-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.8-e2e-metal-ipi-ovn-ipv6-ipsec openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-master-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.17-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.16-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.15-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.14-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.13-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.12-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.11-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.10-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.9-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.8-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed
pull-ci-openshift-cluster-network-operator-release-4.7-e2e-ovn-ipsec-step-registry openshift/cluster-network-operator presubmit Registry content changed

A total of 205 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse abort to abort all active rehearsals

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci openshift-ci bot requested review from dougbtv and pliurh April 8, 2024 10:17
Copy link
Contributor

openshift-ci bot commented Apr 8, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pperiyasamy
Once this PR has been reviewed and has the lgtm label, please assign dcbw for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pperiyasamy
Copy link
Member Author

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@pperiyasamy
Copy link
Member Author

Reason for failure with ci/rehearse/openshift/svt/master/reliability-v2-azure-4.15-nightly-x86-reliability-v2-1h CI lane is because of ovn-ipsec-host pod on two nodes (out of 12 nodes) are crashlooping with below error though machine configs are rolled out correctly.

2024-04-08T15:21:43.064202187Z + counter=0
2024-04-08T15:21:43.064202187Z + '[' -f /etc/cni/net.d/10-ovn-kubernetes.conf ']'
2024-04-08T15:21:43.064360389Z ovnkube-node has configured node.
2024-04-08T15:21:43.064374189Z + echo 'ovnkube-node has configured node.'
2024-04-08T15:21:43.064374189Z + pgrep pluto
2024-04-08T15:21:43.077693637Z + echo 'pluto is not running, enable the service and/or check system logs'
2024-04-08T15:21:43.077745538Z pluto is not running, enable the service and/or check system logs
2024-04-08T15:21:43.077758338Z + exit 2

Running the test again to see if the issue is consistent.

@pperiyasamy
Copy link
Member Author

/retest

@pperiyasamy
Copy link
Member Author

/assign @yuvalk @jcaamano @jluhrsen

@pperiyasamy
Copy link
Member Author

/pj-rehearse pull-ci-openshift-svt-master-reliability-v2-azure-4.15-nightly-x86-reliability-v2-1h

@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@pperiyasamy
Copy link
Member Author

/pj-rehearse pull-ci-openshift-qe-ocp-qe-perfscale-ci-main-azure-4.15-nightly-x86-data-path-ipsec-9nodes

@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@pperiyasamy
Copy link
Member Author

/retest

@pperiyasamy
Copy link
Member Author

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@pperiyasamy
Copy link
Member Author

Reason for failure with ci/rehearse/openshift/svt/master/reliability-v2-azure-4.15-nightly-x86-reliability-v2-1h CI lane is because of ovn-ipsec-host pod on two nodes (out of 12 nodes) are crashlooping with below error though machine configs are rolled out correctly.

2024-04-08T15:21:43.064202187Z + counter=0
2024-04-08T15:21:43.064202187Z + '[' -f /etc/cni/net.d/10-ovn-kubernetes.conf ']'
2024-04-08T15:21:43.064360389Z ovnkube-node has configured node.
2024-04-08T15:21:43.064374189Z + echo 'ovnkube-node has configured node.'
2024-04-08T15:21:43.064374189Z + pgrep pluto
2024-04-08T15:21:43.077693637Z + echo 'pluto is not running, enable the service and/or check system logs'
2024-04-08T15:21:43.077745538Z pluto is not running, enable the service and/or check system logs
2024-04-08T15:21:43.077758338Z + exit 2

Running the test again to see if the issue is consistent.

This issue is shown up again with latest run https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/50690/rehearse-50690-pull-ci-openshift-qe-ocp-qe-perfscale-ci-main-azure-4.15-nightly-x86-control-plane-ipsec-24nodes/1780216294851743744, seems like we may have to revisit isIPsecMachineConfigActive logic in CNO, created a bug https://issues.redhat.com/browse/OCPBUGS-32347 to track this.

@jluhrsen
Copy link
Contributor

/test e2e-aws-ovn-ipsec

Copy link
Contributor

openshift-ci bot commented Apr 17, 2024

@jluhrsen: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test app-ci-config-dry
  • /test boskos-config
  • /test boskos-config-generation
  • /test build-clusters
  • /test build01-dry
  • /test build02-dry
  • /test build03-dry
  • /test build04-dry
  • /test build05-dry
  • /test build09-dry
  • /test build10-dry
  • /test check-gh-automation
  • /test check-gh-automation-tide
  • /test ci-operator-config
  • /test ci-operator-config-metadata
  • /test ci-operator-registry
  • /test ci-secret-bootstrap-config-validation
  • /test ci-testgrid-allow-list
  • /test clusterimageset-validate
  • /test config
  • /test core-valid
  • /test deprecate-templates
  • /test generated-config
  • /test generated-dashboards
  • /test hosted-mgmt-dry
  • /test jira-lifecycle-config
  • /test openshift-image-mirror-mappings
  • /test ordered-prow-config
  • /test owners
  • /test pr-reminder-config
  • /test prow-config
  • /test prow-config-filenames
  • /test prow-config-semantics
  • /test pylint
  • /test release-config
  • /test release-controller-config
  • /test secret-generator-config-valid
  • /test services-valid
  • /test stackrox-stackrox-stackrox-stackrox-check
  • /test step-registry-metadata
  • /test step-registry-shellcheck
  • /test sync-rover-groups
  • /test vsphere02-dry
  • /test yamllint

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-release-master-build-clusters
  • pull-ci-openshift-release-master-ci-operator-config
  • pull-ci-openshift-release-master-ci-operator-registry
  • pull-ci-openshift-release-master-core-valid
  • pull-ci-openshift-release-master-deprecate-templates
  • pull-ci-openshift-release-master-owners
  • pull-ci-openshift-release-master-release-controller-config
  • pull-ci-openshift-release-master-step-registry-metadata
  • pull-ci-openshift-release-master-step-registry-shellcheck
  • pull-ci-openshift-release-openshift-image-mirror-mappings
  • pull-ci-openshift-release-yamllint

In response to this:

/test e2e-aws-ovn-ipsec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jluhrsen
Copy link
Contributor

/pj-rehearse pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-ipsec

@openshift-ci-robot
Copy link
Contributor

@jluhrsen: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

@jluhrsen: job(s): pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-ipsec either don't exist or were not found to be affected, and cannot be rehearsed

@jluhrsen
Copy link
Contributor

/pj-rehearse pull-ci-openshift-cluster-network-operator-master-e2e-ovn-ipsec-step-registry

@openshift-ci-robot
Copy link
Contributor

@jluhrsen: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@pperiyasamy
Copy link
Member Author

pperiyasamy commented Apr 18, 2024

Thanks Jamo, Now this run hits an issue with IPsec configuration on a particular worker node which might be an another cause for why cluster operators are not coming up. Create a JIRA OCPBUGS-32402 to track this.

@huiran0826
Copy link
Contributor

/pj-rehearse pull-ci-openshift-cluster-network-operator-master-e2e-ovn-ipsec-step-registry

@openshift-ci-robot
Copy link
Contributor

@huiran0826: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@huiran0826
Copy link
Contributor

/pj-rehearse pull-ci-openshift-qe-ocp-qe-perfscale-ci-main-azure-4.15-nightly-x86-control-plane-ipsec-24nodes

@openshift-ci-robot
Copy link
Contributor

@huiran0826: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Copy link
Contributor

openshift-ci bot commented Apr 22, 2024

@pperiyasamy: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pperiyasamy
Copy link
Member Author

closing it in favor of #50740.

@pperiyasamy pperiyasamy closed this May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
6 participants