METAL-897: Use nmcli instead of legacy network scripts #1631

elfosardo · 2024-02-06T17:03:50Z

No description provided.

elfosardo · 2024-02-07T14:52:38Z

/retest
ofcir failure

elfosardo · 2024-02-07T16:10:10Z

/retest

elfosardo · 2024-02-08T09:18:11Z

/retest

elfosardo · 2024-02-08T13:44:17Z

/retest

elfosardo · 2024-02-08T13:57:31Z

/retest
galaxy error, interesting!

elfosardo · 2024-02-08T17:45:58Z

/retest

elfosardo · 2024-02-08T18:13:23Z

/retest

elfosardo · 2024-02-09T08:36:00Z

/retest

elfosardo · 2024-02-09T10:01:03Z

/retest

elfosardo · 2024-02-09T12:00:36Z

/retest
more ansible galaxy error, scary

elfosardo · 2024-02-09T13:19:55Z

/retest

mkowalski · 2024-02-13T10:52:50Z

Hey, the only thing that concerns me (but it may be completely invalid) - how am I supposed to "upgrade" after this commit merges? Should I run make clean (or make realclean) before pulling from master and only afterwards use it, or maybe doesn't matter?

I feel without clean before git pull I may have unwanted stuff in my /etc but not sure honestly how this is handled.

What I am trying to say - maybe in host_cleanup.sh we should leave (as non-failing) sudo rm -f /etc/sysconfig/network-scripts/ifcfg-[...] to handle systems that used old dev-scripts in the past?

elfosardo · 2024-02-13T13:32:28Z

Hey, the only thing that concerns me (but it may be completely invalid) - how am I supposed to "upgrade" after this commit merges? Should I run make clean (or make realclean) before pulling from master and only afterwards use it, or maybe doesn't matter?

I feel without clean before git pull I may have unwanted stuff in my /etc but not sure honestly how this is handled.

What I am trying to say - maybe in host_cleanup.sh we should leave (as non-failing) sudo rm -f /etc/sysconfig/network-scripts/ifcfg-[...] to handle systems that used old dev-scripts in the past?

@mkowalski that sounds like a good idea, I'll update the PR

elfosardo · 2024-02-14T12:58:36Z

/retest
CI is not really ok at the moment

elfosardo · 2024-02-14T13:33:18Z

/retest

mkowalski · 2024-02-14T13:50:25Z

host_cleanup.sh

+if [ -e /etc/sysconfig/network-scripts/ifcfg-${INT_IF} ]; then
+    sudo rm -f /etc/sysconfig/network-scripts/ifcfg-${INT_IF}
+fi
+


If you add systemctl restart network.service || true here you will allow to keep using master branch on Stream 8 bootstrapped with very old dev-scripts

we can add this in a follow up as part of cleaning

mkowalski · 2024-02-14T13:50:48Z

/lgtm
Whenever CI passes, good to go

elfosardo · 2024-02-14T15:17:45Z

/retest

elfosardo · 2024-02-14T15:25:35Z

/retest
ansible galaxy issue

elfosardo · 2024-02-14T15:42:01Z

/retest

elfosardo · 2024-02-14T16:52:26Z

/retest

elfosardo · 2024-02-15T07:46:23Z

/retest

elfosardo · 2024-02-15T09:52:13Z

/retest

elfosardo · 2024-02-15T13:01:11Z

/retest

elfosardo · 2024-02-15T16:27:44Z

/retest
failure is not related to this change

elfosardo · 2024-02-16T08:44:10Z

/retest

elfosardo · 2024-02-16T13:36:02Z

/retest
wow CI is so foobar at the moment

mkowalski · 2024-02-16T16:34:17Z

I am not sure if this error is really important here, I looked and cluster deploys but something somewhere fails afterwards,

 INFO[2024-02-16T15:01:57Z] Step e2e-metal-ipi-bm-baremetalds-devscripts-setup succeeded after 1h17m5s. 
INFO[2024-02-16T15:01:57Z] Step phase pre succeeded after 1h19m20s.     
INFO[2024-02-16T15:01:57Z] Running multi-stage phase test               
INFO[2024-02-16T15:01:57Z] Running step e2e-metal-ipi-bm-baremetalds-e2e-test. 
INFO[2024-02-16T16:17:19Z] Logs for container test in pod e2e-metal-ipi-bm-baremetalds-e2e-test: 
INFO[2024-02-16T16:17:19Z] time="2024-02-16T16:11:40Z" level=info msg="processed event" event="{{ } {foo-crd.17b463c4bc2ffd43  e2e-horizontal-pod-autoscaling-6430  ed339db1-94bc-4127-9842-38a3ebf7f32d 258089 0 2024-02-16 16:10:55 +0000 UTC <nil> <nil> map[] map[monitor.openshift.io/observed-recreation-count: monitor.openshift.io/observed-update-count:1] [] [] [{kube-controller-manager Update v1 2024-02-16 16:11:40 +0000 UTC FieldsV1 {\"f:count\":{},\"f:firstTimestamp\":{},\"f:involvedObject\":{},\"f:lastTimestamp\":{},\"f:message\":{},\"f:reason\":{},\"f:reportingComponent\":{},\"f:source\":{\"f:component\":{}},\"f:type\":{}} }]} {HorizontalPodAutoscaler e2e-horizontal-pod-autoscaling-6430 foo-crd a2e65cc7-43f1-4f19-a3bf-7a965e1ceb46 autoscaling/v2 257669 } FailedGetResourceMetric failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready) {horizontal-pod-autoscaler } 2024-02-16 16:10:55 +0000 UTC 2024-02-16 16:11:40 +0000 UTC 4 Warning 0001-01-01 00:00:00 +0000 UTC nil  nil horizontal-pod-autoscaler }" 

[...]

 Cleaning up.
found errors fetching in-cluster data: [failed to list files in disruption event folder on node host2.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host3.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host4.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host5.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host6.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource] 

[...]

 Failing tests:
[sig-cli] oc adm node-logs [Suite:openshift/conformance/parallel]
environment: line 123:   320 Killed                  openshift-tests run "${TEST_SUITE}" ${TEST_ARGS:-} --provider "${TEST_PROVIDER:-}" -o "${ARTIFACT_DIR}/e2e.log" --junit-dir "${ARTIFACT_DIR}/junit"
++ date +%s
+ echo 1708100239
{"component":"entrypoint","error":"wrapped process failed: exit status 137","file":"k8s.io/test-infra/prow/entrypoint/run.go:84","func":"k8s.io/test-infra/prow/entrypoint.Options.internalRun","level":"error","msg":"Error executing test process","severity":"error","time":"2024-02-16T16:17:19Z"}
error: failed to execute wrapped command: exit status 137 
INFO[2024-02-16T16:17:19Z] Step e2e-metal-ipi-bm-baremetalds-e2e-test failed after 1h15m22s.

I can't see how this change would make cluster suddenly to fail conformance (if it really failed) but not break the installation

elfosardo · 2024-02-20T08:31:10Z

@mkowalski thank you for checking that
it's weird that the error is showing up now as the CI was 100% passing last week, so I don't think the issue is due to this change
I'm going to retest once more and see

elfosardo · 2024-02-20T08:31:15Z

/retest

elfosardo · 2024-02-20T13:01:36Z

/retest
yet another unrelated failure

elfosardo · 2024-02-21T08:44:37Z

/retest

elfosardo · 2024-02-21T10:51:37Z

/retest

elfosardo · 2024-02-21T13:42:48Z

/retest

elfosardo · 2024-02-21T13:55:33Z

/retest

derekhiggins · 2024-02-22T14:09:42Z

/approve
tested on CS9 with both ipv4 and ipv6

openshift-ci · 2024-02-22T14:09:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekhiggins

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [derekhiggins]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

elfosardo changed the title ~~Use nmcli instead of legacy network scripts~~ [WIP] Use nmcli instead of legacy network scripts Feb 6, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2024

openshift-ci bot requested review from bfournie and cybertron February 6, 2024 17:04

elfosardo force-pushed the use-nmcli branch 7 times, most recently from 1473b94 to a5989d6 Compare February 7, 2024 14:29

elfosardo force-pushed the use-nmcli branch from a5989d6 to 900f225 Compare February 8, 2024 14:17

elfosardo changed the title ~~[WIP] Use nmcli instead of legacy network scripts~~ METAL-897: Use nmcli instead of legacy network scripts Feb 9, 2024

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 9, 2024

elfosardo force-pushed the use-nmcli branch from 900f225 to db237f1 Compare February 9, 2024 09:19

elfosardo force-pushed the use-nmcli branch 2 times, most recently from 4adb9b2 to 3b4db99 Compare February 13, 2024 13:42

Use nmcli instead of legacy network scripts

5f7a197

elfosardo force-pushed the use-nmcli branch from 3b4db99 to 5f7a197 Compare February 14, 2024 09:42

mkowalski reviewed Feb 14, 2024

View reviewed changes

openshift-ci bot assigned mkowalski Feb 14, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 14, 2024

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2024

openshift-merge-bot bot merged commit 5f47779 into openshift-metal3:master Feb 22, 2024
12 checks passed

METAL-897: Use nmcli instead of legacy network scripts #1631

METAL-897: Use nmcli instead of legacy network scripts #1631

Conversation

elfosardo commented Feb 6, 2024

elfosardo commented Feb 7, 2024

elfosardo commented Feb 7, 2024

elfosardo commented Feb 8, 2024

elfosardo commented Feb 8, 2024

elfosardo commented Feb 8, 2024

elfosardo commented Feb 8, 2024

elfosardo commented Feb 8, 2024

elfosardo commented Feb 9, 2024

elfosardo commented Feb 9, 2024

elfosardo commented Feb 9, 2024 • edited Loading

elfosardo commented Feb 9, 2024

mkowalski commented Feb 13, 2024 • edited Loading

elfosardo commented Feb 13, 2024

elfosardo commented Feb 14, 2024

elfosardo commented Feb 14, 2024

mkowalski Feb 14, 2024

Choose a reason for hiding this comment

elfosardo Feb 14, 2024

Choose a reason for hiding this comment

zaneb Mar 5, 2024

Choose a reason for hiding this comment

mkowalski commented Feb 14, 2024

elfosardo commented Feb 14, 2024

elfosardo commented Feb 14, 2024

elfosardo commented Feb 14, 2024

elfosardo commented Feb 14, 2024

elfosardo commented Feb 15, 2024

elfosardo commented Feb 15, 2024

elfosardo commented Feb 15, 2024

elfosardo commented Feb 15, 2024

elfosardo commented Feb 16, 2024

elfosardo commented Feb 16, 2024

mkowalski commented Feb 16, 2024

elfosardo commented Feb 20, 2024

elfosardo commented Feb 20, 2024

elfosardo commented Feb 20, 2024

elfosardo commented Feb 21, 2024

elfosardo commented Feb 21, 2024

elfosardo commented Feb 21, 2024

elfosardo commented Feb 21, 2024

derekhiggins commented Feb 22, 2024

openshift-ci bot commented Feb 22, 2024

elfosardo commented Feb 9, 2024 •

edited

Loading

mkowalski commented Feb 13, 2024 •

edited

Loading