OSSM-3276: Add metallb installer for clusters without external IPs #580

jewertow · 2023-05-29T14:35:25Z

No description provided.

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

openshift-ci · 2023-05-29T14:35:28Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

pkg/metallb/install.go

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

luksa · 2023-05-30T09:48:30Z

pkg/util/shell/shell_helper.go

@@ -34,7 +34,7 @@ func ExecuteWithEnvAndInput(t test.TestHelper, env []string, cmd string, input s
 	t.T().Helper()
 	output, err := execShellCommand(cmd, env, input)
 	if err != nil {
-		t.Fatalf("Command failed: %s\n%serror: %s", cmd, appendNewLine(output), err)
+		t.Logf("Command failed: %s\n%serror: %s", cmd, appendNewLine(output), err)


This now prevents tests from failing when the command fails. If you need to check the command's output even if the command fails, we need to find a better way to achieve this. We do have some other places where we expect the command to fail, so this is clearly a valid use-case. We just need to find a good way to tell this function whether to fail on command failures or not.

This command is useless for now. I don't understand it's purpose. It's not the first time when I struggle with fatal errors that cannot be retried.

Why would it be useless? It's being used in basically all tests. In 99% of cases, if the command fails, the test should fail immediately and not on the next assertion. Not failing fast is one of the worst things in software, as it almost always leads to a lot of wasted time.

Because for some errors kubectl returns error and for others it does not. For example, when I execute kubectl -n metallb-system... and the namespace does not exist, I get fatal error. If the namespace exist, but a deployment does not exist, I don't get error. 🤯

I am not able to verify all the tests for now, so I will revert this change and will use exec.Command.

=== Failed === FAIL: pkg/tests/ossm-federation (0.00s) test_helper_setup.go:60: Check if MetalLB operator already exists test_helper_setup.go:64: FATAL: Command failed: oc -n metallb-system get deployments/metallb-operator-controller-manager Error from server (NotFound): namespaces "metallb-system" not found error: exit status 1 panic: FailNow

You can use the shell.Execute("command || true") pattern for now.

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

pkg/metallb/install.go

fjglira

lgtm my only question is: This oeprator version will work with OCP version from 4.9 to 4.13? Or we need to add something to validate that this works first against every version supported that we have?

luksa · 2023-05-30T16:02:30Z

pkg/metallb/install.go

+func checkIfMetalLbOperatorExists(t test.TestHelper) bool {
+	t.Log("Check if MetalLB operator already exists")
+	// pattern "cmd || true" is used to avoid getting fatal error
+	output := shell.Execute(t, fmt.Sprintf("oc get deployments -n %s metallb-operator-controller-manager || true", ns.MetalLB))


Suggested change

output := shell.Execute(t, fmt.Sprintf("oc get deployments -n %s metallb-operator-controller-manager || true", ns.MetalLB))

output := shell.Executef(t, "oc get deployments -n %s metallb-operator-controller-manager || true", ns.MetalLB)

Replaced with oc.ResourceExists().

luksa · 2023-05-30T16:02:42Z

pkg/metallb/install.go

+func checkIfMetalLbControllerExists(t test.TestHelper) bool {
+	t.Log("Check if MetalLB controller already exists")
+	// pattern "cmd || true" is used to avoid getting fatal error
+	output := shell.Execute(t, fmt.Sprintf("oc get deployments -n %s controller || true", ns.MetalLB))


Suggested change

output := shell.Execute(t, fmt.Sprintf("oc get deployments -n %s controller || true", ns.MetalLB))

output := shell.Executef(t, "oc get deployments -n %s controller || true", ns.MetalLB)

Replaced with oc.ResourceExists().

luksa · 2023-05-30T16:02:51Z

pkg/metallb/install.go

+func checkIfIPAddressPoolExists(t test.TestHelper) bool {
+	t.Log("Check if MetalLB controller already exists")
+	// pattern "cmd || true" is used to avoid getting fatal error
+	output := shell.Execute(t, fmt.Sprintf("oc get ipaddresspools -n %s worker-internal-ips || true", ns.MetalLB))


Suggested change

output := shell.Execute(t, fmt.Sprintf("oc get ipaddresspools -n %s worker-internal-ips || true", ns.MetalLB))

output := shell.Executef(t, "oc get ipaddresspools -n %s worker-internal-ips || true", ns.MetalLB)

Replaced with oc.ResourceExists().

luksa · 2023-05-30T16:06:39Z

pkg/metallb/install.go

+	oc.ApplyString(t, ns.MetalLB, metallbOperator)
+	retry.UntilSuccess(t, func(t test.TestHelper) {
+		// pattern "cmd || true" is used to avoid getting fatal error
+		shell.Execute(t, fmt.Sprintf("oc get deployments -n %s metallb-operator-controller-manager || true", ns.MetalLB),


This won't work properly when the oc passed to this function is not the default oc.

And this is exactly why I said in the other PR that using shell.Execute to run a custom oc command is not okay.

Yeah... I didn't find this, because I was testing missing operator, MetalLB and IPAddressPool on the cluster for which default kubeconfig is used.

I will have to pass kubeconfigs to each of these commands.

How about creating oc.ResourceExists(t, ns, kind, name)? Seems like we need this in several places.

jewertow · 2023-05-30T16:08:06Z

This oeprator version will work with OCP version from 4.9 to 4.13?

I guess it will work only on 4.12... I was only testing it on 4.12 for now, but I am aware of this issue. I'm going to check versions for other OCP versions tomorrow, because I have to download crc 4.9-4.13 and check what metallb versions are available. Then I will add startingCSV: {{ .MetalLbCsv }} in the metallb-operator.yaml

luksa · 2023-05-30T17:10:03Z

BTW why do we even need a LoadBalancer for gateway api tests? Can't we use the cluster IP of the service created by the Gateway?

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

jewertow · 2023-05-31T20:41:31Z

BTW why do we even need a LoadBalancer for gateway api tests?

Because the deployment controller creates LoadBalancer services for Gateway objects by default.

Can't we use the cluster IP of the service created by the Gateway?

I think we can, but we need load balancer anyway. I also believe that LoadBalancer services are more future proof, as router will be deprecated at some point in the future, so it's worth to test it.

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

fjglira · 2023-06-01T09:48:41Z

pkg/metallb/install.go

+	oc.ApplyTemplateString(t, ns.MetalLB, metallbOperator, map[string]string{"Version": metallbVersion})
+	retry.UntilSuccess(t, func(t test.TestHelper) {
+		if !oc.ResourceExists(t, ns.MetalLB, "deployments", "metallb-operator-controller-manager") {
+			t.Log("metallb-operator-controller-manager not found - waiting until exists")


To make this retry work is need to use instead of t.Log a t.Fatal because the retry needs a failure to make the retry, right now will log this only and continue with the rest of the steps

Oh, right. Thanks!

fjglira · 2023-06-01T09:49:43Z

pkg/metallb/install.go

+	oc.ApplyString(t, ns.MetalLB, metallb)
+	retry.UntilSuccess(t, func(t test.TestHelper) {
+		if !oc.ResourceExists(t, ns.MetalLB, "deployments", "controller") {
+			t.Log("MetalLB controller not found - waiting until exists")


Same here, to make this retry work is need to use t.Fatal to force the failure and make the retry

fjglira · 2023-06-01T09:49:50Z

pkg/metallb/install.go

+func createAddressPool(t test.TestHelper, oc oc.OC) {
+	t.Log("Check if MetalLB controller already exists")
+	if oc.ResourceExists(t, ns.MetalLB, "ipaddresspools", "worker-internal-ips") {
+		t.Log("IPAddressPool already exists - skip applying IPAddressPool")


Same here, to make this retry work is need to use t.Fatal to force the failure and make the retry

It's not retried :)

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

OSSM-3276: Add metallb installer for clusters without external IPs

697ef26

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

openshift-ci bot added the do-not-merge/work-in-progress label May 29, 2023

openshift-ci bot added the size/L label May 29, 2023

luksa reviewed May 29, 2023

View reviewed changes

pkg/metallb/install.go Outdated Show resolved Hide resolved

luksa approved these changes May 29, 2023

View reviewed changes

Refactor MetalLB installer

049c6a0

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

jewertow force-pushed the OSSM-3276 branch from d3ebc1f to 049c6a0 Compare May 30, 2023 09:24

Revert changes in federation tests

8f59db6

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

jewertow marked this pull request as ready for review May 30, 2023 09:34

openshift-ci bot removed the do-not-merge/work-in-progress label May 30, 2023

luksa reviewed May 30, 2023

View reviewed changes

luksa self-requested a review May 30, 2023 09:49

Revert change in shell.Exec

2888b10

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

luksa reviewed May 30, 2023

View reviewed changes

pkg/metallb/install.go Show resolved Hide resolved

jewertow requested a review from fjglira May 30, 2023 15:54

fjglira approved these changes May 30, 2023

View reviewed changes

luksa reviewed May 30, 2023

View reviewed changes

jewertow force-pushed the OSSM-3276 branch from 5ab8a25 to 2888b10 Compare May 31, 2023 11:49

jewertow added 3 commits May 31, 2023 14:58

Add function oc.ResourceExists

03c127f

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

Refactor metallb installer

796738f

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

Make metallb-operator version configurable

f0ce3e8

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

Disable lint check

6e3cad3

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

jewertow requested review from luksa and fjglira May 31, 2023 20:48

fjglira reviewed Jun 1, 2023

View reviewed changes

Return errors in retry functions

e98b57d

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

jewertow requested a review from fjglira June 1, 2023 10:19

fjglira approved these changes Jun 1, 2023

View reviewed changes

jewertow added the okay to merge label Jun 1, 2023

openshift-merge-robot merged commit 8cbbfe0 into maistra:main Jun 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OSSM-3276: Add metallb installer for clusters without external IPs #580

OSSM-3276: Add metallb installer for clusters without external IPs #580

jewertow commented May 29, 2023

openshift-ci bot commented May 29, 2023

luksa May 30, 2023

jewertow May 30, 2023

luksa May 30, 2023

jewertow May 30, 2023

jewertow May 30, 2023

jewertow May 30, 2023

luksa May 30, 2023 •

edited

Loading

jewertow May 30, 2023

fjglira left a comment

luksa May 30, 2023

jewertow May 31, 2023

luksa May 30, 2023

jewertow May 31, 2023

luksa May 30, 2023

jewertow May 31, 2023

luksa May 30, 2023

jewertow May 30, 2023

jewertow May 30, 2023

luksa May 31, 2023

jewertow May 31, 2023

jewertow commented May 30, 2023

luksa commented May 30, 2023

jewertow commented May 31, 2023

fjglira Jun 1, 2023

jewertow Jun 1, 2023

jewertow Jun 1, 2023

fjglira Jun 1, 2023

jewertow Jun 1, 2023

fjglira Jun 1, 2023

jewertow Jun 1, 2023

	output := shell.Execute(t, fmt.Sprintf("oc get deployments -n %s metallb-operator-controller-manager \|\| true", ns.MetalLB))
	output := shell.Executef(t, "oc get deployments -n %s metallb-operator-controller-manager \|\| true", ns.MetalLB)

	output := shell.Execute(t, fmt.Sprintf("oc get ipaddresspools -n %s worker-internal-ips \|\| true", ns.MetalLB))
	output := shell.Executef(t, "oc get ipaddresspools -n %s worker-internal-ips \|\| true", ns.MetalLB)

OSSM-3276: Add metallb installer for clusters without external IPs #580

OSSM-3276: Add metallb installer for clusters without external IPs #580

Conversation

jewertow commented May 29, 2023

openshift-ci bot commented May 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luksa May 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fjglira left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jewertow commented May 30, 2023

luksa commented May 30, 2023

jewertow commented May 31, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luksa May 30, 2023 •

edited

Loading