Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Egress firewall fix retry #3449

Merged
merged 3 commits into from Mar 9, 2023
Merged

Conversation

npinaeva
Copy link
Member

@npinaeva npinaeva commented Feb 28, 2023

Egress firewall creation error was overridden by the status update
error and never returned. That would prevent retry on failure.
It worked before because the error was reassigned locally https://github.com/openshift/ovn-kubernetes/blob/release-4.11/go-controller/pkg/ovn/obj_retry.go#L681.

Fix egress firewall unit tests gatewayMode setup
update "egress firewall with node selector updates during node update"
to work for both gateway modes.

@npinaeva
Copy link
Member Author

/retest

@coveralls
Copy link

coveralls commented Feb 28, 2023

Coverage Status

Coverage: 51.364% (-0.02%) from 51.386% when pulling 2292961 on npinaeva:egress-firewall-get-ns-as into 396d8ee on ovn-org:master.

@npinaeva npinaeva force-pushed the egress-firewall-get-ns-as branch 3 times, most recently from 2c8895b to a256b38 Compare March 3, 2023 10:21
asIndex := getNamespaceAddrSetDbIDs(egressFirewall.Namespace, oc.controllerName)
as, err := oc.addressSetFactory.EnsureAddressSet(asIndex)
as, err := oc.addressSetFactory.GetAddressSet(asIndex)
if err != nil {
return fmt.Errorf("cannot ensure addressSet for namespace %s: %v", egressFirewall.Namespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you change the error aka cannot get addressSet for namespace...

@@ -187,10 +188,17 @@ var _ = ginkgo.Describe("OVN EgressFirewall Operations", func() {
ObjectMeta: newObjectMeta(node1Name, ""),
},
},
},
&v1.NamespaceList{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding the namespace and everything that goes with it could we just add the namespaces addressSet to the db?

@npinaeva
Copy link
Member Author

npinaeva commented Mar 6, 2023

Hey @JacobTanenbaum I fixed the comments, ptal :)

@jcaamano jcaamano self-assigned this Mar 6, 2023
@JacobTanenbaum
Copy link
Contributor

/LGTM

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2nd commit I'm not sure is necessary. This was intentional so that egress firewall wouldn't fail it processed the event before the namespace handler fired. It doesn't hurt to create the egress firewall before does it?

3rd commit we dont need to move the test case for node selector. It's just with local gw mode it wont add the extra logical route policies on ovn_cluster_router to reroute secondary known host addresses. An easy fix here would be to just make NodeIP == GatewayRouterIP I think. Although that may result in a lower amount of test coverage. Alternatively, in LGW mode just remove the extra LRP on ovn_cluster_router.

@npinaeva
Copy link
Member Author

npinaeva commented Mar 6, 2023

2nd commit I'm not sure is necessary. This was intentional so that egress firewall wouldn't fail it processed the event before the namespace handler fired. It doesn't hurt to create the egress firewall before does it?

well logically egress firewall shouldn't create namespace-owned object, because it won't clean it up in case namespace was actually already deleted (I know it is quite difficult to imagine why in that case egress firewall create event is handled). But it is also good to not wait 30 second for egress firewall retry if the namespace address set was created later, so maybe we don't need it

error and never returned. That would prevent retry on failure.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
update "egress firewall with node selector updates during node update"
to work for both gateway modes.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
@npinaeva
Copy link
Member Author

npinaeva commented Mar 8, 2023

@trozet I removed the second commit and simplified the node selector test, so it should work for both gw modes now without any extra conditions

first argument is nil.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
@npinaeva
Copy link
Member Author

npinaeva commented Mar 9, 2023

found one more error handling problem while writing steps to reproduce for this bug :)

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch on the 3rd commit!

@trozet trozet merged commit 9d38bf1 into ovn-org:master Mar 9, 2023
21 checks passed
@npinaeva npinaeva deleted the egress-firewall-get-ns-as branch March 10, 2023 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants