New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Egress firewall fix retry #3449
Conversation
/retest |
2c8895b
to
a256b38
Compare
asIndex := getNamespaceAddrSetDbIDs(egressFirewall.Namespace, oc.controllerName) | ||
as, err := oc.addressSetFactory.EnsureAddressSet(asIndex) | ||
as, err := oc.addressSetFactory.GetAddressSet(asIndex) | ||
if err != nil { | ||
return fmt.Errorf("cannot ensure addressSet for namespace %s: %v", egressFirewall.Namespace, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could you change the error aka cannot get addressSet for namespace...
@@ -187,10 +188,17 @@ var _ = ginkgo.Describe("OVN EgressFirewall Operations", func() { | |||
ObjectMeta: newObjectMeta(node1Name, ""), | |||
}, | |||
}, | |||
}, | |||
&v1.NamespaceList{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of adding the namespace and everything that goes with it could we just add the namespaces addressSet to the db?
a256b38
to
0044672
Compare
Hey @JacobTanenbaum I fixed the comments, ptal :) |
/LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd commit I'm not sure is necessary. This was intentional so that egress firewall wouldn't fail it processed the event before the namespace handler fired. It doesn't hurt to create the egress firewall before does it?
3rd commit we dont need to move the test case for node selector. It's just with local gw mode it wont add the extra logical route policies on ovn_cluster_router to reroute secondary known host addresses. An easy fix here would be to just make NodeIP == GatewayRouterIP I think. Although that may result in a lower amount of test coverage. Alternatively, in LGW mode just remove the extra LRP on ovn_cluster_router.
well logically egress firewall shouldn't create namespace-owned object, because it won't clean it up in case namespace was actually already deleted (I know it is quite difficult to imagine why in that case egress firewall create event is handled). But it is also good to not wait 30 second for egress firewall retry if the namespace address set was created later, so maybe we don't need it |
0044672
to
012eaf5
Compare
error and never returned. That would prevent retry on failure. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
update "egress firewall with node selector updates during node update" to work for both gateway modes. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
012eaf5
to
26f8f66
Compare
@trozet I removed the second commit and simplified the node selector test, so it should work for both gw modes now without any extra conditions |
first argument is nil. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
found one more error handling problem while writing steps to reproduce for this bug :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch on the 3rd commit!
Egress firewall creation error was overridden by the status update
error and never returned. That would prevent retry on failure.
It worked before because the error was reassigned locally https://github.com/openshift/ovn-kubernetes/blob/release-4.11/go-controller/pkg/ovn/obj_retry.go#L681.
Fix egress firewall unit tests gatewayMode setup
update "egress firewall with node selector updates during node update"
to work for both gateway modes.