Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Egressfirewall use port groups #3968

Merged
merged 2 commits into from
Nov 3, 2023

Conversation

npinaeva
Copy link
Member

@npinaeva npinaeva commented Oct 17, 2023

We already have namespace port groups that are created by multicast and updated by pod handlers. That implementation has some races because port group is created conditionally. First commit changes that to create namespace port groups similar to namespace address sets on namespace initialization and update multicast handler to use that.
Second commit updates egress firewall to use namespaced port group instead of cluster port group + address set match to reduce the number of OVS flows.

Namespace port groups can be created conditionally, but that would require namespace locking for every port group-related operation, and may be worse for performance than 1 extra port group per namespace

@coveralls
Copy link

coveralls commented Oct 18, 2023

Coverage Status

coverage: 50.527% (+0.1%) from 50.422% when pulling b4d82c2 on npinaeva:egressfirewall-portgroup into 9ccf859 on ovn-org:master.

@npinaeva npinaeva changed the title Egressfirewall portgroup Egressfirewall use port groups Oct 18, 2023
@npinaeva npinaeva marked this pull request as ready for review October 18, 2023 10:43
Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npinaeva could you please add a brief explanation or comparison to the PR message on how this reduces the number of OVS flows, for posterity.

go-controller/pkg/ovn/base_network_controller_namespace.go Outdated Show resolved Hide resolved
// to add pod's IP to the namespace's address set.
func (oc *DefaultNetworkController) addPodToNamespace(ns string, ips []*net.IPNet) (*gatewayInfo, map[string]gatewayInfo, []ovsdb.Operation, error) {
// to add pod's IP to the namespace's address set and port group.
func (oc *DefaultNetworkController) addPodToNamespace(ns string, ips []*net.IPNet, portUUID string) (*gatewayInfo, map[string]gatewayInfo, []ovsdb.Operation, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would like to avoid caching and moving around the port UUID. I would like for libovsdbops to be able to take the port name instead. But this is more a note to self than a comment for this PR.

@@ -247,6 +277,15 @@ func (bnc *BaseNetworkController) ensureNamespaceLockedCommon(ns string, readOnl
if err != nil {
return nil, nil, fmt.Errorf("failed to create address set for namespace: %s, error: %v", ns, err)
}

// namespace port groups are only used by egress firewall and multicast for now
if bnc.multicastSupport || config.OVNKubernetesFeature.EnableEgressFirewall {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a second opinion, I wouldn't bother with this and just have the port at all times. But maybe you have your reasons.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I thought about microshift to avoid extra db load (and I don't even know if those have multicast and EF disabled). But then I realized that this helps me to not rewrite all unit tests, because namespace handler would create a port group for every namespace, and all our tests check the db (might be a good reason to improve the unit test framework of course, but for now it requires a lot of changes). So I thought it doesn't make anything worse, but not a strong opinion :)

go-controller/pkg/ovn/namespace.go Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
Copy link
Member Author

@npinaeva npinaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the review @jcaamano , I resolved what was obvious and left some comments to discuss more complicated cases, ptal

go-controller/pkg/ovn/base_network_controller_namespace.go Outdated Show resolved Hide resolved
@@ -247,6 +277,15 @@ func (bnc *BaseNetworkController) ensureNamespaceLockedCommon(ns string, readOnl
if err != nil {
return nil, nil, fmt.Errorf("failed to create address set for namespace: %s, error: %v", ns, err)
}

// namespace port groups are only used by egress firewall and multicast for now
if bnc.multicastSupport || config.OVNKubernetesFeature.EnableEgressFirewall {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I thought about microshift to avoid extra db load (and I don't even know if those have multicast and EF disabled). But then I realized that this helps me to not rewrite all unit tests, because namespace handler would create a port group for every namespace, and all our tests check the db (might be a good reason to improve the unit test framework of course, but for now it requires a lot of changes). So I thought it doesn't make anything worse, but not a strong opinion :)

go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/namespace.go Show resolved Hide resolved
@npinaeva npinaeva force-pushed the egressfirewall-portgroup branch 3 times, most recently from 4a75f10 to 3a655ec Compare October 25, 2023 10:10
@npinaeva
Copy link
Member Author

caught a new unit test failure :( #3977

@npinaeva npinaeva force-pushed the egressfirewall-portgroup branch 3 times, most recently from dcee783 to 5027f7a Compare October 26, 2023 13:48
go-controller/pkg/util/batching/batch.go Outdated Show resolved Hide resolved
go-controller/pkg/util/batching/batch.go Outdated Show resolved Hide resolved
@jcaamano
Copy link
Contributor

reminder that this might have been missed:
#3968 (review)

@npinaeva
Copy link
Member Author

reminder that this might have been missed: #3968 (review)

Oh I did update the commit message here 5027f7a with
Previously, all EF ACLs were applied to the clusterPortGroup, and matched source on address set. In that case, all node will have the same set of OVS flows, matching sources that are not scheduled on that node at all. When using port group as a source match, only flow for local ports will be present on every node.
but now I am thinking you meant some example with specific numbers?

@jcaamano
Copy link
Contributor

reminder that this might have been missed: #3968 (review)

Oh I did update the commit message here 5027f7a with Previously, all EF ACLs were applied to the clusterPortGroup, and matched source on address set. In that case, all node will have the same set of OVS flows, matching sources that are not scheduled on that node at all. When using port group as a source match, only flow for local ports will be present on every node. but now I am thinking you meant some example with specific numbers?

Ah, sorry I was just looking and the PR message.

I think this explanation is enough to justify the change. If you have numbers for an example, that would satisfy my curiosity. But if you don't have them, that's ok. I guess it would depend on the content of the address set.

@npinaeva npinaeva force-pushed the egressfirewall-portgroup branch 3 times, most recently from 9299909 to d772347 Compare October 30, 2023 09:02
Comment on lines 203 to 279
for namespace, acls := range batchNsACLs {
if namespace != "" {
pgName := oc.getNamespacePortGroupName(namespace)
// delete stale ACLs from namespaced port group
if !existingEFNamespaces[namespace] {
ops, err = libovsdbops.DeleteACLsFromPortGroupOps(oc.nbClient, ops, pgName, acls...)
} else {
// re-attach from ClusterPortGroupNameBase to namespaced port group.
// port group should exist, because namespace handler will create it.
ops, err = libovsdbops.AddACLsToPortGroupOps(oc.nbClient, ops, pgName, acls...)
}
if err != nil {
return fmt.Errorf("failed to build cleanup ops: %w", err)
}
}
// delete all EF ACLs from ClusterPortGroupNameBase
ops, err = libovsdbops.DeleteACLsFromPortGroupOps(oc.nbClient, ops,
oc.getClusterPortGroupName(types.ClusterPortGroupNameBase), acls...)
if err != nil {
return fmt.Errorf("failed to build cleanup from ClusterPortGroup ops: %w", err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have one PG for the namespace, and if that PG becomes stale because the namespace was deleted, then the PG will be deleted by the namespace sync function and thus the ACLs will be garbage collected with it, we wouldn't need to explicitly delete those ACLs from here.

You might still be doing this for the UTs, but when talking about potential massive ops I would rather handle it specifically in the UTs than in the real code. In any case it might be appropriate to add a comment.

So overall I am looking at the added startup time here because of unconditional massive ops that we might end up doing. Would it make sense to read the cluster PG, and just move the EF ACLs that are still attached to it? Ideally once this thing, that is costing us time, has been done once, we wouldn't need to do it again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's almost right, but the problem is that a namespace may not only be deleted, but also just get an egress firewall deleted, in that case it won't get to existingEFNamespaces and a cleanup is needed. That is why the most part will have to stay, but I added an extra condition for clusterPG to reduce the number of ops.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! My final nit would be to reorganize the method a bit. It contains several migrations from several different versions and the thing is getting big and confusing. I would probably split them out to their own methods for clarity. But I know I am pushing it now, so up to you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense :) hope I didn't break anything

@npinaeva npinaeva force-pushed the egressfirewall-portgroup branch 2 times, most recently from 3710ab2 to b4d82c2 Compare October 31, 2023 10:42
jcaamano
jcaamano previously approved these changes Nov 2, 2023
@jcaamano
Copy link
Contributor

jcaamano commented Nov 2, 2023

This flake might be new

--- FAIL: TestNetworkPolicyV2Conformance/AdminNetworkPolicyEgressSCTP/Should_support_an_'deny-egress'_policy_for_SCTP_protocol;_ensure_rule_ordering_is_respected (3.27s)

https://github.com/ovn-org/ovn-kubernetes/actions/runs/6731387973/job/18297017440?pr=3968

namespace address set. Existing port groups don't need any change, since
port group name always was a namespace name.
Set ErrNotFound to false for delete acls and ports from port group ops.
Namespace handler will create, update and delete namespace port group
similar to the namespace address set.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
cluster port group + address set match to reduce the number of OVS
flows.
Previously, all EF ACLs were applied to the clusterPortGroup, and
matched source on address set. In that case, all node will have the
same set of OVS flows, matching sources that are not scheduled on
that node at all. When using port group as a source match, only
flow for local ports will be present on every node.
The port group is created, updated and deleted by namespace handler
similar to the namespace address set.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
@jcaamano jcaamano merged commit f94332f into ovn-org:master Nov 3, 2023
29 checks passed
@npinaeva npinaeva deleted the egressfirewall-portgroup branch December 1, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants