Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nodeSelector to Egress Firewall #3002

Merged
merged 1 commit into from Feb 28, 2023
Merged

Conversation

trozet
Copy link
Contributor

@trozet trozet commented May 25, 2022

This allows destination addresses to be chosen via node selector for
egress firewall rules and addresses the use case where a user wants to
open holes in the firewall to specific nodes. For example, if a user
wants to enable kapi service accesss to host networked kapi server
endpoints.

Signed-off-by: Tim Rozet trozet@redhat.com

- What this PR does and why is it needed

- Special notes for reviewers

- How to verify it

- Description for the changelog

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 52.041% when pulling 9069a79 on trozet:ef_node_select into 1d976e4 on ovn-org:master.

go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall_test.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/egressfirewall_test.go Outdated Show resolved Hide resolved
go-controller/pkg/factory/factory.go Outdated Show resolved Hide resolved
// +kubebuilder:validation:Pattern=^([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.?$
DNSName string `json:"dnsName,omitempty"`
// nodeSelector will allow/deny traffic to the Kubernetes node IP of selected nodes. If this is, set
// cidrSelector and DNSName must be unset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this condition validated? I havent seen it so far. Are we rejecting when this condition isnt true otherwise the user will see unexpected results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u need // +optional or this new field , also indicate empty selector "Default" allow/deny to any node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add @msherif1234

@martinkennelly I wasn't really validating it, just ignoring nodeSelector if one of the other 2 were set. This is how the previous egress firewall interacted between DNSName and CIDRSelector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order right now is:

  1. If dns name provided, use that
  2. else if cidr , use that
  3. else use node selector

I could add some validation to make sure only 1 is set. Seems like a good idea to me. wdyt?

// +kubebuilder:validation:Pattern=^([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.?$
DNSName string `json:"dnsName,omitempty"`
// nodeSelector will allow/deny traffic to the Kubernetes node IP of selected nodes. If this is, set
// cidrSelector and DNSName must be unset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u need // +optional or this new field , also indicate empty selector "Default" allow/deny to any node

go-controller/pkg/ovn/egressfirewall.go Outdated Show resolved Hide resolved
return nil
} else if len(egressFirewallACLs) > 1 {
klog.Errorf("Duplicate ACL found for egress firewall: %s", name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to return here with the above error ? in case retry help if it was transient error ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we find there are more than one ACL with the same name, which there shouldn't be, we just delete them all. IMO delete shouldn't fail here, we should just log that there is some error where we configured duplicates.

go-controller/pkg/ovn/obj_retry.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/obj_retry.go Outdated Show resolved Hide resolved
@msherif1234
Copy link
Contributor

can u extend egfw e2e with nodeSelector test case

@trozet
Copy link
Contributor Author

trozet commented Feb 15, 2023

revived! latest update is just a rebase. I'll address everyone's comments tmrw. Thanks for all the feedback.

@trozet trozet force-pushed the ef_node_select branch 2 times, most recently from cffedaf to 5fe0052 Compare February 15, 2023 18:57
@npinaeva
Copy link
Member

I don't know if you've considered an option to create address set per node selector, and only update it instead of acls?
That will require some more objects, but should be better from the nbdb side, and events handling too.
in general that should be similar to the EgressDNS we have now, but for node selectors.

@trozet
Copy link
Contributor Author

trozet commented Feb 23, 2023

I don't know if you've considered an option to create address set per node selector, and only update it instead of acls? That will require some more objects, but should be better from the nbdb side, and events handling too. in general that should be similar to the EgressDNS we have now, but for node selectors.

This does seem cleaner because then operations would simply be mutate to an address set rather than update an ACL. Let me see if I can easily change it.

@trozet
Copy link
Contributor Author

trozet commented Feb 27, 2023

Talking with @npinaeva today we decided to defer doing an address set to hold node IPs. The ideal case with an address set is to create one per node selector so that we could minimize resources and share where needed. However there are some limitations today with mapping external ids in OVN to the node selector. Once that issue is more generally solved within the project we will move the node selector in EF to use address set.

trozet added a commit to trozet/cluster-network-operator that referenced this pull request Feb 27, 2023
Follows the upstream implementation here:
ovn-org/ovn-kubernetes#3002

Signed-off-by: Tim Rozet <trozet@redhat.com>
Copy link
Member

@npinaeva npinaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic lgtm, had some comments for tests

@@ -608,6 +615,257 @@ var _ = ginkgo.Describe("OVN EgressFirewall Operations", func() {
gomega.Expect(err).NotTo(gomega.HaveOccurred())

})
ginkgo.It("egress firewall with node selector updates during node update", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we either should add gwMode in the name here, or if you don't need to test that for gateway modes, move it outside of for _, gwMode := range []config.GatewayMode loop
same for the second test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was just something I missed when I rebased. I think in the past the tests were split into local and shared and for create vs update and it didnt make sense. iirc you fixed it. I'll update the test to use the right gw mode.

@@ -608,6 +615,257 @@ var _ = ginkgo.Describe("OVN EgressFirewall Operations", func() {
gomega.Expect(err).NotTo(gomega.HaveOccurred())

})
ginkgo.It("egress firewall with node selector updates during node update", func() {
config.Gateway.Mode = config.GatewayModeShared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

config.Gateway.Mode = config.GatewayModeShared
config.Gateway.NodeportEnable = true
const (
namespaceName = "namespace1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

node1Name string = "node1"
)

expectedOVNClusterRouter := &nbdb.LogicalRouter{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use functions like newOVNClusterRouter() newRouterPortGroup() etc, check hybrid_test for examples

SBData: []libovsdbtest.TestData{datapath},
}

selector := metav1.LabelSelector{MatchLabels: map[string]string{"name": "test"}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make "name" and "test" constants?

buildEgressFwAclName("namespace1", t.EgressFirewallStartPriority),
nbdb.ACLDirectionToLport,
t.EgressFirewallStartPriority,
"(ip4.dst == 9.9.9.9) && ip4.src == $a10481622940199974102",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.OvnACLLoggingMeter,
"",
false,
map[string]string{"egressFirewall": "namespace1"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"namespace1" -> namespace1.Name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't work? :D

_, err = fakeOVN.fakeClient.KubeClient.CoreV1().Nodes().Patch(context.TODO(), node1Name, types.MergePatchType, patchData, metav1.PatchOptions{})
gomega.Expect(err).NotTo(gomega.HaveOccurred())
expectedClusterPortGroup.ACLs = []string{}
expectedDatabaseState = append(expectedDatabaseState)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Comment on lines 1308 to 1309
node1Name string = "node1"
node2Name string = "node2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

to:
cidrSelector: %s
`, f.Namespace.Name, exFWPermitTcpDnsDest, singleIPMask, exFWPermitCIDR, f.Namespace.Name, labelMatch, exFWDenyCIDR)
framework.Logf("Egress Firewall CR generated: %s", egressFirewallConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it was useful because then I can see the labelMatch string generated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is more of a test debugging right? once you know the CR is correct, you don't need the log?

This allows destination addresses to be chosen via node selector for
egress firewall rules and addresses the use case where a user wants to
open holes in the firewall to specific nodes. For example, if a user
wants to enable kapi service accesss to host networked kapi server
endpoints.

Signed-off-by: Tim Rozet <trozet@redhat.com>
Copy link
Member

@npinaeva npinaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some more renames would be nice, but that is lgtm anyway

@@ -611,6 +618,238 @@ var _ = ginkgo.Describe("OVN EgressFirewall Operations", func() {
gomega.Expect(err).NotTo(gomega.HaveOccurred())

})
ginkgo.It(fmt.Sprintf("egress firewall with node selector updates during node update, gateway mode %s", gwMode), func() {
config.Gateway.Mode = gwMode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need that, it is set in the beginning of the loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this whole for loop doesn't work properly with setting up gwMode, but I can update it in another pr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gomega.Expect(err).NotTo(gomega.HaveOccurred())
asHash, _ := getNsAddrSetHashNames(namespace1.Name)
ipv4ACL := libovsdbops.BuildACL(
buildEgressFwAclName("namespace1", t.EgressFirewallStartPriority),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"namespace1" -> namespace1.Name

t.OvnACLLoggingMeter,
"",
false,
map[string]string{"egressFirewall": "namespace1"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't work? :D

@trozet trozet merged commit 2136c9e into ovn-org:master Feb 28, 2023
Copy link
Member

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only reviewed the CRD parts in this PR since I am reviewing the CNO PR.

@@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.11.3
controller-gen.kubebuilder.io/version: v0.7.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was the client-gen version downgraded here?

pattern: ^([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.?$
type: string
nodeSelector:
description: nodeSelector will allow/deny traffic to the
Kubernetes node IP of selected nodes. If this is, set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If this is set, cidrSelector and DNSName must be unset

type: object
minProperties: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we removed min and max properties here? Even if we added a new nodeSelector, the rule still holds that we need at least one of them set (nodeselector/dnsname/cidrselector) and that only max one can be set which is clear from the description of each field.
Removing kubebuilder rules breaks backwards compatibility of how API would have complained around this and now it would stop complaining if more than one is set OR none are set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol I know why this was removed, its because we never had it in the first place! Seems like it was hardcoded:
We might need to change the types.go file to actually reflect and incorporate this:

// to is the target that traffic is allowed/denied to
// +kubebuilder:validation:MinProperties:=1
// +kubebuilder:validation:MaxProperties:=1
To EgressFirewallDestination `json:"to"`

Reason being API will now start behaving differently than before if we remove these validation hooks.

trozet added a commit to trozet/cluster-network-operator that referenced this pull request Mar 1, 2023
Follows the upstream implementation here:
ovn-org/ovn-kubernetes#3002

Signed-off-by: Tim Rozet <trozet@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants