Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OVN component templates for NodePort services. #3430

Merged
merged 10 commits into from Apr 5, 2023

Conversation

dceara
Copy link
Contributor

@dceara dceara commented Feb 18, 2023

- What this PR does and why is it needed

This changes the way NodePort services configure OVN.

Instead of generating a load balancer per service per node we now generate at most four (template) load balancers (IPv4/IPv6 applicable to gateway routers or applicable to node switches). Each template load balancer uses as VIP a template variable denoting the node IP, e.g., NODEIP_IPv4 and as backend another template variable, e.g., SVC_node_port_svc__UDP__30957__node__router__template__IPv4. These template variables have potentially different values on different chassis. For example, NODEIP_IPv4 will expand to chassis-X's IP on chassis-X and to chassis-Y's IP on chassis-Y. Similarly, SVC_node_port_svc__UDP__30957__node__router__template__IPv4 will expand on chassis-X to the set of backends that are required by the NodePort service on chassis-X and that might be different than the set of backends required on chassis-Y.

Due to the fact that now we split NodePort OVN load balancers into router and switch load balancers we also need to provision two new load balancer groups allowing us to apply router specific load balancers to routers and switch specific ones to switches respectively:

  • clusterRouterLBGroup: containing all GW logical routers.
  • clusterSwitchLBGroup: containing all node logical switches.

This significantly reduces NB size and load in ovn-northd in scenarios where large numbers of NodePort services are created in clusters at scale.

We currently focused on NodePort services that don't use session affinity timeout (not supported on Template LBs in OVN) and that don't have ETP=local. The latter can probably be implemented as a follow up.

The first 5 commits do some of the initial work required for using templates (bumping the OVN schema, tracking node chassisIDs, etc.) and also enable unit tests for LB groups (which are the current default way of provisioning services in OVN).

The next commit in this PR introduces a new model_client mutator type, useful when mutating OVSDB map values. According to RFC 7047, section 5.1., mutations that update existing values must first delete those values from the map. In order to avoid always performing blind deletes before update we provide the ability to the client to specify whether the delete is required (e.g., at first insert) or not (e.g., when updating an existing value).

The last commit in this PR changes the implementation for node port services. We now build a third type of load balancer configs (template configs) filtering out those that can currently use templates (as mentioned above: NodePort w/o session affinity and w/o ETP/ITP local).

- Special notes for reviewers

Note: there's a small TODO left in the code, to properly document what the buildTemplateLBs() function does.

- How to verify it

All service related relevant unit tests have been adapted to reflect the new way of provisioning load balancers. All conformance and control-plane tests should pass.

- Description for the changelog
Use OVN component templates for NodePort services.

@dceara
Copy link
Contributor Author

dceara commented Feb 18, 2023

CC: @dcbw @trozet @tssurya @flavio-fernandes please take a look when you get the chance.

@flavio-fernandes
Copy link
Contributor

flavio-fernandes commented Feb 20, 2023 via email

@dceara
Copy link
Contributor Author

dceara commented Feb 27, 2023

/retest

@coveralls
Copy link

coveralls commented Feb 27, 2023

Coverage Status

Coverage: 52.37% (+0.2%) from 52.169% when pulling 61e0379 on dceara:node-port-component-templates into 8a616fd on ovn-org:master.

Copy link
Contributor

@flavio-fernandes flavio-fernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @dceara . I need to resume the review, but thought of dropping my initial comments here.
I also rebased thes PR to the latest master branch here. You may use it if it helps.
To be continued... :)

go-controller/pkg/libovsdbops/model_client_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@flavio-fernandes flavio-fernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few silly nits and question. I will pore over this pr one final time, but everything looks great to me already. Great stuff!

Copy link
Contributor

@flavio-fernandes flavio-fernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff, @dceara! A few suggestions and comments, ptal.

go-controller/pkg/libovsdbops/loadbalancer.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/loadbalancer.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/model_client.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/template_var.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/controller/services/repair.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/master.go Outdated Show resolved Hide resolved
@tssurya
Copy link
Member

tssurya commented Mar 6, 2023

/assign @tssurya

@dceara
Copy link
Contributor Author

dceara commented Mar 10, 2023

As discussed with @kyrtapz I'm going to drop the first commit from this PR, dacdcec ("gateway_init: Remove unused 'sctpSupport' argument.") and track that via a new issue #3484

@tssurya
Copy link
Member

tssurya commented Mar 10, 2023

As discussed with @kyrtapz I'm going to drop the first commit from this PR, dacdcec ("gateway_init: Remove unused 'sctpSupport' argument.") and track that via a new issue #3484

thanks! I was about to suggest this as well to keep that separate since this work is already big and that cleanup can be a new PR in itself.

@jcaamano
Copy link
Contributor

@dceara A head's up for the needed rebase, in case you missed it.

@dceara
Copy link
Contributor Author

dceara commented Mar 10, 2023

@dceara A head's up for the needed rebase, in case you missed it.

On it, thanks! Unit tests just passed locally. New version incoming.

@dceara dceara force-pushed the node-port-component-templates branch from 1b9fe2f to c276a07 Compare March 10, 2023 15:48
@dceara dceara requested review from flavio-fernandes and jcaamano and removed request for flavio-fernandes and jcaamano March 13, 2023 23:26
@dceara
Copy link
Contributor Author

dceara commented Mar 14, 2023

@jcaamano @flavio-fernandes @tssurya I tried using the "request review" feature in github but I couldn't make it say that I "request review from everyone". So.. I'm doing that now through this comment:

I think this PR is ready for a new round of review.

Thanks,
Dumitru

Copy link
Member

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dceara for doing this! I know this is long overdue haha ; I appreciate your persistence and efforts on getting this complex scale fix in!
Might seem like lots of comments at first glance but not many changes in general. Main Logic wise LGTM.
nit: wished to see more details on the last commit's message or maybe a link to the actual OVN PR where we can understand how a template would then percolate into a flow on OVS side.

Nevertheless for my own sanity so that I remember things, here is how this PR looks like on an end product:

_uuid               : 440925ac-2fa0-4a4b-904a-32c15fa3767e
load_balancer       : [c466f525-37ae-401e-b3f9-5b599c316df3]
name                : clusterSwitchLBGroup

_uuid               : ee183dca-8094-48f4-8237-2acec50acbe8
load_balancer       : [550ed435-a306-432f-bd5d-9af296c9a4e6, 9ff61978-1c29-4108-a01b-7cd9b191be2e]
name                : clusterLBGroup

_uuid               : a566a8d3-c129-4b25-8922-c9d8e9ae5bf4
load_balancer       : [884f0d41-b463-4b90-8874-af421cc013e7]
name                : clusterRouterLBGroup
_uuid               : 884f0d41-b463-4b90-8874-af421cc013e7
external_ids        : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="default/example-service-hello-world"}
health_check        : []
ip_port_mappings    : {}
name                : "Service_default/example-service-hello-world_TCP_node_router_template_IPv4"
options             : {address-family=ipv4, event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="false", template="true"}
protocol            : tcp
selection_fields    : []
vips                : {"^NODEIP_IPv4:32399"="^Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4"}

_uuid               : c466f525-37ae-401e-b3f9-5b599c316df3
external_ids        : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="default/example-service-hello-world"}
health_check        : []
ip_port_mappings    : {}
name                : "Service_default/example-service-hello-world_TCP_node_switch_template_IPv4"
options             : {address-family=ipv4, event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="false", template="true"}
protocol            : tcp
selection_fields    : []
vips                : {"^NODEIP_IPv4:32399"="^Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4"}
_uuid               : dd89d5a1-2d47-4eeb-8f41-6c28d3a95657
chassis             : "f1a3702b-252b-42a7-9939-968b9ea7f4d0"
external_ids        : {}
variables           : {NODEIP_IPv4="172.19.0.3", Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4="169.254.169.2:8080", Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4="172.19.0.3:8080"}

_uuid               : 27ea0802-7376-4770-bd24-a738fbe66fce
chassis             : "99fe5f0c-60cf-4a2a-9e5d-aa78111307a8"
external_ids        : {}
variables           : {NODEIP_IPv4="172.19.0.4", Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4="172.19.0.3:8080", Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4="172.19.0.3:8080"}

_uuid               : 2ad1f543-13e3-4332-9afc-3e7a94223e06
chassis             : "07a79270-10ba-40c4-a5d8-64d1d32c2222"
external_ids        : {}
variables           : {NODEIP_IPv4="172.19.0.2", Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4="172.19.0.3:8080", Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4="172.19.0.3:8080"}
  table=5 (lr_in_defrag       ), priority=110  , match=(ip && ip4.dst == ^NODEIP_IPv4 && tcp), action=(reg0 = ^NODEIP_IPv4; reg9[16..31] = tcp.dst; ct_dnat;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.est && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399 && ct_mark.natted == 1), action=(flags.force_snat_for_lb = 1; next;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.new && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399), action=(flags.force_snat_for_lb = 1; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4; force_snat);)
  table=5 (lr_in_defrag       ), priority=110  , match=(ip && ip4.dst == ^NODEIP_IPv4 && tcp), action=(reg0 = ^NODEIP_IPv4; reg9[16..31] = tcp.dst; ct_dnat;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.est && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399 && ct_mark.natted == 1), action=(flags.force_snat_for_lb = 1; next;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.new && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399), action=(flags.force_snat_for_lb = 1; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4; force_snat);)
  table=5 (lr_in_defrag       ), priority=110  , match=(ip && ip4.dst == ^NODEIP_IPv4 && tcp), action=(reg0 = ^NODEIP_IPv4; reg9[16..31] = tcp.dst; ct_dnat;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.est && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399 && ct_mark.natted == 1), action=(flags.force_snat_for_lb = 1; next;)
  table=7 (lr_in_dnat         ), priority=120  , match=(ct.new && !ct.rel && ip4 && reg0 == ^NODEIP_IPv4 && tcp && reg9[16..31] == 32399), action=(flags.force_snat_for_lb = 1; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__router__template__IPv4; force_snat);)
  table=6 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg1 = ^NODEIP_IPv4; reg2[0..15] = 32399; ct_lb_mark;)
  table=12(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg0[1] = 0; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4);)
  table=6 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg1 = ^NODEIP_IPv4; reg2[0..15] = 32399; ct_lb_mark;)
  table=12(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg0[1] = 0; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4);)
  table=6 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg1 = ^NODEIP_IPv4; reg2[0..15] = 32399; ct_lb_mark;)
  table=12(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == ^NODEIP_IPv4 && tcp.dst == 32399), action=(reg0[1] = 0; ct_lb_mark(backends=^Service__default_example_service_hello_world__TCP__32399__node__switch__template__IPv4);)

go-controller/pkg/ovn/controller/services/node_tracker.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/controller/services/node_tracker.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/default_network_controller.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/default_network_controller.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/gateway_init.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/master.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/master.go Outdated Show resolved Hide resolved
@tssurya
Copy link
Member

tssurya commented Mar 14, 2023

@dceara : also looks like you need to rebase.

@dceara dceara force-pushed the node-port-component-templates branch from c276a07 to 1325629 Compare March 15, 2023 23:13
go-controller/pkg/libovsdbops/loadbalancer.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/template_var.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/template_var.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/template_var.go Outdated Show resolved Hide resolved
go-controller/pkg/libovsdbops/template_var.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/gateway_init.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/master.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/controller/services/lb_config.go Outdated Show resolved Hide resolved
go-controller/pkg/ovn/controller/services/lb_config.go Outdated Show resolved Hide resolved
@dceara dceara force-pushed the node-port-component-templates branch 2 times, most recently from 8b6f068 to 99909e4 Compare March 21, 2023 12:18
@dceara
Copy link
Contributor Author

dceara commented Mar 21, 2023

/retest_failed

@dceara
Copy link
Contributor Author

dceara commented Mar 21, 2023

/retest-failed

Copy link
Contributor

@flavio-fernandes flavio-fernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@flavio-fernandes
Copy link
Contributor

/retest

@dceara
Copy link
Contributor Author

dceara commented Apr 4, 2023

CI failures are unrelated:

@dceara
Copy link
Contributor Author

dceara commented Apr 4, 2023

/retest-failed

@dceara dceara force-pushed the node-port-component-templates branch from 1a02d9c to d0dc227 Compare April 4, 2023 16:21
That's what we actually deploy nowadays.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
These will be used for per node NB configurations (e.g., load balancer
templates).

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
An upcoming commit will need to make some decisions based on the node
IP address family.  To avoid re-parsing nodeIPs, just store them as
net.IP.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
mergeLBs() was incorrectly only merging switches and routers of load
balancers that can be merged.  Until now that wasn't a problem because
load balancers that used groups were never real candidates for merging
(they were cluster-IP services).  An upcoming commit will take advantage
of this behavior and use LB groups for node port services too.

Fixes: 18d11d4 ("Use Load_Balancer_Groups when supported.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
This better reflects what it's used for.  The version of the packages to
be used at runtime is actually specified in the corresponding
Dockerfiles in dist/images.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
This picks up the new Chassis_Template_Var support in OVN.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
This picks up the following fixes (only mentioning relevant ones):
  2bab96e899b5 ("northd: prevents sending packet to conntrack for router ports")
  d01fdfdb2c97 ("lb: Allow IPv6 template LBs to use explicit backends.")
  6a16c741e5a1 ("controller: lflow: do not use tcp as default IP protocol for ct_snat_to_vip action")
  77384b7fe3f7 ("northd: Drop packets for LBs with no backends")
  81eaa98bbb60 ("northd: Use generic ct.est flows for LR LBs")
  0af110c400cc ("northd: drop ct.inv packets in post snat and lb_aff_learn stages")
  89fc85fa7f2b ("controller: Add config option per LB to enable/disable CT flush")

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Insert OVSDB mutations for columns of type "map" are performed only if
the value didn't already exist in the database (RFC 7047, section 5.1).
That means that mutations that update existing values must first delete
those.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
@dceara dceara force-pushed the node-port-component-templates branch from d0dc227 to 6faf008 Compare April 5, 2023 07:39
Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are good to squash

@dceara dceara force-pushed the node-port-component-templates branch 2 times, most recently from 35e0f44 to 6f9c9e2 Compare April 5, 2023 11:10
Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On retrospective, it has posed quite a hurdle to track that we were not getting into nasty race conditions with the template variables.

One thing I think this design could have benefited from is updating the node variables in the network controller. That would have removed one source of potential races and I think it would be fine as the service controller should not really be concerned with their values.

Another aspect is tracking that the service controller and the node tracker are not updating the same variables. That's a fact but probably the service controller could benefit from filtering them out or explicitly checking that is not the case in the future in a away that alerts us if we end up doing it by mistake or unexpectedly.

Overall though I have to thank @dceara for the patience he has had with me.

@dceara
Copy link
Contributor Author

dceara commented Apr 5, 2023

On retrospective, it has posed quite a hurdle to track that we were not getting into nasty race conditions with the template variables.

One thing I think this design could have benefited from is updating the node variables in the network controller. That would have removed one source of potential races and I think it would be fine as the service controller should not really be concerned with their values.

Another aspect is tracking that the service controller and the node tracker are not updating the same variables. That's a fact but probably the service controller could benefit from filtering them out or explicitly checking that is not the case in the future in a away that alerts us if we end up doing it by mistake or unexpectedly.

Maybe a different approach like a separate controller to handle template variable reconciliation would make sense as a follow up. Then the node tracker, the network controller and the services controller would just post operations of the form "set variable X's value for chassis C"/"delete variable X"/"delete all variables for chassis C" to the "template variable controller" which in turn would eventually propagate these to the NB. I guess the advantage of that would be that we can take care of template synchronization in a single place and that we can easily reuse templates for other features (not only services). We don't have such use cases yet though.

Overall though I have to thank @dceara for the patience he has had with me.

Thanks @jcaamano @tssurya @flavio-fernandes for all the reviews!

@flavio-fernandes
Copy link
Contributor

/retest-failed

@dceara
Copy link
Contributor Author

dceara commented Apr 5, 2023

The multi-homing CI failures below are unrelated to the changes this PR is making:

 [Fail] Multi Homing multiple pods connected to the same OVN-K secondary network can communicate over the secondary network [It] can communicate over an L3 - routed - secondary network 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:291

[Fail] Multi Homing multiple pods connected to the same OVN-K secondary network can communicate over the secondary network [It] can communicate over an L2 - switched - secondary network without IPAM 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:291

[Fail] Multi Homing multiple pods connected to the same OVN-K secondary network can communicate over the secondary network [It] can communicate over an L2 - switched - secondary network without IPAM 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:291

[Fail] Multi Homing multiple pods connected to the same OVN-K secondary network can communicate over the secondary network [It] can communicate over an L3 - routed - secondary network with IPv6 subnet 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:291

[Fail] Multi Homing multiple pods connected to the same OVN-K secondary network can communicate over the secondary network [It] can communicate over an L3 - routed - secondary network with IPv6 subnet 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:291

For now we only use templates for NodePort services that don't have
ETP=local set or affinity timeout.  The latter is not supported in
core OVN but the former can probably be implemented as a follow up
change.

With load balancer templates we're able to reduce the size of the NB and
amount of work ovn-northd has to perform by a factor of N (where N is
the number of nodes).  In particular, for a cluster with N nodes and S
NodePort services:
- before this change we would configure O(N * S) load balancers (two per
  node for every service)
- with this change this gets reduced to O(S) load balancers (four per
  service).  We do have to configure N template variables per service
  but the impact of those on the amount of work ovn-northd has to
  perform is minimal as it will only have to propagate those records
  to the Southbound.

Instead of generating a load balancer per service per node we now
generate at most four (template) load balancers (IPv4/IPv6 applicable to
gateway routers or applicable to node switches).

Each template load balancer uses as VIP a template variable denoting
the node IP, e.g., NODEIP_IPv4 and as backend another template variable,
e.g., SVC_node_port_svc__UDP__30957__node__router__template__IPv4.

These template variables have potentially different values on different
chassis. For example, NODEIP_IPv4 will expand to chassis-X's IP on
chassis-X and to chassis-Y's IP on chassis-Y.

Similarly, SVC_node_port_svc__UDP__30957__node__router__template__IPv4
will expand on chassis-X to the set of backends that are required by
the NodePort service on chassis-X and that might be different than the
set of backends required on chassis-Y.

The node IP template variables are managed by the services controller.
Together with the nodeInfo structs provided by the node tracker they
represent the node-specific information.  A RWMutex, nodeInfoRWLock,
is used to protect access to this node specific information.  When
writing this data (node_tracker) the mutex is taken for writing and
when using it (service processing) the mutex is taken for reading
(to reduce contention between service controller threads).

The alreadyAppliedLock is repurposed into a RWMutex too in order to
further reduce contention between service controller threads.

More information about the way OVN component templates work and
synthetic benchmark results can be found at:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-November/399522.html
and
https://www.openvswitch.org/support/ovscon2022/slides/Horizontal-scaling.pdf

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
In general, template OVN load balancers used for implementing NodePort
services might have completely different sets of backends on different
nodes.

In particular however, there are quite a few cases in which node port
services end up using the same backends on all nodes, e.g., for services
that don't have ETP/ITP=local and that select only OVN networked pods as
backends.

Detect such situations and avoid using templated backends.  Use instead a
shared list of backend addresses.  This reduces NB DB size of the
Chassis_Template_Var table, from O(N * S) to O(N), where N is the
number of nodes and S is the number of node port services that match the
criteria defined above.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
@dceara dceara force-pushed the node-port-component-templates branch from 6f9c9e2 to 61e0379 Compare April 5, 2023 17:03
@jcaamano jcaamano merged commit 473b456 into ovn-org:master Apr 5, 2023
22 checks passed
@flavio-fernandes
Copy link
Contributor

🥳

@dceara dceara deleted the node-port-component-templates branch April 6, 2023 07:21
@tssurya
Copy link
Member

tssurya commented Apr 6, 2023

THANK YOU @dceara for being persistent and determined throughout and all your hard work on this.!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants