Refactor EIP handling #230

detiber · 2022-03-01T19:18:42Z

Reconcile Service/endpoint updates separately
Provide option to perform health check against non-EIP External Address for the assigned control plane node
Attempt to pro-actively migrate EIP in the case of node deletion or node becomes unschedulable

These changes have been running as part of an automated test suite against my fork of cluster-api-provider-packet. Prior to these changes the flake frequency of the tests involving upgrade workflows or control plane node remediation was much higher.

A test deployment with these changes can be found https://raw.githubusercontent.com/detiber/packet-ccm/test/deploy/template/deployment.yaml, which uses a pre-built image with these changes: quay.io/detiber/cloud-provider-equinix-metal:dev

I suspect there are still some improvements that can be made to improve handling of the EIP, since there still seem to be a few edge cases where the EIP takes a while to be migrated, but overall these changes seem to provide quite an improvement over the previous handling.

Dockerfile_builder

metal/bgp.go

metal/cloud.go

metal/config.go

metal/devices.go

metal/eip_controlplane_reconciliation.go

detiber · 2022-03-01T19:39:02Z

metal/eip_controlplane_reconciliation.go

+		for _, a := range node.Status.Addresses {
+			// Find the non EIP external address for the node to use for the health check
+			if a.Type == v1.NodeExternalIP && a.Address != controlPlaneEndpoint.Address {
+				controlPlaneHealthURL = fmt.Sprintf("https://%s:%d", a.Address, m.nodeAPIServerPort)
+			}
+		}


With ClusterAPI, we are currently setting the EIP address as a loopback address on all control plane nodes to workaround the bootstrapping issue of the Service/Endpoints not yet being set, so kubeadm init doesn't fail.

This was causing an issue with the health checks not actually checking the proper endpoints in some circumstances...

detiber · 2022-03-02T21:39:33Z

Rebased now that #231 has been merged, but this PR also includes a fix from #232 as well.

deitch · 2022-03-03T10:20:37Z

Heh, a lot smaller now.

I will hold off until we resolve #232 and then we can address this one.

deitch · 2022-03-07T19:56:43Z

The merger of #232 should make this even smaller post-rebase?

detiber · 2022-03-07T20:13:29Z

@deitch rebased on latest.

deitch · 2022-03-08T16:33:39Z

@detiber I get most of it - looks good - except for the core. The existing one depends on reconcileNodes() being called once in a while (e.g. 30 seconds), and then checking the health of the EIP. If it failed, it then went on to check the control plane node IPs and move the EIP around, as needed.

In this proposed one, it appears to change that. It only checks the health in doHealthCheck(), which only is called by UpdateFunc:, and it only checks if the node on which UpdateFunc was called actually currently has the EIP assigned.

How does this regularly check the status of the EIP and update it if necessary? I get that it would work when UpdateFunc is called, but why would it update it with any regularity?

deitch · 2022-03-08T16:34:00Z

Also, it doesn't look like it checks the EIP address, only the node's?

davidspek · 2022-03-09T11:07:05Z

I was just running through another test of my PR that adds kube-vip support to the cluster-api provider and I noticed that the Equinix Metal CCM is assigning the EIP to one of the control plane nodes. However, I don't the think the CCM should be doing this. When using kube-vip to load-balance the control plane the EIP shouldn't be assigned to one of the nodes as far as I know. Also, when using the Cluster API the CCM won't be available on time since the EIP for the control plane must be accessible during the kubeadm bootstrapping. Expanding on that, the CCM won't actually be available until the user deploys a CNI.

So I believe there are 3 routes that could be taken here.

The CCM takes care of deploying kube-vip somehow. However, this might be difficult to do at the right time during cluster bootstrapping.
The CCM stops assigning the control plane EIP to one of the control plane nodes.
We provide a way to disable the CCM from assigning the control plane EIP to one of the control plane nodes. Maybe we could do this by changing the tag on the EIP?

Edit:
It seems as though it is easily possible to not have the CCM take control of the control plane EIP, which would be the route to take when using kube-vip with the cluster API.

deitch · 2022-03-09T12:29:42Z

I was just running through another test of detiber/cluster-api-provider-packet#65 that adds kube-vip support to the cluster-api provider and I noticed that the Equinix Metal CCM is assigning the EIP to one of the control plane nodes. However, I don't the think the CCM should be doing this.

It only does this if you pass it the right tag for the EIP. If you do not, CCM ignores it. That is the flag that says, "you should control the EIP" or "you should ignore it".

if you are using kibe-vip in its mode of managing the control plane EIP, then the CCM should not have the tag set.

davidspek · 2022-03-09T12:32:14Z

@deitch Thanks for the explanation, I just found that as well and updated my PR to reflect the change.

detiber · 2022-03-10T15:55:26Z

@detiber I get most of it - looks good - except for the core. The existing one depends on reconcileNodes() being called once in a while (e.g. 30 seconds), and then checking the health of the EIP. If it failed, it then went on to check the control plane node IPs and move the EIP around, as needed.

In this proposed one, it appears to change that. It only checks the health in doHealthCheck(), which only is called by UpdateFunc:, and it only checks if the node on which UpdateFunc was called actually currently has the EIP assigned.

How does this regularly check the status of the EIP and update it if necessary? I get that it would work when UpdateFunc is called, but why would it update it with any regularity?

Because the resync interval on the informer is set, every time a resync happens UpdateFunc is called for each of the Nodes, so that keeps us doing a health check at most roughly every minute, which is approximately the same as the old timer loop happening every 60 seconds.

detiber · 2022-03-10T16:00:09Z

Also, it doesn't look like it checks the EIP address, only the node's?

Yeah, that was the bit that I suspected to get some push back on... In the case of Cluster API, we are configuring the EIP as a local loopback address: https://github.com/kubernetes-sigs/cluster-api-provider-packet/blob/main/templates/cluster-template.yaml#L30-L36, this was needed to work around how kubeadm expects the address to be resolvable when bootstrapping (since the external service/endpoints would not be configured yet, since CPEM is not yet deployed).

This resulted in issues during upgrade, where the local etcd instance is removed from the cluster prior to the machine being deleted, which would result in the local apiserver being unresponsive. Not necessarily an issue if CPEM is running on the machine being deleted, but becomes an issue when it's one of the control plane nodes that isn't running CPEM that is also currently assigned the EIP. CPEM is attempting to health check the EIP, which resolves to localhost and passes, so doesn't attempt to re-assign the EIP.

deitch · 2022-03-10T16:15:07Z

Because the resync interval on the informer is set, every time a resync happens UpdateFunc is called for each of the Nodes, so that keeps us doing a health check at most roughly every minute, which is approximately the same as the old timer loop happening every 60 seconds.

Ah, I didn't realize that. So basically every update-interval for resync it automatically will call the health check? Then that makes sense.

deitch · 2022-03-10T16:19:38Z

Yeah, that was the bit that I suspected to get some push back on... In the case of Cluster API, we are configuring the EIP as a local loopback address: https://github.com/kubernetes-sigs/cluster-api-provider-packet/blob/main/templates/cluster-template.yaml#L30-L36, this was needed to work around how kubeadm expects the address to be resolvable when bootstrapping (since the external service/endpoints would not be configured yet, since CPEM is not yet deployed).

Oh yes, I have suffered through that. I do wish kubeadm was a bit more configurable in that regards.

CPEM is attempting to health check the EIP, which resolves to localhost and passes, so doesn't attempt to re-assign the EIP.

So the reason you did it is so that we won't have a situation where the health check in CPEM thinks everything is healthy - because it is checking the EIP, which is on local loopback, but really it isn't OK when coming from the Internet?

detiber · 2022-03-10T16:30:10Z

Yeah, that was the bit that I suspected to get some push back on... In the case of Cluster API, we are configuring the EIP as a local loopback address: https://github.com/kubernetes-sigs/cluster-api-provider-packet/blob/main/templates/cluster-template.yaml#L30-L36, this was needed to work around how kubeadm expects the address to be resolvable when bootstrapping (since the external service/endpoints would not be configured yet, since CPEM is not yet deployed).

Oh yes, I have suffered through that. I do wish kubeadm was a bit more configurable in that regards.

CPEM is attempting to health check the EIP, which resolves to localhost and passes, so doesn't attempt to re-assign the EIP.

So the reason you did it is so that we won't have a situation where the health check in CPEM thinks everything is healthy - because it is checking the EIP, which is on local loopback, but really it isn't OK when coming from the Internet?

Exactly

detiber · 2022-03-10T16:36:24Z

@deitch Do you think it would make sense to make using the EIP URL or not a configurable option?

I'm not sure how others may be using CPEM outside of Cluster API to know if it's a broader issue or just a localized one.

deitch · 2022-03-10T16:36:52Z

I don't understand.

detiber · 2022-03-10T16:39:03Z

@deitch rather than just avoiding the use of the EIP automatically (like this PR is doing) for health checks, or would it make sense to make that behavior opt-in, with the default continuing to use the EIP url for health checks?

deitch · 2022-03-10T16:43:08Z

Would it make sense to split the conversation? This PR is changing two things simultaneously: how we invoke health checks; what health checks test.

Does this PR have any value doing just the first, and then having a separate one focused just on the second? Or must it go together?

deitch · 2022-03-10T16:44:03Z

@deitch rather than just avoiding the use of the EIP automatically (like this PR is doing) for health checks, or would it make sense to make that behavior opt-in, with the default continuing to use the EIP url for health checks?

Now that I get it, yes, I think so. My concern is lack of consistent EIP behaviour by CPEM, but that can be acceptable I think.

detiber · 2022-03-10T16:48:55Z

Would it make sense to split the conversation? This PR is changing two things simultaneously: how we invoke health checks; what health checks test.

Does this PR have any value doing just the first, and then having a separate one focused just on the second? Or must it go together?

Both are really needed in order to unblock the CAPI v1 support for CAPP, so I'm not sure there is much value in breaking it up.

That said, more than happy to break it up if that is your preference.

detiber · 2022-03-14T15:43:14Z

This should be ready to go.

deitch · 2022-03-16T09:23:51Z

Reviewing now. @detiber can you add the new option env var to the README?

metal/config.go

metal/eip_controlplane_reconciliation.go

deitch · 2022-03-16T09:45:03Z

metal/eip_controlplane_reconciliation.go

+			FilterFunc: func(obj interface{}) bool {
+				n, _ := obj.(*v1.Node)
+				return m.nodeFilter(n)
+			},


Given that the apiServerPort might not be set the first time a node is added, will this work the next time? Is the FilterFunc called on every sync?

metal/eip_controlplane_reconciliation.go

deitch

This is rather nicely done, and I think simplifies a lot.

I mostly have some questions, README requests, and some small changes

detiber · 2022-03-17T19:45:37Z

I still need to address the README updates, but did quite a bit of cleanup based on the other comments.

Working on validating that the Server Side Apply changes didn't break anything, though 😬

detiber · 2022-03-17T20:21:05Z

Server side apply changes seem to be working good, will pick back up on the README changes tomorrow.

metal/eip_controlplane_reconciliation.go

deitch · 2022-03-18T09:13:15Z

This all looks solid. Just needs the README updated and that nodeFitler fixed, and we should be good to go.

detiber · 2022-03-18T15:48:19Z

@deitch README is updated and we are no longer fitlering anything.

I also squashed down the commits and cleaned up the commit message.

deitch

This all looks good, other than the single bug I think I might have found.

Comment here when that is patched, and I can run another series of tests on it.

metal/eip_controlplane_reconciliation.go

- Reconcile Service/endpoint updates separately - Provide option to perform health check against non-EIP External Address for the assigned control plane node - Attempt to pro-actively migrate EIP in the case of node deletion or node becomes unschedulable Signed-off-by: Jason DeTiberus <detiber@users.noreply.github.com>

deitch

LGTM. Let's let CI go green. If you are comfortable with it @detiber, then once green, we can merge in.

detiber · 2022-03-21T15:57:41Z

@deitch sgtm, I've been testing this code pretty regularly from an EIP perspective with the automated testing I've been doing for CAPP

deitch · 2022-03-21T17:48:20Z

Really nice on this one, thank you.

detiber commented Mar 1, 2022

View reviewed changes

detiber requested review from deitch and displague March 1, 2022 19:40

detiber mentioned this pull request Mar 2, 2022

General cleanup and updating dependencies #231

Merged

detiber force-pushed the eipFixes branch 3 times, most recently from ca82a3d to f39e42e Compare March 2, 2022 21:34

detiber mentioned this pull request Mar 7, 2022

⚠️ CAPI v1 support kubernetes-sigs/cluster-api-provider-packet#307

Merged

6 tasks

detiber force-pushed the eipFixes branch from f39e42e to ac48ea5 Compare March 7, 2022 20:07

detiber changed the title ~~[WIP] Refactor EIP handling~~ Refactor EIP handling Mar 7, 2022

detiber force-pushed the eipFixes branch from 8983408 to 98d53e5 Compare March 11, 2022 20:35

deitch reviewed Mar 16, 2022

View reviewed changes

metal/eip_controlplane_reconciliation.go Outdated Show resolved Hide resolved

metal/eip_controlplane_reconciliation.go Show resolved Hide resolved

deitch requested changes Mar 16, 2022

View reviewed changes

detiber force-pushed the eipFixes branch from f0cea18 to 7b12d95 Compare March 17, 2022 19:44

deitch reviewed Mar 18, 2022

View reviewed changes

metal/eip_controlplane_reconciliation.go Outdated Show resolved Hide resolved

detiber force-pushed the eipFixes branch from 439aa60 to 5d17833 Compare March 18, 2022 15:46

detiber self-assigned this Mar 18, 2022

deitch requested changes Mar 21, 2022

View reviewed changes

metal/eip_controlplane_reconciliation.go Outdated Show resolved Hide resolved

detiber force-pushed the eipFixes branch from 5d17833 to 62d3bf3 Compare March 21, 2022 15:22

deitch approved these changes Mar 21, 2022

View reviewed changes

deitch merged commit f8c11af into kubernetes-sigs:master Mar 21, 2022

detiber deleted the eipFixes branch March 21, 2022 17:54

displague mentioned this pull request Mar 1, 2023

[Bug] CPEM fails to sync service when additional service is in default namespace #380

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor EIP handling #230

Refactor EIP handling #230

detiber commented Mar 1, 2022 •

edited

detiber Mar 1, 2022

detiber commented Mar 2, 2022

deitch commented Mar 3, 2022

deitch commented Mar 7, 2022

detiber commented Mar 7, 2022

deitch commented Mar 8, 2022

deitch commented Mar 8, 2022

davidspek commented Mar 9, 2022 •

edited

deitch commented Mar 9, 2022

davidspek commented Mar 9, 2022

detiber commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

detiber commented Mar 14, 2022

deitch commented Mar 16, 2022

deitch Mar 16, 2022

deitch left a comment

detiber commented Mar 17, 2022

detiber commented Mar 17, 2022

deitch commented Mar 18, 2022

detiber commented Mar 18, 2022

deitch left a comment

deitch left a comment

detiber commented Mar 21, 2022

deitch commented Mar 21, 2022

Refactor EIP handling #230

Refactor EIP handling #230

Conversation

detiber commented Mar 1, 2022 • edited

detiber Mar 1, 2022

Choose a reason for hiding this comment

detiber commented Mar 2, 2022

deitch commented Mar 3, 2022

deitch commented Mar 7, 2022

detiber commented Mar 7, 2022

deitch commented Mar 8, 2022

deitch commented Mar 8, 2022

davidspek commented Mar 9, 2022 • edited

deitch commented Mar 9, 2022

davidspek commented Mar 9, 2022

detiber commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

deitch commented Mar 10, 2022

deitch commented Mar 10, 2022

detiber commented Mar 10, 2022

detiber commented Mar 14, 2022

deitch commented Mar 16, 2022

deitch Mar 16, 2022

Choose a reason for hiding this comment

deitch left a comment

Choose a reason for hiding this comment

detiber commented Mar 17, 2022

detiber commented Mar 17, 2022

deitch commented Mar 18, 2022

detiber commented Mar 18, 2022

deitch left a comment

Choose a reason for hiding this comment

deitch left a comment

Choose a reason for hiding this comment

detiber commented Mar 21, 2022

deitch commented Mar 21, 2022

detiber commented Mar 1, 2022 •

edited

davidspek commented Mar 9, 2022 •

edited