New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine controller permanently moves Floating IPs over controllers #1265
Comments
we have this logic
which means the port will be checked with the floating ip and port ID ,
|
with my settings I can see our code logic above takes effect
|
I may misunderstood, but from what I see in current controller code, as soon as a control plane machine is reconcilied, it attempts to associate the floating IP with the machine that is being reconciliated. With a single controller, it will works properly as the condition that you were pointing is always true. But as soon as another controller VM is reconcilied, it'll see that the floating IP isn't bound to its own port and will perform the association. Here is a complete extract of the logs of my capo controller:
At the begining we see that floating IP is associated with machine management-cluster-control-plane-9wjjt:
And indeed if that machine is reconciliated again, it says that the floating IP is already associated
But as soon as another machine (here management-cluster-control-plane-zmsnn) gets reconcilied, as it has a different port id, it associated the floating IP to itself.
IMHO, it'd be better to test |
It sounds like we've got a problem of expressiveness here. We expect all the control plane nodes to be the same and deterministic, but that isn't the case when you've got multiple control plane nodes with only one of them having a floating ip. With our current model we should probably be restricting the control plane to a single node when not using a load balancer, because it looks like we can't support multiple nodes. I don't think we can use the
I do think you have a valid use case, though. In fact we have this same use case in OpenShift: we don't use Octavia for the API vip either. Instead we create a FIP with a port on the control plane network which isn't bound to any of the control plane machines. We then use keepalived to float the VIP dynamically across all 3 control plane nodes. We're also currently looking at adding more api VIP options, including floating the VIP between control plane nodes which don't share an L2 using BGP and ECMP. I would personally be very much in favour of also adding these options to CAPO, but I'm not sure who would work on them. Is this something you might be interested in? |
I agree that having a single floating IP pointing to a specific control plane node is definitively not a production grade solution, and VRRP or BGP based options would be much better! For that purpose, I was wondering whether MetalLB could be deployed as a ClusterResourceSet and used for external API access? I haven't yet tried that option, but i'd like to give it a try if it makes sense. Regarding the floating IP issue, when we get to the point where But even if that works, it's true that external API access failover would rely on machine reconciliation loop, which is not fast, so explicitly limiting to a single control plane node as you suggested would also make sense, for sure. |
Unfortunately not directly that I'm aware. IIUC Metal LB provides service load balancers. i.e. It requires an apiserver to operate and therefore can't loadbalance the api VIP. I'd be delighted to be wrong about that, though, so if you know better please chime in!
...
That's the problem. You'd need something to trigger the machine reconciliation loop and, in ideal circumstances, nothing will do that on a static cluster. |
I think we should add a check and warning on such (multiple control node with multiple floating ip) for now .. |
That makes sense. Anyway, if it's only a warning (and not a validation constraint), shouldn't we also try to avoid moving the floating IP around controllers in case thare are multiple one, as suggested? Sorry if I mis-understand, but I don't get your point on multiple floating IP: there can be only one floating IP that carries the API endpoint, right?
Not arguing that it would be a suitable failover mechanism, but from what I understand there's a periodic reconciliation every 10 minutes by default.. So even on static cluster, machine would end up being reconcilied at some point (Btw, this is also one of the causes of this bug, as FIP will never stop being moved over controllers following their respective periodic reconciliations...) |
this is correct
I noticed that the FIP moved to another node suddenly which looks to me not a correct behavior
so I think unless we can figure out a outbound solution like VIP etc, we'd better disable the >1 controller case if there is no LB involved |
The replices of control plane is defined in KubeadmControlPlane so I think it's more reasonable to just stick the FIP to the first VM (so check fp != nil) case |
/kind bug
I was attemtping to use CAPI/CAPO on an openstack platform that doesn't provides loadbalancer API, and expected to use floatingIP
to reach the API server.
For that purpose I was using following parameters in OpenstackCluster:
disableAPIServerFloatingIP: false
apiServerLoadBalancer: {}
It turns out that openstackmachine controller is permanently moving floating IP overs the master nodes, causing API connectivity interruptions. Here is an extract from the capo controller logs:
This issue was observed with capo v0.6.3 and v0.5.3 (I was thinking that this commit could have introduced this bug, but it doesn't)
The cause of the issue is quite straightforward: as we don't check if floatingIP is already associated to an healthy control-plane machine, each machine will attach the floating IP to its port when it is reconcilated..
It even often breaks deployment with multiple controllers, as floating IP move to the machine being provisionned is racing with kubeadm join that fails to reach the cluster API.
The text was updated successfully, but these errors were encountered: