-
Notifications
You must be signed in to change notification settings - Fork 948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of leader election for layer-2 mode #195
Comments
[ Quoting <notifications@github.com> in "[google/metallb] Consider getting r..." ]
This algorithm results in a couple of properties:
This sgtm
(also explains why I see 4 qps of config traffic w/ leader election)
I also agree with allowing a split brain situation, instead of doing something
complex.
/Miek
…--
Miek Gieben
|
This all seems reasonable to me. My only big concern would be ending up with hotspots due to whatever the hashing algorithm is doing. |
This would definitely solve two of our problems with MetalLB layer2 mode: HA of IP announcing (we only have one master so can't leader elect if master down) and source IP visibility. Any idea what release this is targeted for? |
My rough plan is to have BGPv6 support and this bug in 0.7. Unfortunately no timeline for when that'll happen, since I'm a lone developer in his spare time :( |
We just encountered this issue today ourselves. The proposal looks good, and I am in favor of just letting ARP/NDP sort it out. This seems to be more in line with Kubernetes as a whole with workload and services not being completely dependent on the control plane being available. re: @mdlayher thoughts on hotspots -- I cannot speak for everyone's use-cases, but for us this is a specific need where we map services to 'edge-nodes'. We are already targeting specific nodes with these deployment, so its somewhat deterministic already. |
With this change alone, all speakers will always announce all layer2 IPs. This will work, kinda-sorta, but it's obviously not right. The followup change to implement a new leader selection algorithm will come separately.
Now, instead of one node owning all layer2 announcements, each service selects one eligible node (i.e. with a local ready pod) as the announcer. There is per-service perturbation such that even multiple services pointing to the same pods will tend to spread their announcers across eligible nodes.
Does it make sense to add the service namespace as well in the hash in nodename + servicename hash? |
Should this works now, that being getting the real ip of the requester... I am using..
Still getting |
@xeor check ingress access logs first, does it have the real ip address there |
Just checked it's log during all my experimenting.. It shows |
What is even |
Please use the mailing list for support questions, not a closed bug :) |
Sorry, but I wanted to confirm that this was actually fixed and should have worked.. If it should, I'll continue my quest on figuring it out.. :) |
Layer 2 modes (ARP and NDP) requires a single owner machine for each service IP. Currently we do this by running master election and having the winner own all IPs.
This is a little sub-optimal in several ways:
So, here's a proposal: let's get rid of the leader election, and replace it with a deterministic node selection algorithm. The controller logic remains unchanged (allocates an IP). On the speaker, we would do the following:
[x.node for x in endpoints]
This algorithm results in a couple of properties:
externalTrafficPolicy: Local
for layer2 mode services, which removes one of the major downsides of using ARP/NDP today (no client IP visibility).@miekg @mdlayher Thoughts on this proposal?
The text was updated successfully, but these errors were encountered: