New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework how gratuitous advertisements (ARP/NDP) work #736
Conversation
@russellb if you want to have a look |
@russellb added way more explanations, tell me if this is enough |
Note from my testing:
On each config watch expiry, we do a resync, so we spam |
how often does that happen? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. The design looks OK to me and things seem to work on my testing.
Added some nitpicks and questions, mainly around comments.
@johananl thanks a lot for the review, see my answers |
BTW to speed up thing I'll implement an allow/deny list based on interface name, so we don't spam on each container interface when using calico or others. I've tried in the past to detect veth type but it's really not easy |
that's not really related to this change though, right? |
@russellb yes in a separate PR, at some point in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
With our current approach, we only do gratuitous advertisements on the first SetBalancer() call, and we don't resend any gratuitous advertisements on the next calls to reduce the amount of "spam" in the network. This was working pretty well when we were using K8S API to do all the decisions. Now that we are also using MemberList status, our decisions are based on eventually consistent information, and there are at least 2 cases where we need to resend gratuitous advertisements even if the information that we have makes us think there were no changes in ownership: 1) Split brain with no ownership changes: 3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean. Now for some reason C can't talk to A and B anymore, and our algorithm in ShouldAnnounce() continues to pick A as the owner of I. As there were no changes for A, A doesn't send any gratuitous advertisement. As C thinks it is alone, it thinks it owns I and sends gratuitous advertisements. Some seconds later C rejoins A & B, C stops sending gratuitous advertisements, but A continues to be the owner and doesn't send any gratuitous advertisement. Depending on the switches' inner working, traffic might continue to go to C for a long time. 2) Race condition on ForceSync: 3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean. A becomes really slow (cpu limits or ...) and memberlist on B and C decides that A is not part of the memberlist cluster anymore. B and C each start a ForceSync(), one of B or C becomes the owner of I and starts gratuitous advertisements for I. A starts to respond again to memberlist and rejoins the cluster, while doing its first ForceSync(). A thinks it was always the owner of I and doesn't send any gratuitous advertisement. The idea of this patch is to send gratuitous advertissements for 5 seconds from the last SetBalancer() call, instead of the last time we think we became the owner. To ensure there is only 1 sender for each IP, we use only one goroutine for all gratuitous advertisement calls. As gratuitous() was using Lock() (ie exclusive lock), we were sending at most 1 gratuitous advertisement at a time, so we know that this is fine performance wise, but it might be burstier than before. This fixes metallb#584 Signed-off-by: Etienne Champetier <echampetier@anevia.com>
With our current approach, we only do gratuitous advertisements on the first
SetBalancer() call, and we don't resend any gratuitous advertisements on the next
calls to reduce the amount of "spam" in the network.
This was working pretty well when we were using K8S API to do all the decisions.
Now that we are also using MemberList status, our decisions are based on eventually consistent
information, and there are at least 2 cases where we need to resend gratuitous advertisements
even if the information that we have makes us think there were no changes in ownership:
Split brain with no ownership changes:
3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
Now for some reason C can't talk to A and B anymore, and our algorithm
in ShouldAnnounce() continues to pick A as the owner of I.
As there were no changes for A, A doesn't send any gratuitous advertisement.
As C thinks it is alone, it thinks it owns I and sends gratuitous advertisements.
Some seconds later C rejoins A & B, C stops sending gratuitous advertisements,
but A continues to be the owner and doesn't send any gratuitous advertisement.
Depending on the switches' inner working, traffic might continue to go to C for a long time.
Race condition on ForceSync:
3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
A becomes really slow (cpu limits or ...) and memberlist on B and C decides that A is not part of the memberlist cluster
anymore. B and C each start a ForceSync(), one of B or C becomes the owner of I and starts gratuitous advertisements for I.
A starts to respond again to memberlist and rejoins the cluster, while doing its first ForceSync().
A thinks it was always the owner of I and doesn't send any gratuitous advertisement.
The idea of this patch is to send gratuitous advertissements for 5 seconds from the last SetBalancer() call,
instead of the last time we think we became the owner.
To ensure there is only 1 sender for each IP, we use only one goroutine for all gratuitous advertisement calls.
As gratuitous() was using Lock() (ie exclusive lock), we were sending at most 1 gratuitous advertisement at a time,
so we know that this is fine performance wise, but it might be burstier than before.
This fixes #584