Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework how gratuitous advertisements (ARP/NDP) work #736

Merged
merged 1 commit into from Oct 14, 2020

Conversation

champtar
Copy link
Contributor

@champtar champtar commented Oct 2, 2020

With our current approach, we only do gratuitous advertisements on the first
SetBalancer() call, and we don't resend any gratuitous advertisements on the next
calls to reduce the amount of "spam" in the network.
This was working pretty well when we were using K8S API to do all the decisions.
Now that we are also using MemberList status, our decisions are based on eventually consistent
information, and there are at least 2 cases where we need to resend gratuitous advertisements
even if the information that we have makes us think there were no changes in ownership:

  1. Split brain with no ownership changes:
    3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
    Now for some reason C can't talk to A and B anymore, and our algorithm
    in ShouldAnnounce() continues to pick A as the owner of I.
    As there were no changes for A, A doesn't send any gratuitous advertisement.
    As C thinks it is alone, it thinks it owns I and sends gratuitous advertisements.
    Some seconds later C rejoins A & B, C stops sending gratuitous advertisements,
    but A continues to be the owner and doesn't send any gratuitous advertisement.
    Depending on the switches' inner working, traffic might continue to go to C for a long time.

  2. Race condition on ForceSync:
    3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
    A becomes really slow (cpu limits or ...) and memberlist on B and C decides that A is not part of the memberlist cluster
    anymore. B and C each start a ForceSync(), one of B or C becomes the owner of I and starts gratuitous advertisements for I.
    A starts to respond again to memberlist and rejoins the cluster, while doing its first ForceSync().
    A thinks it was always the owner of I and doesn't send any gratuitous advertisement.

The idea of this patch is to send gratuitous advertissements for 5 seconds from the last SetBalancer() call,
instead of the last time we think we became the owner.
To ensure there is only 1 sender for each IP, we use only one goroutine for all gratuitous advertisement calls.
As gratuitous() was using Lock() (ie exclusive lock), we were sending at most 1 gratuitous advertisement at a time,
so we know that this is fine performance wise, but it might be burstier than before.

This fixes #584

@champtar
Copy link
Contributor Author

champtar commented Oct 2, 2020

@russellb if you want to have a look

@russellb
Copy link
Collaborator

russellb commented Oct 5, 2020

@russellb if you want to have a look

thanks! Can you explain a bit more in the description how this fixes #584? The current description and commit message just talks about the lock change.

@champtar champtar changed the title Rework how gratuitous advertisement (ARP/NDP) works Rework how gratuitous advertisements (ARP/NDP) work Oct 6, 2020
@champtar
Copy link
Contributor Author

champtar commented Oct 6, 2020

@russellb added way more explanations, tell me if this is enough

internal/layer2/announcer.go Outdated Show resolved Hide resolved
@champtar
Copy link
Contributor Author

champtar commented Oct 8, 2020

Note from my testing:

W1008 00:03:37.209528       1 reflector.go:302] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: watch of *v1.ConfigMap ended with: too old resource version: 4356503 (4358042)
{"caller":"main.go:387","configmap":"metallb-system/config","event":"startUpdate","msg":"start of config update","ts":"2020-10-08T00:03:38.213385661Z"}
{"caller":"main.go:411","configmap":"metallb-system/config","event":"endUpdate","msg":"end of config update","ts":"2020-10-08T00:03:38.213495356Z"}
{"caller":"k8s.go:421","configmap":"metallb-system/config","event":"configLoaded","msg":"config (re)loaded","ts":"2020-10-08T00:03:38.213521216Z"}
{"caller":"main.go:268","event":"startUpdate","msg":"start of service update"
...

On each config watch expiry, we do a resync, so we spam

@russellb
Copy link
Collaborator

russellb commented Oct 9, 2020

Note from my testing:

W1008 00:03:37.209528       1 reflector.go:302] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: watch of *v1.ConfigMap ended with: too old resource version: 4356503 (4358042)
{"caller":"main.go:387","configmap":"metallb-system/config","event":"startUpdate","msg":"start of config update","ts":"2020-10-08T00:03:38.213385661Z"}
{"caller":"main.go:411","configmap":"metallb-system/config","event":"endUpdate","msg":"end of config update","ts":"2020-10-08T00:03:38.213495356Z"}
{"caller":"k8s.go:421","configmap":"metallb-system/config","event":"configLoaded","msg":"config (re)loaded","ts":"2020-10-08T00:03:38.213521216Z"}
{"caller":"main.go:268","event":"startUpdate","msg":"start of service update"
...

On each config watch expiry, we do a resync, so we spam

how often does that happen?

Copy link
Member

@johananl johananl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. The design looks OK to me and things seem to work on my testing.

Added some nitpicks and questions, mainly around comments.

internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Outdated Show resolved Hide resolved
internal/layer2/announcer_test.go Outdated Show resolved Hide resolved
internal/layer2/announcer.go Show resolved Hide resolved
@champtar
Copy link
Contributor Author

champtar commented Oct 9, 2020

@johananl thanks a lot for the review, see my answers

@champtar
Copy link
Contributor Author

champtar commented Oct 9, 2020

BTW to speed up thing I'll implement an allow/deny list based on interface name, so we don't spam on each container interface when using calico or others. I've tried in the past to detect veth type but it's really not easy

@russellb
Copy link
Collaborator

BTW to speed up thing I'll implement an allow/deny list based on interface name, so we don't spam on each container interface when using calico or others. I've tried in the past to detect veth type but it's really not easy

that's not really related to this change though, right?

@champtar
Copy link
Contributor Author

@russellb yes in a separate PR, at some point in the future

Copy link
Member

@johananl johananl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

With our current approach, we only do gratuitous advertisements on the first
SetBalancer() call, and we don't resend any gratuitous advertisements on the next
calls to reduce the amount of "spam" in the network.
This was working pretty well when we were using K8S API to do all the decisions.
Now that we are also using MemberList status, our decisions are based on eventually consistent
information, and there are at least 2 cases where we need to resend gratuitous advertisements
even if the information that we have makes us think there were no changes in ownership:

1) Split brain with no ownership changes:
3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
Now for some reason C can't talk to A and B anymore, and our algorithm
in ShouldAnnounce() continues to pick A as the owner of I.
As there were no changes for A, A doesn't send any gratuitous advertisement.
As C thinks it is alone, it thinks it owns I and sends gratuitous advertisements.
Some seconds later C rejoins A & B, C stops sending gratuitous advertisements,
but A continues to be the owner and doesn't send any gratuitous advertisement.
Depending on the switches' inner working, traffic might continue to go to C for a long time.

2) Race condition on ForceSync:
3 nodes A B C, A owns the LoadBalancer IP I, cluster is clean.
A becomes really slow (cpu limits or ...) and memberlist on B and C decides that A is not part of the memberlist cluster
anymore. B and C each start a ForceSync(), one of B or C becomes the owner of I and starts gratuitous advertisements for I.
A starts to respond again to memberlist and rejoins the cluster, while doing its first ForceSync().
A thinks it was always the owner of I and doesn't send any gratuitous advertisement.

The idea of this patch is to send gratuitous advertissements for 5 seconds from the last SetBalancer() call,
instead of the last time we think we became the owner.
To ensure there is only 1 sender for each IP, we use only one goroutine for all gratuitous advertisement calls.
As gratuitous() was using Lock() (ie exclusive lock), we were sending at most 1 gratuitous advertisement at a time,
so we know that this is fine performance wise, but it might be burstier than before.

This fixes metallb#584

Signed-off-by: Etienne Champetier <echampetier@anevia.com>
@johananl johananl merged commit 0c1ae8c into metallb:main Oct 14, 2020
2 checks passed
@champtar champtar deleted the gratuitious2 branch Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

L2 mode doesn't resend gratuitous arp when node is diconnected and reconnected
3 participants