Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hashicorp/memberlist to speedup dead node detection #527

Merged
merged 3 commits into from
Mar 11, 2020

Conversation

champtar
Copy link
Contributor

This is an early POC to fix #298.
It needs a lot of cleanup, but without tuning I get failover around 3s \o/

@schaze
Copy link

schaze commented Feb 16, 2020

Great idea! I really think using Kubernetes to do the failover handling is not a practical approach. Do you have any indication how this will scale with e.g. 50 or 80 nodes?

@champtar
Copy link
Contributor Author

According to this presentation https://www.hashicorp.com/resources/everybody-talks-gossip-serf-memberlist-raft-swim-hashicorp-consul, hashicorp goal is/was 5k nodes, so even if we don't tune memberlist configuration properly with 100 nodes I think we should be fine.

@champtar
Copy link
Contributor Author

@danderson when you have time to review, I don't mind some guidance how you would see this code ;)

@danderson
Copy link
Contributor

TL;DR: I love it. But I'm trying to onboard more maintainers aside from me, and @daxmc99 wanted to do the review on this, so I'm going to let them do the first pass.

Thank you for making this!

@champtar champtar requested a review from daxmc99 February 18, 2020 23:52
@champtar champtar changed the title [POC] use hashicorp/memberlist to speedup dead node detection Use hashicorp/memberlist to speedup dead node detection Feb 25, 2020
@champtar
Copy link
Contributor Author

@daxmc99 I've improved the PR a bit, I just need need to fix MemberList logs to use MetalLB logger, but please review when you can

@champtar champtar force-pushed the failfast branch 2 times, most recently from 1445f9f to bdfb01d Compare February 26, 2020 23:28
@champtar
Copy link
Contributor Author

I've fixed the logs, here an example:

{"branch":"","caller":"main.go:80","commit":"","msg":"MetalLB speaker starting (no version or build info)","ts":"2020-02-26T23:30:08.353241621Z","version":""}
{"caller":"main.go:165","msg":"Node event","node addr":"10.10.52.141","node event":0,"node name":"etienne-ks141","ts":"2020-02-26T23:30:08.360848514Z"}
{"caller":"main.go:166","msg":"Call Force Sync","ts":"2020-02-26T23:30:08.361032071Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"eth0","msg":"created ARP responder for interface","ts":"2020-02-26T23:30:08.362216201Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"eth0","msg":"created NDP responder for interface","ts":"2020-02-26T23:30:08.362750429Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"cali108c69a7a5b","msg":"created ARP responder for interface","ts":"2020-02-26T23:30:08.363544109Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"cali108c69a7a5b","msg":"created NDP responder for interface","ts":"2020-02-26T23:30:08.363721254Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"cali62f82a7db1b","msg":"created ARP responder for interface","ts":"2020-02-26T23:30:08.364336672Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"cali62f82a7db1b","msg":"created NDP responder for interface","ts":"2020-02-26T23:30:08.364634205Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.144:7946","ts":"2020-02-26T23:30:08.403798539Z"}
{"caller":"main.go:165","msg":"Node event","node addr":"10.10.52.142","node event":0,"node name":"etienne-ks142","ts":"2020-02-26T23:30:08.405691549Z"}
{"caller":"main.go:166","msg":"Call Force Sync","ts":"2020-02-26T23:30:08.40577009Z"}
{"caller":"main.go:165","msg":"Node event","node addr":"10.10.52.144","node event":0,"node name":"etienne-ks144","ts":"2020-02-26T23:30:08.405783769Z"}
{"caller":"main.go:166","msg":"Call Force Sync","ts":"2020-02-26T23:30:08.40579436Z"}
{"caller":"main.go:165","msg":"Node event","node addr":"10.10.52.143","node event":0,"node name":"etienne-ks143","ts":"2020-02-26T23:30:08.405802063Z"}
{"caller":"main.go:166","msg":"Call Force Sync","ts":"2020-02-26T23:30:08.40581302Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.141:7946","ts":"2020-02-26T23:30:08.405850847Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:210: [DEBUG] memberlist: Stream connection from=10.10.52.141:43598","ts":"2020-02-26T23:30:08.406190368Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.142:7946","ts":"2020-02-26T23:30:08.407089632Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.143:7946","ts":"2020-02-26T23:30:08.444129524Z"}
{"Memberlist nb join":4,"caller":"main.go:150","error ?":null,"ts":"2020-02-26T23:30:08.446427959Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-02-26T23:30:08.74665996Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-02-26T23:30:08.746759896Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"kube-system/coredns","ts":"2020-02-26T23:30:08.746778824Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"kube-system/coredns","ts":"2020-02-26T23:30:08.746789656Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"kube-system/kubernetes-dashboard","ts":"2020-02-26T23:30:08.746803624Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"kube-system/kubernetes-dashboard","ts":"2020-02-26T23:30:08.746813531Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"test-echo/test-lb","ts":"2020-02-26T23:30:08.746824139Z"}
{"caller":"main.go:251","event":"noConfig","msg":"not processing, still waiting for config","service":"test-echo/test-lb","ts":"2020-02-26T23:30:08.746833965Z"}
{"caller":"main.go:252","event":"endUpdate","msg":"end of service update","service":"test-echo/test-lb","ts":"2020-02-26T23:30:08.74684448Z"}
{"caller":"main.go:356","configmap":"metallb-system/config","event":"startUpdate","msg":"start of config update","ts":"2020-02-26T23:30:08.748325546Z"}
{"caller":"main.go:380","configmap":"metallb-system/config","event":"endUpdate","msg":"end of config update","ts":"2020-02-26T23:30:08.748500961Z"}
{"caller":"k8s.go:395","configmap":"metallb-system/config","event":"configLoaded","msg":"config (re)loaded","ts":"2020-02-26T23:30:08.748526633Z"}
{"caller":"bgp_controller.go:285","event":"nodeLabelsChanged","msg":"Node labels changed, resyncing BGP peers","ts":"2020-02-26T23:30:08.748574971Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"test-echo/test-lb","ts":"2020-02-26T23:30:08.753693673Z"}
{"caller":"main.go:294","event":"endUpdate","msg":"end of service update","service":"test-echo/test-lb","ts":"2020-02-26T23:30:08.753772423Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-02-26T23:30:08.753792833Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-02-26T23:30:08.753803373Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"kube-system/coredns","ts":"2020-02-26T23:30:08.753818595Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"kube-system/coredns","ts":"2020-02-26T23:30:08.753829001Z"}
{"caller":"main.go:243","event":"startUpdate","msg":"start of service update","service":"kube-system/kubernetes-dashboard","ts":"2020-02-26T23:30:08.753839864Z"}
{"caller":"main.go:247","event":"endUpdate","msg":"end of service update","service":"kube-system/kubernetes-dashboard","ts":"2020-02-26T23:30:08.753849696Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:210: [DEBUG] memberlist: Stream connection from=10.10.52.144:59186","ts":"2020-02-26T23:30:51.954839672Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.144:7946","ts":"2020-02-26T23:30:57.854936649Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:210: [DEBUG] memberlist: Stream connection from=10.10.52.144:59290","ts":"2020-02-26T23:31:21.957591719Z"}
{"caller":"main.go:178","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 10.10.52.144:7946","ts":"2020-02-26T23:31:27.857257754Z"}

@champtar
Copy link
Contributor Author

Ok I think this is ready

Copy link
Contributor

@daxmc99 daxmc99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quite high-level review:
This looks great and the code makes sense. I'm still getting my bearings on metallb so I would like to wait till Saturday to give this the proper test drive it needs.

manifests/metallb.yaml Show resolved Hide resolved
speaker/main.go Show resolved Hide resolved
@champtar
Copy link
Contributor Author

@daxmc99 no problem, I just realised I haven't fixed the tests, I'll try to so that before this weekend

@champtar
Copy link
Contributor Author

I haven't improved/added tests but I've fixed it

Copy link
Contributor

@daxmc99 daxmc99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great! I also noticed a reduced dead node detection time in my testing!

Copy link

@mskrocki mskrocki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@champtar I tried this change in my env, and I am running into:
standard_init_linux.go:211: exec user process caused "permission denied"
from the controller container.

@champtar
Copy link
Contributor Author

champtar commented Mar 3, 2020

@mskrocki I've not even run tested the new controller in my testing as almost nothing changes
Can you check if the binary was built with "CGO_ENABLED=0" (check with ldd)
Else can you share your containers ?

@mskrocki
Copy link

mskrocki commented Mar 3, 2020

I just created controller from HEAD and I am running into same issue, so it is my env. Sorry for the confusion.

@champtar
Copy link
Contributor Author

champtar commented Mar 3, 2020

@mskrocki no problem. Do look if you built a fully static go binary or not as I remember having similar issue (you compile and link on a glibc system, then copy in a musl container)

@champtar
Copy link
Contributor Author

champtar commented Mar 3, 2020

@danderson: @daxmc99 approved so tell me if you want some changes

speaker/main.go Outdated Show resolved Hide resolved
@danderson
Copy link
Contributor

The only thing I would like to change (not necessarily in this PR), is the static encryption key in the manifest. I have two problems with this:

  • ~Nobody reads the manifest before applying it, so my expectation is that ~all MetalLB deployments will run with the same "secretkeytobechanged" key.
  • memberlist supports online key rotation, and I would love to use that to rotate keys autonomously.

In general, I think a good way to do this would be:

  • give controller permissions to manage a k8s Secret, and give speakers permissions to read the secret.
  • Controller automatically generates random keys and updates the Secret periodically. Speakers see this change, and trigger a key rotation via the memberlist API.

That way, the security aspect of memberlist would Just Work, with no configuration required.

But that said, this is a huge improvement to failover time, and I don't want to block releasing it on these changes. So instead, can you tweak the installation docs (in the website subdir) to mention that operators should change the manifest secret key before deploying?

@champtar
Copy link
Contributor Author

champtar commented Mar 4, 2020

@danderson I 100% agree on the just work part, I wanted to put the guid of the speaker DaemonSet via the downward API but it's not supported.
Looking at it once more key rotation is indeed supported but it was not clear to me at all on my first read of the docs.
My only concerns with online key rotation is on a system with unstable API server (temporary outage) the key rotation could just make the whole system worse, but I think I'll implement it at some point.

@champtar
Copy link
Contributor Author

champtar commented Mar 4, 2020

Just need to update the kustomize part now

@champtar champtar force-pushed the failfast branch 3 times, most recently from f37ae7d to 6f8d27e Compare March 5, 2020 03:08
@champtar
Copy link
Contributor Author

champtar commented Mar 5, 2020

Haven't tested kustomize part, can someone tell me if that seems ok ?

champtar added 2 commits March 5, 2020 10:16
Signed-off-by: Etienne Champetier <echampetier@anevia.com>
… log

Signed-off-by: Etienne Champetier <echampetier@anevia.com>
@champtar champtar force-pushed the failfast branch 2 times, most recently from 76519ed to 7e5c9e7 Compare March 5, 2020 16:24
By default MemberList is disabled, so new behaviour is opt-in on upgrade

This fixes metallb#298

Signed-off-by: Etienne Champetier <echampetier@anevia.com>
@champtar
Copy link
Contributor Author

champtar commented Mar 5, 2020

This worked

namespace: metallb-system

resources:
  - metallb.yaml

configMapGenerator:
- name: config
  files:
    - configs/config

secretGenerator:
- name: memberlist
  files:
    - configs/secretkey

generatorOptions:
 disableNameSuffixHash: true

@danderson I think this should be good now

@champtar
Copy link
Contributor Author

champtar commented Mar 9, 2020

@danderson friendly ping :)
If you don't have time to do any testing maybe just release a beta/rc so I can get more feedback from users

@champtar
Copy link
Contributor Author

@danderson @daxmc99 pretty please :)

@danderson danderson merged commit 3c5fc63 into metallb:main Mar 11, 2020
@danderson
Copy link
Contributor

Merged, thank you! If @daxmc99 has other feedback, we can do it in a followup change.

We've been talking about scheduling a release in #metallb-dev on the k8s slack, if you want to drop in there.

johananl added a commit to kinvolk/metallb that referenced this pull request Mar 21, 2020
In metallb#527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
johananl added a commit to kinvolk/metallb that referenced this pull request Mar 22, 2020
In metallb#527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
johananl added a commit to kinvolk/metallb that referenced this pull request Mar 27, 2020
In metallb#527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
johananl added a commit to kinvolk/metallb that referenced this pull request Mar 27, 2020
In metallb#527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
johananl added a commit to kinvolk/metallb that referenced this pull request Mar 30, 2020
In metallb#527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
rata pushed a commit that referenced this pull request Mar 30, 2020
In #527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.
@rata rata mentioned this pull request Jun 15, 2020
1 task
rata pushed a commit that referenced this pull request Jun 27, 2020
In #527, HashiCorp memberlist support was added, which requires a
secret to be present in the metallb-system namespace. We need to
create this secret in the dev environment for speakers to converge.

(cherry picked from commit 500e0d5)
@champtar champtar deleted the failfast branch October 2, 2020 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failover time very high in layer2 mode
5 participants