Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protokube with memberlist gossip DNS doesnt startup #9006

Closed
tvi opened this issue Apr 27, 2020 · 1 comment
Closed

Protokube with memberlist gossip DNS doesnt startup #9006

tvi opened this issue Apr 27, 2020 · 1 comment

Comments

@tvi
Copy link
Contributor

tvi commented Apr 27, 2020

1. What kops version are you running? The command kops version, will display
this information.

Version 1.17.0-beta.1 (git-32af4ed9b)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

"kubernetesVersion": "1.15.9",

3. What cloud provider are you using?
aws

4. What commands did you run? What is the simplest way to reproduce this issue?

kops create cluster
--cloud=aws \
--dns=private \
--master-count=3  \
--master-zones=us-west-2a \
--name=XXX.k8s.local \
--topology=private \
--vpc=vpc-XXX \
--zones=us-west-2a

5. What happened after the commands executed?

Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: I0427 23:06:13.708915  153189 cluster.go:145] resolved peers to following addresses peers=10.49.48.93:4000,10.49.22.210:4000,10.49.61.82:4000,10.49.54.191:4000,10.49.60.236:4000,10.49.1.156:4000
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: I0427 23:06:13.710232  153189 cluster.go:157] setting advertise address explicitly addr=10.49.1.156 port=4000
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: panic: duplicate metrics collector registration attempted
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: goroutine 1 [running]:
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc00055c1e0, 0xc000834120, 0x9, 0x9)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/pkg/mod/github.com/prometheus/client_golang@v0.9.2/prometheus/registry.go:391 +0xad
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: github.com/jacksontj/memberlistmesh.(*Peer).register(0xc0000f0c80, 0x38eeea0, 0xc00055c1e0, 0xc000782420, 0x1a)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/pkg/mod/github.com/jacksontj/memberlistmesh@v0.0.0-20190905163944-93462b9d2bb7/cluster.go:391 +0x84b
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: github.com/jacksontj/memberlistmesh.Create(0x38eeea0, 0xc00055c1e0, 0xc0004e8510, 0xc, 0x0, 0x0, 0xc000048400, 0x6, 0x8, 0x40da01, ...)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/pkg/mod/github.com/jacksontj/memberlistmesh@v0.0.0-20190905163944-93462b9d2bb7/cluster.go:177 +0x6ef
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: k8s.io/kops/protokube/pkg/gossip/memberlist.NewMemberlistGossiper(0xc0004e8510, 0xc, 0x3381728, 0x3, 0xc000042ea0, 0x13, 0x51b7110, 0x0, 0x0, 0x38bc0e0, ...)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/src/k8s.io/kops/protokube/pkg/gossip/memberlist/gossip.go:69 +0x209
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: k8s.io/kops/protokube/pkg/gossip/memberlist.init.0.func1(0xc0004e8510, 0xc, 0x3381728, 0x3, 0xc000042ea0, 0x13, 0x51b7110, 0x0, 0x0, 0x38bc0e0, ...)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/src/k8s.io/kops/protokube/pkg/gossip/memberlist/gossip.go:35 +0xb4
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: k8s.io/kops/protokube/pkg/gossip.GetGossipState(0x338addf, 0xa, 0xc0004e8510, 0xc, 0x3381728, 0x3, 0xc000042ea0, 0x13, 0x51b7110, 0x0, ...)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/src/k8s.io/kops/protokube/pkg/gossip/gossip.go:90 +0x147
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: main.run(0x38bc7a0, 0xc000010018)
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/src/k8s.io/kops/protokube/cmd/protokube/main.go:324 +0x29c0
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]: main.main()
Apr 27 16:06:13 ip-10-49-1-156 docker[153137]:         /go/src/k8s.io/kops/protokube/cmd/protokube/main.go:57 +0xba
Apr 27 16:06:13 ip-10-49-1-156 dockerd[3409]: time="2020-04-27T16:06:13-07:00" level=info msg="shim reaped" id=c9e539967d888b1265d8002d77da356625f786859555bde8b4032888a6cc3010

6. What did you expect to happen?

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

WIP

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

I am still working on debugging this issue. Just pasting the error so it can be referenced.
Likely related to #8771

@jacksontj
Copy link
Contributor

We have actually figured this out internally, but explaining here for posterity.

First off, the panic itself is due to duplicate metrics registrations. The current memberlistmesh library registers a bunch of metrics (carryover from the old alertmanager names etc.) -- which can't be registered in duplicate (which should be fixed -- jacksontj/memberlistmesh#1).

Non-support for duplicate registration shouldn't be an issue, since we only want 1 (the migration was from mesh -> memberlist), so why is it trying to duplicate register?

In our clusters we had already migrated our clusters to memberlistmesh. So our primaryGossip is set to memberlistmesh, which means the upstream config change to include memberlistmesh as a secondary by default causes issues! We had upgraded our configuration but omitted the secondary configuration as we weren't using it anymore. This meant that the default change is getting included -- meaning kops is trying to start with both primary and secondary memberlistmesh (on the same ports!). Unfortunately the flagsbuilder library's defaults are to exclude empty/default strings -- meaning we can't clear an option.

I've created #9008 to address this issue, and with that you can disable the secondary by setting protocol to "", so something like:

gossipConfig:
    ....
    secondary:
        protocol: ""
...

dnsControllerGossipConfig:
    ....
    secondary:
        protocol: ""

jacksontj added a commit to jacksontj/kops that referenced this issue Apr 28, 2020
This way if you have the value set in config (even as "") it'll get
passed down to allow you to override the default config

Related to kubernetes#9006
@tvi tvi closed this as completed May 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants