Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node registration failure due to reserved node names #107

Closed
tsmgeek opened this Issue Jan 31, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@tsmgeek
Copy link

tsmgeek commented Jan 31, 2019

I am running the latest helm chart 0.5.0 but upgraded the images for consul to latest at the time of this ticket.

2019/01/31 16:02:30 [WARN] consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "7ab8005a-f263-7456-9625-3027875e7142": Node name ip-10-100-136-56.eu-west-1.compute.internal is reserved by node 0a6ad553-f1f2-e86e-83fe-faea5d1b429c with name ip-10-100-136-56.eu-west-1.compute.internal

I noticed that some of my clients started to throw up this error and lead me to the ticket linked below.
hashicorp/consul#4741

There is no guarantee that a statefulset deployment on EKS will remain on the same node, it is possible that it can change within the same AZ and so the node data in the PVC will now be incorrect. It may be an idea to have this enabled by default within the chart.

-disable-host-node-id - Setting this to true will prevent Consul from using information from the host to generate a deterministic node ID, and will instead generate a random node ID which will be persisted in the data directory. This is useful when running multiple Consul agents on the same host for testing. This defaults to false in Consul prior to version 0.8.5 and in 0.8.5 and later defaults to true, so you must opt-in for host-based IDs. Host-based IDs are generated using https://github.com/shirou/gopsutil/tree/master/host, which is shared with HashiCorp's Nomad, so if you opt-in to host-based IDs then Consul and Nomad will use information on the host to automatically assign the same ID in both systems.
@adilyse

This comment has been minimized.

Copy link
Member

adilyse commented Mar 1, 2019

Hi @tsmgeek,

In this case, the disable-host-node-id setting won't solve this issue in Kubernetes. It currently defaults to true which is causing this issue, but setting it to false will then explicitly tie the node id to the host machine. That means a pod rescheduled to a different node will also throw this error.

Given this, there's not currently a mechanism for the Consul/Kubernetes integration to resolve this without it first being fixed in Consul.

Since the issue is already being actively tracked in the Consul and there is no additional work needed in the Helm chart to support a fix, I'm going to close this. For continued updates, please follow hashicorp/consul#4741.

@adilyse adilyse closed this Mar 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.