Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul servers in different datacenters with same name causes wan gossip ring to fail. #158

Closed
keyneston opened this issue May 16, 2014 · 2 comments

Comments

@keyneston
Copy link

Our servers tend to have names like fooX.dc01.justin.tv. When configuring Consul it naturally mapped so that our servers to have a node name of 'fooX' in datacenter 'dc01'.

Due to the way that our various datacenters are setup the same servers foo[1,3,5] in each datacenter happen to be our consul masters. When they attempt to join the wan gossip ring they encounter an issue since foo1 from another datacenter is already registered.

Logs: (mangled to hide names and ips)

May 16 13:30:27 foo5 consul: ==> Starting Consul agent...
May 16 13:30:27 foo5 consul: ==> Starting Consul agent RPC...
May 16 13:30:27 foo5 consul: ==> Joining cluster...
May 16 13:30:27 foo5 consul:     Join completed. Synced with 1 initial agents
May 16 13:30:27 foo5 consul: ==> Consul agent running!
May 16 13:30:27 foo5 consul:        Node name: 'foo5'
May 16 13:30:27 foo5 consul:       Datacenter: 'abc01'
May 16 13:30:27 foo5 consul:           Server: true (bootstrap: false)
May 16 13:30:27 foo5 consul:      Client Addr: 0.0.0.0 (HTTP: 8500, DNS: 8600, RPC: 8400)
May 16 13:30:27 foo5 consul:     Cluster Addr: 198.51.100.9 (LAN: 8301, WAN: 8302)
May 16 13:30:27 foo5 consul:
May 16 13:30:27 foo5 consul: ==> Log data will now stream in as it occurs:
May 16 13:30:27 foo5 consul:
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] raft: Node at 198.51.100.9:8300 [Follower] entering Follower state
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [WARN] memberlist: Binding to public address without encryption!
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo5 198.51.100.9
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: Attempting re-join to previously known node: foo1: 192.0.2.5:8301
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [WARN] memberlist: Binding to public address without encryption!
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo5 198.51.100.9
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: abc01, addr: 198.51.100.9:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: Attempting re-join to previously known node: foo9: 192.0.2.139:8302
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo1 192.0.2.5
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: Re-joined to previously known node: foo1: 192.0.2.5:8301
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] agent: (LAN) joining: [foo1.abc01.justin.tv foo5.abc01.justin.tv foo9.abc01.justin.tv foo13.abc01.justin.tv]
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo1 192.0.2.146
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: bar8 128.66.12.25
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo9 192.0.2.139
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: bar2 128.66.14.36
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: efg01, addr: 192.0.2.146:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: foo13 192.0.2.135
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: baz1 128.66.61.22
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: baz2 128.66.62.19
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: EventMemberJoin: bar1 128.66.17.23
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [ERR] memberlist: Conflicting address for foo5. Mine: 198.51.100.9:8302 Theirs: 199.9.254.149:8302
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [ERR] serf: Node name conflicts with another node at 199.9.254.149:8302. Names must be unique! (Resolution enabled: true)
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] serf: Re-joined to previously known node: foo9: 192.0.2.139:8302
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: xyz01, addr: 128.66.12.25:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: rst01, addr: 192.0.2.139:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: xyz01, addr: 128.66.14.36:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: rst01, addr: 192.0.2.135:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: xyz01, addr: 128.66.61.22:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: xyz01, addr: 128.66.62.19:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] consul: adding server for datacenter: xyz01, addr: 128.66.17.23:8300
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [INFO] agent: (LAN) joined: 1 Err: <nil>
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:27 foo5 consul:     2014/05/16 13:30:27 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:28 foo5 consul:     2014/05/16 13:30:28 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:28 foo5 consul:     2014/05/16 13:30:28 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:28 foo5 consul:     2014/05/16 13:30:28 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:28 foo5 consul:     2014/05/16 13:30:28 [ERR] serf: Failed to decode conflict query response: reflect: reflect.Value.SetString using unaddressable value
May 16 13:30:35 foo5 consul:     2014/05/16 13:30:35 [WARN] serf: minority in name conflict resolution, quiting [0 / 1]
May 16 13:30:35 foo5 consul:     2014/05/16 13:30:35 [WARN] Shutdown without a Leave
@armon
Copy link
Member

armon commented May 16, 2014

Okay, I think the simple fix is to suffix the DC to the node name in the WAN pool. Should be a simple fix!

@armon armon closed this as completed in 1611d98 May 16, 2014
@keyneston
Copy link
Author

Thanks! That fixed it.

duckhan pushed a commit to duckhan/consul that referenced this issue Oct 24, 2021
* Make catalog package naming consistent with command flags

Rename:
* catalog/from-k8s -> catalog/to-consul
* catalog/from-consul -> catalog/to-k8s

* Refactor catalog/to-consul/resource_test.go

* Move common functionality into helper functions
* Remove duplicate tests
* Remove unnecessary time.Sleep calls

* Update catalog/to-consul/resource_test.go

Co-Authored-By: Luke Kysow <1034429+lkysow@users.noreply.github.com>

* Add docs for test functions and change function names

* remove `test` prefix from helper function
* pass in `t *testing.T` object to helper function
  and assert NoError there rather than in the test
* make test IPs look more like IPs rather than string letters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants