Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wan federation doesn't work with FQDN node name #12614

Open
rootmout opened this issue Mar 24, 2022 · 1 comment
Open

Wan federation doesn't work with FQDN node name #12614

rootmout opened this issue Mar 24, 2022 · 1 comment
Labels
needs-discussion Topic needs discussion with the larger Consul maintainers before committing to for a release theme/federation-usability Anything related to Federation type/bug Feature does not function as expected

Comments

@rootmout
Copy link

Related Topic on Hashicorp Discuss

Connection failure in federation between VMs (primary) and kubernetes

Overview of the Issue

I wanted to federate a consul cluster running in k3s (named: dc2) to a consul cluster running on VMs (named: dc1).
The nodes of the VM cluster have a node name in FQDN format (eg ceph-1.hirsingue.infra.mydomain.fr) which causes the following error in the logs (dc2 consul server) and prevents the federation from continuing:

2022-03-22T08:42:47.170Z [INFO]  agent: (WAN) joined: number_of_nodes=1
2022-03-22T08:42:47.170Z [INFO]  agent: Join cluster completed. Synced with initial agents: cluster=WAN num_agents=1
2022-03-22T08:42:47.170Z [INFO]  agent.server: Handled event for server in area: event=member-join server=ceph-2.hirsingue.infra.mydomain.fr.dc1 area=wan
2022-03-22T08:42:47.657Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.11.11:8302: node name does not encode a datacenter: ceph-2.hirsingue.infra.mydomain.fr.dc1

Reproduction Steps

This should be similar for any cluster type, but in my case :

  • have a cluster in VMs up and running with node name in FQDN format.
    eg: ceph-1.hirsingue.infra.mydomain.fr
  • create and distribute TLS certificates on VM servers
    eg: consul tls cert create -server -dc dc1 -node ceph-1.hirsingue.infra.mydomain.fr
  • following this documentation: Federation between VMs (primary) and Kubernetes to federate a mononode k3s.

Consul info for dc1 and dc2 server

DC2 Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease = 
	revision = 37c7d06b
	version = 1.11.2
consul:
	acl = enabled
	bootstrap = true
	known_datacenters = 2
	leader = true
	leader_addr = 10.42.0.89:8300
	server = true
raft:
	applied_index = 18196
	commit_index = 18196
	fsm_pending = 0
	last_contact = 0
	last_log_index = 18196
	last_log_term = 29
	last_snapshot_index = 16385
	last_snapshot_term = 29
	latest_configuration = [{Suffrage:Voter ID:c6226cd1-b686-5e17-cf23-22bbc5d42e06 Address:10.42.0.89:8300}]
	latest_configuration_index = 0
	num_peers = 0
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Leader
	term = 29
runtime:
	arch = amd64
	cpu_count = 8
	goroutines = 185
	max_procs = 8
	os = linux
	version = go1.17.5
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 29
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 85
	members = 2
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 2533
	members = 4
	query_queue = 0
	query_time = 1
DC1 Server info
root@ceph-1:~ # consul info 
agent:
	check_monitors = 0
	check_ttls = 1
	checks = 2
	services = 2
build:
	prerelease = 
	revision = 37c7d06b
	version = 1.11.2
consul:
	acl = enabled
	bootstrap = false
	known_datacenters = 2
	leader = false
	leader_addr = 192.168.11.12:8300
	server = true
raft:
	applied_index = 445310
	commit_index = 445310
	fsm_pending = 0
	last_contact = 13.990512ms
	last_log_index = 445310
	last_log_term = 7102
	last_snapshot_index = 442448
	last_snapshot_term = 7102
	latest_configuration = [{Suffrage:Voter ID:d893d4cf-d43f-13e2-38b4-ee593d86d829 Address:192.168.11.11:8300} {Suffrage:Voter ID:0781d8ae-72a5-2a4b-9892-2aba24ea38f7 Address:192.168.11.12:8300} {Suffrage:Voter ID:23de9416-f021-b420-a9ef-6dd73313c54b Address:192.168.11.10:8300}]
	latest_configuration_index = 0
	num_peers = 2
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Follower
	term = 7102
runtime:
	arch = amd64
	cpu_count = 8
	goroutines = 177
	max_procs = 8
	os = linux
	version = go1.17.5
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 902
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 14504
	members = 3
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 2533
	members = 4
	query_queue = 0
	query_time = 1

Operating system and Environment details

VMs OS:

Description:	Ubuntu 21.04
Release:	21.04

Thanks for your help!

@Amier3 Amier3 added theme/federation-usability Anything related to Federation type/bug Feature does not function as expected labels Mar 25, 2022
@Amier3
Copy link
Contributor

Amier3 commented Apr 12, 2022

Hey @rootmout

Thanks for bringing this to our attention. It looks like this isn't exactly a conventional "bug". Node names and datacenter names actually can't contain the "."(dot) character, which is why you're experiencing this issue. This isn't a strictly enforced rule due to backwards compatibility, and in most cases you won't experience any issues using the "." character. This just happens to be one of the cases that you will run into issues.

I'll leave this open because there's a lot of room to either:

  • Start validating in certain areas like this or
  • Throw a warning to in certain cases to make this more clear

@Amier3 Amier3 added the needs-discussion Topic needs discussion with the larger Consul maintainers before committing to for a release label Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion Topic needs discussion with the larger Consul maintainers before committing to for a release theme/federation-usability Anything related to Federation type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

2 participants