Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC Advertise Address not Advertisable if -bind 0.0.0.0 #186

Closed
ghost opened this issue Oct 1, 2015 · 13 comments
Closed

RPC Advertise Address not Advertisable if -bind 0.0.0.0 #186

ghost opened this issue Oct 1, 2015 · 13 comments

Comments

@ghost
Copy link

ghost commented Oct 1, 2015

Getting the following:

[root@nomad ~]# nomad agent -server -bootstrap-expect 1 -data-dir /tmp/nomad -bind 0.0.0.0
==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
==> Starting Nomad agent...
==> Error starting agent: server setup failed: Failed to start RPC layer: RPC advertise address is not advertisable: [::]:4647

This happens only if -bind 0.0.0.0 is set, setting an interface IP clears the error. Occurs whether or not IPv6 is enable. Tested on CentOS 7 and CoreOS Stable (in client only mode)

@LordFPL
Copy link

LordFPL commented Oct 1, 2015

I encounter the same pb : bind on 0.0.0.0 is working, but there is a problem with advertise address... only one can be used.

In consul, i never encounter this problem... maybe it's implicit... but in nomad you have to specify it.

Here is a part of my config :

bind_addr = "0.0.0.0"
advertise {
    rpc = "my.ip:4647"
    serf= "my.ip:4648"
}

I have asked in the google gorup about a way to specify a dynamic var like $(hostname -i).

@ryanuber
Copy link
Member

ryanuber commented Oct 1, 2015

Ah, so in Consul I think this worked because we had a function that would scan for a private IP address, and automatically use that. There were a number of different opinions surrounding that option, and although in most cases it made things easier to start, it was intentionally left out so that it would always be clear which address we were binding and advertising. It just gets ambiguous if there are multiple interfaces at play.

I'm marking this as a thinking ticket, because the UX can probably be improved. Thanks for reporting.

@apognu
Copy link
Contributor

apognu commented Oct 2, 2015

Maybe, in the case the specific IP address of the server isn't known in advance, its advertise addresses could be derived from a given subnet. Something in the likes of:

advertise {
  rpc = "10.0.0.0/8:4657"
  serf = "10.0.0.0/8:4648"
}

I already have a patch somewhere that would look for the first interface to have an IP on the given subnet and use it for advertising, if it is deemed interesting.

@HenryTheHamster
Copy link

+1 to the subnet idea, or something similar to avoid having to know the IP up front

@cbednarski
Copy link
Contributor

I already have a patch somewhere that would look for the first interface to have an IP on the given subnet and use it for advertising, if it is deemed interesting.

@apognu That would be welcome!

@apognu
Copy link
Contributor

apognu commented Oct 6, 2015

Give me a few hours, I'll submit a PR.

@cetex
Copy link

cetex commented Oct 18, 2015

I'd like to add another thing here, this should behave like consul does, so -bind=:: should work. (Bind to any ipv4 or ipv6 ip address available) as well as the commandline option -advertise=

In our consul environment i set all nodes to -bind=:: and then -advertise=<the "real" ip of the node>

On all masters i setup a secondary ip-address of 10.255.255.255 on loopback (this is anycasted through our network) so any client anywhere within our network will just "-retry-join=10.255.255.255" and find the closest running master available.

serf/raft seems to take care of the rest, the client finds a master and gets the full list of all other masters and then just seems to ignore the -retry-join ip.

@c4milo
Copy link
Contributor

c4milo commented Dec 12, 2015

I'm also running into this. Consul's behavior seems to be what most people will like to see, including myself. Also, instead of giving an IP address, I would prefer to specify the network interface.

@cbednarski
Copy link
Contributor

Based on the feedback here and feedback that we received in Consul we're considering the following:

  1. Support bind based on named interface (eth1).
  2. Support bind based on CIDR range (10.1.0.0/16). This allows you to get pretty granular if you have multiple interfaces or multiple IPs per interface.

Specifying the IP (as is currently supported) is pretty straightforward. If we do interface or CIDR we end up with a lot of messy edge cases.

  • Some interfaces may have aliases, like eth1:0, on linux.
  • Some interfaces may have multiple IPs associated with them (e.g. IPv4 and IPv6).
    • Should we use the "first" IP that matches a specific interface or CIDR block instead of using all of them?
    • If so how do we define "first"?
  • IPv6 CIDR is getting into weird territory.
  • Any other cross-platform considerations?

We end up using similar logic in at least 3 places so this is a good opportunity to factor it out:

  1. Binding Nomad APIs when the agent starts
  2. Detecting available networks during fingerprinting
  3. Placing tasks into specific networks

I'd like to add another thing here, this should behave like consul does, so -bind=:: should work. (Bind to any ipv4 or ipv6 ip address available) as well as the commandline option -advertise=
In our consul environment i set all nodes to -bind=:: and then -advertise=

I'm not sure this is still supported in Consul 0.6.0. Also, since Nomad does not use serf across the entire cluster (only amongst the server nodes) we may not be able to do things exactly the same way that Consul does them.

@jhartman86
Copy link

Bit by this issue as well, +1 for binding to named interface.

@dadgar
Copy link
Contributor

dadgar commented Mar 22, 2016

Closing this as #941 lets you bind by interface name

@dadgar dadgar closed this as completed Mar 22, 2016
@sjwl
Copy link

sjwl commented Oct 4, 2016

it appears #941 was pulled out from this commit? 079e55e

is there a corresponding issue/explanation why the feature was removed?

benbuzbee pushed a commit to benbuzbee/nomad that referenced this issue Jul 21, 2022
Changes user restore API to blocking (since it was blocking internally).
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants