New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to select private/public IP for specific task/port #646

Open
skozin opened this Issue Jan 5, 2016 · 30 comments

Comments

Projects
None yet
@skozin

skozin commented Jan 5, 2016

Extracted from #209.

We use Nomad with Docker driver to operate cluster of machines. Some of them have both public and private interfaces. These two-NIC machines run internal services that need to listen only on a private interface, as well as public services, which should listen on a public interface.

So we need a way of specifying whether some task should listen on public or private IP.

I think this can be generalized to the ability to specify subnet mask for a specific port:

resources {
    network {
        mbits = 100
        port "http" {
            # Listen on all interfaces that match this mask; the task will not be
            # started on a machine that has no NICs with IPs in this subnet.
            netmask = "10.10.0.1/16"
        }
        port "internal-bus" {
            # The same with static port number
            static = 4050
            netmask = "127.0.0.1/32"
        }
    }
}

This would be the most flexible solution that would cover most, if not all, cases. For example, to listen on all interfaces, as requested in #209, you would just pass 0.0.0.0/0 netmask that matches all possible IPs. Maybe it makes sense to make this netmask the default, i.e. bind to all interfaces if no netmask is specified for a port.

I think this is really important feature, because its lack prevents people from running Nomad in VPC (virtual private cloud) environments, like Amazon VPC, Google Cloud Platform with subnetworks, OVH Dedicated Cloud and many others, as well as any other environments where some machines are connected to more than one network.


Another solution is to allow specifying interface name(s), like eth0, but that wouldn't work in our case because:

  1. different machines may have different order and, thus, different names of network interfaces;
  2. to make things worse, some machines may have multiple IPs assigned to the same interface, e.g. see DigitalOcean's anchor ip which is enabled by default on each new machine.

Example for point 1: assume that I want to start some task on all machines in the cluster, and that I want this task to listen only on private interface to prevent exposing it to the outer world. Consul agent is a nice example of such service.

Now, some machines in the cluster are connected to both public and private networks, and have two NICs:

  • eth0 corresponds to public network, say, 162.243.197.49/24;
  • eth1 corresponds to my private network 10.10.0.1/24.

But majority of machines are only connected to a private net, and have only one NIC:

  • eth0 corresponds to the private net 10.10.0.1/24.

This is fairly typical setup in VPC environments.

You can see that it would be impossible to constrain my service only to private subnet by specifying interface name, because eth0 corresponds to different networks on different machines, and eth1 is even missing on some machines.

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jan 6, 2016

Contributor

Hey @skozin,

Thanks for the input. I like the idea of a netmask and it is a valid way of constraining your task to the correct network. The one major draw back to this approach is that it forces the end user (not the operator) to know the network configuration. What I would like is to allow operators to set up arbitrary key/value metadata for network interfaces and then end users can use a simpler constraint. For example an operator could tag a network interface as "availability" = "public" and then the user can submit a job with something like this:

port "http" {
     constraint {
           "availability" = "public"
     }
}

Obviously syntax was just something I came up with right now but does this seem reasonable to you? With a more generic constraint system (like that for jobs) we could also support net mask and other constraints.

Contributor

dadgar commented Jan 6, 2016

Hey @skozin,

Thanks for the input. I like the idea of a netmask and it is a valid way of constraining your task to the correct network. The one major draw back to this approach is that it forces the end user (not the operator) to know the network configuration. What I would like is to allow operators to set up arbitrary key/value metadata for network interfaces and then end users can use a simpler constraint. For example an operator could tag a network interface as "availability" = "public" and then the user can submit a job with something like this:

port "http" {
     constraint {
           "availability" = "public"
     }
}

Obviously syntax was just something I came up with right now but does this seem reasonable to you? With a more generic constraint system (like that for jobs) we could also support net mask and other constraints.

@skozin

This comment has been minimized.

Show comment
Hide comment
@skozin

skozin Jan 6, 2016

Hi @dadgar,

Thanks for the prompt response =) Yes, this seems perfectly reasonable to me.

Your proposal is more elegant and flexible than mine, because it yields more readable/maintainable job definitions and, at the same time, allows for greater precision, e.g. it is possible to bind some task to private interface on a set of machines even if those machines have private interfaces pointing into different private networks.

But, in order for this to cover the same cases, the implementation should meet these two requirements:

  1. If multiple interfaces match constraint for some port, the task should listen on all matching interfaces' addresses;
  2. if matching interface has multiple addresses, the task should listen on each of these IPs.

So, if some port matches interfaces A and B, where A have addresses a1 and a2, and B have addresses b1 and b2, then the task should listen on a1, a2, b1 and b2.


I'm not sure I understand the new port[name].constraint construct. Do you propose to move static attribute there too? I think it should be done for consistency, because static = xxx is, in fact, a constraint, just like availability = "public":

port "http" {
  constraint {
    static = 1234
    interface {
      availability = "public"
    }
  }
}

Or, maybe, it would be better to not introduce new inner block that would always be required?

port "http" {
  static = 1234
  interface {
    availability = "public"
  }
}

Another question is about how to specify metadata for network interfaces. Do I get it right that it would go into client block of Nomad configuration, like this?

client {
  network {
    interface "eth0" {
      availability = "public"
    }
    interface "eth1" {
      availability = "private"
    }
    interface "docker0" {
      is_docker_bridge = true
    }
  }
}

If that's correct, then client.network_inteface option may now be deprecated, because you can do the same with new port constraints: just mark the desired interface with some metadata, e.g. use_for_nomad_tasks = true, and add this constraint to all ports. However, that would add complexity to job definitions, so maybe some special meta key may be introduced instead:

client {
  network {
    interface "eth1" {
      # Special meta key that prohibits fingerprinting this interface
      disable = true
    }
    interface "eth0" {
      availability = "public"
    }
  }

And the last one: I think it would be useful to support negative meta constraints, e.g.:

# Client config
client {
  network {
    interface "eth0" {
      availability = "public"
    }
    interface "eth1" {
      availability = "private"
    }
    interface "lo" {
      availability = "local"
    }
  }
}

# Task config
# ...
port "http" {
  static = 1234
  interface {
    # would listen on "eth1" and "lo"
    not {
      availability = "public"
    }
  }
}

Or, maybe, regex constraints?

port "http" {
  static = 1234
  interface {
    availability = "/^(?!public)./"
    # or:
    # availability = "/^(private|local)$/"
  }
}

What do you think?

skozin commented Jan 6, 2016

Hi @dadgar,

Thanks for the prompt response =) Yes, this seems perfectly reasonable to me.

Your proposal is more elegant and flexible than mine, because it yields more readable/maintainable job definitions and, at the same time, allows for greater precision, e.g. it is possible to bind some task to private interface on a set of machines even if those machines have private interfaces pointing into different private networks.

But, in order for this to cover the same cases, the implementation should meet these two requirements:

  1. If multiple interfaces match constraint for some port, the task should listen on all matching interfaces' addresses;
  2. if matching interface has multiple addresses, the task should listen on each of these IPs.

So, if some port matches interfaces A and B, where A have addresses a1 and a2, and B have addresses b1 and b2, then the task should listen on a1, a2, b1 and b2.


I'm not sure I understand the new port[name].constraint construct. Do you propose to move static attribute there too? I think it should be done for consistency, because static = xxx is, in fact, a constraint, just like availability = "public":

port "http" {
  constraint {
    static = 1234
    interface {
      availability = "public"
    }
  }
}

Or, maybe, it would be better to not introduce new inner block that would always be required?

port "http" {
  static = 1234
  interface {
    availability = "public"
  }
}

Another question is about how to specify metadata for network interfaces. Do I get it right that it would go into client block of Nomad configuration, like this?

client {
  network {
    interface "eth0" {
      availability = "public"
    }
    interface "eth1" {
      availability = "private"
    }
    interface "docker0" {
      is_docker_bridge = true
    }
  }
}

If that's correct, then client.network_inteface option may now be deprecated, because you can do the same with new port constraints: just mark the desired interface with some metadata, e.g. use_for_nomad_tasks = true, and add this constraint to all ports. However, that would add complexity to job definitions, so maybe some special meta key may be introduced instead:

client {
  network {
    interface "eth1" {
      # Special meta key that prohibits fingerprinting this interface
      disable = true
    }
    interface "eth0" {
      availability = "public"
    }
  }

And the last one: I think it would be useful to support negative meta constraints, e.g.:

# Client config
client {
  network {
    interface "eth0" {
      availability = "public"
    }
    interface "eth1" {
      availability = "private"
    }
    interface "lo" {
      availability = "local"
    }
  }
}

# Task config
# ...
port "http" {
  static = 1234
  interface {
    # would listen on "eth1" and "lo"
    not {
      availability = "public"
    }
  }
}

Or, maybe, regex constraints?

port "http" {
  static = 1234
  interface {
    availability = "/^(?!public)./"
    # or:
    # availability = "/^(private|local)$/"
  }
}

What do you think?

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jan 7, 2016

Contributor

Yeah the constraints should have the follow expressional power of our other constraints: https://www.nomadproject.io/docs/jobspec/index.html#attribute.

As for binding to all interfaces/ips, I think that should be a configurable thing. There are cases in which you only need to bind to a single IP/interface.

Contributor

dadgar commented Jan 7, 2016

Yeah the constraints should have the follow expressional power of our other constraints: https://www.nomadproject.io/docs/jobspec/index.html#attribute.

As for binding to all interfaces/ips, I think that should be a configurable thing. There are cases in which you only need to bind to a single IP/interface.

@skozin

This comment has been minimized.

Show comment
Hide comment
@skozin

skozin Jan 7, 2016

Ah, completely missed the constraints syntax, sorry. So, the port configuration would look like this, right?

port "http" {
  static = 1234
  constraint {
    # "interface." is a prefix for all interface-related attributes
    # "availability" is a user-defined attribute; a set of system-defined attributes
    # may be added in the future, e.g. "netmask-cidr", "is-default-route", etc.
    attribute = "interface.availability"
    value = "public"
  }
  # there may be multiple constraints applied to the same port
}

Regarding binding to all interfaces that match port's constraints: of course there are cases when you need to bind to a single interface; I would even say that these cases are the most common ones. But wouldn't constraints system already enable you to do that? For example, if there are several interfaces marked with availability = "public", and you want to bind to one specific interface, just mark this interface with an additional metadata attribute, and include one more constraint in the port definition. Why wouldn't that work?

As for the case when an interface has several IPs, honestly I have no idea how this can be made configurable without requiring the job author to specify the exact IP or netmask. Of course you can add an option to bind to only one (first/random) IP of an interface, but that would be a fairly useless option, because now you have no idea what address the service will actually listen on, and whether this address would be routable from some network. The only generic and useful options I can think about is port.ipv4 defaulted to true and port.ipv6 defaulted to false, which would include/exclude all IPv4/IPv6 addresses.

skozin commented Jan 7, 2016

Ah, completely missed the constraints syntax, sorry. So, the port configuration would look like this, right?

port "http" {
  static = 1234
  constraint {
    # "interface." is a prefix for all interface-related attributes
    # "availability" is a user-defined attribute; a set of system-defined attributes
    # may be added in the future, e.g. "netmask-cidr", "is-default-route", etc.
    attribute = "interface.availability"
    value = "public"
  }
  # there may be multiple constraints applied to the same port
}

Regarding binding to all interfaces that match port's constraints: of course there are cases when you need to bind to a single interface; I would even say that these cases are the most common ones. But wouldn't constraints system already enable you to do that? For example, if there are several interfaces marked with availability = "public", and you want to bind to one specific interface, just mark this interface with an additional metadata attribute, and include one more constraint in the port definition. Why wouldn't that work?

As for the case when an interface has several IPs, honestly I have no idea how this can be made configurable without requiring the job author to specify the exact IP or netmask. Of course you can add an option to bind to only one (first/random) IP of an interface, but that would be a fairly useless option, because now you have no idea what address the service will actually listen on, and whether this address would be routable from some network. The only generic and useful options I can think about is port.ipv4 defaulted to true and port.ipv6 defaulted to false, which would include/exclude all IPv4/IPv6 addresses.

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jan 7, 2016

Contributor

Yeah more or less! And I think that is a fairly clean syntax. The constraint system would let you do it but then you are breaking Nomads abstraction a bit by targeting a single IP/interface in particular when you just care about certain attributes. So I think there can be a bind_all_interfaces = true which binds you to all interfaces and similarly for IPs. As for determining your IP, we inject that via environment variables and it is also available via consul.

Contributor

dadgar commented Jan 7, 2016

Yeah more or less! And I think that is a fairly clean syntax. The constraint system would let you do it but then you are breaking Nomads abstraction a bit by targeting a single IP/interface in particular when you just care about certain attributes. So I think there can be a bind_all_interfaces = true which binds you to all interfaces and similarly for IPs. As for determining your IP, we inject that via environment variables and it is also available via consul.

@skozin

This comment has been minimized.

Show comment
Hide comment
@skozin

skozin Jan 8, 2016

@dadgar, I know that IP is being passed to the task and is available via Consul, but, in some cases, you really need to listen on a specific ip. The good example is DigitalOcean's floating IPs. Every instance in DO has two IPs attached to its eth0 interface: the first is regular public IP, and the second is so called anchor IP. In order to make some service accessible via floating IP, you need to listen on anchor IP. If you listen just on regular public IP, the service will be unaccessible via floating IP.

I'm not particularly against bind_all_interfaces option, but I fail to understand how you would usefully choose the IP/interface to listen on when bind_all_interfaces = false and multiple interfaces match the port or when the matching interface has multiple IPs.

I guess that choosing first/random interface/IP would be useless and confusing in such cases, especially if bind_all_interfaces will be false by default. For example, in the DO case you can't be sure that your service will be accessible via regular/floating IP if you constrain your service to eth0, because you can't know whether regular or anchor IP will be chosen by Nomad.

Another reason for listening on all interfaces by default is that it is the behavior that most developers are used to. For example, in Node.js and many other APIs, including Go, the socket will be bound to all interfaces if you don't explicitly pass the interface. It would be intuitive if Nomad exhibited the same behavior: in the absence of constraints, the port would be bound to all interfaces. And when you add constraints, you just narrow the set of chosen interfaces (and thus IPs). This is a really simple mental model.


That said, I understand that listening on multiple, but not all, interfaces in the case when driver doesn't support port mapping makes no sense, because you usually can't tell your program/API to listen on a set of IPs. Usually you have just two choices: either bind to all IPs, or to some particular IP.

So, maybe the following would be a good compromise:

  • when no constraints are specified for a port, map that port to all interfaces when driver supports port mapping, and pass NOMAD_PORT_xxx_IP = 0.0.0.0 env. variable otherwise;
  • when a port has constraints, choose the first matching IP and map the port to that IP or pass NOMAD_PORT_xxx_IP = a.b.c.d env. variable;
  • when a port has constraints that match multiple interfaces/IPs, print a message warning that random IP will be chosen.

skozin commented Jan 8, 2016

@dadgar, I know that IP is being passed to the task and is available via Consul, but, in some cases, you really need to listen on a specific ip. The good example is DigitalOcean's floating IPs. Every instance in DO has two IPs attached to its eth0 interface: the first is regular public IP, and the second is so called anchor IP. In order to make some service accessible via floating IP, you need to listen on anchor IP. If you listen just on regular public IP, the service will be unaccessible via floating IP.

I'm not particularly against bind_all_interfaces option, but I fail to understand how you would usefully choose the IP/interface to listen on when bind_all_interfaces = false and multiple interfaces match the port or when the matching interface has multiple IPs.

I guess that choosing first/random interface/IP would be useless and confusing in such cases, especially if bind_all_interfaces will be false by default. For example, in the DO case you can't be sure that your service will be accessible via regular/floating IP if you constrain your service to eth0, because you can't know whether regular or anchor IP will be chosen by Nomad.

Another reason for listening on all interfaces by default is that it is the behavior that most developers are used to. For example, in Node.js and many other APIs, including Go, the socket will be bound to all interfaces if you don't explicitly pass the interface. It would be intuitive if Nomad exhibited the same behavior: in the absence of constraints, the port would be bound to all interfaces. And when you add constraints, you just narrow the set of chosen interfaces (and thus IPs). This is a really simple mental model.


That said, I understand that listening on multiple, but not all, interfaces in the case when driver doesn't support port mapping makes no sense, because you usually can't tell your program/API to listen on a set of IPs. Usually you have just two choices: either bind to all IPs, or to some particular IP.

So, maybe the following would be a good compromise:

  • when no constraints are specified for a port, map that port to all interfaces when driver supports port mapping, and pass NOMAD_PORT_xxx_IP = 0.0.0.0 env. variable otherwise;
  • when a port has constraints, choose the first matching IP and map the port to that IP or pass NOMAD_PORT_xxx_IP = a.b.c.d env. variable;
  • when a port has constraints that match multiple interfaces/IPs, print a message warning that random IP will be chosen.
@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jan 8, 2016

Contributor

Yeah I don't think we disagree very much. I think we can randomly select an IP and that will be the desired behavior because the job should have constraints that logically match the desired attributes of the interface or set of IPs. So within that constraint group any IP we choose should be fine. For your DO example, there should be a way for an operator to mark that this CIDR block is of type floating/regular. And if an application cares they can put a constraint.

As for the environment variable I think we made a mistake by injecting only one IP variable. I think moving forward it will be something like NOMAD_IP_port_label="210.123.1.179:4231". So that multiple ports can be on different IPs.

Contributor

dadgar commented Jan 8, 2016

Yeah I don't think we disagree very much. I think we can randomly select an IP and that will be the desired behavior because the job should have constraints that logically match the desired attributes of the interface or set of IPs. So within that constraint group any IP we choose should be fine. For your DO example, there should be a way for an operator to mark that this CIDR block is of type floating/regular. And if an application cares they can put a constraint.

As for the environment variable I think we made a mistake by injecting only one IP variable. I think moving forward it will be something like NOMAD_IP_port_label="210.123.1.179:4231". So that multiple ports can be on different IPs.

@skozin

This comment has been minimized.

Show comment
Hide comment
@skozin

skozin Jan 8, 2016

Yup, passing port and IP together in a single variable would be a good choice too. Do I understand correctly that port_label part of NOMAD_IP_port_label variable name is a single substitution, e.g. http for port "http" { ... }?

But I think that it would be nice to additionally pass IP and port in different variables, to avoid parsing when you need IP and port separately, like in e.g. Node's http.listen API. That way, the task would get three variables for each port:

  • NOMAD_ADDRESS_portlabel="210.123.1.179:4231"
  • NOMAD_IP_portlabel="210.123.1.179"
  • NOMAD_PORT_portlabel="4231" (this is already present in Nomad).

Regarding random selection, I would suggest printing a warning anyway, because random IP selection, imho, is not what someone usually wants when binding a socket, and with great probability is a result of a mistake.

skozin commented Jan 8, 2016

Yup, passing port and IP together in a single variable would be a good choice too. Do I understand correctly that port_label part of NOMAD_IP_port_label variable name is a single substitution, e.g. http for port "http" { ... }?

But I think that it would be nice to additionally pass IP and port in different variables, to avoid parsing when you need IP and port separately, like in e.g. Node's http.listen API. That way, the task would get three variables for each port:

  • NOMAD_ADDRESS_portlabel="210.123.1.179:4231"
  • NOMAD_IP_portlabel="210.123.1.179"
  • NOMAD_PORT_portlabel="4231" (this is already present in Nomad).

Regarding random selection, I would suggest printing a warning anyway, because random IP selection, imho, is not what someone usually wants when binding a socket, and with great probability is a result of a mistake.

@skozin

This comment has been minimized.

Show comment
Hide comment
@skozin

skozin Jan 20, 2016

Hi @dadgar, here are some points by @DanielDent against using interface names: #223 (comment)

I don't agree that an interface name is necessarily easily predictable across deployed instances. This is probably even more true for Nomad than it is for Consul where I first raised this issue. Interface names can depend on the order in which interfaces are brought up - on some systems they can depend on some stateful ideas. On a bare metal Debian instance, /etc/udev/rules.d/70-persistent-net.rules is auto-generated during installation.

Interface names often are associated with a network technology. Some machines might be connected to the cluster over a VPN (and have an interface name based on that), while other machines might have the SDN interconnect offloaded to a hypervisor or network-provided routing infrastructure.

Some sites may even be using IP addresses directly for service discovery/bootstrapping purposes. It might involve anycasted IP addresses, IP addresses which are NATted to another IP, an IP address which gets announced to routing infrastructure using a BGP/IGP approach, or a DHCP server cluster which uses MAC addresses/VLAN tags/switch port metadata to assign static IP addresses to some or all nodes.

With heterogenous infrastructure, an IP subnet approach will work in cases where interface based approaches won't.

Furthermore, a single network interface can have multiple IP addresses associated with it. This is especially common in IPv6 configurations, but it can be come up with IPv4 too. A machine having many IPv6 addresses with a variety of scopes is pretty standard. Allowing users to specify an interface name might not actually help disambiguate the appropriate address on which to bind.

I can say that I agree with all of them, and that the cases described in that comment are the real ones and occur fairly frequently in production clusters. In our infrastructure, using interface names will not be a problem because we provision all machines with Ansible, so we can just implement additional logic for configuring Nomad, but not every setup allows this, and anyway it brings additional unneeded complexity.

I understand that CIDR notation is too low-level and infrastructure-specific, but, as you can see, interface names may be even more so.

What if we allowed infra operators to attach metadata to networks instead of interfaces? This way, job authors would still be able to use high-level attributes to specify port constraints. What do you think?

skozin commented Jan 20, 2016

Hi @dadgar, here are some points by @DanielDent against using interface names: #223 (comment)

I don't agree that an interface name is necessarily easily predictable across deployed instances. This is probably even more true for Nomad than it is for Consul where I first raised this issue. Interface names can depend on the order in which interfaces are brought up - on some systems they can depend on some stateful ideas. On a bare metal Debian instance, /etc/udev/rules.d/70-persistent-net.rules is auto-generated during installation.

Interface names often are associated with a network technology. Some machines might be connected to the cluster over a VPN (and have an interface name based on that), while other machines might have the SDN interconnect offloaded to a hypervisor or network-provided routing infrastructure.

Some sites may even be using IP addresses directly for service discovery/bootstrapping purposes. It might involve anycasted IP addresses, IP addresses which are NATted to another IP, an IP address which gets announced to routing infrastructure using a BGP/IGP approach, or a DHCP server cluster which uses MAC addresses/VLAN tags/switch port metadata to assign static IP addresses to some or all nodes.

With heterogenous infrastructure, an IP subnet approach will work in cases where interface based approaches won't.

Furthermore, a single network interface can have multiple IP addresses associated with it. This is especially common in IPv6 configurations, but it can be come up with IPv4 too. A machine having many IPv6 addresses with a variety of scopes is pretty standard. Allowing users to specify an interface name might not actually help disambiguate the appropriate address on which to bind.

I can say that I agree with all of them, and that the cases described in that comment are the real ones and occur fairly frequently in production clusters. In our infrastructure, using interface names will not be a problem because we provision all machines with Ansible, so we can just implement additional logic for configuring Nomad, but not every setup allows this, and anyway it brings additional unneeded complexity.

I understand that CIDR notation is too low-level and infrastructure-specific, but, as you can see, interface names may be even more so.

What if we allowed infra operators to attach metadata to networks instead of interfaces? This way, job authors would still be able to use high-level attributes to specify port constraints. What do you think?

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jan 20, 2016

Contributor

Yeah I think in practice it will be something like that as interfaces can have many IPs

Contributor

dadgar commented Jan 20, 2016

Yeah I think in practice it will be something like that as interfaces can have many IPs

@tugbabodrumlu

This comment has been minimized.

Show comment
Hide comment
@tugbabodrumlu

tugbabodrumlu Apr 26, 2016

Hi, is there any update regarding this issue? Will Nomad support this feature fairly soon?

Hi, is there any update regarding this issue? Will Nomad support this feature fairly soon?

@diptanu

This comment has been minimized.

Show comment
Hide comment
@diptanu

diptanu Apr 27, 2016

Collaborator

@tugbabodrumlu Yeah it should be possible to use multiple network interfaces on Nomad fairly soon.

Collaborator

diptanu commented Apr 27, 2016

@tugbabodrumlu Yeah it should be possible to use multiple network interfaces on Nomad fairly soon.

@CpuID

This comment has been minimized.

Show comment
Hide comment
@CpuID

CpuID Oct 10, 2016

Any updates on this?

CpuID commented Oct 10, 2016

Any updates on this?

@CpuID

This comment has been minimized.

Show comment
Hide comment
@CpuID

CpuID Oct 10, 2016

Related:

moby/moby#17750
moby/moby#17796 (comment)
moby/moby#18906 (available as of 1.10.x)

It seems you can use docker create, then docker network connect, and finally docker start to get multiple networks attached to a single container. But it can't be done in a single docker run call.

CpuID commented Oct 10, 2016

Related:

moby/moby#17750
moby/moby#17796 (comment)
moby/moby#18906 (available as of 1.10.x)

It seems you can use docker create, then docker network connect, and finally docker start to get multiple networks attached to a single container. But it can't be done in a single docker run call.

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Oct 10, 2016

Contributor

Sorry no update yet, priorities got shuffled a bit

Contributor

dadgar commented Oct 10, 2016

Sorry no update yet, priorities got shuffled a bit

@kaskavalci

This comment has been minimized.

Show comment
Hide comment
@kaskavalci

kaskavalci Dec 12, 2016

Contributor

We are also moving forward docker driver and need to support multiple interfaces per jobs or tasks. Is there any workaround yet? @dadgar do you have any milestone plan for this feature?

Contributor

kaskavalci commented Dec 12, 2016

We are also moving forward docker driver and need to support multiple interfaces per jobs or tasks. Is there any workaround yet? @dadgar do you have any milestone plan for this feature?

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Dec 12, 2016

Contributor

@kaskavalci It will likely be in a 0.6.X release! Timeline on that is early half on next year.

Contributor

dadgar commented Dec 12, 2016

@kaskavalci It will likely be in a 0.6.X release! Timeline on that is early half on next year.

@Ashald

This comment has been minimized.

Show comment
Hide comment
@Ashald

Ashald Dec 13, 2016

Would love to see this as well!

Ashald commented Dec 13, 2016

Would love to see this as well!

@sheerun

This comment has been minimized.

Show comment
Hide comment
@sheerun

sheerun Jan 3, 2017

Contributor

It would be best to abstract iterfaces into client configuration instead. For example put following in job description:

port "public_port" {
  static = "80"
}

port "private_port" {
  static = "80"
  network = "private"
}

port "docker_port" {
  static = "80"
  network = "docker"
}

and in client configuration

network_interfaces {
  public = "eth0"
  private = "eth2"
  docker = "overlay1"
}
Contributor

sheerun commented Jan 3, 2017

It would be best to abstract iterfaces into client configuration instead. For example put following in job description:

port "public_port" {
  static = "80"
}

port "private_port" {
  static = "80"
  network = "private"
}

port "docker_port" {
  static = "80"
  network = "docker"
}

and in client configuration

network_interfaces {
  public = "eth0"
  private = "eth2"
  docker = "overlay1"
}
@jovandeginste

This comment has been minimized.

Show comment
Hide comment
@jovandeginste

jovandeginste Jan 27, 2017

We have a similar case where every Docker container is assigned an ipv6 address (using Docker's fixed-cidr-v6). Our containers usually don't expose ports on the host (since they are reachable directly via IPv6). This however prevents us from letting Nomad register the container in Consul (including health checks), since Nomad only knows about the (host's) IPv4 ip and knows nothing about the ports.

So what we need is either:
a) Nomad to find out what IPv6 address is assigned to the container (after staring it), and use that to register it in Consul; or
b) Nomad to select an IPv6 address before starting a container and assign it while starting (like it does with non-static IPv4 ports).

I thought b) would be easy using Docker's ip and ip6 parameters, but was sadly mistaken...

Is our use case sufficiently covered here, or should we open a separate feature request for this?

We have a similar case where every Docker container is assigned an ipv6 address (using Docker's fixed-cidr-v6). Our containers usually don't expose ports on the host (since they are reachable directly via IPv6). This however prevents us from letting Nomad register the container in Consul (including health checks), since Nomad only knows about the (host's) IPv4 ip and knows nothing about the ports.

So what we need is either:
a) Nomad to find out what IPv6 address is assigned to the container (after staring it), and use that to register it in Consul; or
b) Nomad to select an IPv6 address before starting a container and assign it while starting (like it does with non-static IPv4 ports).

I thought b) would be easy using Docker's ip and ip6 parameters, but was sadly mistaken...

Is our use case sufficiently covered here, or should we open a separate feature request for this?

@seeder

This comment has been minimized.

Show comment
Hide comment
@seeder

seeder Feb 19, 2017

Just to chime in, not being able to have nomad select ipv6 instead of ipv4 address for service registration in consul shall be considered a bug.

seeder commented Feb 19, 2017

Just to chime in, not being able to have nomad select ipv6 instead of ipv4 address for service registration in consul shall be considered a bug.

@robwdux

This comment has been minimized.

Show comment
Hide comment
@robwdux

robwdux Mar 12, 2017

The ability to add additional networks and explicitly specify ip addresses when scheduling Docker containers is something we currently would require for our direct ip / routing of private, public unicast and anycast addressing to containers, not quite ready for ipv6 but would be nice to see that added as requested above.

robwdux commented Mar 12, 2017

The ability to add additional networks and explicitly specify ip addresses when scheduling Docker containers is something we currently would require for our direct ip / routing of private, public unicast and anycast addressing to containers, not quite ready for ipv6 but would be nice to see that added as requested above.

@darren-west

This comment has been minimized.

Show comment
Hide comment
@darren-west

darren-west May 31, 2017

Hi,

Is there any update on this?

Thanks

darren-west commented May 31, 2017

Hi,

Is there any update on this?

Thanks

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar May 31, 2017

Contributor

@darren-west No update yet!

Contributor

dadgar commented May 31, 2017

@darren-west No update yet!

@lgierth

This comment has been minimized.

Show comment
Hide comment
@lgierth

lgierth Jul 14, 2017

My use case is that I need to bind to specific addresses, both IPv4 and IPv6.

Any tasks I could pick up here? If it's cool I could give bind_all_interfaces a try, or the following:

        port "ovpn" {
            static = 1194
            bind = ["198.51.233.0/24", "2620:2:6000::/48"]
        }

lgierth commented Jul 14, 2017

My use case is that I need to bind to specific addresses, both IPv4 and IPv6.

Any tasks I could pick up here? If it's cool I could give bind_all_interfaces a try, or the following:

        port "ovpn" {
            static = 1194
            bind = ["198.51.233.0/24", "2620:2:6000::/48"]
        }
@lgierth

This comment has been minimized.

Show comment
Hide comment
@lgierth

lgierth Jul 15, 2017

For the record, I'm currently working around this with iptables port forwarding:

iptables -t nat -A PREROUTING -p udp -d 198.51.233.233 --dport 1194 -j DNAT --to-destination 10.44.3.1:1194

lgierth commented Jul 15, 2017

For the record, I'm currently working around this with iptables port forwarding:

iptables -t nat -A PREROUTING -p udp -d 198.51.233.233 --dport 1194 -j DNAT --to-destination 10.44.3.1:1194
@dhv

This comment has been minimized.

Show comment
Hide comment
@dhv

dhv Feb 27, 2018

@dadgar Any update on this?

dhv commented Feb 27, 2018

@dadgar Any update on this?

@lzrski

This comment has been minimized.

Show comment
Hide comment
@lzrski

lzrski Jun 3, 2018

Edit: After some more digging I've found this solution, which seems good enough for our case: #209 (comment)

Below is my original comment. I hope, together with the link, it will be helpful to someone else.

Hello. In our case all services should only bind to private network - we want to use a load balancer to publish them selectively. Is there a way to achieve that?

Our particular configuration is as follows.

We are using Nomad backed by Consul in HA on DigitalOcean machines. Here is our configuration for Nomad servers:

data_dir   = "/var/nomad/data"
bind_addr = "${ self.ipv4_address_private }"

advertise {
  serf = "${ self.ipv4_address_private }:4648"
}

server {
  enabled = true
  bootstrap_expect = 3
}

client {
  enabled = true
}

and Consul:

server = true,
data_dir = "/var/consul/data"
ui = true
bind_addr = "${ self.ipv4_address_private }"
bootstrap = ${ count.index == 0 }

We have a test job like that:

job "echo-service" {
  datacenters = ["dc1"]

  type = "service"

  group "echo" {
    task "webservice" {
      driver = "docker"

      resources {
        network {
          port "http" {}
        }
      }

      config {
        image = "hashicorp/http-echo"
        args = [
          "-text",
          "Hello, Nomad!"
        ]
        port_map {
          http = 5678
        }
      }
    }
  }
}

It always binds to a public network interface. Interestingly, the Web UI shows the address as I would like it to be (probably a bug in its own right):

image

Unfortunately it doesn't seem to reflect reality:

$ http 10.133.50.69:27912

http: error: ConnectionError: HTTPConnectionPool(host='10.133.50.69', port=27912): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f49dc372810>: Failed to establish a new connection: [Errno 111] Connection refused',))

The CLI shows a real (alas unwanted) binding:

$ nomad alloc status 2aee0c75
ID                  = 2aee0c75
Eval ID             = f7353297
Name                = echo-service.echo[0]
Node ID             = 69c19558
Job ID              = echo-service
Job Version         = 0
Client Status       = running
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created             = 13m20s ago
Modified            = 13m7s ago

Task "webservice" is "running"
Task Resources
CPU        Memory           Disk     IOPS  Addresses
0/100 MHz  812 KiB/300 MiB  300 MiB  0     http: 206.189.110.181:27912

Task Events:
Started At     = 2018-06-03T19:24:33Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2018-06-03T19:24:33Z  Started     Task started by client
2018-06-03T19:24:31Z  Driver      Downloading image hashicorp/http-echo:latest
2018-06-03T19:24:31Z  Task Setup  Building Task Directory
2018-06-03T19:24:31Z  Received    Task received by client

$ http 206.189.110.181:27912
HTTP/1.1 200 OK
Content-Length: 14
Content-Type: text/plain; charset=utf-8
Date: Sun, 03 Jun 2018 19:42:30 GMT
X-App-Name: http-echo
X-App-Version: 0.2.3

Hello, Nomad!

Any soultion, even a hackish workaround, will be welcome.

lzrski commented Jun 3, 2018

Edit: After some more digging I've found this solution, which seems good enough for our case: #209 (comment)

Below is my original comment. I hope, together with the link, it will be helpful to someone else.

Hello. In our case all services should only bind to private network - we want to use a load balancer to publish them selectively. Is there a way to achieve that?

Our particular configuration is as follows.

We are using Nomad backed by Consul in HA on DigitalOcean machines. Here is our configuration for Nomad servers:

data_dir   = "/var/nomad/data"
bind_addr = "${ self.ipv4_address_private }"

advertise {
  serf = "${ self.ipv4_address_private }:4648"
}

server {
  enabled = true
  bootstrap_expect = 3
}

client {
  enabled = true
}

and Consul:

server = true,
data_dir = "/var/consul/data"
ui = true
bind_addr = "${ self.ipv4_address_private }"
bootstrap = ${ count.index == 0 }

We have a test job like that:

job "echo-service" {
  datacenters = ["dc1"]

  type = "service"

  group "echo" {
    task "webservice" {
      driver = "docker"

      resources {
        network {
          port "http" {}
        }
      }

      config {
        image = "hashicorp/http-echo"
        args = [
          "-text",
          "Hello, Nomad!"
        ]
        port_map {
          http = 5678
        }
      }
    }
  }
}

It always binds to a public network interface. Interestingly, the Web UI shows the address as I would like it to be (probably a bug in its own right):

image

Unfortunately it doesn't seem to reflect reality:

$ http 10.133.50.69:27912

http: error: ConnectionError: HTTPConnectionPool(host='10.133.50.69', port=27912): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f49dc372810>: Failed to establish a new connection: [Errno 111] Connection refused',))

The CLI shows a real (alas unwanted) binding:

$ nomad alloc status 2aee0c75
ID                  = 2aee0c75
Eval ID             = f7353297
Name                = echo-service.echo[0]
Node ID             = 69c19558
Job ID              = echo-service
Job Version         = 0
Client Status       = running
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created             = 13m20s ago
Modified            = 13m7s ago

Task "webservice" is "running"
Task Resources
CPU        Memory           Disk     IOPS  Addresses
0/100 MHz  812 KiB/300 MiB  300 MiB  0     http: 206.189.110.181:27912

Task Events:
Started At     = 2018-06-03T19:24:33Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2018-06-03T19:24:33Z  Started     Task started by client
2018-06-03T19:24:31Z  Driver      Downloading image hashicorp/http-echo:latest
2018-06-03T19:24:31Z  Task Setup  Building Task Directory
2018-06-03T19:24:31Z  Received    Task received by client

$ http 206.189.110.181:27912
HTTP/1.1 200 OK
Content-Length: 14
Content-Type: text/plain; charset=utf-8
Date: Sun, 03 Jun 2018 19:42:30 GMT
X-App-Name: http-echo
X-App-Version: 0.2.3

Hello, Nomad!

Any soultion, even a hackish workaround, will be welcome.

lzrski added a commit to KDVnet/packer-terraform-digitalocean-playground that referenced this issue Jun 3, 2018

@dadgar

This comment has been minimized.

Show comment
Hide comment
@dadgar

dadgar Jun 4, 2018

Contributor

@lzrski Hm the UI does look like a bug, we will look into that. Nomad currently does not manage load balancers for you. A common approach is to set up an external load balancer that points to an internal load balancer such as Fabio register services in your internal load balancer

Contributor

dadgar commented Jun 4, 2018

@lzrski Hm the UI does look like a bug, we will look into that. Nomad currently does not manage load balancers for you. A common approach is to set up an external load balancer that points to an internal load balancer such as Fabio register services in your internal load balancer

@wrightMatthew

This comment has been minimized.

Show comment
Hide comment
@wrightMatthew

wrightMatthew Jun 27, 2018

Mentioned in related issue #209 was binding to all interfaces by binding to 0.0.0.0. It was also mentioned in this original question. Is there any progress for that? I can't figure out how to go about configuring my service to be available on all network interfaces.

wrightMatthew commented Jun 27, 2018

Mentioned in related issue #209 was binding to all interfaces by binding to 0.0.0.0. It was also mentioned in this original question. Is there any progress for that? I can't figure out how to go about configuring my service to be available on all network interfaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment