Using "_ec2:publicDns_" for network.publish_host doesn't seem to work properly #76

Open
bodgit opened this Issue Apr 22, 2014 · 13 comments

Comments

Projects
None yet
10 participants
@bodgit

bodgit commented Apr 22, 2014

I'm trying to attach a remote (potentially non-EC2) Tribe node to an EC2 cluster. I raised this on the mailing list but the thread died after a bit of back and forth so I figured I would raise this here as I would still like to get this fixed.

I've created two nodes in EC2 EU region with the following configuration which is as small as possible to illustrate the problem:

network.publish_host: "_ec2:publicDns_"
discovery.type: ec2
discovery.ec2.groups: estest
discovery.ec2.host_type: public_dns
cloud.aws.region: "eu-west-1"
cloud.aws.access_key: abc123
cloud.aws.secret_key: s3cr3t
cloud.node.auto_attributes: true
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false

I'm using the public DNS host type because these resolve within EC2 to the private IP address but outside of EC2 return the public IP address which should theoretically mean an external node can join the cluster (the Tribe node in this case). Keeping traffic private within EC2 is meant to yield better performance avoiding the need to "hairpin" inter-cluster traffic as it would have to traverse the EC2 NAT layer if using the public IP addresses.

Both nodes have ES 1.1.1 and cloud-aws 2.1.1 installed. Both are members of an estest security group which has the following rules:

Type Protocol Port Range Source
Custom TCP Rule TCP 9300 - 9399 sg-feedbeef (estest)
Custom ICMP Rule Echo Request N/A sg-feedbeef (estest)
SSH TCP 22 1.2.3.4/32
Custom TCP Rule TCP 9200 sg-feedbeef (estest)
Custom TCP Rule TCP 9200 1.2.3.4/32
Custom TCP Rule TCP 9300 - 9399 1.2.3.4/32
Custom ICMP Rule Echo Request N/A 1.2.3.4/32

1.2.3.4 is my IP address for external access. Both nodes have elastic IP addresses assigned to them so they are on "well known" external IP addresses for the purposes of the tribe node configuration.

Both nodes correctly find each other on startup and I get a basic two-node cluster. Even though I'm using discovery.ec2.host_type: public_dns because the external DNS names internally resolve to the private IP addresses within EC2 the cluster is happily using the private IP addresses to communicate.

However, if I query one node using http://54.72.215.117:9200/_nodes/transport?pretty I get:

{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "0nMUUSkXSnqy35ttLEkcqA" : {
      "name" : "Numinus",
      "transport_address" : "inet[/172.31.12.62:9300]",
      "host" : "ip-172-31-12-62",
      "ip" : "172.31.12.62",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[ec2-54-72-137-131.eu-west-1.compute.amazonaws.com/172.31.12.62:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0%0:9300]",
        "publish_address" : "inet[/172.31.12.62:9300]"
      }
    },
    "-XlaNF-hSAi2U8tcpEFmug" : {
      "name" : "Howard the Duck",
      "transport_address" : "inet[ec2-54-72-215-117.eu-west-1.compute.amazonaws.com/172.31.12.61:9300]",
      "host" : "ip-172-31-12-61",
      "ip" : "172.31.12.61",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[ec2-54-72-215-117.eu-west-1.compute.amazonaws.com/172.31.12.61:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0:9300]",
        "publish_address" : "inet[ec2-54-72-215-117.eu-west-1.compute.amazonaws.com/172.31.12.61:9300]"
      }
    }
  }
}

Notice that only the node that is the one I directly queried (Howard the Duck) has the public DNS included in its transport address. If I query the other node (Numinus) via http://54.72.137.131:9200/_nodes/transport?pretty I get:

{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "0nMUUSkXSnqy35ttLEkcqA" : {
      "name" : "Numinus",
      "transport_address" : "inet[ec2-54-72-137-131.eu-west-1.compute.amazonaws.com/172.31.12.62:9300]",
      "host" : "ip-172-31-12-62",
      "ip" : "172.31.12.62",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[ec2-54-72-137-131.eu-west-1.compute.amazonaws.com/172.31.12.62:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0:9300]",
        "publish_address" : "inet[ec2-54-72-137-131.eu-west-1.compute.amazonaws.com/172.31.12.62:9300]"
      }
    },
    "-XlaNF-hSAi2U8tcpEFmug" : {
      "name" : "Howard the Duck",
      "transport_address" : "inet[/172.31.12.61:9300]",
      "host" : "ip-172-31-12-61",
      "ip" : "172.31.12.61",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[ec2-54-72-215-117.eu-west-1.compute.amazonaws.com/172.31.12.61:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0%0:9300]",
        "publish_address" : "inet[/172.31.12.61:9300]"
      }
    }
  }
}

Notice that the other node (Numinus) now has the correct transport address. I would expect both nodes to have the same style transport address when queried from either node.

My tribe node (also running ES 1.1.1) has the following minimal configuration:

discovery.zen.ping.multicast.enabled: false
tribe:
  dublin:
    cluster:
      name: elasticsearch
    discovery:
      zen:
        ping:
          unicast:
            hosts:
              - 54.72.215.117
              - 54.72.137.131

The tribe node correctly connects to the external IP address and (I'm guessing) retrieves the node list and then promptly tries to connect to the private IP addresses, and fails.

Now I can change the EC2 cluster configuration to use the public IP addresses like so:

network.publish_host: "_ec2:publicIp_"
discovery.type: ec2
discovery.ec2.groups: estest
discovery.ec2.host_type: public_ip
cloud.aws.region: "eu-west-1"
cloud.aws.access_key: abc123
cloud.aws.secret_key: s3cr3t
cloud.node.auto_attributes: true
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false

However the cluster won't associate until I add the following additional rules to the estest security group:

Type Protocol Port Range Source
Custom TCP Rule TCP 9300 - 9399 54.72.137.131/32
Custom TCP Rule TCP 9300 - 9399 54.72.215.117/32

This isn't good as I can't easily scale the cluster without adding the public IP address of each node to the security group and also the performance is worse as every inter-cluster connection is hairpinning through the EC2 NAT layer.

With the cluster up, I get consistent output from http://54.72.215.117:9200/_nodes/transport?pretty, e.g.

{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "L4rjg5JNR1yuk-kU_eAthw" : {
      "name" : "Phil Urich",
      "transport_address" : "inet[/54.72.137.131:9300]",
      "host" : "ip-172-31-12-62",
      "ip" : "172.31.12.62",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[/54.72.137.131:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0%0:9300]",
        "publish_address" : "inet[/54.72.137.131:9300]"
      }
    },
    "WvFPkVyfSi2dpHJ4k3yk8w" : {
      "name" : "Blue Streak",
      "transport_address" : "inet[/54.72.215.117:9300]",
      "host" : "ip-172-31-12-61",
      "ip" : "172.31.12.61",
      "version" : "1.1.1",
      "build" : "f1585f0",
      "http_address" : "inet[/54.72.215.117:9200]",
      "attributes" : {
        "aws_availability_zone" : "eu-west-1a"
      },
      "transport" : {
        "bound_address" : "inet[/0:0:0:0:0:0:0:0:9300]",
        "publish_address" : "inet[/54.72.215.117:9300]"
      }
    }
  }
}

I can then attach a tribe node successfully.

There's a slight issue in that the tribe node must be either directly attached to the internet or have a 1:1 NAT with the external address set as the value for network.publish_host so that the EC2 cluster can talk back to the tribe node, but that's not directly related to this issue.

It may be that a VPN is the only viable solution so you should just use private IP addresses everywhere and let the IPv4 routing layer deal with everything, or use IPv6, and then this issue goes away but the setting as it currently stands seems of little use.

@Dchamard

This comment has been minimized.

Show comment
Hide comment
@Dchamard

Dchamard May 28, 2014

Running into the same issue

Running into the same issue

@dadoonet dadoonet added the 2.2.0 label May 29, 2014

@dadoonet dadoonet added this to the 2.2.0 milestone May 29, 2014

@dadoonet dadoonet self-assigned this Jun 20, 2014

@dadoonet dadoonet added the bug label Jun 20, 2014

@dadoonet

This comment has been minimized.

Show comment
Hide comment
@dadoonet

dadoonet Jun 20, 2014

Member

Thanks for this detailed explanation.

I think this is caused by this line: https://github.com/elasticsearch/elasticsearch-cloud-aws/blob/master/src/main/java/org/elasticsearch/cloud/aws/network/Ec2NameResolver.java#L111-111

Which basically tries to resolve the hostname to an IP address. And I guess that ec2-54-72-137-131.eu-west-1.compute.amazonaws.com is resolved locally in that case to 172.31.12.62 instead of 54.72.137.131.

I might be wrong but the only workaround I can see is to use public_ip as you mentioned as this exposes the right public IP address.

@kimchy WDYT?

Member

dadoonet commented Jun 20, 2014

Thanks for this detailed explanation.

I think this is caused by this line: https://github.com/elasticsearch/elasticsearch-cloud-aws/blob/master/src/main/java/org/elasticsearch/cloud/aws/network/Ec2NameResolver.java#L111-111

Which basically tries to resolve the hostname to an IP address. And I guess that ec2-54-72-137-131.eu-west-1.compute.amazonaws.com is resolved locally in that case to 172.31.12.62 instead of 54.72.137.131.

I might be wrong but the only workaround I can see is to use public_ip as you mentioned as this exposes the right public IP address.

@kimchy WDYT?

@dadoonet dadoonet removed 2.2.0 labels Jun 20, 2014

@Dchamard

This comment has been minimized.

Show comment
Hide comment
@Dchamard

Dchamard Jun 20, 2014

Hi,

Using public_ip is a big issue, meaning that the individual clusters will have to go to the external I.P to talk internally which is not what we want for performance and security groups management.

Amazon public_dns resolves to the private I.P only when you are in the same region and resolves to the public address when you are external of the region which is what we need for the multi-region scenario.

This means that elasticsearch should be returning the external IP when the request comes from an external cluster. I guess if it would return the dns name instead of the I.P address it would fix that problem.

Please let me know if that makes sense.

Hi,

Using public_ip is a big issue, meaning that the individual clusters will have to go to the external I.P to talk internally which is not what we want for performance and security groups management.

Amazon public_dns resolves to the private I.P only when you are in the same region and resolves to the public address when you are external of the region which is what we need for the multi-region scenario.

This means that elasticsearch should be returning the external IP when the request comes from an external cluster. I guess if it would return the dns name instead of the I.P address it would fix that problem.

Please let me know if that makes sense.

@dadoonet

This comment has been minimized.

Show comment
Hide comment
@dadoonet

dadoonet Jun 20, 2014

Member

So publish_address should be set to public_ip and bound_address to private_ip?

Could this work as expected? If so, any chance you could reproduce this scenario by setting those IP manually and report?

Member

dadoonet commented Jun 20, 2014

So publish_address should be set to public_ip and bound_address to private_ip?

Could this work as expected? If so, any chance you could reproduce this scenario by setting those IP manually and report?

@msonnabaum

This comment has been minimized.

Show comment
Hide comment
@msonnabaum

msonnabaum Jul 25, 2014

I'm also running into this.

And while it looks like it could be fixed here in the aws plugin, the same thing appears to happen when not using ec2 discovery and setting network.publish_host to a hostname.

Could it not resolve the publish_host on the node trying to connect to to that host rather than the host itself?

I'm also running into this.

And while it looks like it could be fixed here in the aws plugin, the same thing appears to happen when not using ec2 discovery and setting network.publish_host to a hostname.

Could it not resolve the publish_host on the node trying to connect to to that host rather than the host itself?

@dadoonet dadoonet removed this from the 2.3.0 milestone Aug 5, 2014

@dadoonet dadoonet removed their assignment Oct 24, 2014

@rparkhunovsky

This comment has been minimized.

Show comment
Hide comment
@rparkhunovsky

rparkhunovsky Feb 13, 2015

The issue is pretty actual for cross-region cloud environments and appears a blocker for usage as a feature. I've had that behaviour tested both on ES1.2.4/cloud_aws-2.3.0 and upgraded ES1.4.3/cloud_aws-2.4.1 and it was still reproducible. This making simple and eventual things hard. Please, fix.

The issue is pretty actual for cross-region cloud environments and appears a blocker for usage as a feature. I've had that behaviour tested both on ES1.2.4/cloud_aws-2.3.0 and upgraded ES1.4.3/cloud_aws-2.4.1 and it was still reproducible. This making simple and eventual things hard. Please, fix.

@ssuprun

This comment has been minimized.

Show comment
Hide comment
@ssuprun

ssuprun Feb 18, 2015

Contributor

I'm not sure if my change will fix this issue, but any way it looks like bug.
Could somebody review it - #175

Contributor

ssuprun commented Feb 18, 2015

I'm not sure if my change will fix this issue, but any way it looks like bug.
Could somebody review it - #175

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Apr 26, 2015

Member

@dadoonet with #175 merged, should this be closed? and maybe elastic/elasticsearch#6333 too?

Member

clintongormley commented Apr 26, 2015

@dadoonet with #175 merged, should this be closed? and maybe elastic/elasticsearch#6333 too?

@rparkhunovsky

This comment has been minimized.

Show comment
Hide comment
@rparkhunovsky

rparkhunovsky Apr 27, 2015

Agree for closing #76. But tested the changes with elastic/elasticsearch#6333 - it doesn't resolve the issue anyway.

Agree for closing #76. But tested the changes with elastic/elasticsearch#6333 - it doesn't resolve the issue anyway.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Apr 27, 2015

Member

thanks @rparkhunovsky - closing this one

Member

clintongormley commented Apr 27, 2015

thanks @rparkhunovsky - closing this one

@taraslayshchuk

This comment has been minimized.

Show comment
Hide comment
@taraslayshchuk

taraslayshchuk Sep 2, 2015

Hi. I get in trouble with configurating 2 nodes in aws cloud in tribute cluster.

Config file of first node:

cluster.name: es_one
node.name: node-t1
plugin.mandatory: "cloud-aws"
network.bound_address: "_ec2:privateDns_"
network.public_host: "_ec2:publicDns_"
transport.bind_host: "_ec2:privateDns_"
transport.publish_host: "_ec2:publicDns_"
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

cloud:
     aws:
         access_key: key
         secret_key: value

cloud.aws.region: "us-east-1"
discovery.type: ec2

Log output from first node:

[2015-09-02 15:06:18,828][INFO ][node                     ] [node-t1] version[1.7.1], pid[14686], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 15:06:18,828][INFO ][node                     ] [node-t1] initializing ...
[2015-09-02 15:06:18,925][INFO ][plugins                  ] [node-t1] loaded [cloud-aws], sites []
[2015-09-02 15:06:18,987][INFO ][env                      ] [node-t1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [5.3gb], net total_space [7.7gb], types [ext4]
[2015-09-02 15:06:22,716][INFO ][node                     ] [node-t1] initialized
[2015-09-02 15:06:22,716][INFO ][node                     ] [node-t1] starting ...
[2015-09-02 15:06:22,790][INFO ][transport                ] [node-t1] bound_address {inet[/172.31.13.137:9301]}, publish_address {inet[ec2-52-3-85-143.compute-1.amazonaws.com/172.31.13.137:9301]}
[2015-09-02 15:06:22,808][INFO ][discovery                ] [node-t1] es_one/qwcnD8D2QEest37d39onow
[2015-09-02 15:06:33,992][INFO ][cluster.service          ] [node-t1] new_master [node-t1][qwcnD8D2QEest37d39onow][ip-172-31-13-137][inet[ec2-52-3-85-143.compute-1.amazonaws.com/172.31.13.137:9301]], reason: zen-disco-join (elected_as_master)
[2015-09-02 15:06:34,021][INFO ][http                     ] [node-t1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.13.137:9200]}
[2015-09-02 15:06:34,022][INFO ][node                     ] [node-t1] started
[2015-09-02 15:06:34,029][INFO ][gateway                  ] [node-t1] recovered [0] indices into cluster_state

Config file of second node:

cluster.name: es_two
node.name: node-t2
plugin.mandatory: "cloud-aws"
network.host: "_ec2:publicDns_"
network.bind_host: "_ec2:publicDns_"
network.public_host: "_ec2:publicDns_"
transport.host: "_ec2:publicDns_"
transport.bind_host: "_ec2:publicDns_"
transport.publish_host: "_ec2:publicDns_"
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

cloud:
     aws:
         access_key: key
         secret_key: value

cloud.aws.region: "us-west-2"

discovery.type: ec2

Log output from second node:

[2015-09-02 15:15:09,871][INFO ][node                     ] [node-t2] version[1.7.1], pid[2768], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 15:15:09,872][INFO ][node                     ] [node-t2] initializing ...
[2015-09-02 15:15:09,979][INFO ][plugins                  ] [node-t2] loaded [cloud-aws], sites []
[2015-09-02 15:15:10,051][INFO ][env                      ] [node-t2] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [5.4gb], net total_space [7.7gb], types [ext4]
[2015-09-02 15:15:13,801][INFO ][node                     ] [node-t2] initialized
[2015-09-02 15:15:13,802][INFO ][node                     ] [node-t2] starting ...
[2015-09-02 15:15:13,884][INFO ][transport                ] [node-t2] bound_address {inet[/192.168.1.155:9301]}, publish_address {inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9301]}
[2015-09-02 15:15:13,908][INFO ][discovery                ] [node-t2] es_two/UIBV-NoFT8-YNPzIiqO8Jw
[2015-09-02 15:15:25,380][INFO ][cluster.service          ] [node-t2] new_master [node-t2][UIBV-NoFT8-YNPzIiqO8Jw][ip-192-168-1-155][inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9301]], reason: zen-disco-join (elected_as_master)
[2015-09-02 15:15:25,407][INFO ][http                     ] [node-t2] bound_address {inet[/192.168.1.155:9200]}, publish_address {inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9200]}
[2015-09-02 15:15:25,407][INFO ][node                     ] [node-t2] started
[2015-09-02 15:15:25,408][INFO ][gateway                  ] [node-t2] recovered [0] indices into cluster_state

Local node config file:

node.name: cluster
plugin.mandatory: "cloud-aws"
cloud:
     aws:
         access_key: key
         secret_key: value

network.host: my.localhost.dns
network.bind_host: my.localhost.dns
network.public_host: my.localhost.dns
transport.host: my.localhost.dns
transport.bind_host: my.localhost.dns
transport.publish_host: my.localhost.dns
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

tribe:
  es_one:
    cluster.name: es_one
    discovery.zen.ping.multicast.enabled: false
    discovery.type: ec2
    discovery.ec2.host_type: public_dns
    discovery.ec2.groups: sg-7b3c461c

  es_two:
    cluster.name: es_two
    discovery.zen.ping.multicast.enabled: false
    discovery.type: ec2
    discovery.ec2.host_type: public_dns
    discovery.ec2.groups: sg-8e55e0ea

logger:
  level: DEBUG

discovery.type: ec2

Local node log file output:

[2015-09-02 18:09:34,089][INFO ][node                     ] [cluster] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:34,089][INFO ][node                     ] [cluster] initializing ...
[2015-09-02 18:09:34,161][INFO ][plugins                  ] [cluster] loaded [cloud-aws], sites [bigdesk, head]
[2015-09-02 18:09:35,998][INFO ][node                     ] [cluster/es_one] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:35,998][INFO ][node                     ] [cluster/es_one] initializing ...
[2015-09-02 18:09:35,998][INFO ][plugins                  ] [cluster/es_one] loaded [cloud-aws], sites []
[2015-09-02 18:09:37,157][INFO ][node                     ] [cluster/es_one] initialized
[2015-09-02 18:09:37,159][INFO ][node                     ] [cluster/es_two] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:37,159][INFO ][node                     ] [cluster/es_two] initializing ...
[2015-09-02 18:09:37,160][INFO ][plugins                  ] [cluster/es_two] loaded [cloud-aws], sites []
[2015-09-02 18:09:37,845][INFO ][node                     ] [cluster/es_two] initialized
[2015-09-02 18:09:37,857][INFO ][node                     ] [cluster] initialized
[2015-09-02 18:09:37,857][INFO ][node                     ] [cluster] starting ...
[2015-09-02 18:09:37,919][INFO ][transport                ] [cluster] bound_address {inet[/10.131.72.169:9301]}, publish_address {inet[tlaispc.ddns.softservecom.com/10.131.72.169:9301]}
[2015-09-02 18:09:37,925][INFO ][discovery                ] [cluster] elasticsearch/xFarBkexRqKrBw6gezgPdA
[2015-09-02 18:09:37,925][WARN ][discovery                ] [cluster] waited for 0s and no initial state was set by the discovery
[2015-09-02 18:09:37,929][INFO ][http                     ] [cluster] bound_address {inet[/10.131.72.169:9200]}, publish_address {inet[tlaispc.ddns.softservecom.com/10.131.72.169:9200]}
[2015-09-02 18:09:37,929][INFO ][node                     ] [cluster/es_one] starting ...
[2015-09-02 18:09:37,938][INFO ][transport                ] [cluster/es_one] bound_address {inet[/0:0:0:0:0:0:0:0:9302]}, publish_address {inet[/10.131.72.169:9302]}
[2015-09-02 18:09:37,942][INFO ][discovery                ] [cluster/es_one] es_one/DohOLLs9QwqCk4eKt9VDFw
[2015-09-02 18:10:07,942][WARN ][discovery                ] [cluster/es_one] waited for 30s and no initial state was set by the discovery
[2015-09-02 18:10:07,943][INFO ][node                     ] [cluster/es_one] started
[2015-09-02 18:10:07,943][INFO ][node                     ] [cluster/es_two] starting ...
[2015-09-02 18:10:07,951][INFO ][transport                ] [cluster/es_two] bound_address {inet[/0:0:0:0:0:0:0:0:9303]}, publish_address {inet[/10.131.72.169:9303]}
[2015-09-02 18:10:07,953][INFO ][discovery                ] [cluster/es_two] es_two/IJVyDR_lRuOoE3bqlSnpGQ
[2015-09-02 18:10:37,953][WARN ][discovery                ] [cluster/es_two] waited for 30s and no initial state was set by the discovery
[2015-09-02 18:10:37,954][INFO ][node                     ] [cluster/es_two] started
[2015-09-02 18:10:37,954][INFO ][node                     ] [cluster] started

I am using ubuntu 14.04, elasticsearch version: 1.7.1, Build: b88f43f/2015-07-29T09:54:16Z, JVM: 1.7.0_79, aws_cloud plugin version v2.7.1.
So what we have? Nodes see each other, but did not make cluster. I don`t know why, but cluster.service did not start before discovery. Can you help me please - what I am doing wrong?

Hi. I get in trouble with configurating 2 nodes in aws cloud in tribute cluster.

Config file of first node:

cluster.name: es_one
node.name: node-t1
plugin.mandatory: "cloud-aws"
network.bound_address: "_ec2:privateDns_"
network.public_host: "_ec2:publicDns_"
transport.bind_host: "_ec2:privateDns_"
transport.publish_host: "_ec2:publicDns_"
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

cloud:
     aws:
         access_key: key
         secret_key: value

cloud.aws.region: "us-east-1"
discovery.type: ec2

Log output from first node:

[2015-09-02 15:06:18,828][INFO ][node                     ] [node-t1] version[1.7.1], pid[14686], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 15:06:18,828][INFO ][node                     ] [node-t1] initializing ...
[2015-09-02 15:06:18,925][INFO ][plugins                  ] [node-t1] loaded [cloud-aws], sites []
[2015-09-02 15:06:18,987][INFO ][env                      ] [node-t1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [5.3gb], net total_space [7.7gb], types [ext4]
[2015-09-02 15:06:22,716][INFO ][node                     ] [node-t1] initialized
[2015-09-02 15:06:22,716][INFO ][node                     ] [node-t1] starting ...
[2015-09-02 15:06:22,790][INFO ][transport                ] [node-t1] bound_address {inet[/172.31.13.137:9301]}, publish_address {inet[ec2-52-3-85-143.compute-1.amazonaws.com/172.31.13.137:9301]}
[2015-09-02 15:06:22,808][INFO ][discovery                ] [node-t1] es_one/qwcnD8D2QEest37d39onow
[2015-09-02 15:06:33,992][INFO ][cluster.service          ] [node-t1] new_master [node-t1][qwcnD8D2QEest37d39onow][ip-172-31-13-137][inet[ec2-52-3-85-143.compute-1.amazonaws.com/172.31.13.137:9301]], reason: zen-disco-join (elected_as_master)
[2015-09-02 15:06:34,021][INFO ][http                     ] [node-t1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.13.137:9200]}
[2015-09-02 15:06:34,022][INFO ][node                     ] [node-t1] started
[2015-09-02 15:06:34,029][INFO ][gateway                  ] [node-t1] recovered [0] indices into cluster_state

Config file of second node:

cluster.name: es_two
node.name: node-t2
plugin.mandatory: "cloud-aws"
network.host: "_ec2:publicDns_"
network.bind_host: "_ec2:publicDns_"
network.public_host: "_ec2:publicDns_"
transport.host: "_ec2:publicDns_"
transport.bind_host: "_ec2:publicDns_"
transport.publish_host: "_ec2:publicDns_"
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

cloud:
     aws:
         access_key: key
         secret_key: value

cloud.aws.region: "us-west-2"

discovery.type: ec2

Log output from second node:

[2015-09-02 15:15:09,871][INFO ][node                     ] [node-t2] version[1.7.1], pid[2768], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 15:15:09,872][INFO ][node                     ] [node-t2] initializing ...
[2015-09-02 15:15:09,979][INFO ][plugins                  ] [node-t2] loaded [cloud-aws], sites []
[2015-09-02 15:15:10,051][INFO ][env                      ] [node-t2] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [5.4gb], net total_space [7.7gb], types [ext4]
[2015-09-02 15:15:13,801][INFO ][node                     ] [node-t2] initialized
[2015-09-02 15:15:13,802][INFO ][node                     ] [node-t2] starting ...
[2015-09-02 15:15:13,884][INFO ][transport                ] [node-t2] bound_address {inet[/192.168.1.155:9301]}, publish_address {inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9301]}
[2015-09-02 15:15:13,908][INFO ][discovery                ] [node-t2] es_two/UIBV-NoFT8-YNPzIiqO8Jw
[2015-09-02 15:15:25,380][INFO ][cluster.service          ] [node-t2] new_master [node-t2][UIBV-NoFT8-YNPzIiqO8Jw][ip-192-168-1-155][inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9301]], reason: zen-disco-join (elected_as_master)
[2015-09-02 15:15:25,407][INFO ][http                     ] [node-t2] bound_address {inet[/192.168.1.155:9200]}, publish_address {inet[ec2-52-88-78-52.us-west-2.compute.amazonaws.com/192.168.1.155:9200]}
[2015-09-02 15:15:25,407][INFO ][node                     ] [node-t2] started
[2015-09-02 15:15:25,408][INFO ][gateway                  ] [node-t2] recovered [0] indices into cluster_state

Local node config file:

node.name: cluster
plugin.mandatory: "cloud-aws"
cloud:
     aws:
         access_key: key
         secret_key: value

network.host: my.localhost.dns
network.bind_host: my.localhost.dns
network.public_host: my.localhost.dns
transport.host: my.localhost.dns
transport.bind_host: my.localhost.dns
transport.publish_host: my.localhost.dns
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false

tribe:
  es_one:
    cluster.name: es_one
    discovery.zen.ping.multicast.enabled: false
    discovery.type: ec2
    discovery.ec2.host_type: public_dns
    discovery.ec2.groups: sg-7b3c461c

  es_two:
    cluster.name: es_two
    discovery.zen.ping.multicast.enabled: false
    discovery.type: ec2
    discovery.ec2.host_type: public_dns
    discovery.ec2.groups: sg-8e55e0ea

logger:
  level: DEBUG

discovery.type: ec2

Local node log file output:

[2015-09-02 18:09:34,089][INFO ][node                     ] [cluster] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:34,089][INFO ][node                     ] [cluster] initializing ...
[2015-09-02 18:09:34,161][INFO ][plugins                  ] [cluster] loaded [cloud-aws], sites [bigdesk, head]
[2015-09-02 18:09:35,998][INFO ][node                     ] [cluster/es_one] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:35,998][INFO ][node                     ] [cluster/es_one] initializing ...
[2015-09-02 18:09:35,998][INFO ][plugins                  ] [cluster/es_one] loaded [cloud-aws], sites []
[2015-09-02 18:09:37,157][INFO ][node                     ] [cluster/es_one] initialized
[2015-09-02 18:09:37,159][INFO ][node                     ] [cluster/es_two] version[1.7.1], pid[17251], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-02 18:09:37,159][INFO ][node                     ] [cluster/es_two] initializing ...
[2015-09-02 18:09:37,160][INFO ][plugins                  ] [cluster/es_two] loaded [cloud-aws], sites []
[2015-09-02 18:09:37,845][INFO ][node                     ] [cluster/es_two] initialized
[2015-09-02 18:09:37,857][INFO ][node                     ] [cluster] initialized
[2015-09-02 18:09:37,857][INFO ][node                     ] [cluster] starting ...
[2015-09-02 18:09:37,919][INFO ][transport                ] [cluster] bound_address {inet[/10.131.72.169:9301]}, publish_address {inet[tlaispc.ddns.softservecom.com/10.131.72.169:9301]}
[2015-09-02 18:09:37,925][INFO ][discovery                ] [cluster] elasticsearch/xFarBkexRqKrBw6gezgPdA
[2015-09-02 18:09:37,925][WARN ][discovery                ] [cluster] waited for 0s and no initial state was set by the discovery
[2015-09-02 18:09:37,929][INFO ][http                     ] [cluster] bound_address {inet[/10.131.72.169:9200]}, publish_address {inet[tlaispc.ddns.softservecom.com/10.131.72.169:9200]}
[2015-09-02 18:09:37,929][INFO ][node                     ] [cluster/es_one] starting ...
[2015-09-02 18:09:37,938][INFO ][transport                ] [cluster/es_one] bound_address {inet[/0:0:0:0:0:0:0:0:9302]}, publish_address {inet[/10.131.72.169:9302]}
[2015-09-02 18:09:37,942][INFO ][discovery                ] [cluster/es_one] es_one/DohOLLs9QwqCk4eKt9VDFw
[2015-09-02 18:10:07,942][WARN ][discovery                ] [cluster/es_one] waited for 30s and no initial state was set by the discovery
[2015-09-02 18:10:07,943][INFO ][node                     ] [cluster/es_one] started
[2015-09-02 18:10:07,943][INFO ][node                     ] [cluster/es_two] starting ...
[2015-09-02 18:10:07,951][INFO ][transport                ] [cluster/es_two] bound_address {inet[/0:0:0:0:0:0:0:0:9303]}, publish_address {inet[/10.131.72.169:9303]}
[2015-09-02 18:10:07,953][INFO ][discovery                ] [cluster/es_two] es_two/IJVyDR_lRuOoE3bqlSnpGQ
[2015-09-02 18:10:37,953][WARN ][discovery                ] [cluster/es_two] waited for 30s and no initial state was set by the discovery
[2015-09-02 18:10:37,954][INFO ][node                     ] [cluster/es_two] started
[2015-09-02 18:10:37,954][INFO ][node                     ] [cluster] started

I am using ubuntu 14.04, elasticsearch version: 1.7.1, Build: b88f43f/2015-07-29T09:54:16Z, JVM: 1.7.0_79, aws_cloud plugin version v2.7.1.
So what we have? Nodes see each other, but did not make cluster. I don`t know why, but cluster.service did not start before discovery. Can you help me please - what I am doing wrong?

@Pryz

This comment has been minimized.

Show comment
Hide comment
@Pryz

Pryz Oct 6, 2015

I'm also hitting this bug. Any news on this ?

elasticsearch-cloud-aws : 2.7.1
elasticsearch : 1.7.1

Pryz commented Oct 6, 2015

I'm also hitting this bug. Any news on this ?

elasticsearch-cloud-aws : 2.7.1
elasticsearch : 1.7.1

@toleksyn

This comment has been minimized.

Show comment
Hide comment
@toleksyn

toleksyn Oct 7, 2015

Also encountered this one with es 1.6. Please fix.

toleksyn commented Oct 7, 2015

Also encountered this one with es 1.6. Please fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment