2.2.0-beta1 and Elasticsearch 2.0 transport error #614

philmes · 2015-11-23T21:19:55Z

I think this is an es-spark problem, but it might also be an ES 2 problem

I have an ES 2.0 cluster which has a custom publish.host set:

network.publish_host: _ec2:publicDns_

this produces the following in /_nodes/transport:

{  
   "cluster_name":"es-test-cluster",
   "nodes":{  
      "nnDrgTgVSVqZPLUmP37E8w":{  
         "name":"Living Tribunal",
         "transport_address":"<some ip>:9300",
         "host":"<some ip>.57",
         "ip":"<some ip>.57",
         "version":"2.0.0",
         "build":"de54438",
         "http_address":"<an EC2 public DNS>/<some ip>:9200"
         ...

The http_address is not parsed correctly, and the Rest client attempts to connect to a completely invalid address.

The text was updated successfully, but these errors were encountered:

costin · 2015-11-23T21:26:46Z

What's the exception that you are getting? What is the address being used to connect?

philmes · 2015-11-23T21:40:55Z

Stepping through it in the debugger, in RestClient:discoverNodes() it adds each node with:

hosts.add(StringUtils.parseIpAddress(inet).toString());

parseIpAddress doesn't seem to understand the format in the response from the server, which is in the format foo.bar.amazonaws.com/10.10.0.23:9200

I can definitely connect to both the hostname, and the ip, from the Spark cluster.

costin · 2015-11-23T21:46:48Z

While useful, the post still doesn't answer the two questions above.

philmes · 2015-11-23T21:55:42Z

Exception below, which I think answers both questions. Addresses/IP's obscured:

Each host is running two ES instances. Four hosts in total, for eight instances. The instances are all happy in one cluster.

If I remove the custom hostname, everything works as it should - the only difference is the output from /_nodes/transport

5/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201] failed (Connection refused); selected next node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200] failed (Connection refused); selected next node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201] failed (Connection refused); no other nodes left - aborting...
15/11/23 19:59:02 ERROR Executor: Exception in task 5.3 in stage 7.0 (TID 15286)
org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9200, ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200, ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9201, ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200, ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201, ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201, ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200, ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201]]
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:383)
    at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:391)
    at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:467)
    at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:473)
    at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:473)
    at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:411)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:399)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

costin · 2015-11-23T22:04:36Z

Cheers.

Will look into it - as you mentioned likely the parsing goes wrong which causes the underlying socket to not use the hostname (instead of the IP).

jvaralves · 2015-11-27T16:40:20Z

I am having a similar issue. On org.elasticsearch.hadoop.rest.RestClient the code tries to get the starting index (searching for /) and ending index (searching for ]) for the ip address. The problem is the inet string is in the format "/:" without any "]". This cause the response to be -1 for ending index which then causes an exception in the next line.

For reference I am using elasticsearch-spark_2.10 2.2.0-m1 and my elasticsearch node is version 2.0.0.

I am specifying the node in spark using the following config option.

conf.set("es.nodes", "")

Am I doing anything wrong?

costin · 2015-12-03T21:20:49Z

@jvaralves best to open a different ticket and provide more information such as what is the exception, what is your "inet string" and further more, if you don't specify any value to es.nodes, where do you want ES-Hadoop to connect?

costin · 2015-12-03T22:19:38Z

Fixed in master. Publishing a new dev snapshot as we speak.

costin added bug :Rest :Spark v2.2.0-rc1 labels Nov 23, 2015

costin closed this as completed in 98ab152 Dec 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

jvaralves commented Nov 27, 2015

costin commented Dec 3, 2015

costin commented Dec 3, 2015

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

Comments

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

philmes commented Nov 23, 2015

costin commented Nov 23, 2015

jvaralves commented Nov 27, 2015

costin commented Dec 3, 2015

costin commented Dec 3, 2015