Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

Closed
philmes opened this issue Nov 23, 2015 · 8 comments
Closed

2.2.0-beta1 and Elasticsearch 2.0 transport error #614

philmes opened this issue Nov 23, 2015 · 8 comments

Comments

@philmes
Copy link

philmes commented Nov 23, 2015

I think this is an es-spark problem, but it might also be an ES 2 problem

I have an ES 2.0 cluster which has a custom publish.host set:

network.publish_host: _ec2:publicDns_

this produces the following in /_nodes/transport:

{  
   "cluster_name":"es-test-cluster",
   "nodes":{  
      "nnDrgTgVSVqZPLUmP37E8w":{  
         "name":"Living Tribunal",
         "transport_address":"<some ip>:9300",
         "host":"<some ip>.57",
         "ip":"<some ip>.57",
         "version":"2.0.0",
         "build":"de54438",
         "http_address":"<an EC2 public DNS>/<some ip>:9200"
         ...

The http_address is not parsed correctly, and the Rest client attempts to connect to a completely invalid address.

@costin
Copy link
Member

costin commented Nov 23, 2015

What's the exception that you are getting? What is the address being used to connect?

@philmes
Copy link
Author

philmes commented Nov 23, 2015

Stepping through it in the debugger, in RestClient:discoverNodes() it adds each node with:

hosts.add(StringUtils.parseIpAddress(inet).toString());

parseIpAddress doesn't seem to understand the format in the response from the server, which is in the format foo.bar.amazonaws.com/10.10.0.23:9200

I can definitely connect to both the hostname, and the ip, from the Spark cluster.

@costin
Copy link
Member

costin commented Nov 23, 2015

While useful, the post still doesn't answer the two questions above.

@philmes
Copy link
Author

philmes commented Nov 23, 2015

Exception below, which I think answers both questions. Addresses/IP's obscured:

Each host is running two ES instances. Four hosts in total, for eight instances. The instances are all happy in one cluster.

If I remove the custom hostname, everything works as it should - the only difference is the output from /_nodes/transport

5/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201] failed (Connection refused); selected next node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200] failed (Connection refused); selected next node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200] failed (Connection refused); selected next node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201] failed (Connection refused); selected next node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200]
15/11/23 19:59:02 ERROR NetworkClient: Node [ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201] failed (Connection refused); no other nodes left - aborting...
15/11/23 19:59:02 ERROR Executor: Exception in task 5.3 in stage 7.0 (TID 15286)
org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9200, ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9200, ec2-2.eu-west-1.compute.amazonaws.com/10.0.0.119:9201, ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9200, ec2-1.eu-west-1.compute.amazonaws.com/10.0.0.37:9201, ec2-3.eu-west-1.compute.amazonaws.com/10.0.0.75:9201, ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9200, ec2-4.eu-west-1.compute.amazonaws.com/10.0.0.57:9201]]
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:383)
    at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:391)
    at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:467)
    at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:473)
    at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:473)
    at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:411)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:399)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

@costin
Copy link
Member

costin commented Nov 23, 2015

Cheers.

Will look into it - as you mentioned likely the parsing goes wrong which causes the underlying socket to not use the hostname (instead of the IP).

@jvaralves
Copy link

I am having a similar issue. On org.elasticsearch.hadoop.rest.RestClient the code tries to get the starting index (searching for /) and ending index (searching for ]) for the ip address. The problem is the inet string is in the format "/:" without any "]". This cause the response to be -1 for ending index which then causes an exception in the next line.

For reference I am using elasticsearch-spark_2.10 2.2.0-m1 and my elasticsearch node is version 2.0.0.

I am specifying the node in spark using the following config option.

conf.set("es.nodes", "")

Am I doing anything wrong?

@costin
Copy link
Member

costin commented Dec 3, 2015

@jvaralves best to open a different ticket and provide more information such as what is the exception, what is your "inet string" and further more, if you don't specify any value to es.nodes, where do you want ES-Hadoop to connect?

@costin costin closed this as completed in 98ab152 Dec 3, 2015
@costin
Copy link
Member

costin commented Dec 3, 2015

Fixed in master. Publishing a new dev snapshot as we speak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants