Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Find Node #436

Closed
Chadwiki opened this issue Apr 27, 2015 · 17 comments
Closed

Cannot Find Node #436

Chadwiki opened this issue Apr 27, 2015 · 17 comments

Comments

@Chadwiki
Copy link

Version: 2.1.0.Beta3
Elastic.version: 1.4.4

Set to : 'es.nodes.client.only' = 'true',

java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot find node with id [KRPdnBpJRc2FkyTJ_oRQpQ](is HTTP enabled?) from shard [Shard[state=STARTED, primary=true, node=KRPdnBpJRc2FkyTJ_oRQpQ, name=0, index=metrics-2015.04.18]] in nodes [[{state=STARTED, primary=true, node=KRPdnBpJRc2FkyTJ_oRQpQ, relocating_node=null, shard=0, index=metrics-2015.04.18}, {state=STARTED, primary=false, node=sPeEIWyBRZWlnHAHevG7Ng, relocating_node=null, shard=0, index=metrics-2015.04.18}]]; layout [[[{state=STARTED, primary=true, node=KRPdnBpJRc2FkyTJ_oRQpQ, relocating_node=null, shard=0, index=metrics-2015.04.18}, {state=STARTED, primary=false, node=sPeEIWyBRZWlnHAHevG7Ng, relocating_node=null, shard=0, index=metrics-2015.04.18}], [{state=STARTED, primary=false, node=MB83hrQtR-CgPEfKfUFYWA, relocating_node=null, shard=1, index=metrics-2015.04.18}, {state=STARTED, primary=true, node=KRPdnBpJRc2FkyTJ_oRQpQ, relocating_node=null, shard=1, index=metrics-2015.04.18}]]]

@costin
Copy link
Member

costin commented Apr 28, 2015

es.nodes.client.only is supported only in master not in Beta3. The exception that you are getting shows (in a bad way) that the node you specified cannot be found because it is not a data node - upgrade to the dev snapshot.

@Chadwiki
Copy link
Author

Here are my node types: Master, Search, Index, Data.
The Data nodes do not have HTTP enabled.
The LB URL for Search would be https://enviro.qa.comp.org/ES.
I can hit the Search nodes via IP which are behind the LB URL - 10.2.20.139:9200, 10.2.20.140:9200

Here is a sample query:
CREATE EXTERNAL TABLE qa_metrics (
date timestamp,
category string,
subcategory string,
application string,
version string,
company_id string,
hostname string,
entity_code string,
datacenter_location string)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'= 'metrics-2015.04.18/metrics',
'es.nodes' = '10.2.20.139,10.2.20.140',
'es.nodes.client.only' = 'true',
'es.field.read.empty.as.null' = 'true',
'es.mapping.id' = '_id',
'es.mapping.names' = 'date:metrics.@timestamp,
category:metrics.Category,
subcategory:metrics.Subcategory,
application:metrics.Application,
version:metrics.Version,
company_id:metrics.extra_data.company_id,
hostname:metrics.Hostname,
datacenter_location:metrics.datacenter_location,
entity_code:metrics.entity_code');

@costin
Copy link
Member

costin commented Apr 28, 2015

That's the one and soon Beta4 will be available in the maven repo. For data nodes, you need to enable the HTTP port; this is a requirement of the connector. If that's not an option, you can enable client nodes with HTTP enabled and configure the connector to use only those.
Without http, I'm afraid the connector won't work.

@Chadwiki
Copy link
Author

The client nodes are: 10.2.20.139,10.2.20.140 ---
http.enabled: true
node.data: false
node.master: false

I still receive the error listed above.

@Chadwiki
Copy link
Author

@costin Should this work with nodes i listed above and the use of es.nodes.client.only?

@costin
Copy link
Member

costin commented Apr 29, 2015

the params you specify in es.nodes only need to have HTTP access. In a client only scenario, it is recommended to specify some of client nodes here - you don't need all of them since they'll be discovered automatically.

Can you double check you are using Beta4 (and that no other jar is available in the classpath) and follow the instructions here. In particular, turn on logging on the REST package and please upload as a gist the logs somewhere.

Cheers,

@Chadwiki
Copy link
Author

Yes - I removed all files except - elasticsearch-hadoop-2.1.0.Beta4.jar

Here is the hive output:
http://pastebin.com/4fnkwkYR
and
http://pastebin.com/rNPPiHm5

I did notice another error, this time... I'm not sure this is related...

Error!
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

@Chadwiki
Copy link
Author

@costin Any ideas

@Chadwiki
Copy link
Author

Chadwiki commented May 5, 2015

@costin Any thoughts to this issue?

@costin
Copy link
Member

costin commented May 5, 2015

Not at the moment, no. When I do, I'll post them - the error does seem conspicuous and likely is caused by an incorrect mapping as explained here. What's your ES mapping?

@costin costin reopened this May 5, 2015
@costin costin added the :Hive label May 5, 2015
@Chadwiki
Copy link
Author

Chadwiki commented May 5, 2015

My Mapping is huge, but the hive statement I used works in a lower environment. This lower environment has http enabled on all data nodes. Both environments use the same mapping.
I also confirmed my mapping against the one failing...

@costin
Copy link
Member

costin commented May 6, 2015

What's the mapping relevant to the fields used by the Hive table? The
connector will use those and ignore the rest.
On May 6, 2015 12:59 AM, "webtoed" notifications@github.com wrote:

My Mapping is huge, but the hive statement I used works in a lower
environment. This lower environment has http enabled on all data nodes.
Both environments use the same mapping.
I also confirmed my mapping against the one failing...


Reply to this email directly or view it on GitHub
#436 (comment)
.

@Chadwiki
Copy link
Author

Chadwiki commented May 6, 2015

I was able to get this working after I opened up http on the data nodes.
I can retest the above HIVE Statement. But I need to be able to get es.nodes.client.only working in the next environment.

Understood about the mapping -

  1. Does es.resources require the Type or can it just be an index?
  2. When using es.query, does this change the path for es.mapping.names?

@costin
Copy link
Member

costin commented May 6, 2015

  1. es.resource requires an index and type - you can use wild cards but in
    the end, it will be resolved to an index and a type; see the documentation
  2. not sure what you mean - es.query is the query to elasticsearch,
    es.mapping.names handles the mapping between Hive and Elasticsearch. If in
    your es.query you want to refer to fields in elasticsearch that are mapped
    to the Hive table, then use the Elasticsearch names.
    There is no translation of the es.query done by the connector.

On Wed, May 6, 2015 at 5:46 PM, webtoed notifications@github.com wrote:

I was able to get this working after I opened up http on the data nodes.
I can retest the above HIVE Statement. But I need to be able to get
es.nodes.client.only working in the next environment.

Understood about the mapping -

  1. Does es.resources require the Type or can it just be an index?
  2. When using es.query, does this change the path for es.mapping.names?


Reply to this email directly or view it on GitHub
#436 (comment)
.

@Chadwiki
Copy link
Author

Chadwiki commented May 7, 2015

-What does the es.query provide? How does it help in most cases?
I was trying to use this to isolate the data further, by only returning specific data from Elastic.

To explain my previous question - mapping of fields is direct when looking at data in Elastic. You use a path/field name to perform a search. However performing a search (Post) the return data has changed the field path.

Example: to search you can use the metric.application on the index. The reference to this field in the returned JSON , data.hits.hits.appliciation.

I was wondering it using es.query transforms the path or mappings in any way.

My other issue - 444 seems to be more relevant now. I have moved my client.nodes only testing into another environment that has http disabled on all DATA nodes. This has the latest HIVE and Issue. It a different error from "cannot find node"

If you deem them not related we can keep each open. My bottom line is , I need to get client only working.

Thanks so much for your time thus far...

@costin
Copy link
Member

costin commented May 18, 2015

@webtoed Does this mean you have found the solution to your problem? Does client routing works for you or not? And if so, what was the problem in the first place?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants