Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exclude master/client nodes from data requests #214

Closed
costin opened this issue Jun 13, 2014 · 3 comments
Closed

exclude master/client nodes from data requests #214

costin opened this issue Jun 13, 2014 · 3 comments

Comments

@costin
Copy link
Member

costin commented Jun 13, 2014

Double check behaviour in cluster with master-only or client/tribe nodes in terms of writing/reading data.

@costin costin added v2.0.2 and removed v2.0.1 labels Aug 15, 2014
@jeffsteinmetz
Copy link

I found this open comment, and may be related to a question I was going to ask.

In your webinar, you mention that search requests from es-hadoop to elasticsearch distribute the query to nodes. You also mention if one node goes down, es-hadoop will move to another node.
My question is this:
We use client nodes (no data) for all our queries (no queries go direct to a data node)
Our environment is 1 dedicated Master (no data), 2 or more clients (no DATA, HTTP enabled) and several Data nodes (non client)

How does es-hadoop distribute load in this case? does it distribute load via the list of client nodes passed into "es.nodes" SparkConf? or is it doing some type of routing request within the query (through the client node)?

@costin
Copy link
Member Author

costin commented Apr 28, 2015

@jeffsteinmetz For some reason I've only found this comment now - apologies for the huge delay. THe latest Beta (4) has support for client only nodes - in other words, es-hadoop can be forced to connect to the cluster only through these nodes. Clearly it affects parallelism since the queries are distributed between these nodes (instead of going to the data nodes directly) but performance doesn't seem to be affected too much - depends on the volume really and how import locality is.
In other words, if you are doing HUGE bulk reads, you might find it slower, if not, you are unlikely to spot any difference.

Cheers,

@costin costin closed this as completed Apr 28, 2015
@costin
Copy link
Member Author

costin commented Apr 28, 2015

By the way, I've closed the issue since Beta4 was just released. Let me know if you have any issues/queries potentially through the mailing list or another issue.

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants