Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLRC should allow requests to prefer a node #48717

Open
jbaiera opened this issue Oct 30, 2019 · 8 comments
Open

LLRC should allow requests to prefer a node #48717

jbaiera opened this issue Oct 30, 2019 · 8 comments
Labels
:Clients/Java Low Level REST Client Minimal dependencies Java Client for Elasticsearch >enhancement Team:Data Management Meta label for data/management team

Comments

@jbaiera
Copy link
Member

jbaiera commented Oct 30, 2019

The Java Low Level REST Client has a pool of hosts that it selects from in round robin fashion to servicing requests. A host is selected for a request based on the client's current "node selector" implementation, which is a pluggable function that can filter nodes to call based on their roles, or other user provided attributes.

In some client applications, there may be cases where a request should have an affinity to a particular node if it is available instead of always selecting a host using round robin logic. This sort of behavior would allow distributed applications to more deterministically spread their client load across a cluster instead of relying on purely randomized round robin communication.

One such example would be the ES-Hadoop project, where a single client may be shared across multiple threads, each preferring to send requests directly to a node that is known to be hosting a shard. The client accepts that the preferred node may be unreachable and falls back to other nodes in the cluster to submit the request if required, but in the average case that the preferred node is available, it will be used for all requests for a given workload.

@jbaiera jbaiera added >enhancement :Clients/Java Low Level REST Client Minimal dependencies Java Client for Elasticsearch labels Oct 30, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Java Low Level REST Client)

@javanna
Copy link
Member

javanna commented Nov 4, 2019

Would it be possible to implement this through a custom node selector? I am also not sure that a client should have this capability around being aware of where the data is located etc. Was uneven balancing of requests a problem that occurred in practice? How do others client handle this @elastic/es-clients ?

@ezimuel
Copy link
Contributor

ezimuel commented Nov 5, 2019

@javanna we already have the logic to select a node in many Elasticsearch clients. For instance, in elasticsearch-php we have a SelectorInterface and a ConnectionPoolInterface to allow custom implementation. We used a Round Robin algorithm by default for the connection pool. Recently me and @delvedor proposed also a Weighted Connection Pool (here a js implementation).

@javanna
Copy link
Member

javanna commented Nov 5, 2019

that I believe is comparable to the node selector construct that the java client already exposes.

@karmi
Copy link
Contributor

karmi commented Nov 5, 2019

that I believe is comparable to the node selector construct that the java client already exposes.

That's correct — choosing nodes selectively, eg. based on the node attributes, is exactly the use case for a custom selector.

@jbaiera
Copy link
Member Author

jbaiera commented Nov 8, 2019

@javanna The node selector option is a reasonable one when each client is used for a single task, but it is less reasonable in cases where we want to reuse a client across multiple tasks. For instance, using one client instance for the life of a spark streaming job, it likely will be used by multiple tasks at the same time, each potentially with their own preferred node. If we can provide a node selector per request we can get around those issues.

@javanna
Copy link
Member

javanna commented Nov 12, 2019

I see what you mean, potentially the NodeSelector could be provided with the RequestOptions, on the other hand I wonder if a node selector implementation could be made aware of the different tasks that the client is used for and choose nodes based on that too.

@jbaiera
Copy link
Member Author

jbaiera commented Nov 13, 2019

on the other hand I wonder if a node selector implementation could be made aware of the different tasks that the client is used for and choose nodes based on that too.

Yeah, I purposely worded this issue as a description of the problem because I didn't want to push a specific solution on it. This would be an interesting option, though it seems like it would be a bit more difficult to use through the API, especially if the NodeSelector you want to use might change its state over time (e.g. a task might want to pin to a different node).

@rjernst rjernst added the Team:Data Management Meta label for data/management team label May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Clients/Java Low Level REST Client Minimal dependencies Java Client for Elasticsearch >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

6 participants