New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLRC should allow requests to prefer a node #48717
Comments
Pinging @elastic/es-core-features (:Core/Features/Java Low Level REST Client) |
Would it be possible to implement this through a custom node selector? I am also not sure that a client should have this capability around being aware of where the data is located etc. Was uneven balancing of requests a problem that occurred in practice? How do others client handle this @elastic/es-clients ? |
@javanna we already have the logic to select a node in many Elasticsearch clients. For instance, in elasticsearch-php we have a SelectorInterface and a ConnectionPoolInterface to allow custom implementation. We used a Round Robin algorithm by default for the connection pool. Recently me and @delvedor proposed also a Weighted Connection Pool (here a js implementation). |
that I believe is comparable to the node selector construct that the java client already exposes. |
That's correct — choosing nodes selectively, eg. based on the node attributes, is exactly the use case for a custom selector. |
@javanna The node selector option is a reasonable one when each client is used for a single task, but it is less reasonable in cases where we want to reuse a client across multiple tasks. For instance, using one client instance for the life of a spark streaming job, it likely will be used by multiple tasks at the same time, each potentially with their own preferred node. If we can provide a node selector per request we can get around those issues. |
I see what you mean, potentially the NodeSelector could be provided with the RequestOptions, on the other hand I wonder if a node selector implementation could be made aware of the different tasks that the client is used for and choose nodes based on that too. |
Yeah, I purposely worded this issue as a description of the problem because I didn't want to push a specific solution on it. This would be an interesting option, though it seems like it would be a bit more difficult to use through the API, especially if the NodeSelector you want to use might change its state over time (e.g. a task might want to pin to a different node). |
The Java Low Level REST Client has a pool of hosts that it selects from in round robin fashion to servicing requests. A host is selected for a request based on the client's current "node selector" implementation, which is a pluggable function that can filter nodes to call based on their roles, or other user provided attributes.
In some client applications, there may be cases where a request should have an affinity to a particular node if it is available instead of always selecting a host using round robin logic. This sort of behavior would allow distributed applications to more deterministically spread their client load across a cluster instead of relying on purely randomized round robin communication.
One such example would be the ES-Hadoop project, where a single client may be shared across multiple threads, each preferring to send requests directly to a node that is known to be hosting a shard. The client accepts that the preferred node may be unreachable and falls back to other nodes in the cluster to submit the request if required, but in the average case that the preferred node is available, it will be used for all requests for a given workload.
The text was updated successfully, but these errors were encountered: