-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved, unified, sniffing heuristics #24871
Comments
@javanna WDYT? |
It would also be nice to discuss which nodes should be queried for this data. I would propose that clients:
The idea here is that it would be advisable for users to seed their clients with stable nodes, which in many installations will be the master nodes. It is also advisable that clients try to keep load off the master nodes, hence making them a last resort. There is a risk here, that in large installs, if the data nodes were to all disappear a thundering herd of |
This seems like a general discussion on what Elasticsearch REST clients should do compared to what they do now in terms of sniffing. We should hear what @elastic/es-clients think of this proposal. |
I thought that the official clients already allowed to plug in custom selectors so that users can choose which nodes get selected. The java low level REST client doesn't support this yet but there's #21888 for it. Probably the default behaviour when sniffing should be adjusted as I think all nodes are selected by the official clients while master only nodes should be skipped. Apart from that I am not sure what else should change provided that the notion of selector is exposed and pluggable in every client. |
@javanna I took a look at the python and ruby clients, I can't find any support for this. Maybe I'm missing something? I also don't believe beats supports this either. Maybe @honzakral @karmi and @andrewkroh can confirm? Perhaps other clients support this? At any rate, the forced omission of master-only nodes would be a great thing to standardize. |
Correct, Beats does not currently have sniffing support. @7AC is working on go-elasticsearch which Beats will eventually use. Having support for this in the client library would be nice. |
Thanks @GlenRSmith , it looks like the python client rejects master nodes. That's great! It would still be great if we could standardize on node settings filtering as well. I mean, maybe this has been tackled before, but ideally clients would all have identical sniffing algorithms, which is what I'm trying to get at here. |
@andrewvc of course, and that's very much the broad intent - uniformity and parity among the low-level clients. Striving for that on this particular item isn't objectionable. |
Currently there is the default behavior (filtering out master-only nodes) and the ability to override the callback used to filter out the nodes. In python all you have to do is supply your own implementation of the Since this is an advanced functionality I think that is sufficient. So far I haven't had a single request for anything more structured/systematic. This approach is common to all the clients. 0 - https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/transport.py#L11-L28 |
There's documentation for the sniffing behaviour of the .NET clients, which includes the ability to specify a predicate to determine which nodes in the cluster API calls should be executed on. |
The Ruby client doesn't do any filtering right now, there's an old open issue: elastic/elasticsearch-ruby#251. So far nobody showed any interest in the feature, though, so I've put on backburner. I've realized that the The Ruby client, as all the official clients, supports supplying a custom connection selector, which allows people to programatically select nodes based on arbitrary criteria, eg. the node attributes. Regarding the feature itself, I'm not sure I grasp what is exactly being suggested -- it looks to me like a feature request to add a |
Having spoken to a variety of client authors, the consensus is not to have this discussion on the ES repo since this isn't an ES issue, closing. |
Currently, sniffing can be problematic due to its reliance of
http
endpoints meaning the node is sniffable. Part of this has bled through in #12792 . Its also complicated in that its hard or impossible for users to define arbitrary sniffing criteria.Goals
I think a good set of sniffing heuristics would have the following heuristics:
The Proposal
I propose that clients alter their sniffing logic to one of two modes:
So, we would use the same node metadata that is normally used for rack awareness to restrict sniffing based on a custom user defined property. A client, such as the logstash elasticsearch output, might expose the following config options:
A client would construct the list of sniffed nodes by:
/_nodes/http
APIkey == value
in the node metadata if present. If the user has not enabled this option then all results will be returned.I think this algorithm meets all the goals discussed earlier.
The text was updated successfully, but these errors were encountered: