Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for extracting node attributes #266

Closed
wants to merge 9 commits into from

Conversation

ShimiTaNaka
Copy link

This PR will allow setting comma separated list of node attributes to use as labels.
If the node will not have the required label, the value will fallback to empty string

@zwopir
Copy link
Member

zwopir commented Jul 10, 2019

Hi Elad,

thanks for the PR. Can you explain the use case for the node attributes? Do they differ over the nodes in a cluster?
I'd only like to merge features into the exporter the majority of users will benefit from. As I don't know node attributes and how widely they are used a short explanation of your use case would help me decide.

@ShimiTaNaka
Copy link
Author

Yes
Mostly, this: https://www.elastic.co/blog/implementing-hot-warm-cold-in-elasticsearch-with-index-lifecycle-management
In this pattern, not all nodes are equal, and this I will probably want to monitor the storage per data tier.
Regardless, in general, it's possibly to alter index routing according to attributes, but, as said, the main incentive is the Hot-Warm-Cold architecture

@zwopir
Copy link
Member

zwopir commented Jul 10, 2019

ok, yes. Make sense. The problem I see with the PR is that the node attributes are only added to the metrics of the node subcollector. The other subcollectors don't receive the node attributes labels. A way out would be an approach comparable to #207 (comment)
and https://www.robustperception.io/how-to-have-labels-for-machine-roles

So I suggest creating a const metric elasticsearch_node_attributes{...,<attr_name>="<attr_value>"} = (0|1) for each of the envvar-specified node attributes. Additionally to the approach linked above, I'd suggest to set the metric value to 0 if the provided node attribute was not found

For example
elasticsearch_node_attributes{node_attr_data="hot"} = 1 for found attributes
elasticsearch_node_attributes{node_attr_data=""} = 0 for not found attributes

To avoid metric label clashes, we should prefix the label name with (for example) node_attr_. Otherwise it's a matter of time a user has clashing attribute/label names which would cause the exporter to crash. IIRC the prometheus go lib doesn't handle duplicate label definitions too gracefully.

@ShimiTaNaka
Copy link
Author

ok, I see what you mean, and then do a join to match the attribute and the rest of the data, that will also greatly affect the grafana dashboard but indeed will make it easier if for any reason the attributes will be required outside the nodes (although that's the only place where they are returned in the response AFAIK)

@zwopir
Copy link
Member

zwopir commented Jul 10, 2019

not having consistent labels (that are specific to a ES node) was actually a problem before and was fixed by the clusterinfo subcollector. So I appreciate you're going the extra mile implementing the feature as proposed

@ShimiTaNaka
Copy link
Author

@zwopir Updated the PR with the additional metrics

@zwopir
Copy link
Member

zwopir commented Jul 11, 2019

the proposal of the const metric was meant as a replacement for the additional labels

@ShimiTaNaka
Copy link
Author

ShimiTaNaka commented Jul 11, 2019

while I agree it can replace it, I don't see how it's different than the node roles which are also set in the node metrics in additional to the const metrics.
Since the attributes are of the node, while we can complicate the prometheus queries a bit for the node metrics, I don't see why we must complicate it, it will ease up filtering GC/OS metrics by node attributes and I assume this is also part of the reason ES put them only in the node status API to begin with

@anjaweigel-tng
Copy link

Any updates on this? For many metrics, we would really like to differentiate between the data tiers (e.g. hot and frozen nodes). Is there some solution in place I am not seeing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants