Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticSearch agent check should only query the local node. #1181

Merged
merged 1 commit into from
Dec 18, 2014

Conversation

jonaf
Copy link
Contributor

@jonaf jonaf commented Oct 28, 2014

The current ElasticSearch agent check will cause all nodes in an ElasticSearch cluster to cease reporting metrics if one ElasticSearch node fails or is slow to respond to the request. This is because ElasticSearch will query all nodes in the cluster before providing a response. In practice, Bazaarvoice has seen this produce troubling side-effects in Datadog graphs: all the nodes in the cluster suddenly stop reporting metrics. This is particularly problematic when looking at historical data and trying to draw conclusions based on the data -- because the Datadog graphs draw a line between two visible data points, it is rarely obvious that no data was reported during a certain period of time; rather, it just looks like very unusual and sometimes frightening metrics were being sent from the ElasticSearch nodes.

The easy way to fix all of this is, of course, to simply inform ElasticSearch that it should only look at the local node stats, and allow all the other ElasticSearch nodes to do the same.

Tested with ElasticSearch 0.90.13 and later.

@remh
Copy link
Contributor

remh commented Oct 28, 2014

Thanks a lot for the detailed feedback and the PR @jonaf ! We are going to review that.

@remh
Copy link
Contributor

remh commented Dec 18, 2014

Thanks a bunch @jonaf . We're merging it.

remh added a commit that referenced this pull request Dec 18, 2014
ElasticSearch agent check should only query the local node.
@remh remh merged commit 3c21c2a into DataDog:master Dec 18, 2014
olivielpeau added a commit that referenced this pull request Jun 4, 2015
The hostname matching is not needed anymore as:
- since PR #1181 we only ask the _local node for stats when `is_external` is set to `false`
- we don't match the hostname at all when `is_external` is `true`

Matching hostnames can also filter out legitimate data when the
local elasticsearch node reports a different hostname.

See also issue #457
olivielpeau added a commit that referenced this pull request Jun 4, 2015
The hostname matching is not needed anymore as:
- since PR #1181 we only ask the _local node for stats when `is_external` is set to `false`
- we don't match the hostname at all when `is_external` is `true`

Matching hostnames can also filter out legitimate data when the
local elasticsearch node reports a different hostname.

See also issue #457
olivielpeau added a commit that referenced this pull request Jun 5, 2015
The hostname matching is not needed anymore as:
- since PR #1181 we only ask the _local node for stats when `is_external` is set to `false`
- we don't match the hostname at all when `is_external` is `true`

Matching hostnames can also filter out legitimate data when the
local elasticsearch node reports a different hostname.

See also issue #457
olivielpeau added a commit that referenced this pull request Jun 9, 2015
The hostname matching is not needed anymore as:
- since PR #1181 we only ask the _local node for stats when `is_external` is set to `false`
- we don't match the hostname at all when `is_external` is `true`

Matching hostnames can also filter out legitimate data when the
local elasticsearch node reports a different hostname.

See also issue #457
olivielpeau added a commit that referenced this pull request Jun 9, 2015
The hostname matching is not needed anymore as:
- since PR #1181 we only ask the _local node for stats when `is_external` is set to `false`
- we don't match the hostname at all when `is_external` is `true`

Matching hostnames can also filter out legitimate data when the
local elasticsearch node reports a different hostname.

See also issue #457
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants