Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Telemetry] Snapshot telemetry for health API as part of _xpack/health #90877

Closed
andreidan opened this issue Oct 13, 2022 · 2 comments · Fixed by #91708
Closed

[Telemetry] Snapshot telemetry for health API as part of _xpack/health #90877

andreidan opened this issue Oct 13, 2022 · 2 comments · Fixed by #91708
Assignees

Comments

@andreidan
Copy link
Contributor

andreidan commented Oct 13, 2022

We'd like to report an overview of the health API diagnostics to be able to answer the following questions:

  • is the health API being used?
  • what are the statuses returned by the API?
  • what is the most encountered red/yellow indicator?
  • what is the most encountered diagnosis per indicator?

The output might look like this:

{
  "invocations": {           // Is the health API being used?
    "total": 20,
    "explain_true": 3,
    "explain_false": 17
  },   
  "green": 12,             // What are the statuses the API returned?
  "yellow": {        
    "total": 6,          
      "causes": {    
          "slm": 5,         // What is the most encountered R/Y indicator?    
          "shards_availability": 1
       }
  },
  "red": {
    "total": 2,
    "causes": {
      "shards_availability": 2
    }
  }
  "shards_availability": {
    "yellow": {         
      "elasticsearch:health:shards_availability:increase_shard_limit_index_setting": 1  // most encountered diagnosis per indicator?
    },
    "red": {
      "elasticsearch:health:shards_availability:enable_index_allocations": 2,
      "elasticsearch:health:shards_availability:generic_explain_allocations": 18    // is shards_availability returning a generic diagnosis?
    }
  },
  "slm": {
    "yellow": {
      "elasticsearch:health:slm:not_running": 5
    }
  }
}
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@gmarouli
Copy link
Contributor

gmarouli commented Nov 16, 2022

Tasks:

@kingherc kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022
elasticsearchmachine pushed a commit that referenced this issue Nov 18, 2022
This PR introduces the collectors of the health API telemetry. Our
target telemetry has the following shape:

```
{
    "invocations": {
      "total": 22,
      "verbose_true": 12,
      "verbose_false": 10
    },
    "statuses": {
      "green": 10,
      "yellow": 4,
      "red": 8,
      "values": ["green", "yellow", "red"]
    },
    "indicators": {
      "red" : {
        "master_stability": 2,
        "ilm":2,
        "slm": 4,
        "values": ["master_stability", "ilm", "slm"]
      },
      "yellow": {
        "disk": 1,
        "shards_availability": 1,
        "master_stability": 2,
        "values": ["disk", "shards_availability", "master_stability"]
      }
    },
    "diagnoses": {
      "red": {
        "elasticsearch:health:shards_availability:primary_unassigned": 1,
        "elasticsearch:health:disk:add_disk_capacity_master_nodes": 3,
        "values": ["elasticsearch:health:shards_availability:primary_unassigned", "elasticsearch:health:disk:add_disk_capacity_master_nodes"]
      },
      "yellow": {
        "elasticsearch:health:disk:add_disk_capacity_data_nodes": 1,
        "values": [""elasticsearch:health:disk:add_disk_capacity_data_nodes"]
      }
    }
  }
```

This PR introduces the thread safe `Counters` class and the
`HealthApiStats` which keeps keeps of the metrics above based on the
health api responses that it encounters. The `HealthApiStatsAction`
collects the `HealthApiStats` of all nodes.

Part of: #90877
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants