Last x minutes API output mode for counters #77076
Labels
:Core/Infra/REST API
REST infrastructure and utilities
>enhancement
Team:Core/Infra
Meta label for core/infra team
Most API outputs provide statistical counters of activity since cluster or in other cases, node start. For example, the indices stats API gives you the following statistics for all indexing activity:
Or similarly , node stats will report things like bulk rejections per node:
Currently there are two ways that we can make measurements from these numbers:
_stats?level=shards
and b) there's too much data to process offline , e.g., to send to Elastic Support for analysis. One potential solution for this might be like a diagnostic mode for the monitoring metricbeat.A 3rd option, which would be conducive to diagnostic collection would be for all counters in API output to report a last x minutes count (1,5,15 ~like "load", e.g.). Perhaps the API could be called with a query string parameters that indicates counters should output with last x instead of since uptime.
This type of data will allow us to do complex cluster workload analysis, for example, to detect imbalance due to sharding issues or hardware performance issues The method support currently uses to do this analysis is to have customer capture 2 diagnostics separated by 15 minutes, which end-to-end is rather manual.
The text was updated successfully, but these errors were encountered: