Expose recovery, snapshot, and restore rate limits and throttle times in node stats #91354
Labels
:Distributed Coordination/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
>enhancement
Supportability
Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.
Team:Distributed
Meta label for distributed team (obsolete)
Description
The rate limits for recoveries and snapshots can be indirectly computed from Elasticsearch configurations, which makes it hard to ascertain their real values used during runtime because:
indices.recovery.max_bytes_per_sec
by configuring three key bandwidth metrics settings.max_restore_bytes_per_sec
is already capped by the recovery rate limit.max_snapshot_bytes_per_sec
to the recovery limit (when the node bandwidth metrics settings are configured) apart from the existing snapshot speed configuration.To make observability of these rate limits easier, the proposal is to expose the final used rate limit values (considering also any cap, e.g., by the recovery rate limit) in the node stats. Snapshot rate limits will be reported per repository.
Apart from the speed, the proposal is to also expose the throttling times more prominently. Specifically:
The text was updated successfully, but these errors were encountered: