[Data Tiers] Add telemetry enhancements for data tiers utilization #71204

jethr0null · 2021-04-01T22:44:38Z

Telemetry was added for data tiers in this pr.

Currently collected data:

node_count :: number of nodes with this tier/role
index_count :: number of indices on this tier
total_shard_count :: total number of shards for all nodes in this tier
primary_shard_count :: number of primary shards for all nodes in this tier
doc_count :: number of documents for all nodes in this tier
total_size_bytes :: total number of bytes for all shards for all nodes in this tier
primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
primary_shard_size_median_bytes :: median shard size for primary shard in this tier
primary_shard_size_mad_bytes :: median absolute deviation of shard size for primary shard in this tier

Challenges with the current data:

The existing telemetry does not enable us to distinguish actual utilization and will wind up reporting things like index_count in multiple tiers if the node is tagged with multiple node roles. In order to be able to accurately report on the actual utilization of each tier, we need to add telemetry which would associate these fields with the role that the data is currently associated with.

For example, I would expect something like the following query of our telemetry data should accurately return only data that is “actively associated” with the warm tier: stack_stats.xpack.data_tiers.data_warm.index_count > 1

A concrete example of how this data will be used is to report on and visualize the number of unique clusters that have data residing on a given tier (the ability to drill down into more detailed stats such as the doc_count or index_count for the data residing on each tier would also be useful).

It would also be useful to be able to distinguish whether the tier an index is located on matches its first preference (index.routing.allocation.include._tier_preference). So for example, an index might specify cold as its first preference but if no cold nodes are available it could reside on its tier of second preference (say warm). We could use this distinction to suggest actions to the user such as scaling or enabling autoscaling.

cc @dakrone @sajjadwahmed

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-04-01T22:44:40Z

Pinging @elastic/es-core-features (Team:Core/Features)

dakrone · 2021-04-26T20:21:47Z

distinguish between what roles a node is capable of acting as versus what role(s) it is actively acting as

Can you explain this one a little more? I don't think I understand what you mean by "actively acting as".

jethr0null · 2021-04-26T20:51:31Z

Sure thing. I updated the original comment for clarity/corrections.

elasticsearchmachine · 2023-11-16T18:13:36Z

Pinging @elastic/es-data-management (Team:Data Management)

jethr0null added >enhancement :Core/Features/Features Team:Data Management Meta label for data/management team labels Apr 1, 2021

jbaiera mentioned this issue Aug 10, 2021

Aggregate data tier index stats separately from node stats #76322

Merged

dakrone added :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Data Management/Other labels Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data Tiers] Add telemetry enhancements for data tiers utilization #71204

[Data Tiers] Add telemetry enhancements for data tiers utilization #71204

jethr0null commented Apr 1, 2021 •

edited

elasticmachine commented Apr 1, 2021

dakrone commented Apr 26, 2021

jethr0null commented Apr 26, 2021

elasticsearchmachine commented Nov 16, 2023

[Data Tiers] Add telemetry enhancements for data tiers utilization #71204

[Data Tiers] Add telemetry enhancements for data tiers utilization #71204

Comments

jethr0null commented Apr 1, 2021 • edited

elasticmachine commented Apr 1, 2021

dakrone commented Apr 26, 2021

jethr0null commented Apr 26, 2021

elasticsearchmachine commented Nov 16, 2023

jethr0null commented Apr 1, 2021 •

edited