[Telemetry] Add telemetry around the time it is taking for grabbing the telemetry stats #132233
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Adds new telemetry around the execution duration grabbing usage:
isReady
state of each collectorfetch
objects from each collectorfetch
andisReady
The overall durations show the overall health of the collection mechanism, while the breakdown objects help diagnose specific collectors and improve upon them.
Why is this in telemetry and not in CI?
Adding limits and checks in CI is a good idea for catching early issues. Collecting these metrics via telemetry will also help us identify bottlenecks against real-world use cases from Kibanas in the wild.
Changes
usage_collector_stats
collector:total_is_ready_duration
total_fetch_duration
total_duration
is_ready_duration_breakdown
fetch_duration_breakdown
usage_collector_stats
to a Collector with a proper schema, for a more ergonomic codebase and to include the schema automatically into the schema files.usage_collector_stats
collector schema and verify it.usage_collector_stats
collectorWhat does the usage collector stats look like?
Notes
Closes #119468