-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/cluster metadata manifest age metric #17404
Feat/cluster metadata manifest age metric #17404
Conversation
0e97f85
to
6561ddd
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fd6-7ef4-41a6-8262-7230c841fe62:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fd6-7ef9-4b86-bda0-1ba5447f32ed:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fd6-7efc-49a9-b41f-1d6b69d03a5b:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fd6-7ef7-4b95-8ec4-03159352ac39:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fe8-c3f8-419c-acbf-8fd97656ed7a:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fe8-c3f5-4b5d-9ed6-473939561df9:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fe8-c3fb-4b08-bb65-53b6c166ee34:
new failures in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fe8-c3fe-493c-8257-62a4f1562008:
new failures in https://buildkite.com/redpanda/redpanda/builds/46914#018e82ae-5f15-413b-b330-46f460aa8919:
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46874#018e7fe8-c3fb-4b08-bb65-53b6c166ee34 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46914#018e82ae-5f15-413b-b330-46f460aa8919 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
src/v/cluster/controller_probe.cc
Outdated
auto maybe_manifest_ref | ||
= _controller.metadata_uploader().manifest(); | ||
if (!maybe_manifest_ref.has_value()) { | ||
return int64_t{-1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 0 will be a more natural value to aggregate over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also check if the manifest's timestamp is 0, and return 0 then too. It indicates we've never uploaded.
src/v/cluster/controller_probe.cc
Outdated
|
||
auto age_s = std::chrono::duration_cast<std::chrono::seconds>( | ||
now_ts - manifest.upload_time_since_epoch); | ||
return int64_t{age_s.count()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're looking for a place to test this, I think cluster/cloud_metadata/tests/uploader_test.cc would be a good place. The full application is in that test, so we should have metrics.
field _metadata_uploader is an unique_ptr that gets initialized with a value only if cloud storage is initialized. change the return type to reflect the fact that the uploader might be null
6561ddd
to
e866d80
Compare
expose the how long ago (in seconds) the manifest was uploaded. the metric is added only when the metadata_uploader is available
e866d80
to
97b4131
Compare
@@ -102,6 +104,41 @@ void controller_probe::setup_metrics() { | |||
"Number of partitions that lack quorum among replicants")) | |||
.aggregate({sm::shard_label}), | |||
}); | |||
|
|||
if (auto maybe_uploader = _controller.metadata_uploader()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please prefer if (auto x ...; x)
instead of implicit cast to bool behavior.
define a new metric
redpanda_cluster_latest_cluster_metadata_manifest_age
To tracks the age (in seconds) of the cluster_metadata_manifest saved in cloud storage.
It's updated by the controller, and should result in a Sawtooth pattern, raising steadily and dropping to 0 when a new
cluster_metadata_manifest
in uploaded.Fixes https://github.com/redpanda-data/core-internal/issues/1207
Backports Required
Release Notes
Feature
redpanda_cluster_latest_cluster_metadata_manifest_age
to track the age of the cluster_metadata_manifest in cloud storage