Skip to content

Commit

Permalink
[BUGFIX] Fix reporting of `cortex_prometheus_rule_group_duration_seco…
Browse files Browse the repository at this point in the history
…nds` (cortexproject#3310)

* [BUGFIX] Fix reporting of `cortex_prometheus_rule_group_duration_seconds` metric

Fixes a small bug in the ruler metrics, where we used the wrong name to
match against the upstream metric. This caused the metric to not report
any data at all.

Signed-off-by: gotjosh <josue@grafana.com>

* Changelog entry

Signed-off-by: gotjosh <josue@grafana.com>
  • Loading branch information
gotjosh committed Oct 9, 2020
1 parent bcc8199 commit ef2c7a5
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 7 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -84,6 +84,7 @@
* [BUGFIX] Experimental Alertmanager API: Do not allow empty Alertmanager configurations or bad template filenames to be submitted through the configuration API. #3185
* [BUGFIX] Reduce failures to update heartbeat when using Consul. #3259
* [BUGFIX] When using ruler sharding, moving all user rule groups from ruler to a different one and then back could end up with some user groups not being evaluated at all. #3235
* [BUGFIX] Fixes the metric `cortex_prometheus_rule_group_duration_seconds` in the Ruler, it wouldn't report any values. #3310

## 1.4.0 / 2020-10-02

Expand Down
2 changes: 1 addition & 1 deletion pkg/ruler/manager_metrics.go
Expand Up @@ -148,7 +148,7 @@ func (m *ManagerMetrics) Collect(out chan<- prometheus.Metric) {
// If same user is later re-added, all metrics will start from 0, which is fine.

data.SendSumOfSummariesPerUser(out, m.EvalDuration, "prometheus_rule_evaluation_duration_seconds")
data.SendSumOfSummariesPerUser(out, m.IterationDuration, "cortex_prometheus_rule_group_duration_seconds")
data.SendSumOfSummariesPerUser(out, m.IterationDuration, "prometheus_rule_group_duration_seconds")

data.SendSumOfCountersPerUser(out, m.IterationsMissed, "prometheus_rule_group_iterations_missed_total")
data.SendSumOfCountersPerUser(out, m.IterationsScheduled, "prometheus_rule_group_iterations_total")
Expand Down
27 changes: 21 additions & 6 deletions pkg/ruler/manager_metrics_test.go
Expand Up @@ -61,12 +61,27 @@ cortex_prometheus_rule_evaluations_total{rule_group="group_two",user="user2"} 10
cortex_prometheus_rule_evaluations_total{rule_group="group_two",user="user3"} 100
# HELP cortex_prometheus_rule_group_duration_seconds The duration of rule group evaluations.
# TYPE cortex_prometheus_rule_group_duration_seconds summary
cortex_prometheus_rule_group_duration_seconds_sum{user="user1"} 0
cortex_prometheus_rule_group_duration_seconds_count{user="user1"} 0
cortex_prometheus_rule_group_duration_seconds_sum{user="user2"} 0
cortex_prometheus_rule_group_duration_seconds_count{user="user2"} 0
cortex_prometheus_rule_group_duration_seconds_sum{user="user3"} 0
cortex_prometheus_rule_group_duration_seconds_count{user="user3"} 0
cortex_prometheus_rule_group_duration_seconds{user="user1",quantile="0.01"} 1
cortex_prometheus_rule_group_duration_seconds{user="user1",quantile="0.05"} 1
cortex_prometheus_rule_group_duration_seconds{user="user1",quantile="0.5"} 1
cortex_prometheus_rule_group_duration_seconds{user="user1",quantile="0.9"} 1
cortex_prometheus_rule_group_duration_seconds{user="user1",quantile="0.99"} 1
cortex_prometheus_rule_group_duration_seconds_sum{user="user1"} 1
cortex_prometheus_rule_group_duration_seconds_count{user="user1"} 1
cortex_prometheus_rule_group_duration_seconds{user="user2",quantile="0.01"} 10
cortex_prometheus_rule_group_duration_seconds{user="user2",quantile="0.05"} 10
cortex_prometheus_rule_group_duration_seconds{user="user2",quantile="0.5"} 10
cortex_prometheus_rule_group_duration_seconds{user="user2",quantile="0.9"} 10
cortex_prometheus_rule_group_duration_seconds{user="user2",quantile="0.99"} 10
cortex_prometheus_rule_group_duration_seconds_sum{user="user2"} 10
cortex_prometheus_rule_group_duration_seconds_count{user="user2"} 1
cortex_prometheus_rule_group_duration_seconds{user="user3",quantile="0.01"} 100
cortex_prometheus_rule_group_duration_seconds{user="user3",quantile="0.05"} 100
cortex_prometheus_rule_group_duration_seconds{user="user3",quantile="0.5"} 100
cortex_prometheus_rule_group_duration_seconds{user="user3",quantile="0.9"} 100
cortex_prometheus_rule_group_duration_seconds{user="user3",quantile="0.99"} 100
cortex_prometheus_rule_group_duration_seconds_sum{user="user3"} 100
cortex_prometheus_rule_group_duration_seconds_count{user="user3"} 1
# HELP cortex_prometheus_rule_group_iterations_missed_total The total number of rule group evaluations missed due to slow rule group evaluation.
# TYPE cortex_prometheus_rule_group_iterations_missed_total counter
cortex_prometheus_rule_group_iterations_missed_total{user="user1"} 1
Expand Down

0 comments on commit ef2c7a5

Please sign in to comment.