Skip to content

Commit

Permalink
monitoring: fix CephPoolGrowthWarning expression
Browse files Browse the repository at this point in the history
Prometheus found duplicate series for the match group `pool_id`
and `instance` when evaluating the CephPoolGrowthWarning alert
expression. This alert has an evaluation interval of 2 days and
did not consider the changing POD names during the Rook migration process.

This commit adds the `pod` label to the set of considered labels.

Signed-off-by: Matej Feder <matej.feder@dnation.cloud>
(cherry picked from commit 6ba084f)
  • Loading branch information
matofeder authored and mergify[bot] committed Jun 17, 2024
1 parent d9ca8ee commit 8521286
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion deploy/charts/rook-ceph-cluster/prometheus/localrules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ groups:
annotations:
description: "Pool '{{ $labels.name }}' will be full in less than 5 days assuming the average fill-up rate of the past 48 hours."
summary: "Pool growth rate may soon exceed capacity"
expr: "(predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id, instance) group_right() ceph_pool_metadata) >= 95"
expr: "(predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id, instance, pod) group_right() ceph_pool_metadata) >= 95"
labels:
oid: "1.3.6.1.4.1.50495.1.2.1.9.2"
severity: "warning"
Expand Down
2 changes: 1 addition & 1 deletion deploy/examples/monitoring/localrules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -537,7 +537,7 @@ spec:
annotations:
description: "Pool '{{ $labels.name }}' will be full in less than 5 days assuming the average fill-up rate of the past 48 hours."
summary: "Pool growth rate may soon exceed capacity"
expr: "(predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id, instance) group_right() ceph_pool_metadata) >= 95"
expr: "(predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id, instance, pod) group_right() ceph_pool_metadata) >= 95"
labels:
oid: "1.3.6.1.4.1.50495.1.2.1.9.2"
severity: "warning"
Expand Down

0 comments on commit 8521286

Please sign in to comment.