Skip to content

reef: mgr/dashboard: Show the OSDs Out and Down panels as red whenever an OSD is in Out or Down state in Ceph Cluster grafana dashboard#54538

Merged
nizamial09 merged 1 commit intoceph:reeffrom
aaSharma14:wip-63571-reef
Mar 7, 2024

Conversation

@aaSharma14
Copy link
Copy Markdown
Contributor

backport tracker: https://tracker.ceph.com/issues/63571


backport of #53650
parent tracker: https://tracker.ceph.com/issues/62969

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

…OSD is in Out or Down state in Ceph Cluster grafana dashboard

Fixes: https://tracker.ceph.com/issues/62969

Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit a29e6a8)
@aaSharma14 aaSharma14 requested a review from a team as a code owner November 17, 2023 06:51
@aaSharma14 aaSharma14 requested review from cloudbehl and nizamial09 and removed request for a team November 17, 2023 06:51
@aaSharma14 aaSharma14 added this to the reef milestone Nov 17, 2023
@Ejdesgaard
Copy link
Copy Markdown
Contributor

I just tried to import the changes and "all" is not green
Mon 20 Nov 09:24:21 CET 2023

Also, it looks like the thresholds are based on %, that doesn't work across the scaling.
red being 10% of a 24 osd cluster is too low and 10% of a 1000 osd cluster is too high.
Using an integer value instead would be much better. The simple values could be all-1 = yellow and all-2 is red
The better one would be all-1 = yellow and all-( ( osd.count/node.count )-1 ) = red

@cloudbehl
Copy link
Copy Markdown
Contributor

@Ejdesgaard The reason for the change is that we have alerts as well which uses the exact same value across cluster, if its a smaller cluster or a bigger one. This panel should reflect the same and thats what we try to do with thrash holds. Another reason, Grafana doesn't allow complex calculations for thresholds. Its by percentage or by exact value

If we want to change the logic then the fix also need to be in OSD alerts.

@nizamial09 nizamial09 merged commit 896b854 into ceph:reef Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants