Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: fix CephMonQuorumAtRisk Alert Query #8652

Merged
merged 1 commit into from Sep 7, 2021

Conversation

anmolsachan
Copy link
Contributor

@anmolsachan anmolsachan commented Sep 7, 2021

The updated query will work for single mon deployment and will work better for deployments with five or more mons.

Signed-off-by: Anmol Sachan anmol13694@gmail.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #8156

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

[skip ci]

@mergify mergify bot added the ceph main ceph tag label Sep 7, 2021
The updated query will work for single mon deployment and will work better for
deployments with five or more mons.

Signed-off-by: Anmol Sachan <anmol13694@gmail.com>
@@ -80,7 +80,7 @@ spec:
severity_level: error
storage_type: ceph
expr: |
count(ceph_mon_quorum_status{job="rook-ceph-mgr"} == 1) <= ((count(ceph_mon_metadata{job="rook-ceph-mgr"}) % 2) + 1)
count(ceph_mon_quorum_status{job="rook-ceph-mgr"} == 1) <= (floor(count(ceph_mon_metadata{job="rook-ceph-mgr"}) / 2) + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn @anmolsachan recently, we allowed having even mons too, is this something we need to handle here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formula should also work for an even number of mons.

@@ -80,7 +80,7 @@ spec:
severity_level: error
storage_type: ceph
expr: |
count(ceph_mon_quorum_status{job="rook-ceph-mgr"} == 1) <= ((count(ceph_mon_metadata{job="rook-ceph-mgr"}) % 2) + 1)
count(ceph_mon_quorum_status{job="rook-ceph-mgr"} == 1) <= (floor(count(ceph_mon_metadata{job="rook-ceph-mgr"}) / 2) + 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formula should also work for an even number of mons.

@travisn travisn merged commit 2983319 into rook:master Sep 7, 2021
leseb added a commit that referenced this pull request Sep 8, 2021
ceph: fix CephMonQuorumAtRisk Alert Query (backport #8652)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag skip-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PrometheusRule CephMonQuorumAtRisk rule is wrong
5 participants