Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2014391: ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsM… #301

Merged
merged 1 commit into from Oct 15, 2021

Conversation

agarwal-mudit
Copy link
Member

…issingReplicas'

CephMgrIsAbsent

This alert initially had the following query

absent(up{job="rook-ceph-mgr"})

which will fire when the 'up' query is not present, but had two flows
a. it will not be fired if 'up' provides a result with ZERO value
b. it will not give any fields in the metric, so 'namespace' was missing

when the above query was replaced with the following,

up{job="rook-ceph-mgr"} == 0

query had the following shortage
a. whenever mgr pod is completely down (like 'replicas' set to ZERO
and 'mgr' is not coming up), 'up' query will not give any result.

Thus we had to combine both the queries to get results in both the scenarios.

CephMgrIsMissingReplicas

This query previously was,

sum(up{job="rook-ceph-mgr"}) < 1

had the same structure as the above (Absent) query, but it's
intention was to check the no: of 'replicas' count for ceph mgr.
Now it is changed to a kube query which handles the replicas count.

Signed-off-by: Arun Kumar Mohan amohan@redhat.com
(cherry picked from commit cfa2c2d)

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

…issingReplicas'

CephMgrIsAbsent
----------------
This alert initially had the following query

absent(up{job="rook-ceph-mgr"})

which will fire when the 'up' query is not present, but had two flows
  a. it will not be fired if 'up' provides a result with ZERO value
  b. it will not give any fields in the metric, so 'namespace' was missing

when the above query was replaced with the following,

up{job="rook-ceph-mgr"} == 0

query had the following shortage
  a. whenever mgr pod is completely down (like 'replicas' set to ZERO
and 'mgr' is not coming up), 'up' query will not give any result.

Thus we had to combine both the queries to get results in both the scenarios.

CephMgrIsMissingReplicas
------------------------
This query previously was,

sum(up{job="rook-ceph-mgr"}) < 1

had the same structure as the above (Absent) query, but it's
intention was to check the no: of 'replicas' count for ceph mgr.
Now it is changed to a kube query which handles the replicas count.

Signed-off-by: Arun Kumar Mohan <amohan@redhat.com>
(cherry picked from commit cfa2c2d)
@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Oct 15, 2021
@openshift-ci
Copy link

openshift-ci bot commented Oct 15, 2021

@agarwal-mudit: This pull request references Bugzilla bug 2014391, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

2 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @ebenahar

In response to this:

Bug 2014391: ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsM…

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Oct 15, 2021

@openshift-ci[bot]: GitHub didn't allow me to request PR reviews from the following users: ebenahar.

Note that only red-hat-storage members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

@agarwal-mudit: This pull request references Bugzilla bug 2014391, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

2 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @ebenahar

In response to this:

Bug 2014391: ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsM…

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Oct 15, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: agarwal-mudit, leseb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2021
@leseb leseb merged commit 028aecc into red-hat-storage:release-4.9 Oct 15, 2021
@openshift-ci
Copy link

openshift-ci bot commented Oct 15, 2021

@agarwal-mudit: All pull requests linked via external trackers have merged:

Bugzilla bug 2014391 has been moved to the MODIFIED state.

In response to this:

Bug 2014391: ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsM…

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
3 participants