Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export currentOP uptime query metrics #706

Merged
merged 4 commits into from Sep 15, 2023

Conversation

tregubov-av
Copy link
Contributor

@tregubov-av tregubov-av commented Sep 6, 2023

Resolves: #704

This is the implementation of the collector for notifications about too long requests are in processed.
For example:

apiVersion: v1
data:
  kube-state-metrics-mongodb.rules: |-
    groups:
    - name: kube-state-metrics-mongodb.rules
      rules:
      - alert: MongodbCurrentQueryTime
        expr: (mongodb_currentop_query_uptime > 3e+8) / 1000
        labels:
          severity: critical
        annotations:
          description: "Opid: {{ $labels.opid }}\nDesc: {{ $labels.desc }}\nNs: {{ $labels.ns }}\nOp : {{ $labels.op }}\nUptime : {{ $value }} ms\n"
          summary: "MongoDB\nCurrent slow query on: {{ $labels.endpoint }}"
kind: ConfigMap
metadata:
  labels:
    app: prometheus
    prometheus: kube-prometheus
    release: kube-prometheus-stack
    role: alert-rules
  name: kube-prometheus-exporter-mongodb
  namespace: monitoring

  • Tests passed.
  • Fix conflicts with target branch.

@tregubov-av tregubov-av requested a review from a team as a code owner September 6, 2023 23:50
@tregubov-av tregubov-av requested review from ademidoff and JiriCtvrtka and removed request for a team September 6, 2023 23:50
@it-percona-cla
Copy link

it-percona-cla commented Sep 6, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@BupycHuk BupycHuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tregubov-av,

Looks good to me.
Could you please sign the CLA?

Comment on lines 103 to 128
opid, ok := bsonMapElement["opid"].(int32)
if !ok {
logger.Errorf("Invalid type int32 assertion for 'opid': %t", ok)
break
}
namespace, ok := bsonMapElement["ns"].(string)
if !ok {
logger.Errorf("Invalid type string assertion for 'ns': %t", ok)
break
}
db, collection := splitNamespace(namespace)
op, ok := bsonMapElement["op"].(string)
if !ok {
logger.Errorf("Invalid type string assertion for 'op': %t", ok)
break
}
decs, ok := bsonMapElement["desc"].(string)
if !ok {
logger.Errorf("Invalid type string assertion for 'desc': %t", ok)
break
}
microsecs_running, ok := bsonMapElement["microsecs_running"].(int64)
if !ok {
logger.Errorf("Invalid type int64 assertion for 'microsecs_running': %t", ok)
break
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need all these assertions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for a more flexible construction of alerts, you need to have a sufficient number of labels.
In this case, you have to use type assertion to get the values from the interface.
But I don't want to panic if the code fails.
If there are suggestions on how to do better, I'm ready to rewrite this part

@tregubov-av
Copy link
Contributor Author

tregubov-av commented Sep 7, 2023

This is weird...
Locally all tests pass on all available versions MongoDB (4.2, 4.4, 5.0)
What am I doing wrong?
Prepared a pull request based on this documentation https://github.com/percona/mongodb_exporter/blob/main/docs/development-guide.md

P.S.
I think I understand what the problem is with passing integration tests.
Perhaps they are launched immediately after the docker-compose has completed, but the MongoDB cluster has not yet had time to start.
Locally, this behavior is reproduced if immediately after make test-cluster runs make test.
I came to this idea because the tests of those controllers to which I did not make changes ended with errors.

@tregubov-av tregubov-av changed the title Export currentOP uptime query metrics #704 Export currentOP uptime query metrics Sep 7, 2023
@BupycHuk
Copy link
Member

BupycHuk commented Sep 8, 2023

Hi @tregubov-av, yeah, problem isn't in your PR. We started having this problem for all PRs recently and thanks for inputs. we will try to fix that.

exporter/currentop_collector.go Outdated Show resolved Hide resolved
exporter/currentop_collector.go Show resolved Hide resolved
Co-authored-by: Artem Gavrilov <charlieblackwood7@gmail.com>
@artemgavrilov artemgavrilov merged commit bedbd21 into percona:main Sep 15, 2023
6 of 8 checks passed
@tregubov-av
Copy link
Contributor Author

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Export currentOP uptime query metrics
6 participants