Implement DRBD metrics for disk-state #26

MalloZup · 2019-10-11T08:28:05Z

Description

This pr will implement this metric:


# HELP ha_cluster_drbd_resource_disk_state show per resource name, its role, the volume and disk_state (Diskless,Attaching, Failed, Negotiating, Inconsistent, Outdated, DUnknown, Consistent, UpToDate)
# TYPE ha_cluster_drbd_resource_disk_state gauge
ha_cluster_drbd_resource{disk_state="uptodate",resource_name="1-single-0",role="Secondary",volume="0"} 1
ha_cluster_drbd_resource{disk_state="uptodate",resource_name="1-single-1",role="Secondary",volume="0"} 1
ha_cluster_drbd_resource{disk_state="uptodate",resource_name="vg1",role="Secondary",volume="0"} 1
ha_cluster_drbd_resource{disk_state="uptodate",resource_name="vg2",role="Secondary",volume="0"} 1
ha_cluster_drbd_resource{disk_state="uptodate",resource_name="vg3",role="Secondary",volume="0"} 1

Usage:

Check if the state of DRBD disk are UptoDate, out of sync etc/
monitor or alert drbd resource disks

what is it is missing:

parse and populated types.
Tests
set the prometheus exporter metric according to metric

Add function to get raw JSON

doesn't have any SBD_DEVICE set. In this case just catch the error and continue. The rationale behind is that in some systems user could forget to set this in config file so we don't want to panic the exporter because an index error

We need to sleep the same timeout if a X metric encounter an error, so they metrics are executed always at same time. Example: if sbd metric fail and we continue without timeout, the execution will be 10X or more faster then a normal metric with timeout.

- add reset for drbd metric, this is needed in case we lost a disk and since a disk is a label, if we wouldn't destroy/reset a metric at each time, we could contain a zombie disk metric - implement map from value to number

stefanotorresi · 2019-10-14T10:41:00Z

Just a nitpicky remark about naming (yeah, I might be kind of a naming freak):
The _state part in ha_cluster_drbd_resource_disk_state feels somewhat redundant. I mean, do we track anything other than the disks state?
Also, are there other disks other than "resource disks"? If not, could this be just ha_cluster_drbd_disk?
In general, what other ha_cluster_drbd_* metrics we have in our future plans? So that we can plan their naming accordingly.

MalloZup · 2019-10-14T12:18:36Z

Good point. I think we can remove the last 2 words

drbd_metrics.go

ha_cluster_exporter.go

Done thx

MalloZup · 2019-10-14T15:51:03Z

@storresi can web merge?

stefanotorresi

sorry @MalloZup I actually forgot to hit the green button 😝

MalloZup · 2019-10-14T18:41:41Z

OK THX! I have added it to the release draft. We will wait a bit this time before releasing a new rpm since we need to do some refactoring etc

MalloZup added 12 commits October 10, 2019 22:06

add skeleton of 1st metric

e943fde

Add drbd first minimal implementation

ef196df

Add function to get raw JSON

FIX: fix the case where sbd config file

1735e6d

doesn't have any SBD_DEVICE set. In this case just catch the error and continue. The rationale behind is that in some systems user could forget to set this in config file so we don't want to panic the exporter because an index error

Add types for parsing json and tests

73f99e4

Add missing fields and tests for parsing

0fca535

Implement parser function for drbd

b97ab91

Set metric accordingly to data. Experimental

632dd57

Minor: update todos and remove printf

a0f04d1

Implement resetDrbdMetric function, drbd metric

0b5498b

- add reset for drbd metric, this is needed in case we lost a disk and since a disk is a label, if we wouldn't destroy/reset a metric at each time, we could contain a zombie disk metric - implement map from value to number

Move types from pacemaker to seperate file

1e973f8

simplify metric

0bb2b6b

MalloZup mentioned this pull request Oct 14, 2019

refactor pacemaker metrics like DRBD one #30

Closed

MalloZup requested a review from stefanotorresi October 14, 2019 10:11

MalloZup changed the title ~~WIP: Implement DRBD metrics~~ Implement DRBD metrics for disk-state Oct 14, 2019

MalloZup force-pushed the drbd-metric branch from 610d798 to deb2b8b Compare October 14, 2019 13:01

stefanotorresi previously requested changes Oct 14, 2019

View reviewed changes

drbd_metrics.go Outdated Show resolved Hide resolved

ha_cluster_exporter.go Outdated Show resolved Hide resolved

Make name of metric more consistent

79866e9

MalloZup force-pushed the drbd-metric branch from 605da52 to 79866e9 Compare October 14, 2019 14:17

MalloZup requested a review from stefanotorresi October 14, 2019 14:18

stefanotorresi approved these changes Oct 14, 2019

View reviewed changes

MalloZup merged commit d97a702 into master Oct 14, 2019

MalloZup deleted the drbd-metric branch October 16, 2019 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement DRBD metrics for disk-state #26

Implement DRBD metrics for disk-state #26

MalloZup commented Oct 11, 2019 •

edited

Loading

stefanotorresi commented Oct 14, 2019

MalloZup commented Oct 14, 2019

MalloZup commented Oct 14, 2019

stefanotorresi left a comment

MalloZup commented Oct 14, 2019

Implement DRBD metrics for disk-state #26

Implement DRBD metrics for disk-state #26

Conversation

MalloZup commented Oct 11, 2019 • edited Loading

Description

Usage:

what is it is missing:

stefanotorresi commented Oct 14, 2019

MalloZup commented Oct 14, 2019

MalloZup commented Oct 14, 2019

stefanotorresi left a comment

Choose a reason for hiding this comment

MalloZup commented Oct 14, 2019

MalloZup commented Oct 11, 2019 •

edited

Loading