support per-drive crit/warning thresholds by adding metrics when needed#374
Conversation
By default, snclient does not add unnecessary metrics if they do not occur in a condition. This is done by checking the operand of conditions using check.HasThreshold() function Adding used_pct adds metrics per drive: '<drive> used' and '<drive> used %' , but if it is not present these metrics will be missing. ``` ./snclient -vvv --logfile stdout run check_drivesize "drive=/" "warn=used_pct gt 50" show-all OK - / 428.610 GiB/935.929 GiB (45.8%) |'/ used'=460216111104B;502473211904;904451781427;0;1004946423808 '/ used %'=45.8%;50;90;0;100 ``` But adding a bare warn='used_pct gt 90' would affect all drives. To check multiple drives while specifying different thresholds for each drive, we need to add the percentage usage metrics. Metrics are also checked when building finalizing the check, and can influence the final state. ``` ./snclient -vvv --logfile stdout run check_drivesize "drive=/" "drive=/tmp" "warn='/ used %' gt 30" "crit='/tmp used %' gt 66" show-all WARNING - / 428.867 GiB/935.929 GiB (45.8%), /tmp 961.945 MiB/31.127 GiB (3.0%) |'/ used'=460492066816B;;;0;1004946423808 '/ used %'=45.8%;30;;0;100 '/tmp used'=1008672768B;;;0;33422544896 '/tmp used %'=3%;;66;0;100 ``` Detect conditions where the operand is named '<drive> used %', if there is a condition using that as operator, add usage metrics for that drive as well. This only works on that drive, and since the operand '<drive> used %' is different for each drive, it wont effect other drives perfdata.
these conditions have their keyword transformed to '<drive> used %' so that the metric name matches the condition name in condition.String() , check if the keyword is in the original, if it isnt, its likely changed. print it out separately.
|
Also supports ' used_pct' metrics as well. These have their keywords transformed to ' used %' to match what would be the metric name for usage percentages. As their keywords are converted, and checked just like ' used %' in conditions, they trigger adding metrics for a drive. During the metrics check, they take effect and can raise warning/critical. In addition, when calling a Condition.String() output, check if the keyword is contained in the original string. If the keyword is not there, likely due to it being transformed prior at some point, append the new keyword to output. |
|
very nice. Could you add a sentence to the docs and mention this possibility as an example maybe? |
done |
By default, snclient does not add unnecessary metrics if they do not occur in a condition. This is done by checking the operand of conditions using check.HasThreshold() function
Adding used_pct adds metrics per drive: '[drive] used' and '[drive] used %' , but if it is not present these metrics will be missing.
But adding a bare warn='used_pct gt 90' would affect all drives. To check multiple drives while specifying different thresholds for each drive, we need to add the percentage usage metrics. Metrics are also checked when finalizing the check, and can influence the final state.
Detect conditions where the operand is named '[drive] used %', if there is a condition using that as operator, add usage metrics for that drive as well. This only works on that drive, and since the operand '[drive] used %' is different for each drive, it wont effect other drives perfdata.