Gracefully handle negative counter values for prometheus-emitter & update metric types for sys/disk/* metrics#18719
Conversation
|
Thanks for the investigation and fix, @aho135!
I agree that these metrics should be configured as gauges instead. Should we also fix the metric type in the For any other metric misconfigurations, the exception handling should give us more insights into that I suppose. |
Thanks for the review @abhishekrb19 Yeah good point, let me update the metric type and take a look at other emitters |
.../prometheus-emitter/src/main/java/org/apache/druid/emitter/prometheus/PrometheusEmitter.java
Outdated
Show resolved
Hide resolved
…/druid/emitter/prometheus/PrometheusEmitter.java Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
sys/disk/* metrics
The OshiSysMonitor DiskStats uses KeyedDiff which can occasionally lead to negative values for metrics such as sys/disk/write/size when the underlying long in HWDiskStore overflows. This leads to the following uncaught exception:
This kind of metric should arguably be configured as gauge. But in the case of misconfiguration this logs out a warning instead of throwing an uncaught exception
Description
More graceful handling when a counter is unintentionally updated with a negative value
Release note
More graceful handling when a prometheus counter is unintentionally updated with a negative value.
Update the OshiSysMonitor DiskStats
sys/disk/*delta metrics from counter to gauge type for prometheus-emitter, dropwizard-emitter and statsd-emitter.Key changed/added classes in this PR
PrometheusEmitterPrometheusEmitterTestThis PR has: