Add label and mapping support in dropwizard metric exporter #98

dopuskh3 · 2016-01-04T12:45:38Z

Introcude a metric mapper (higly inspired from jmx_exporter code) that allows to map dropwizard metrics to target metric and labels.

This code will be re-used in jmx_exporter in a further review.

This component map a metric name for a metric name, label names and values by using a yaml configuration.

Allow to use a mapping configuration in DropWizard collector.

brian-brazil · 2016-01-04T12:47:32Z

simpleclient_common/pom.xml

@@ -39,6 +39,11 @@
            <artifactId>simpleclient</artifactId>
            <version>0.0.12-SNAPSHOT</version>
        </dependency>
+        <dependency>
+            <groupId>org.yaml</groupId>
+            <artifactId>snakeyaml</artifactId>


We can't add dependencies in here, it's used by the pushgateway and servlet and this may clash with other versions the user is using.

I can move the MetricMapper code in the dropwizard package then

brian-brazil · 2016-01-04T12:50:08Z

The Prometheus standard for dealing with dotted strings is glob based rather than regex based, see https://github.com/prometheus/graphite_exporter#metric-mapping-and-configuration Accordingly I'd like the dropwizard collector to use the same approach for consistency.

We only use regexes as a last resort when metrics are too unstructured to deal with in any other way.

iksaif · 2016-01-04T12:50:24Z

simpleclient_common/src/main/java/io/prometheus/client/exporter/common/MetricMapper.java

+        final String snakeCaseMetricName = snakeCasePattern.matcher(metricName).replaceAll("$1_$2").toLowerCase();
+
+        for (Rule rule : rules) {
+            Matcher matcher = null;


Maybe move the inside of the loop in a sub-method ? (This will allow you to write a unit test to test only one iteration)

dopuskh3 · 2016-01-04T13:09:06Z

Dropwizard metrics are not always dot-separated names. Consequently, using regex mapping allows a wider range of mapping. Also note that metrics exported as jmx are often exposed with dropwizard library (kafka for example), that's the reason why I chosed to use the jmx_exporter code for inspiration.
Finally, note that with caching, the regex code will be called only once per source metric and will have a negligible overhead.

brian-brazil · 2016-01-04T13:30:56Z

Dropwizard metrics are not always dot-separated names.

Can you give an example of this where the labels aren't dot-separated?

dopuskh3 · 2016-01-04T13:45:16Z

Using MetricRegistry.name() helper is not mandatory. This helper create graphite-style metric names by using dot as a separator but a lot of user are not using this helper. A lot of users are also using MixedCaseMetricNames as the last metric component when using this helper. In this case, using regex can be particularly useful.

brian-brazil · 2016-01-04T13:54:20Z

Mixed case metric names shouldn't present a problem, worst case you just leave them as-is. Dealing with other instrumentation and monitoring systems is always a question of compromises.

I'd prefer to keep things simple where possible, do you have an example of something that'd be impossible if we used the approach the graphite exporter uses? If so then we likely also need to change both the graphite and statsd exporters.

dopuskh3 · 2016-01-04T14:02:59Z

From the kafka code (using yammer metrics old name for dropwizard):

val failedProduceRequestRate = newMeter("FailedProduceRequestsPerSec", "requests", TimeUnit.SECONDS, tags)

val totalProduceRequestRate = newMeter("TotalProduceRequestsPerSec", "requests", TimeUnit.SECONDS, tags)

With the approach used in the graphite exporter, merging both metrics into a single prometheus metric is not possible.

With the proposed approach, snake case metric name + regex allows to map (.+)_produce_requests_per_sec to the same metric with a status label (status: $1)

brian-brazil · 2016-01-04T14:24:18Z

I don't see any fundamental problem there, you can still get at the data with the right labels where there are labels (in this case there aren't labels). Many times when working with 3rd party monitoring things that might make sense as labels aren't practical to do as labels. It's not a show-stopper, it's just a bit annoying.

In addition, what you're trying to do is bad practice. Separate failed and total counters are recommended as they're easier to work with.

unbrice · 2016-01-06T17:24:20Z

Not a show stopper, but it'd be cool to have internal consistency between metric names of a given deployment. This makes discoverability better and improves the user experience.

OTOH I agree care must be given to making Prometheus consistent.

I suspect that one of the reason behind the design of the rules applied to graphite metrics was to make them reversible (so as to be able to read historical data if/when we can reading back from graphite).
To contrast, the JMX exporter used a different syntax since it did not need to be reversible and using a different syntax improved internal consistency of a given deployment (and therefore discoverability and user experience).

I want to make the argument that given this understanding above, the dropwizard exporter is more similar to the JMX exporter than to the graphite potentially-reversible exporter. With this in mind, it seems that the dropwizard exporter should mimick the JMX integration, not the graphite one. This also means using YAML config files, which is similar to what happens elsewhere in the project.

Would you be willing to allow using the same rules for JMX exporter and dropwizard exporter?

brian-brazil · 2016-01-06T17:41:49Z

Not a show stopper, but it'd be cool to have internal consistency between metric names of a given deployment.

You can always instrument directly with Prometheus :) You can pull out the metrics from the java client into other monitoring systems too.

I suspect that one of the reason behind the design of the rules applied to graphite metrics was to make them reversible (so as to be able to read historical data if/when we can reading back from graphite).

I'm not aware of that being a reason. That data path wouldn't involve the graphite exporter, as it's about collection rather than long-term storage.

To contrast, the JMX exporter used a different syntax since it did not need to be reversible

The JMX exporter uses regexes as there were four different types of information with effectively no standardisation that could be part of a metric. Attempting to produce a DSL with the flexibility required would not have been sane. Thus we use our last resort, throw it all into a string and use regexes.

With dropwizard metrics following the dotted string format, this isn't a problem as there's only one type of information and a standard format. We can offer a simple DSL.

I want to make the argument that given this understanding above, the dropwizard exporter is more similar to the JMX exporter than to the graphite potentially-reversible exporter.

The graphite exporter and dropwizard collector are both dealing with dotted strings, there's no real similarity with the JMX exporter which is dealing with mBeans.

If you want to propose that the JMX exporter style configuration is the right way to handle dropwizard metrics, then you must also argue that it's the right way to handle graphite and statsd as they're dealing with the exact same type of data.

unbrice · 2016-01-11T14:15:37Z

Thanks Brian. After reading your arguments, I agree that you are right to say that a DSL like the one of the graphite exporter is appropriate for dropwizard metrics.

felixbarny · 2016-05-20T12:51:09Z

If you want to use dropwizard metrics but with rich (name, labels) metric names, you could also take a look at my open source project stagemonitor.

Stagemonitor has its own MetricRegistry but uses the Metric objects from dropwizard:

metricRegistry.counter(name("foo_bar").tag("baz", "qux").build()).inc();

I just added support for Prometheus: stagemonitor/stagemonitor@d4fa87a

brian-brazil · 2016-05-20T12:56:50Z

More integrations are always good. Do you want to send a PR to add that one to the docs?

I'd generally recommend Prometheus's Java metrics over Dropwizard. Our API and semantics are a bit better, even ignoring labels.

felixbarny · 2016-05-20T12:59:23Z

Sure. What would be the right place to document that?

Could you elaborate why your semantics are better?

brian-brazil · 2016-05-20T13:17:39Z

It'd be a 3rd party item on https://prometheus.io/docs/instrumenting/exporters/

http://www.robustperception.io/how-does-a-prometheus-counter-work/ and https://prometheus.io/docs/practices/histograms/ cover the main aspects.

felixbarny · 2016-05-20T14:53:52Z

From looking at the source of Summary and Histogram they don't seem to support recency of response times. What I mean by that is they seem to take all response times into account since the application has started and not only recent response times like the dropwizard's Timer and its forward-decaying priority reservoir does. Or did I overlook something?

brian-brazil · 2016-05-20T14:57:47Z

It works the same was as the Prometheus Counter. They export cumulative stats, and from that Prometheus can calculate the latency over any time window without bias towards some samples.

felixbarny · 2016-05-20T15:34:41Z

I guess I see. But the Summary, which could be compared to dropwizard's Timer does not support quantiles or does it?

brian-brazil · 2016-05-20T15:49:41Z

It doesn't currently, but it is being worked on.

The problem with client-side quantiles is that they're expensive (locking) and can't be aggregated. Histograms don't have those problems.

felixbarny · 2016-05-20T17:11:38Z

The problem with the histogram is that you have to know the distribution and range beforehand. For a general purpose monitoring solution like stagemonitor which strongly believes in simplicity for users (ie I don't want to require users to configure that) the histograms don't seem feasible. (Have you though about dynamically adjusting the buckets at runtime?)

How will the impl of snapshots look like? Simmilar to dropwizard's decaying reservoir?

brian-brazil · 2016-05-20T18:21:43Z

(Have you though about dynamically adjusting the buckets at runtime?)

Doesn't work, as all clients need to have identical buckets for aggregation to work.

How will the impl of snapshots look like? Simmilar to dropwizard's decaying reservoir?

It'll store samples for the last 5m or so, and calculate from there.

brian-brazil · 2017-05-17T15:07:37Z

Please reopen if you implement the glob-based approach.

Francois Visconte added 2 commits January 4, 2016 13:44

Add a MetricMapper component

f982095

This component map a metric name for a metric name, label names and values by using a yaml configuration.

Use MetricMapper in DropWizard collector

d7995c6

Allow to use a mapping configuration in DropWizard collector.

brian-brazil reviewed Jan 4, 2016
View reviewed changes

iksaif reviewed Jan 4, 2016
View reviewed changes

brian-brazil closed this May 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add label and mapping support in dropwizard metric exporter #98

Add label and mapping support in dropwizard metric exporter #98

dopuskh3 commented Jan 4, 2016

brian-brazil Jan 4, 2016

dopuskh3 Jan 4, 2016

brian-brazil commented Jan 4, 2016

iksaif Jan 4, 2016

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

unbrice commented Jan 6, 2016

brian-brazil commented Jan 6, 2016

unbrice commented Jan 11, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016 •

edited

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

brian-brazil commented May 17, 2017

Add label and mapping support in dropwizard metric exporter #98

Add label and mapping support in dropwizard metric exporter #98

Conversation

dopuskh3 commented Jan 4, 2016

brian-brazil Jan 4, 2016

Choose a reason for hiding this comment

dopuskh3 Jan 4, 2016

Choose a reason for hiding this comment

brian-brazil commented Jan 4, 2016

iksaif Jan 4, 2016

Choose a reason for hiding this comment

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

dopuskh3 commented Jan 4, 2016

brian-brazil commented Jan 4, 2016

unbrice commented Jan 6, 2016

brian-brazil commented Jan 6, 2016

unbrice commented Jan 11, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016 • edited

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

felixbarny commented May 20, 2016

brian-brazil commented May 20, 2016

brian-brazil commented May 17, 2017

brian-brazil commented May 20, 2016 •

edited