New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce MetricName type to encapsulate more information about metrics. #561
Conversation
Hi, I intended to start this discussion by creating a TimeSeries Format Specification. You can find it here : https://github.com/StevenLeRoux/TimeSeries_Format_Specification I welcome every comment and I encourage the patch from @udoprog since it's a more powerful way to manipulate series since you can discover them from tags instead of having to know the structure of a MetricName if you don't use tags. |
This is a large refactoring meant to change the way metric names are represented to better suit a wide variety of backends. Quite a few metric backend systems are starting to use a general concept of 'tags' which is a step back from having too encode information in the (string) names of the metrics*. * KairosDB (http://code.google.com/p/kairosdb/), OpenTSDB (http://opentsdb.net/overview.html) Tags are meant to allow for natural grouping of similar timeseries allowing for the ability to easily correlate freely between the timeseries that incorporate them. For example, take the following two timeseries. * example.com.interface.traffic.eth0.in * example.com.interface.traffic.eth0.out Using tags, these would be represented as; * interface.traffic {host=example.com, interface=eth0, direction=in} * interface.traffic {host=example.com, interface=eth0, direction=out}
This is very interesting - did you know that the previous major version (2.x) had a MetricName? We had used one of its (many) fields to implement this concept of 'tags.' I can tell you that upgrading to the 3.x releases was rather awkward (we ended up encoding the 'tags' directly in the metric name string). |
@earthling: had no idea, that is interesting. Looking back, it was even named the same. The major distinction I can see is like you stated the grand number of fields vs Map. When considering the case of thousands of hosts with hundreds of roles with hundreds of timeseries, treating them as a hierarchy becomes problematic. |
Yes, I agree, but the popular 'Graphite' system is highly hierarchical. For this case, we encode the tags as part of the graphite metric name. Your example: Which is perhaps not ideal, but I think it is manageable. One could imagine a 'metric name formatter' or something of that sort. Do you think it would be valuable to have a means to control the position of the 'tags' when they are rendered into a singular string for system like Graphite? |
I think it's valuable for large deployment when you have indexed tags so that you can discover and fetch your metrics from this index instead of proceed to a full scan of your entries or your tree. |
Here is another initiative to start thinking in terms of tags; |
Thank you for your work on this. Some version of this will be included in v4, without a doubt. Work won't begin in earnest on v4 for some time, but I'll certainly continue this discussion at that time. |
In this particular PR the name has been replaced with a class MetricName. Instead of this wouldn't it be simpler to have tags supplied when one creates a counter, gauge or timer etc via the MetricsRegistry.counter, .gauge() etc. |
I'm confident that this MetricName approach will be the most flexible and user friendly way to implement tags. |
Could you please let me know why a MetricName would be friendly way to implement tags? Suppose if I create a counter like this: MetricRegistry.counter("name", Map<String,String>); Are there any underlying functionality that would prevent an approach I mentioned above? |
There's nothing that would prevent us from doing that. This is simply a better API, in my opinion: MetricName counterName = new MetricName("counter").tagged("tag1", "value1");
MetricRegistry.counter(counterName) // counter { tag1=value1 }
MetricRegistry.counter(counterName.tagged("tag2", "value2")); // counter { tag1=value1, tag2=value2 } |
But doing this results in 50 file changes in your PR. Whereas if you go with what I suggest you will only have to change the MetricRegistory methods to accept tags. I was wondering if there is a reason related to functionality that requires the use of MetricName? |
Committed to a branch. Thanks @udoprog! |
@ryantenney Wow, must admit I was not expecting this. kudos for looking at it again! At my company we've implemented something similar (but we went with the name MetricId), what we ended up doing was replace MetricRegistry and start from scratch. We'll be releasing the library pretty soon. These are a few things we've learned along the way;
@salilsurendran Refactoring probably does not have to happen again. E.g. the behavior of a MetricName and its underlying implementation can change freely without having to modify signatures and method invocations all over the place. Concretely about what you suggest, if you encapsulate the tags of a metric in the API, every method that interacts with metrics would have to support the new way of doing things, and in order to make use of it, all these 50 files would have to be modified at one point or another. String's are in my opinion poor metric identifiers, they require encoding to encapsulate non-trivial information (like tags). And they have very strong pre-defined behaviour that can not (should not) be extended or modified (final). Two things i'd want for a metric identifier in a framework. I hope any of this will be useful, I'd be thrilled to help out on this if possible. |
Really looking forward to this - what is the timeframe for a v4 release? |
The point of this pull request is not to have it merged. I appreciate the fact that it is a big undertaking to incorporate something like this (third party libraries especially will be a blast).
I am hoping to start the discussion, and I see no better way of doing it that through code.
Commit message
This is a large refactoring meant to change the way metric names are
represented to better suit a wide variety of backends.
Quite a few metric backend systems are starting to use a general concept of
'tags' which is a step back from having too encode information in the (string)
names of the metrics*.
* KairosDB (http://code.google.com/p/kairosdb/), OpenTSDB
(http://opentsdb.net/overview.html)
Tags are meant to allow for natural grouping of similar timeseries allowing for
the ability to easily correlate freely between the timeseries that incorporate
them.
For example, take the following two timeseries.
Using tags, these would be represented as;