New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MetricName type to encapsulate more information about metrics. #561

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@udoprog
Contributor

udoprog commented Apr 8, 2014

The point of this pull request is not to have it merged. I appreciate the fact that it is a big undertaking to incorporate something like this (third party libraries especially will be a blast).

I am hoping to start the discussion, and I see no better way of doing it that through code.

Commit message

This is a large refactoring meant to change the way metric names are
represented to better suit a wide variety of backends.

Quite a few metric backend systems are starting to use a general concept of
'tags' which is a step back from having too encode information in the (string)
names of the metrics*.

* KairosDB (http://code.google.com/p/kairosdb/), OpenTSDB
(http://opentsdb.net/overview.html)

Tags are meant to allow for natural grouping of similar timeseries allowing for
the ability to easily correlate freely between the timeseries that incorporate
them.

For example, take the following two timeseries.

  • example.com.interface.traffic.eth0.in
  • example.com.interface.traffic.eth0.out

Using tags, these would be represented as;

  • interface.traffic {host=example.com, interface=eth0, direction=in}
  • interface.traffic {host=example.com, interface=eth0, direction=out}
@StevenLeRoux

This comment has been minimized.

Show comment
Hide comment
@StevenLeRoux

StevenLeRoux Apr 8, 2014

Hi,

I intended to start this discussion by creating a TimeSeries Format Specification.

You can find it here : https://github.com/StevenLeRoux/TimeSeries_Format_Specification

I welcome every comment and I encourage the patch from @udoprog since it's a more powerful way to manipulate series since you can discover them from tags instead of having to know the structure of a MetricName if you don't use tags.

StevenLeRoux commented Apr 8, 2014

Hi,

I intended to start this discussion by creating a TimeSeries Format Specification.

You can find it here : https://github.com/StevenLeRoux/TimeSeries_Format_Specification

I welcome every comment and I encourage the patch from @udoprog since it's a more powerful way to manipulate series since you can discover them from tags instead of having to know the structure of a MetricName if you don't use tags.

Introduce MetricName type to encapsulate more information about metrics.
This is a large refactoring meant to change the way metric names are
represented to better suit a wide variety of backends.

Quite a few metric backend systems are starting to use a general concept of
'tags' which is a step back from having too encode information in the (string)
names of the metrics*.

* KairosDB (http://code.google.com/p/kairosdb/), OpenTSDB
(http://opentsdb.net/overview.html)

Tags are meant to allow for natural grouping of similar timeseries allowing for
the ability to easily correlate freely between the timeseries that incorporate
them.

For example, take the following two timeseries.

* example.com.interface.traffic.eth0.in
* example.com.interface.traffic.eth0.out

Using tags, these would be represented as;

* interface.traffic {host=example.com, interface=eth0, direction=in}
* interface.traffic {host=example.com, interface=eth0, direction=out}
@earthling

This comment has been minimized.

Show comment
Hide comment
@earthling

earthling Apr 9, 2014

This is very interesting - did you know that the previous major version (2.x) had a MetricName? We had used one of its (many) fields to implement this concept of 'tags.' I can tell you that upgrading to the 3.x releases was rather awkward (we ended up encoding the 'tags' directly in the metric name string).

earthling commented Apr 9, 2014

This is very interesting - did you know that the previous major version (2.x) had a MetricName? We had used one of its (many) fields to implement this concept of 'tags.' I can tell you that upgrading to the 3.x releases was rather awkward (we ended up encoding the 'tags' directly in the metric name string).

@udoprog

This comment has been minimized.

Show comment
Hide comment
@udoprog

udoprog Apr 10, 2014

Contributor

@earthling: had no idea, that is interesting. Looking back, it was even named the same.

The major distinction I can see is like you stated the grand number of fields vs Map.
For my users it will be very important to have the ability to specify arbitrary tags that can correctly be propagated to the backend so that their meaning can carry through the entire application stack without additional translation steps.

When considering the case of thousands of hosts with hundreds of roles with hundreds of timeseries, treating them as a hierarchy becomes problematic.

Contributor

udoprog commented Apr 10, 2014

@earthling: had no idea, that is interesting. Looking back, it was even named the same.

The major distinction I can see is like you stated the grand number of fields vs Map.
For my users it will be very important to have the ability to specify arbitrary tags that can correctly be propagated to the backend so that their meaning can carry through the entire application stack without additional translation steps.

When considering the case of thousands of hosts with hundreds of roles with hundreds of timeseries, treating them as a hierarchy becomes problematic.

@earthling

This comment has been minimized.

Show comment
Hide comment
@earthling

earthling Apr 10, 2014

Yes, I agree, but the popular 'Graphite' system is highly hierarchical. For this case, we encode the tags as part of the graphite metric name. Your example:
interface.traffic {host=example.com, interface=eth0, direction=in}
Would become something like:
interface.traffic.hosts.example_com.interfaces.eth0.directions.in

Which is perhaps not ideal, but I think it is manageable. One could imagine a 'metric name formatter' or something of that sort. Do you think it would be valuable to have a means to control the position of the 'tags' when they are rendered into a singular string for system like Graphite?

earthling commented Apr 10, 2014

Yes, I agree, but the popular 'Graphite' system is highly hierarchical. For this case, we encode the tags as part of the graphite metric name. Your example:
interface.traffic {host=example.com, interface=eth0, direction=in}
Would become something like:
interface.traffic.hosts.example_com.interfaces.eth0.directions.in

Which is perhaps not ideal, but I think it is manageable. One could imagine a 'metric name formatter' or something of that sort. Do you think it would be valuable to have a means to control the position of the 'tags' when they are rendered into a singular string for system like Graphite?

@StevenLeRoux

This comment has been minimized.

Show comment
Hide comment
@StevenLeRoux

StevenLeRoux Apr 10, 2014

I think it's valuable for large deployment when you have indexed tags so that you can discover and fetch your metrics from this index instead of proceed to a full scan of your entries or your tree.

StevenLeRoux commented Apr 10, 2014

I think it's valuable for large deployment when you have indexed tags so that you can discover and fetch your metrics from this index instead of proceed to a full scan of your entries or your tree.

@udoprog

This comment has been minimized.

Show comment
Hide comment
@udoprog

udoprog Apr 24, 2014

Contributor

Here is another initiative to start thinking in terms of tags;
http://metrics20.org/

Contributor

udoprog commented Apr 24, 2014

Here is another initiative to start thinking in terms of tags;
http://metrics20.org/

@ryantenney ryantenney added this to the 4.0.0 milestone Aug 18, 2014

@ryantenney

This comment has been minimized.

Show comment
Hide comment
@ryantenney

ryantenney Aug 18, 2014

Member

Thank you for your work on this. Some version of this will be included in v4, without a doubt. Work won't begin in earnest on v4 for some time, but I'll certainly continue this discussion at that time.

Member

ryantenney commented Aug 18, 2014

Thank you for your work on this. Some version of this will be included in v4, without a doubt. Work won't begin in earnest on v4 for some time, but I'll certainly continue this discussion at that time.

@salilsurendran

This comment has been minimized.

Show comment
Hide comment
@salilsurendran

salilsurendran Jun 10, 2015

In this particular PR the name has been replaced with a class MetricName. Instead of this wouldn't it be simpler to have tags supplied when one creates a counter, gauge or timer etc via the MetricsRegistry.counter, .gauge() etc.

salilsurendran commented Jun 10, 2015

In this particular PR the name has been replaced with a class MetricName. Instead of this wouldn't it be simpler to have tags supplied when one creates a counter, gauge or timer etc via the MetricsRegistry.counter, .gauge() etc.

@ryantenney

This comment has been minimized.

Show comment
Hide comment
@ryantenney

ryantenney Jun 10, 2015

Member

I'm confident that this MetricName approach will be the most flexible and user friendly way to implement tags.

Member

ryantenney commented Jun 10, 2015

I'm confident that this MetricName approach will be the most flexible and user friendly way to implement tags.

@salilsurendran

This comment has been minimized.

Show comment
Hide comment
@salilsurendran

salilsurendran Jun 10, 2015

Could you please let me know why a MetricName would be friendly way to implement tags? Suppose if I create a counter like this:

MetricRegistry.counter("name", Map<String,String>);
counter.increment();

Are there any underlying functionality that would prevent an approach I mentioned above?

salilsurendran commented Jun 10, 2015

Could you please let me know why a MetricName would be friendly way to implement tags? Suppose if I create a counter like this:

MetricRegistry.counter("name", Map<String,String>);
counter.increment();

Are there any underlying functionality that would prevent an approach I mentioned above?

@ryantenney

This comment has been minimized.

Show comment
Hide comment
@ryantenney

ryantenney Jun 10, 2015

Member

There's nothing that would prevent us from doing that. This is simply a better API, in my opinion:

MetricName counterName = new MetricName("counter").tagged("tag1", "value1");
MetricRegistry.counter(counterName)  // counter { tag1=value1 }
MetricRegistry.counter(counterName.tagged("tag2", "value2"));  // counter { tag1=value1, tag2=value2 }
Member

ryantenney commented Jun 10, 2015

There's nothing that would prevent us from doing that. This is simply a better API, in my opinion:

MetricName counterName = new MetricName("counter").tagged("tag1", "value1");
MetricRegistry.counter(counterName)  // counter { tag1=value1 }
MetricRegistry.counter(counterName.tagged("tag2", "value2"));  // counter { tag1=value1, tag2=value2 }
@salilsurendran

This comment has been minimized.

Show comment
Hide comment
@salilsurendran

salilsurendran Jun 10, 2015

But doing this results in 50 file changes in your PR. Whereas if you go with what I suggest you will only have to change the MetricRegistory methods to accept tags. I was wondering if there is a reason related to functionality that requires the use of MetricName?

salilsurendran commented Jun 10, 2015

But doing this results in 50 file changes in your PR. Whereas if you go with what I suggest you will only have to change the MetricRegistory methods to accept tags. I was wondering if there is a reason related to functionality that requires the use of MetricName?

@ryantenney

This comment has been minimized.

Show comment
Hide comment
@ryantenney

ryantenney Jun 12, 2015

Member

Committed to a branch. Thanks @udoprog!

Member

ryantenney commented Jun 12, 2015

Committed to a branch. Thanks @udoprog!

@ryantenney ryantenney closed this Jun 12, 2015

@udoprog

This comment has been minimized.

Show comment
Hide comment
@udoprog

udoprog Jun 12, 2015

Contributor

@ryantenney Wow, must admit I was not expecting this. kudos for looking at it again!

At my company we've implemented something similar (but we went with the name MetricId), what we ended up doing was replace MetricRegistry and start from scratch. We'll be releasing the library pretty soon.

These are a few things we've learned along the way;

  • Most of the existing MetricSet's did not structure the things that we wanted them to, so we started from scratch and re-implemented the ones we wanted with tags in mind. Strong conventions are a must (e.g. every metric must have a distinct key and a unit tag that makes sense).
  • Immutability of MetricId is really neat, but expensive, copying the underlying Map degrades performance in the scenarios where you want to have a dynamic field and lots of measurements. We had to implement a MetricIdCache<K, MetricId> to cope with this.
  • It is very beneficial for MetricName and MetricRegistry to be interfaces that are distributred separately from the core, it makes it easier for frameworks to integrate with metrics. It also makes it easier to version everything (new packages when breaking backwards compatibility) and providing shims for frameworks which are slow adopters.

@salilsurendran
Replacing the identifier of a time series with a concrete type has a few benefits.

Refactoring probably does not have to happen again. E.g. the behavior of a MetricName and its underlying implementation can change freely without having to modify signatures and method invocations all over the place.

Concretely about what you suggest, if you encapsulate the tags of a metric in the API, every method that interacts with metrics would have to support the new way of doing things, and in order to make use of it, all these 50 files would have to be modified at one point or another.

String's are in my opinion poor metric identifiers, they require encoding to encapsulate non-trivial information (like tags). And they have very strong pre-defined behaviour that can not (should not) be extended or modified (final). Two things i'd want for a metric identifier in a framework.


I hope any of this will be useful, I'd be thrilled to help out on this if possible.

Contributor

udoprog commented Jun 12, 2015

@ryantenney Wow, must admit I was not expecting this. kudos for looking at it again!

At my company we've implemented something similar (but we went with the name MetricId), what we ended up doing was replace MetricRegistry and start from scratch. We'll be releasing the library pretty soon.

These are a few things we've learned along the way;

  • Most of the existing MetricSet's did not structure the things that we wanted them to, so we started from scratch and re-implemented the ones we wanted with tags in mind. Strong conventions are a must (e.g. every metric must have a distinct key and a unit tag that makes sense).
  • Immutability of MetricId is really neat, but expensive, copying the underlying Map degrades performance in the scenarios where you want to have a dynamic field and lots of measurements. We had to implement a MetricIdCache<K, MetricId> to cope with this.
  • It is very beneficial for MetricName and MetricRegistry to be interfaces that are distributred separately from the core, it makes it easier for frameworks to integrate with metrics. It also makes it easier to version everything (new packages when breaking backwards compatibility) and providing shims for frameworks which are slow adopters.

@salilsurendran
Replacing the identifier of a time series with a concrete type has a few benefits.

Refactoring probably does not have to happen again. E.g. the behavior of a MetricName and its underlying implementation can change freely without having to modify signatures and method invocations all over the place.

Concretely about what you suggest, if you encapsulate the tags of a metric in the API, every method that interacts with metrics would have to support the new way of doing things, and in order to make use of it, all these 50 files would have to be modified at one point or another.

String's are in my opinion poor metric identifiers, they require encoding to encapsulate non-trivial information (like tags). And they have very strong pre-defined behaviour that can not (should not) be extended or modified (final). Two things i'd want for a metric identifier in a framework.


I hope any of this will be useful, I'd be thrilled to help out on this if possible.

@davidkarlsen

This comment has been minimized.

Show comment
Hide comment
@davidkarlsen

davidkarlsen Oct 2, 2016

Really looking forward to this - what is the timeframe for a v4 release?

davidkarlsen commented Oct 2, 2016

Really looking forward to this - what is the timeframe for a v4 release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment