Survey existing metrics definitions across existing libraries #3

tsloughter · 2019-05-30T14:30:06Z

From the meeting notes where this action item was created:

Lower level Telemetry.Metrics interface in Erlang
- Currently using Structs and Protocols, so hard to convert to Erlang
- Docs might not be as good
- Intention with Phoenix 1.5 is to include this by default, so might not be as seamless
- The API needs to be really good because end-user developers are going to interact with it; not just library authors
- Main issue is with reporters, because if the internal data structures are different, they’d need to support both - need some kind of abstraction that both can handle (like maps)
- How will this interact with OpenTelemetry’s metrics feature set? Probably a lot of overlap, so we need to make sure that it’s not too confusing for people.
- Action: Arkadiusz to Survey existing metrics definitions across existing libraries (Prometheus, OpenCensus, Statix, Telemetry.Metrics) before next meeting

hauleth · 2019-05-30T14:51:22Z

Currently using Structs and Protocols, so hard to convert to Erlang

I haven't found any usage of protocols in telemetry_metrics. Heave I missed something? About structs, as Elixir provides quite easy support for records (without support for protocols though). I think it shouldn't be much of the problem.

Intention with Phoenix 1.5 is to include this by default, so might not be as seamless

Erlang implementation still can provide Elixir-like API. BTW the same should be done for telemetry itself to provide more seamless migration for consumers.

How will this interact with OpenTelemetry’s metrics feature set?

I would suggest that we would ignore direct API in OT and instead "force" user to always use telemetry for sending data to OT which should be only consumer. In that way we would sacrifice some part of the OT specs for better user experience.

About existing metrics types, most common I am aware of are:

counter/sum - these two are equivalent
histogram + sometimes more specialized versions of it like timing
gauge/value - single value at the measurement time

Some other tools also provide metrics like meter which work like taking derivative of gauge, but I think it is out of scope for telemetry_metrics.

arkgil · 2019-06-06T10:35:00Z

BTW the same should be done for telemetry itself to provide more seamless migration for consumers.

Do you mean creating an Elixir module delegating to the Erlang one?

hauleth · 2019-06-06T10:37:01Z

@arkgil yes. It could even be written in Erlang, but in general it should be made easy for consumers to "migrate" to newer versions.

arkgil · 2019-06-06T10:52:05Z

@hauleth I'm not sure what you mean, or maybe I don't see the problem we're trying to solve here 😄

Regarding use of records, I would vote against it, because IMO they are problematic when they show up in stacktraces. I would say that if we aim to have a common structure for both Erlang and Elixir, then maps are the way to go (they might be structs on the Elixir side, although that too might confuse folks when debugging from Erlang).

arkgil · 2019-06-06T14:05:48Z

As Łukasz wrote in a comment above, metric types supported by the libraries around fall into following buckets:

metric counting the number of measurements. AFAIK this kind of counter is supported only by OpenCensus and Telemetry.Metrics, i.e. other libraries allow to increment/decrement the counter by arbitrary value
metric for summing up recorded measurements
metric keeping track of the last recorded measurement
metric building a histogram of recorded values
metric exposing a set of basic statistics about recorded values, like minimum, maximum, mean, chosen percentiles etc. The set of statistics vary depending on the library/system
other, more sophisticated time-series analyses, like moving weighted averages or derivatives

When it comes to defining metrics, most of the libraries use the approach with the "registry". You call a function, the metric is registered somewhere globally, and the registry is queried whenever the metric is updated or needs to be exported. I haven't found library other than Telemetry.Metrics which uses plain data structures for defining metrics and passing them around.

bryannaegele · 2019-07-03T17:01:55Z

I haven't found library other than Telemetry.Metrics which uses plain data structures for defining metrics and passing them around.

How many of those are attempting to interact with multiple implementations without the use of an agent though? I see one of the benefits of using data structures to define metrics is the flexibility they provide for simple migrations via reporters. OpenCensus is the only one I'm aware of that attempts abstracting the destination but moves that abstraction to the agent.

arkgil · 2019-07-05T10:45:27Z

exometer, folsom and metrics (which uses first two as backends) are all quite popular (assessing by number of downloads on Hex) and allow to export metrics to multiple external systems.
The idea is that reporters subscribe to metric updates and are notified every x seconds that they should export the metric.

arkgil · 2019-07-05T10:53:07Z

To me, the difference between using a registry and data structures boils down to these two things:

With data structures, we need to tell the reporter which metrics it shall export. With the registry we can register metrics earlier and either tell it which ones it should use or which ones it should ignore.
With data structures it's not possible for libraries to register metrics, only emit events using Telemetry, which gives more control to the user. With registry, libraries could register metrics directly so that the user can export them.

tsloughter assigned arkgil May 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Survey existing metrics definitions across existing libraries #3

Survey existing metrics definitions across existing libraries #3

tsloughter commented May 30, 2019

hauleth commented May 30, 2019

arkgil commented Jun 6, 2019

hauleth commented Jun 6, 2019

arkgil commented Jun 6, 2019 •

edited

Loading

arkgil commented Jun 6, 2019

bryannaegele commented Jul 3, 2019

arkgil commented Jul 5, 2019 •

edited

Loading

arkgil commented Jul 5, 2019

Survey existing metrics definitions across existing libraries #3

Survey existing metrics definitions across existing libraries #3

Comments

tsloughter commented May 30, 2019

hauleth commented May 30, 2019

arkgil commented Jun 6, 2019

hauleth commented Jun 6, 2019

arkgil commented Jun 6, 2019 • edited Loading

arkgil commented Jun 6, 2019

bryannaegele commented Jul 3, 2019

arkgil commented Jul 5, 2019 • edited Loading

arkgil commented Jul 5, 2019

arkgil commented Jun 6, 2019 •

edited

Loading

arkgil commented Jul 5, 2019 •

edited

Loading