Prometheus metrics? #2

emmanuel · 2017-10-24T16:15:07Z

On the blog articles that correspond to this repository the table of contents indicates that Prometheus metrics exposition was intended as the fourth (next) article in the series.

I'm very interested in Prometheus exposition from Envoy, and I'm wondering if you could give me a hint about how you were planning to go about it. I have a couple of thoughts so far. First, I see that Envoy's native /stats endpoint on the admin port is a bit like statsd output.

I was thinking a bit about using the Prometheus statsd-exporter in order to provide a sink for Envoy to flush to, but I'm not yet familiar enough with the Envoy stats format to have a sense of how many rules would be called for to extract the varying portions of the stat names.
Envoy itself could be extended to natively provide metrics in Prometheus's exposition format. I'm not yet familiar enough with the codebase to have a sense of how much effort would be called for to take this approach, though the docs signal that it shouldn't be too hard.

What do you think? Were you imagining going for one of the above approaches, or do you have something else in mind? I'm interested to implement this (depending on effort required), so if you have a thought of a good route, I may come back shortly with a pull request.

The text was updated successfully, but these errors were encountered:

emmanuel · 2017-10-24T16:18:45Z

With regard to option 2 above, my comment about expected effort is based on the comment here: https://www.envoyproxy.io/envoy/intro/arch_overview/statistics.html

Envoy uses statsd as the statistics output format, though plugging in a different statistics sink would not be difficult.

That said, the comment is with regard to a push-based stats approach, rather than a pull-based one like Prometheus.

christian-posta · 2017-10-26T01:14:07Z

On Tue, Oct 24, 2017 at 9:15 AM, Emmanuel Gomez ***@***.***> wrote: On the blog articles that correspond to this repository the table of contents indicates that Prometheus metrics exposition was intended as the fourth (next) article in the series.

Yep! I kinda got side tracked with another topic, but will return to this series about Envoy.

I'm very interested in Prometheus exposition from Envoy, and I'm wondering if you could give me a hint about how you were planning to go about it. I have a couple of thoughts so far. First, I see that Envoy's native /stats endpoint on the admin port is a bit like statsd output. 1. I was thinking a bit about using the Prometheus statsd-exporter in order to provide a sink for Envoy to flush to, but I'm not yet familiar enough with the Envoy stats format to have a sense of how many rules would be called for to extract the varying portions of the stat names. 2. Envoy itself could be extended to natively provide metrics in Prometheus's exposition format. I'm not yet familiar enough with the codebase to have a sense of how much effort would be called for to take this approach, though the docs signal that it shouldn't be too hard.

I was hoping to explore #2 honestly and then fallback onto #1 if it proved too much to do in my limited time.

What do you think? Were you imagining going for one of the above approaches, or do you have something else in mind? I'm interested to implement this (depending on effort required), so if you have a thought of a good route, I may come back shortly with a pull request.

That'd be great! Want to explore #1 and see how viable it is?

…

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADP0SZVbTxNheYjX7W5aX710xoZjcgpks5svg0LgaJpZM4QEss8> .

-- *Christian Posta* twitter: @christianposta http://www.christianposta.com/blog http://fabric8.io

christian-posta · 2017-10-26T01:14:37Z

Go for it! That'd be a great way to help out the Envoy community as well.

…

On Tue, Oct 24, 2017 at 9:18 AM, Emmanuel Gomez ***@***.***> wrote: With regard to option 2 above, my comment about expected effort is based on the comment here: https://www.envoyproxy.io/envoy/intro/arch_overview/ statistics.html Envoy uses statsd as the statistics output format, though plugging in a different statistics sink would not be difficult. That said, the comment is with regard to a push-based stats approach, rather than a pull-based one like Prometheus. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADP0R47GIMFis5t_hEFUvlYRyR3JpIVks5svg3lgaJpZM4QEss8> .

-- *Christian Posta* twitter: @christianposta http://www.christianposta.com/blog http://fabric8.io

emmanuel · 2017-10-26T03:15:32Z

I’m going to try #1 for now.

There’s been at least some discussion in the Envoy project about #2: ~~naively~~ natively supporting a tag-based (dimensional/metrics 2.0) approach, here: envoyproxy/envoy#1536. That said, it doesn’t look like anyone is working on it at the moment.

I would like to contribute native support (#2) but I will start by parsing the current output (#1) and then circle back around if it warrants the effort.

mrice32 · 2017-10-26T15:35:35Z

It would be great to see new stats sinks/exporters added to Envoy. There seems to be a lot of interest around this.

For 2, we just closed 1536 with 1852 just now. This means that tags/dimensions have first class support in Envoy. Please see here and here for documentation on how to configure these tags as a part of the bootstrap config. Please see here for Stats::Metric, which is the base class that encapsulates all of this identifying information for all types of metrics that are flushed to sinks.

However, to take advantage of these tags, there will need to be stats sinks that use them (this currently does not exist in Envoy as the only native stats sink is statsd), which goes back to 1.

Feel free to open an issue in Envoy to discuss creating a sink for Prometheus or if there are any issues you see with the existing tags/dimensions built into Envoy now.

mrice32 · 2017-10-26T15:55:30Z

I'd also like to note that stats sinks have recently become statically pluggable (like filters). This means that you can link in a factory implementation, provide the corresponding string in the config, and Envoy will use your provided sink without any modification to the Envoy codebase (although it is definitely encouraged to add these sinks to the upstream Envoy repo so other users can look it up by default). See the statsd factory and the tcp statsd sink it creates for an example of how to do this.

emmanuel · 2017-10-26T17:34:44Z

@mrice32 thanks for adding your comments. I wrote my comments above before realizing that work was actually underway to implement tagging/labels/dimensions for stats in Envoy. I'm curious why tagging wasn't treated as the canonical representation and converted to strings as needed (based on my presumption that labels provide a complete superset of string-based naming), but I'm nowhere near deep enough in the codebase to make a claim about that being a better approach overall.

As far as taking the tagging support in its current state all the way through to Prometheus support, the first thing I would bring up is that Prometheus uses a pull-based collection model, and as such is not a 'sink' in the same way that statsd, etc are. I suspect that it would therefore make more sense to provide Prometheus formatted metrics as another format on the /stats path of the admin endpoint (in addition to statsd-like text and JSON, as currently available). A Prometheus server would then discover the Envoy instances as 'scrape targets' using one of the available service-discovery mechanisms that Prometheus supports, and would periodically issue requests to the appropriate endpoint on each Envoy instance to retrieve stats from it.

In fact, 'Prometheus format' is actually one or both of two encodings: a Protobuf representation and a text representation. Given that there is already Protobuf machinery in play in Envoy's build system, I'm hopeful that leveraging the existing tooling to support Prometheus's Protobuf format is not unreasonable. In any case, the text format is relatively compact (moreso than JSON, certainly), and I would hope that it be supported regardless of Protobuf support, for the purpose of adhoc debugging and inspection.

Does that make sense? Do you see any major roadblocks to adding another format to the /stats endpoint?

mrice32 · 2017-10-26T18:21:24Z

To respond to your first question, there are two reasons:

For the tags to be the canonical representation, Envoy's codebase would have to be opinionated about what is a tag vs a static portion of the stat name. It was seen as beneficial to allow this to be configurable (with defaults) using regexes rather than static and requiring modification of Envoy internals for a user to change.
Historical reasons. Envoy used to only support statsd, which (as far as I'm aware) accepts static strings from which tags can be extracted downstream. This meant that the existing canonical representation was originally static strings. There was a decent amount of infrastructure in Envoy built around this assumption (for instance, this is embedded in hot restart). Much of this would need to be redesigned to handle a different canonical representation.

As for adding new formats for scraping stats using a pull model, I think that makes sense. I agree that protobuf is probably the right way to go. As usual, it will probably need to be configurable in the Envoy config, but I don't see any reason this would not work or people would be opposed to having new formats available via /stats. Here is where those stats are dispatched in response to /stats. However, as it is noted in the code, timers (we actually changed that to histograms recently, so that should be updated) and histograms are sent to statsd when recorded and not stored. That means that a pull-based approach will not be able to capture those stats unless we implement some storage system for them (non-trivial since this storage would be unbounded without some way to aggregate or delete old samples). Another thing to note is that currently the tags are not included in the /stats response. The counters and gauges are readily available in that method, though, so that shouldn't be difficult to add.

mrice32 · 2017-10-26T18:31:49Z

cc @mattklein123 @htuch

mattklein123 · 2017-10-26T18:46:34Z

There is a lot of general interest in native Prometheus support. I opened an issue to track. Can we please discuss there: envoyproxy/envoy#1947

emmanuel · 2017-10-26T18:47:16Z

Yeah, this conversation definitely belongs there. Thanks @mattklein123.

emmanuel mentioned this issue Oct 26, 2017

stats: native prometheus export support envoyproxy/envoy#1947

Closed

2 tasks

christian-posta closed this as completed Jan 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus metrics? #2

Prometheus metrics? #2

emmanuel commented Oct 24, 2017

emmanuel commented Oct 24, 2017 •

edited

Loading

christian-posta commented Oct 26, 2017 via email

christian-posta commented Oct 26, 2017 via email

emmanuel commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017 •

edited

Loading

emmanuel commented Oct 26, 2017

mrice32 commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017

mattklein123 commented Oct 26, 2017

emmanuel commented Oct 26, 2017

Prometheus metrics? #2

Prometheus metrics? #2

Comments

emmanuel commented Oct 24, 2017

emmanuel commented Oct 24, 2017 • edited Loading

christian-posta commented Oct 26, 2017 via email

christian-posta commented Oct 26, 2017 via email

emmanuel commented Oct 26, 2017 • edited Loading

mrice32 commented Oct 26, 2017 • edited Loading

mrice32 commented Oct 26, 2017 • edited Loading

emmanuel commented Oct 26, 2017

mrice32 commented Oct 26, 2017 • edited Loading

mrice32 commented Oct 26, 2017

mattklein123 commented Oct 26, 2017

emmanuel commented Oct 26, 2017

emmanuel commented Oct 24, 2017 •

edited

Loading

emmanuel commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017 •

edited

Loading

mrice32 commented Oct 26, 2017 •

edited

Loading