Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect: Option to Expose Envoy's /stats/ for agents like DataDog that can read from there. #7070

Closed
banks opened this issue Jan 16, 2020 · 6 comments · Fixed by #7173
Closed
Labels
good first issue A well-defined bug or improvement with sufficient context which should be approachable for new contr theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/enhancement Proposed improvement or new feature

Comments

@banks
Copy link
Member

banks commented Jan 16, 2020

Currently we allow configuring Prometheus as a metrics sink for Envoy. To do this we configure a pass-through listener on each proxy that allows restricted access to Envoy's internal admin API but currently limits it to /stats/prometheus.

DataDog is supported natively in Envoy via it's statsd compatible interface, however it's latest service mesh integration relies on the agent polling Envoy's /stats endpoint.

It's currently possible to work around this using Expose Paths in the service registration but having the intenal endpont be Envoy's own one on localhost rather than something on the app instance.

But it would be better to have a first class experience equivalent to configuring prometheus on the whole mesh that automatically exposes the full /stats path not just the prometheus part.

The good news is this is very little code to write - the prometheus functionality and tests are almost exactly what we need, we just need a different Envoy config field and when that's used we change the path below to /stats/ rather than /stats/prometheus:

"prefix_rewrite": "/stats/prometheus"

There are other subtle changes needed possible - the URL we expose on the listener should probably be /stats to mirror native Envoy and we should probably re-use that huge method between the two options and just allow passing in those parts.

@banks banks added type/enhancement Proposed improvement or new feature good first issue A well-defined bug or improvement with sufficient context which should be approachable for new contr theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies labels Jan 16, 2020
@tpaschalis
Copy link
Contributor

tpaschalis commented Jan 23, 2020

@banks I'd like to take a stab! I have little experience with consul, so it might take a few days, hope it's okay!

@tpaschalis
Copy link
Contributor

tpaschalis commented Jan 24, 2020

@banks I've got the general idea on what we're trying to accomplish, but I want to ask a question before getting my hands dirty. I'm not 100% clear on whether

  • we want a new config flag that will either expose /stats/ or /stats/prometheus/ or
  • we will duplicate the functionality of envoy_prometheus_bind_addr to another field, eg. envoy_stats_bind_addr with another listener/proxy just for /stats/?

Should these two endpoints be able to be exposed at the same time? Thanks in advance!

@banks
Copy link
Member Author

banks commented Jan 27, 2020

Hey @tpaschalis would be great to have a contribution here!

I think the second of your bullet points is more what I had in mind. The main reason is to avoid a backward incompatible change for folks using envoy_prometheus_bind_addr already - they could do the equivalent with the new setting but would have to update the prometheus configs to hit a different path (/stats/prometheus instead of /metrics) as we expose it now.

It would be possible to achieve either or both with only a single listener with multiple routes setup - currently the prometheus option configures a route for /metrics pointing to /stats/prometheus. So both could exist on the same listener, but I think we'd need a new config to keep it cleaner.

Should these two endpoints be able to be exposed at the same time?

I don't think that is super important honestly, the stats one will be a superset so folks can use prometheus still with only that set, just at a different path. I think whatever is simplest to code and document is the best plan.

Let me know if you have any other questions.

Thanks for jumping in!

@tpaschalis
Copy link
Contributor

tpaschalis commented Jan 27, 2020

Thanks for your comment! I started to implement something like this over the weekend. In addition to the PrometheusBindAddr config field, I have added a StatsBindAddr field.

EDIT :
I've opened a Draft PR, so feel free to take a look.
I've configured two separate listeners, as it simplified the code by not having to handle the cases were PrometheusBindAddr and StatsBindAddr were equal or not (and then either configure two listeners with one route or one listener with two routes. I can also include the other alternative if you wish.

Again, thanks for your time.

@banks
Copy link
Member Author

banks commented Jan 28, 2020 via email

@tpaschalis
Copy link
Contributor

tpaschalis commented Jan 30, 2020

Hey, thanks for your response. I did run the existing tests and fiddled with adding a new one. I force-pushed my local branch a couple of times until I understood how things work, sorry about that.

In any case, I've opened a new PR, that contains the code and the new case-stats-proxy test, based on case-prometheus.
I wasn't sure what I should test from the statsd interface, so initially it looks for the following things in the response

  • http.envoy_metrics.downstream_rq_active for http metrics
  • cluster.local_agent.upstream_rq_active to detect the cluster
  • http.public_listener_http to detect the listener

Looking forward to your comments and observations! Thanks in advance ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue A well-defined bug or improvement with sufficient context which should be approachable for new contr theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/enhancement Proposed improvement or new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants