Feature Request: metrics or instrumentation #1526

sevagh · 2020-05-16T18:39:02Z

It would be nice to have a /metrics endpoint that exposes Prometheus-style metrics.

Note that I might be opinionated for working mostly with the Prometheus ecosystem - perhaps a more general metrics library, with different exposition formats to choose from, could work? I also don't know which existing Haskell metrics libraries or Prometheus libraries exist or are good (but hackage shows there might be some).

I'm not sure what metrics are the best to expose. I bet PostgREST developers know more about essential PostgREST KPIs. But things like:

postgrest_query_execution_time (histogram)
postgrest_http_requests
postgrest_schema_reloads
...

The text was updated successfully, but these errors were encountered:

steve-chavez · 2020-07-10T14:57:07Z

Needs more discussion, but It could be a good idea!

One thing I'd like to note for now is that we can't use a metrics endpoint because it would conflict with users routes(one example of using a metrics table here). So maybe we can come up with a prefix, like /pgrst/metrics or /internal/metrics.

sevagh · 2020-07-10T15:29:50Z

That sounds good. Given PostgREST + NGINX is the recommended/common deployment, one could easily expose the internal metrics prefix at their desired location.

steve-chavez · 2021-09-03T18:21:37Z

On #1933, we were thinking of using a special header for this and avoid creating an extra route.

Seems Prometheus doesn't support adding headers for scraping though 😞 prometheus/prometheus#1724

PostgREST + NGINX is the recommended/common deployment

But since Nginx would be present, then I guess it's not a problem because it can map the header to a url. Another option in the future might be #1909 as well. So a special header would do.

Edit: ref https://github.com/qnikst/prometheus-haskell

darora · 2021-09-22T01:47:01Z

Hmm at least for liveness/health checks, we'd want to be hitting the postgrest instances directly, rather than going through nginx or a similar rewriting layer.

Even with an nginx setup, I'd imagine we generally have it load balancing between multiple postgrest instances transparently, so even for metrics you'd likely want to be hitting each instance directly, rather than going over nginx.

One alternative would be to use a secondary port to host endpoints for liveness/metrics etc. This would avoid creating a breaking change where we reserve e.g. /internal or a similar base path, and trivially allow users to not expose these endpoints externally.

rupurt · 2022-05-05T07:04:51Z

Can haskell run multiple web servers on different ports? That's how it's typically done in other languages so that you don't have route path conflicts + you typically don't expose the prometheus webserver port to the outside world.

steve-chavez · 2022-05-05T15:54:59Z

@rupurt Yeah, we already have that on latest https://postgrest.org/en/latest/configuration.html#admin-server-port

rupurt · 2022-05-05T16:19:57Z

@steve-chavez awesome. Would be great to have prometheus metrics in there 😄

uhbif19 · 2022-08-09T19:34:28Z

Is it okay to use https://hackage.haskell.org/package/prometheus for this task?

steve-chavez · 2022-08-10T21:36:10Z

@uhbif19 Yes, that one should do.

For posterity, #2129 was closed but the pool metrics discussed there would still be useful.

steve-chavez · 2022-09-16T18:25:47Z

From #2477

It would be great if we can get
1.. GC
2. Query response times
3. Requests queued, DB connection pool usage count etc
metrics on the Admin Server port at maybe /met

For 2, I was actually referring to time taken for a request for round trip from postgrest to DB.
It's helpful in scenarios when we are observing high latencies from postgrest but DB takes only few milli seconds. Such issues could be due to high load on postgrest pods, cpu throttling, connection pool crunch etc (in k8s world).

bhupixb · 2023-03-27T09:41:22Z

Has this feature been released in any 10.x version?

steve-chavez · 2023-03-27T10:27:00Z

Nope, not yet implemented.

(Issue would be closed if so)

steve-chavez · 2023-06-08T19:16:05Z

These days seems we should be using OpenTelemetry instead of Prometheus. Maybe with:

Article about the differences: https://www.timescale.com/blog/prometheus-vs-opentelemetry-metrics-a-complete-guide/

steve-chavez · 2023-12-13T20:39:00Z

A metric for the connection pool max acquisition time would be helpful to prevent acquisition timeouts. While it gets higher it will reach the timeout.

Also OpenTelemetry Traces seem to correspond to Server Timing?

develop7 · 2023-12-15T15:24:13Z

@steve-chavez yep, they seem to be a perfect match.

steve-chavez mentioned this issue Oct 23, 2020

SUGGESTION - api "signal handler" to reset cache - SIGUSR1 #1622

Closed

steve-chavez mentioned this issue Sep 3, 2021

Add a proper / minimal health check endpoint #1933

Closed

steve-chavez mentioned this issue Dec 28, 2021

feat: minimal health check #2092

Merged

4 tasks

wolfgangwalther mentioned this issue Jan 20, 2022

Add pool stats to /metrics admin endpoint #2129

Closed

steve-chavez added the enhancement a feature, ready for implementation label Sep 16, 2022

steve-chavez mentioned this issue Sep 16, 2022

[Feature Request] Response times & GC metrics #2477

Closed

develop7 mentioned this issue Oct 11, 2023

Add more data to Server-Timing header #2983

Merged

3 tasks

develop7 mentioned this issue Dec 15, 2023

OpenTelemetry integration tracking issue #3118

Open

8 tasks

steve-chavez mentioned this issue Feb 7, 2024

Empty error message, db-pool size surpassed #3214

Open

steve-chavez mentioned this issue Apr 18, 2024

feat: connection pool metrics in admin server #3420

Merged

steve-chavez closed this as completed in #3420 Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: metrics or instrumentation #1526

Feature Request: metrics or instrumentation #1526

sevagh commented May 16, 2020

steve-chavez commented Jul 10, 2020

sevagh commented Jul 10, 2020

steve-chavez commented Sep 3, 2021 •

edited

Loading

darora commented Sep 22, 2021

rupurt commented May 5, 2022

steve-chavez commented May 5, 2022

rupurt commented May 5, 2022

uhbif19 commented Aug 9, 2022

steve-chavez commented Aug 10, 2022

steve-chavez commented Sep 16, 2022 •

edited

Loading

bhupixb commented Mar 27, 2023

steve-chavez commented Mar 27, 2023

steve-chavez commented Jun 8, 2023 •

edited

Loading

steve-chavez commented Dec 13, 2023 •

edited

Loading

develop7 commented Dec 15, 2023

Feature Request: metrics or instrumentation #1526

Feature Request: metrics or instrumentation #1526

Comments

sevagh commented May 16, 2020

steve-chavez commented Jul 10, 2020

sevagh commented Jul 10, 2020

steve-chavez commented Sep 3, 2021 • edited Loading

darora commented Sep 22, 2021

rupurt commented May 5, 2022

steve-chavez commented May 5, 2022

rupurt commented May 5, 2022

uhbif19 commented Aug 9, 2022

steve-chavez commented Aug 10, 2022

steve-chavez commented Sep 16, 2022 • edited Loading

bhupixb commented Mar 27, 2023

steve-chavez commented Mar 27, 2023

steve-chavez commented Jun 8, 2023 • edited Loading

steve-chavez commented Dec 13, 2023 • edited Loading

develop7 commented Dec 15, 2023

steve-chavez commented Sep 3, 2021 •

edited

Loading

steve-chavez commented Sep 16, 2022 •

edited

Loading

steve-chavez commented Jun 8, 2023 •

edited

Loading

steve-chavez commented Dec 13, 2023 •

edited

Loading