Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Metrics - Noise From Prometheus Probes #943

Closed
omad opened this issue Jun 1, 2023 · 1 comment
Closed

Prometheus Metrics - Noise From Prometheus Probes #943

omad opened this issue Jun 1, 2023 · 1 comment

Comments

@omad
Copy link
Member

omad commented Jun 1, 2023

Background

I'm trying to improve the dashboards used by Digital Earth Australia for monitoring our Datacube OWS deployment, as used for DEA Maps.

We have Datacube OWS deployed into Kubernetes using the ODC Helm Chart, with metrics being recorded by Prometheus, and a dashboard created within Grafana.

The problem I'm having is dealing with noise in the metrics from automated HTTP requests made by K8s to monitor the health of all the OWS instances. Kubernetes provides three probe types, startup, readiness and liveness. We have the startup and readiness probe setup to create a WMS GetMap request for a rarely used layer, and the liveness probe to hit the /ping endpoint.

Noise Problem

However, this creates a continuous level of noise of WMS requests, that makes the recorded metrics hard to use, particularly when automatically scaling up or down to deal with load, but often just all the time. E.g, see the following log captures for the two types of requests.

image

image

Questions/(Partial?) Solution

I've heard that making requests to /ping for the probes should be sufficient to check that OWS is operating, including checking connectivity to the DB.

I need this to respond with a code greater than or equal to 200 and less than 400 for success. And any other code indicates failure.

Is this the behaviour of /ping?

@omad
Copy link
Member Author

omad commented Jun 1, 2023

Okay, I've checked the implementation, and it looks perfect. I can reconfigure our K8s probes.

if db_ok:
return (render_template("ping.html", status="Up"), 200, resp_headers({"Content-Type": "text/html"}))
else:
return (render_template("ping.html", status="Down"), 500, resp_headers({"Content-Type": "text/html"}))

@omad omad closed this as completed Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant