Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health check endpoint for the collector #145

Open
kubibektas opened this issue Dec 3, 2020 · 5 comments
Open

Add health check endpoint for the collector #145

kubibektas opened this issue Dec 3, 2020 · 5 comments

Comments

@kubibektas
Copy link

hi,

we are using metrics collector as a centralized Kubernetes pod that receives metrics from all the application pods. As we have more metrics, the collector pod (metrics server) stops functioning properly and we get ruby_collector_working 0. We noticed the pod was getting CPU throttled and increased the resources for it but would it be possible to add a health check endpoint so that Kubernetes would detect it automatically and restart the pod through a liveness probe?

I saw there was a closed issue for the same feature (#69) But wanted to raise it again as it seems to be a useful functionality.

Thanks you!

@SamSaffron
Copy link
Member

SamSaffron commented Dec 3, 2020 via email

@kubibektas
Copy link
Author

Hi Sam, thanks for the response.

We are just running server as bin/prometheus_exporter and have sidekiq instrumentation on client pods. But our main use case is for reporting our custom metrics related to our application (like number of orders etc). Our problem is that, we are reporting too many metrics and running the server as a single pod. At some point the server gets throttled due to the high number of metrics. In such cases we just want to restart the server and continue reporting metrics. It's not possible to do this automatically right now since we don't have a liveness probe to be used by Kubernetes.

@h0jeZvgoxFepBQ2C
Copy link

We would also like to have this feature 👍

@SamSaffron
Copy link
Member

I am open to have a PR that adds a trivial health check at so /status it can return an OK status 200 page.

@n-rodriguez
Copy link
Contributor

n-rodriguez commented Oct 18, 2022

I am open to have a PR that adds a trivial health check at so /status it can return an OK status 200 page.

Fixed in 27a7689 PR: #226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants