-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics are missing when coturn is used #914
Comments
Do you have any logs from the agent trying to retrieve the information from the prometheus endpoint? What is the error code you get? Maybe you can tune some timeouts or retries? afaik other people are using the prometheus endpoint without problems. |
In Prometheus: I will add a timeout and keep watching it |
Thx for your fast answer. Do you get any error in coturn logs when that failure happens? I haven't used prometheus endpoint at scale but maybe other people can comment if they can reproduce this issue. |
@FritschAuctores Or the other way around - coturn (or rather underlying library that implements prom protocol and webserver) drop the connection after timeout. You can probably try and review those settings on both sides Another point is that until recent commit d62f483 (10 days ago as of today), prometheus was tagging a bunch of metrics with username which created massive dimensionality and as a result - massive amount of metrics. Which is a good reason to have timeouts on scraping (not to mention resource/memory leak) |
@FritschAuctores I can confirm that I am seeing similar behavior with latest master f74f50c - there are gaps in metrics for a few minutes that cause my alerts to fire off. And this is without |
I've been looking at the
I'm wondering if the single thread option is what ends up creating this problem. Unfortunately it is not possible to enable the thread pool option in microhttpd without forking @FritschAuctores what coturn version are you using? Did you get any further insight in the last weeks? Thank you. |
epoll use un microhttpd introduced in #997 And the issue is resolved for me. |
@FritschAuctores pls try following commit 57f5a25 (which is #997 I mentioned above) |
I still experience the issue as of 716417e
|
@ggarber I can confirm that #999 is the root cause |
For future reference: in some cases MHD_USE_EPOLL_INTERNAL_THREAD is not set (whereas it should). Could be a bug in microhttpd library. |
The Prometheus Target is down when coturn is used
Here the
turn_total_traffic_peer_rcvb metric
:What's the problem here? The CPU load of the server was between 20 and 40%
There is no network problem, other exporters (cadvisor) work without any problems during this time
The text was updated successfully, but these errors were encountered: