-
Notifications
You must be signed in to change notification settings - Fork 909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of active idle loops #11422
Comments
The healthcheck thread has to wake up every second to update the query load and drop rate of downstream servers, and will go back to sleep right away if there is nothing else to do but that has to be done every second. The TCP client loop handles timeouts, cleans up the downstream TCP connections cache, and check whether a dump of the current state has been requested.
That sounds good. I'm afraid it's not clear to me how you suggest doing that, though. I would love to see some actual measurements, perhaps with |
Thanks for the quick reply!
I only have some basic figures, reported by htop:
So in fact, this is mainly the health check that consumed resources, but I have to check with perf (or something similar) if it's linked to the loop or to some rarely occurring operations, like a real health check. |
That seems to be a lot indeed, depending on how many downstreams you have and whether they are Do53, DoT or DoH ones. On a fairly idle setup that has been running for 8 days I have 6 minutes 42 seconds of CPU time taken by the healtC thread for a single Do53 downstream, with the default 1s |
No, I don't have time to switch to master, but I'll be patient and don't forget this. |
We just released 1.7.2 which should improve things significantly for your use-case! We will welcome any feedback! |
Indeed, the healthCheck thread has a much much lower impact now. Thanks for the fix! |
I'm closing this for now since most, if not all, of the active loops have been removed. Please let us know if you see something weird! |
Short description
In some threads, there are active idle loops, for example in health checks :
pdns/pdns/dnsdist.cc
Line 1824 in fd93b11
In TCP client, there is an active loop of 500 ms.
Usecase
I have deployed dnsdist successfully on low end computers, but a lot of CPU is used (wasted?) in just doing nothing every second, in the case of health checks.
I didn't take the time to inspect if something is done within TCP client loop.
Description
#7142 was implemented in a way not to change the sleep interval. This should have been done with a sleep adapted to the minimal needed sleep interval.
In the same way, the run method of the multiplexer has fixed timeout of 500ms, which should be configurable.
Globally, the request is about thinking of doing things when they are required, instead of regularly, just in case. The active idle loop is somewhat simpler, but does not scale nor adapts to low end computers.
The text was updated successfully, but these errors were encountered: