Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist TCP stack needs improving #4814

RobinGeuze opened this issue Dec 23, 2016 · 3 comments

dnsdist TCP stack needs improving #4814

RobinGeuze opened this issue Dec 23, 2016 · 3 comments


Copy link

@RobinGeuze RobinGeuze commented Dec 23, 2016

  • Program: dnsdist
  • Issue type: Bug report


The current dnsdist TCP stack has some weaknesses particularly when it is used in front of recursors. Due to the way dnsdist handles distributing queries over its tcp threads it can cause a "jam" of (very) slow queries on a certain thread. When it then assigns a "normal" query to this thread this query might timeout. Another case is that the queue for a certain thread can fill up the entire global TCP queue while the other thread still have plenty of time for processing queries causing queries to get dropped since the queue is "full".

These effects can be mitigated by spawning alot of tcp threads or setting the tcp recv timeouts on the server very low, but neither solution is very desirable.


  • Operating system: Any
  • Software version: 1.0.0, 1.1.0-beta2, git (probably)

Steps to reproduce

The easiest way to reproduce is to fire a bunch of known slow queries at dnsdist. dnsdist will then stop responding properly to normal queries as well or at least encountering failures at random (although I managed to temporarily break it completely with about 1 query every 2 seconds in my case).

Copy link

@rgacogne rgacogne commented Dec 23, 2016

#4817 might help.

Copy link

@Habbie Habbie commented Nov 9, 2017

Did it help? :)

Copy link
Contributor Author

@RobinGeuze RobinGeuze commented Mar 18, 2018

We again ran into some troubles today so I activated the possible fix to check it effectiveness.

These are the TCP stats on the machine without the singlepipe:

Clients    MaxClients Queued     MaxQueued
134        500        0          1000

And these are the TCP stats on the machine with the singlepipe:

Clients    MaxClients Queued     MaxQueued
19         500        0          1000

On the machine without the singlepipe the clients number will keep climbing till it reaches 500 and then start queueing things. I restarted both dnsdist instances roughly at the same time. So far the singlepipe version seems to stay stable at 19 clients so it seems this works like a charm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants