Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Implement TCP out-of-order #9582

Merged
merged 24 commits into from Nov 23, 2020

Conversation

rgacogne
Copy link
Member

@rgacogne rgacogne commented Oct 2, 2020

Short description

This PR implements TCP out-of-order support for dnsdist:

  • for incoming queries (between the client and dnsdist) ;
  • as well as for outgoing queries (between dnsdist and the backend).

We try to reuse the existing TCP connections as much as possible, but we might need to open many connections to the same backend, especially if that backend does not support out-of-order. We also don't mix queries from different incoming TCP connections on the same TCP connection to the backend, since that would require taking care of query ID collisions, which we currently detect. An existing TCP connection to the backend can be re-used for a different incoming connection later, though.

This PR break the TCP handling code in two parts, one handling incoming TCP connections and a second one handling outgoing ones.

It still needs a serious review and more testing.

DNS Shotgun seems to be quite happier with it, by the way:
master
oor

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

@paddg
Copy link
Contributor

paddg commented Oct 5, 2020

How many TCP connections did you simulate during your 6kqps test?

@rgacogne
Copy link
Member Author

rgacogne commented Oct 5, 2020

Roughly 10k:

connection

@paddg
Copy link
Contributor

paddg commented Oct 5, 2020

Very good! Did you test with real DNS queries or with synthetic? If I test with real DNS traffic, the response rate is below 60%

@rgacogne
Copy link
Member Author

rgacogne commented Oct 5, 2020

Did you test with real DNS queries or with synthetic? If I test with real DNS traffic, the response rate is below 60%

Yes, that test is done with the data set mentioned in #9572, which is real-word traffic. Note that the test is done over DNS over TLS, and I tuned a few things:

  • setTCPRecvTimeout(30): read time out set to 30s ;
  • setMaxTCPClientThreads(8): 8 TCP worker threads ;
  • 2 * addTLSLocal("[::1]:8443", 'server.pem', 'server.key', { reusePort=true}, numberOfStoredSessions=0, maxInFlight=5 }): 2 threads doing accept() ;
  • 8 * newServer({ address = "[::1]", maxInFlight=10}): 8 backend threads ;

On the recursor side, both #9495 and #9572 are applied, and I have:

  • client-tcp-timeout=30 ;
  • dnssec=validate ;
  • max-mthreads=8192 ;
  • max-tcp-clients=500000 ;
  • pdns-distributes-queries=no ;
  • qname-minimization=yes ;
  • reuseport=yes ;
  • threads=8.

@paddg
Copy link
Contributor

paddg commented Oct 6, 2020

* numberOfStoredSessions=0

Is this meant as unlimited? Because "Setting this value to 0 disables stored session entirely"

@rgacogne
Copy link
Member Author

rgacogne commented Oct 6, 2020

Is this meant as unlimited? Because "Setting this value to 0 disables stored session entirely"

No, it is intended to disable sessions stored server-side, since client-side tickets are much nicer to the server and more realistic in anycast setups anyway, since the session-ticket key can be shared between servers much more easily than the server-side sessions.

@rgacogne
Copy link
Member Author

rgacogne commented Nov 2, 2020

Squashed to clean up the history a bit, and to make sure all commits build and pass the unit tests.

@rgacogne rgacogne marked this pull request as ready for review November 2, 2020 11:22
Copy link
Member

@omoerbeek omoerbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite a big PR, so I won't state I looked at every line of code. But what I checked looks good and basic testing show it works as advertised once I git the settings right.

related to that: please document that for it to work you should set both incoming and outgoing maxInFlight.

@rgacogne
Copy link
Member Author

It's quite a big PR, so I won't state I looked at every line of code. But what I checked looks good and basic testing show it works as advertised once I git the settings right.

I did not expect you to look at every single detail, don't worry! :-) Thanks a lot for the review and tests!

related to that: please document that for it to work you should set both incoming and outgoing maxInFlight.

I just added more words on how that feature works, that was clearly lacking!

@rgacogne rgacogne merged commit 54edfc0 into PowerDNS:master Nov 23, 2020
@rgacogne rgacogne deleted the ddist-tcp-refactor-split branch November 23, 2020 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants