Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Refactoring of the TCP stack #7559

Merged
merged 34 commits into from
Apr 5, 2019

Conversation

rgacogne
Copy link
Member

Short description

This PR completely changes the way dnsdist handles TCP connections, moving to an event-based mode that allows a single thread to handle a large number of connections, instead of only one. It makes DNS over TCP and DNS over TLS much more scalable.
This needs a very serious review, lots of testing, including weird corner-cases like TCP Fast Open, etc..
The documentation needs to be updated, and probably the way we handle the number of TCP threads to start.

Based on #7526. Hopefully fixes #4814.

IMHO merging this PR calls for the move to the 1.4.x branch at least, perhaps 2.0.x.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

@rgacogne rgacogne added this to the dnsdist-1.4.0 milestone Mar 11, 2019
@rgacogne rgacogne marked this pull request as ready for review March 14, 2019 15:28
@chbruyand chbruyand self-requested a review March 19, 2019 09:52
@rgacogne
Copy link
Member Author

Rebased to fix a conflict.

@rgacogne
Copy link
Member Author

I have done a lot of testing with the TCP/DoT fork of dnsperf and our own dnstcpbench, and so far it looks good. Also tested with massif and ASAN, UBSAN, as well as TCP Fast Open.
I added some metrics and updated the documentation as well.

pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
}
//cerr<<__func__<<": add write backend FD "<<state->d_downstreamSocket->getHandle()<<endl;
handleNewIOState(state, IOState::NeedWrite, state->d_downstreamSocket->getHandle(), handleDownstreamIOCallback, state->getBackendWriteTTD());
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That return statement looks weird in a while loop

pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Show resolved Hide resolved
pdns/tcpiohandler.hh Show resolved Hide resolved
@rgacogne
Copy link
Member Author

rgacogne commented Apr 3, 2019

Just pushed a few commits, the only important one being 705cf3c that fixes a nice bug with DoT, uncovered while testing with https://github.com/DNS-OARC/flamethrower

We need to because the TLS layer might already have data waiting
for us, while there might not be anything left on the OS-level
buffer associated to the socket.
If we don't ask the TLS layer, we might wait indefinitely for
something to arrive while the client has already sent everything,
and it's just waiting for us because the TLS record has been read.
Instead of waiting for the socket to be readable, as it might
already be, so we save a multiplexer trip, and prevent an issue
if we ever add a TLS layer between dnsdist and the backends.
@rgacogne
Copy link
Member Author

rgacogne commented Apr 4, 2019

Rebased to fix conflicts.

Keep, for each frontend and backend:
- the number of concurrent TCP connections
- the average number of queries per connection
- the average duration of a connection
@rgacogne rgacogne merged commit eb3764e into PowerDNS:master Apr 5, 2019
@rgacogne rgacogne deleted the dnsdist-tcp-refactor-clean branch April 10, 2019 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dnsdist TCP stack needs improving
2 participants