Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Refactoring of the TCP stack #7559

Merged
merged 34 commits into from Apr 5, 2019

Conversation

@rgacogne
Copy link
Member

@rgacogne rgacogne commented Mar 11, 2019

Short description

This PR completely changes the way dnsdist handles TCP connections, moving to an event-based mode that allows a single thread to handle a large number of connections, instead of only one. It makes DNS over TCP and DNS over TLS much more scalable.
This needs a very serious review, lots of testing, including weird corner-cases like TCP Fast Open, etc..
The documentation needs to be updated, and probably the way we handle the number of TCP threads to start.

Based on #7526. Hopefully fixes #4814.

IMHO merging this PR calls for the move to the 1.4.x branch at least, perhaps 2.0.x.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)
@rgacogne rgacogne added this to the dnsdist-1.4.0 milestone Mar 11, 2019
@rgacogne rgacogne marked this pull request as ready for review Mar 14, 2019
@chbruyand chbruyand self-requested a review Mar 19, 2019
@rgacogne rgacogne force-pushed the rgacogne:dnsdist-tcp-refactor-clean branch from 026b798 to bbfd480 Mar 19, 2019
@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Mar 19, 2019

Rebased to fix a conflict.

@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Mar 26, 2019

I have done a lot of testing with the TCP/DoT fork of dnsperf and our own dnstcpbench, and so far it looks good. Also tested with massif and ASAN, UBSAN, as well as TCP Fast Open.
I added some metrics and updated the documentation as well.

pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
}
//cerr<<__func__<<": add write backend FD "<<state->d_downstreamSocket->getHandle()<<endl;
handleNewIOState(state, IOState::NeedWrite, state->d_downstreamSocket->getHandle(), handleDownstreamIOCallback, state->getBackendWriteTTD());
return;

This comment has been minimized.

@chbruyand

chbruyand Apr 3, 2019
Member

That return statement looks weird in a while loop

pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Outdated Show resolved Hide resolved
pdns/dnsdist-tcp.cc Show resolved Hide resolved
pdns/tcpiohandler.hh Show resolved Hide resolved
@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Apr 3, 2019

Just pushed a few commits, the only important one being 705cf3c that fixes a nice bug with DoT, uncovered while testing with https://github.com/DNS-OARC/flamethrower

rgacogne added 15 commits Mar 22, 2019
We need to because the TLS layer might already have data waiting
for us, while there might not be anything left on the OS-level
buffer associated to the socket.
If we don't ask the TLS layer, we might wait indefinitely for
something to arrive while the client has already sent everything,
and it's just waiting for us because the TLS record has been read.
Instead of waiting for the socket to be readable, as it might
already be, so we save a multiplexer trip, and prevent an issue
if we ever add a TLS layer between dnsdist and the backends.
@rgacogne rgacogne force-pushed the rgacogne:dnsdist-tcp-refactor-clean branch from d348885 to acadc54 Apr 4, 2019
@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Apr 4, 2019

Rebased to fix conflicts.

Keep, for each frontend and backend:
- the number of concurrent TCP connections
- the average number of queries per connection
- the average duration of a connection
@rgacogne rgacogne merged commit eb3764e into PowerDNS:master Apr 5, 2019
2 checks passed
2 checks passed
ci/circleci: build Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rgacogne rgacogne deleted the rgacogne:dnsdist-tcp-refactor-clean branch Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants