Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Refactoring of the TCP stack #7559

Merged
merged 34 commits into from Apr 5, 2019

Conversation

Projects
None yet
2 participants
@rgacogne
Copy link
Member

commented Mar 11, 2019

Short description

This PR completely changes the way dnsdist handles TCP connections, moving to an event-based mode that allows a single thread to handle a large number of connections, instead of only one. It makes DNS over TCP and DNS over TLS much more scalable.
This needs a very serious review, lots of testing, including weird corner-cases like TCP Fast Open, etc..
The documentation needs to be updated, and probably the way we handle the number of TCP threads to start.

Based on #7526. Hopefully fixes #4814.

IMHO merging this PR calls for the move to the 1.4.x branch at least, perhaps 2.0.x.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

@rgacogne rgacogne added this to the dnsdist-1.4.0 milestone Mar 11, 2019

@rgacogne rgacogne marked this pull request as ready for review Mar 14, 2019

@chbruyand chbruyand self-requested a review Mar 19, 2019

@rgacogne rgacogne force-pushed the rgacogne:dnsdist-tcp-refactor-clean branch from 026b798 to bbfd480 Mar 19, 2019

@rgacogne

This comment has been minimized.

Copy link
Member Author

commented Mar 19, 2019

Rebased to fix a conflict.

@rgacogne

This comment has been minimized.

Copy link
Member Author

commented Mar 26, 2019

I have done a lot of testing with the TCP/DoT fork of dnsperf and our own dnstcpbench, and so far it looks good. Also tested with massif and ASAN, UBSAN, as well as TCP Fast Open.
I added some metrics and updated the documentation as well.

Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
Show resolved Hide resolved pdns/dnsdist-tcp.cc
Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
}
//cerr<<__func__<<": add write backend FD "<<state->d_downstreamSocket->getHandle()<<endl;
handleNewIOState(state, IOState::NeedWrite, state->d_downstreamSocket->getHandle(), handleDownstreamIOCallback, state->getBackendWriteTTD());
return;

This comment has been minimized.

Copy link
@chbruyand

chbruyand Apr 3, 2019

Member

That return statement looks weird in a while loop

Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
Show resolved Hide resolved pdns/dnsdist-tcp.cc Outdated
Show resolved Hide resolved pdns/dnsdist-tcp.cc
Show resolved Hide resolved pdns/tcpiohandler.hh
@rgacogne

This comment has been minimized.

Copy link
Member Author

commented Apr 3, 2019

Just pushed a few commits, the only important one being 705cf3c that fixes a nice bug with DoT, uncovered while testing with https://github.com/DNS-OARC/flamethrower

@rgacogne rgacogne referenced this pull request Apr 4, 2019

Closed

dnsdist: Refactoring to merge the UDP and TCP paths #7526

6 of 7 tasks complete

rgacogne added some commits Feb 25, 2019

dnsdist: Define empty DNSCrypt-related objects when not enabled
This way the rest of the code can mostly ignore whether DNSCrypt
support is enabled.
dnsdist: Disable regression tests for invalid AXFR
The new implementation does not try to be too smart about that
anymore.

rgacogne added some commits Mar 22, 2019

dnsdist: Actually try to read before checking if the socket is readable
We need to because the TLS layer might already have data waiting
for us, while there might not be anything left on the OS-level
buffer associated to the socket.
If we don't ask the TLS layer, we might wait indefinitely for
something to arrive while the client has already sent everything,
and it's just waiting for us because the TLS record has been read.
dnsdist: Try reading from the TCP backend right away
Instead of waiting for the socket to be readable, as it might
already be, so we save a multiplexer trip, and prevent an issue
if we ever add a TLS layer between dnsdist and the backends.

@rgacogne rgacogne force-pushed the rgacogne:dnsdist-tcp-refactor-clean branch from d348885 to acadc54 Apr 4, 2019

@rgacogne

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

Rebased to fix conflicts.

dnsdist: Add more TCP metrics
Keep, for each frontend and backend:
- the number of concurrent TCP connections
- the average number of queries per connection
- the average duration of a connection

@rgacogne rgacogne merged commit eb3764e into PowerDNS:master Apr 5, 2019

2 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@rgacogne rgacogne deleted the rgacogne:dnsdist-tcp-refactor-clean branch Apr 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.