New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a DNSpython resolver #1088

Merged
merged 4 commits into from Feb 1, 2018

Conversation

Projects
None yet
3 participants
@jamadden
Member

jamadden commented Feb 1, 2018

This adds a resolver implementation using dnspython.org.

The hope was that it would be simpler than c-ares (it is, especially once rthalley/dnspython#300 lands, if it does) and also faster than the threaded resolver.

Here's a trivial benchmark, lifted from the test case:

def resolve(res):
    for index in range(100):
        try:
            res.gethostbyname('www.x%s.com' % index)
        except socket.error:
            pass

All times in seconds, smallest is best, hitting Google's public DNS server (8.8.8.8):

Resolver PyPy 2 5.10 CPython 2.7.14
Blocking 74.76 61.04
c-ares 50.17 53.26
thread 158.41 122.06
dnspython 161.80 155.97

So, dnspython is not near as fast as the system (blocking) resolver or the c-ares resolver (even on pypy), but it has overhead comparable to the threaded resolver.

dnspython uses a cache of results. If I change the benchmark to just use a range of 20 and call it twice in a row, I get these results:

Resolver PyPy 2 5.10 CPython 2.7.14
Blocking 0.0565 0.0586
c-ares 1.7725 6.4246
thread 0.1165 0.1707
dnspython 1.1844 9.2374

On PyPy, faster than c-ares now. Note that these results are wildly variable and extremely suspect. For example, on one run I had ares at 8.4s and dnspython at 1.4, but the next run had ares at 2.2s and dnspython at 23.6s. See below for more reliable numbers.

dnspython is just a DNS library, it does not use /etc/hosts at all. I could see that being a desirable trait for reproducibility. It currently is reading /etc/resolv.conf , but we may be able to add environment variables to configure it. Another thing to note is that it only works in monkey-patched processes.

This needs some further documentation updates before merging, and I'm not 100% convinced that it's really worth merging (the refactoring, yes, the dnspython, maybe not, based on those numbers---but we need better numbers). Feedback is, as always, extremely welcome.

Fixes #580. Ref #910 for some more earlier discussions.

jamadden added some commits Jan 31, 2018

@arcivanov

This comment has been minimized.

Contributor

arcivanov commented Feb 1, 2018

Is a loop of 100 iterations enough to make PyPy finish jitting and stabilize?

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

Is a loop of 100 iterations enough to make PyPy finish jitting and stabilize?

Excellent question. I think the answer is "no", based on this vmprof trace (where I'm actually running 200 iterations): http://vmprof.com/#/556030b6-df67-4e71-bc38-120621e3cfbe

But there was so much noise from the DNS servers that it was basically pointless to measure anyway. I've set up a caching DNS server locally and am working to get some more reliable numbers.

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

New numbers! This script is hitting a single DNS server, a local dnsmasq instance configured to have everything cached. The result is numbers that are much more stable. The default backends were used (corecext and corecffi).

(PyPy (vmprof trace))

Resolver PyPy 2 5.10 CPython 2.7.14 CPython 3.7b1 CPython 3.6.4
Blocking 0.31 0.31 0.31 0.31
c-ares 3.58 3.60 3.42 3.19
thread 0.35 0.35 0.35 0.35
dnspython 0.81 1.80 1.08 1.17

The number is the time taken by the total of 300 sequential gethostbyname calls. All three platforms have the same story: the native blocking resolver is the fastest (on macOS, there's a layer of caching in the name resolution library itself), followed closely by the threaded resolver (the extra overhead comes from synchronization), followed by dnspython, lagging by 2-3x, followed far behind by c-ares.

So if the threaded resolver doesn't work for you, and you can live without access to /etc/hosts (or you can run a dnsmasq instance to provide that), the dnspython option might be a good one for you.

Now, the above benchmarks are completely sequential, not demonstrating any parallelism at all. What happens if we batch up 100 greenlets and 100 DNS queries at the same time?

(Note that under PyPy if I let an exception bubble up, the ares process could crash.)

Resolver PyPy 2 5.10 CPython 2.7.14 CPython 3.7b1 CPython 3.6.4
Blocking 0.31 0.32 0.32 0.32
c-ares 2.33 2.49 2.45 2.60
thread 0.13 0.11 0.11 0.12
dnspython 0.18 0.29 0.21 0.21

To put that in relative terms, how much faster was the parallel operation? (What percent of the sequential time did parallel take)?

Resolver PyPy 2.7 3.7 3.6
blocking 100% 100% 100% 100%
ares 65% 69% 71% 82%
thread 37% 31% 31% 34%
dnspython 22% 16% 19% 18%

So the dnspython resolver appears to scale better than the ares resolver or the threaded resolver.

[EDIT: Simplified the tables.]
[EDIT2: Added data for 3.6. Note that it is not comparable to 3.7; 3.7 is the python.org build with conservative build settings, 3.6 is a custom local build with a newer compiler and SDK (as is 2.7; pypy is also stock from pypy.org)]

@arcivanov

This comment has been minimized.

Contributor

arcivanov commented Feb 1, 2018

DNSPython provides another valuable feature - TTL-obeying cache and the one you can reset. System caches are often not quite compliant and impossible to expire. This is a great addition IMO.

Could you also add latest production CPython (3.6) to bench? 3.7 is expected to have an additional 20% performance gain due to new method-resolution instruction set which would not be the case in 3.6.

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

DNSPython provides another valuable feature - TTL-obeying cache and the one you can reset. System caches are often not quite compliant and impossible to expire.

A salient point.

Could you also add latest production CPython (3.6) to bench?

OK.

3.7 is expected to have an additional 20% performance gain due to new method-resolution instruction set which would not be the case in 3.6.

I know 😉

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

(Note that under PyPy if I let an exception bubble up, the ares process could crash.)

Reported at https://bitbucket.org/pypy/pypy/issues/2745/cpyext-exceptions-in-greenlets-running

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

If I enable the dnspython cache, PyPy's time goes from 0.18 to 0.07 for parallel, while 2.7 goes from 0.29 to 0.07 and 3.6 goes from 0.21 to 0.05.

Sequential time effectively becomes the same as the parallel time, after the first iteration.

The dnspython cache appears to respect the TTL of the DNS answers; although it does incur a syscall (time.time) for each access.

@jamadden jamadden changed the title from [WIP] Add a DNSpython resolver to Add a DNSpython resolver Feb 1, 2018

@jamadden

This comment has been minimized.

Member

jamadden commented Feb 1, 2018

eventlet has code to read hosts files. An ambitious next step would be for someone to port that to gevent.

@jamadden jamadden merged commit 41fc0a8 into master Feb 1, 2018

4 of 5 checks passed

continuous-integration/appveyor/branch AppVeyor build failed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.2%) to 88.925%
Details

@jamadden jamadden deleted the dnspython branch Feb 1, 2018

@carsonip

This comment has been minimized.

carsonip commented Oct 10, 2018

@jamadden I decided to try to reproduce the results you've got here. I had issues with thread dns resolver as mentioned here then I moved to ares. In my experience, ares is fast and stable. I wanted to check if dnspython can do better than ares.

However, dnspython performs horribly on my machine. Here's the result using your benchmarking script:

Testing dnspython
    18.34    5.46    7.78    6.78    17.17
dnspython: best of 5 runs: 5.46; worst: 18.34
Testing blocking
    1.10    0.34    0.40    0.63    0.41
blocking: best of 5 runs: 0.34; worst: 1.10
Testing ares
    0.33    0.55    0.31    0.33    0.42
ares: best of 5 runs: 0.31; worst: 0.55
Testing thread
    0.43    0.40    0.42    0.45    0.47
thread: best of 5 runs: 0.40; worst: 0.47
| Resolver | One Iteration | Three Iterations | Delta 3 - two |
| -------- | ------------: | ---------------: | --------------: |
| dnspython |      2.36 |      5.46 |      1.61 |
|  blocking |      0.01 |      0.34 |      0.32 |
|      ares |      0.01 |      0.31 |      0.29 |
|    thread |      0.03 |      0.40 |      0.34 |

dnspython is at least 10x slower than all other resolvers. I tried with dnspython 1.15.0 and also github master branch.

Setup:
Linux Mint 19 (based on Ubuntu 18.04) with default system dns cache at 127.0.0.53
Python 2.7.15rc1
gevent==1.3.6

Any clue? Thanks.

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 10, 2018

The earlier script evolved into a repeatable benchmark shipped with gevent sources. I run it against a local dnsmasq server configured to cache everything (I found there was a tremendous degree of variability based on caches):

$ sudo dnsmasq -d --cache-size=100000 --local-ttl=1000000 --neg-ttl=10000000 --max-ttl=100000000 --min-cache-ttl=10000000000  --no-poll --auth-ttl=100000000000

And then I run the benchmark script like so: (you may want to tweak the values of the two constants to determine how long you want the script to run; don't forget to edit /etc/resolv.conf for the blocking and threaded resolvers):

$ GEVENT_RESOLVER_NAMESERVERS=127.0.0.1 python benchmarks/bench_dns_resolver.py --inherit-environ GEVENT_RESOLVER_NAMESERVERS

Running in Python 3.7.0 on my macOS machine this morning with N=150 and RUN_COUNT=2, I get (results grouped and ordered):

dnspython sequential: Mean +- std dev: 147 us +- 4 us
blocking sequential : Mean +- std dev: 1.11 ms +- 0.03 ms
thread sequential   : Mean +- std dev: 1.21 ms +- 0.02 ms
ares sequential     : Mean +- std dev: 6.56 ms +- 0.21 ms

dnspython parallel  : Mean +- std dev: 152 us +- 7 us
thread parallel     : Mean +- std dev: 357 us +- 16 us
blocking parallel   : Mean +- std dev: 1.11 ms +- 0.01 ms
ares parallel       : Mean +- std dev: 6.76 ms +- 0.21 ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment