New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rec: Sort NS addresses by speed, remove old ones #5877

Merged
merged 2 commits into from Nov 1, 2017

Conversation

Projects
None yet
3 participants
@rgacogne
Member

rgacogne commented Oct 30, 2017

Short description

We used to not sort the different addresses we had for a given NS by speed, only taking care of placing the first one in front. However we also didn't remove existing entries for a given NS, meaning that if a given IP stopped being advertised it would stay in our NS speeds map and keep being used to determine the fastest NS, even if we would only send queries to the new IPs after the selection. Since we didn't send any query to the old IP anymore, its latency would only keep decaying meaning the computed latency of the corresponding NS would only keep decreasing, completely uncorrelated from its real latency.

This commit removes old entries from the NS speeds map if they are no longer present when we refresh the addresses of a given NS. In addition, it orders all NS IPs by decaying latency, meaning new ones will have a fair chance of being picked up.

It might fix #1066, although it did not test that exact case.

Don't merge without a careful review!

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled and tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)
rec: Sort NS addresses by speed, remove old ones
We used to not sort the different addresses we had for a given NS
by speed, only taking care of placing the first one in front.
However we also didn't remove existing entries for a given NS,
meaning that if a given IP stopped being advertised it would stay
in our NS speeds map and keep being used to determine the fastest
NS, even if we would only send queries to the new IPs after the
selection. Since we didn't send any query to the old IP anymore,
its latency would only keep decaying meaning the computed latency
of the corresponding NS would only keep decreasing, completely
uncorrelated from its real latency.

This commit removes old entries from the NS speeds map if they are
no longer present when we refresh the addresses of a given NS.
In addition, it orders all NS IPs by decaying latency, meaning new
ones will have a fair chance of being picked up.
fix issue where we would submit nameserver performance stats for an e…
…mpty DNSName for authoritative zones, which would trip up dump-nsstats. Fixed it in depth.

Also added some error messages in case dump-nsspeeds ever throws an exception again.
@ahupowerdns

I have beaten on this a lot, and after the small changes in dump-nsspeeds, I think we should do this, but continue looking into why the speeds reported for nameservers are sometimes different from what I observe from outside of PowerDNS. This is not specifically related to this PR and likely was around already. But in any case, this reads well and my testing shows no issues.

@ahupowerdns ahupowerdns merged commit a88e082 into PowerDNS:master Nov 1, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@rgacogne rgacogne deleted the rgacogne:rec-nsip-speed-tracking branch Nov 1, 2017

@giganteous

This comment has been minimized.

Show comment
Hide comment
@giganteous

giganteous Nov 2, 2017

Contributor

Observing build a88e082 's behaviour in a simulated environment still gives preference to the new IP for a root nameserver.

rec_control dump-nsspeeds /tmp/out && grep '^[\;dh]' /tmp/out
dumped 39 records
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/398.325378 
h.root-servers.net -> 10.0.1.8/701.068420 
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/61.136986 10.0.2.4/201549.312500 
h.root-servers.net -> 10.0.1.8/61.137657 
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/60.774120 10.0.2.4/200368.500000 
h.root-servers.net -> 10.0.1.8/61.066288 

It really is storing the wrong value for 10.0.1.4 somehow, as I'm measuring >200 msec responses with dig in this environment.

Contributor

giganteous commented Nov 2, 2017

Observing build a88e082 's behaviour in a simulated environment still gives preference to the new IP for a root nameserver.

rec_control dump-nsspeeds /tmp/out && grep '^[\;dh]' /tmp/out
dumped 39 records
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/398.325378 
h.root-servers.net -> 10.0.1.8/701.068420 
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/61.136986 10.0.2.4/201549.312500 
h.root-servers.net -> 10.0.1.8/61.137657 
; nsspeed dump from thread follows
;
d.root-servers.net -> 10.0.1.4/60.774120 10.0.2.4/200368.500000 
h.root-servers.net -> 10.0.1.8/61.066288 

It really is storing the wrong value for 10.0.1.4 somehow, as I'm measuring >200 msec responses with dig in this environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment