Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
rec: Sort NS addresses by speed, remove old ones #5877
We used to not sort the different addresses we had for a given NS by speed, only taking care of placing the first one in front. However we also didn't remove existing entries for a given NS, meaning that if a given IP stopped being advertised it would stay in our NS speeds map and keep being used to determine the fastest NS, even if we would only send queries to the new IPs after the selection. Since we didn't send any query to the old IP anymore, its latency would only keep decaying meaning the computed latency of the corresponding NS would only keep decreasing, completely uncorrelated from its real latency.
This commit removes old entries from the NS speeds map if they are no longer present when we refresh the addresses of a given NS. In addition, it orders all NS IPs by decaying latency, meaning new ones will have a fair chance of being picked up.
It might fix #1066, although it did not test that exact case.
Don't merge without a careful review!
referenced this pull request
Nov 1, 2017
I have beaten on this a lot, and after the small changes in dump-nsspeeds, I think we should do this, but continue looking into why the speeds reported for nameservers are sometimes different from what I observe from outside of PowerDNS. This is not specifically related to this PR and likely was around already. But in any case, this reads well and my testing shows no issues.
Nov 1, 2017
1 check passed
Observing build a88e082 's behaviour in a simulated environment still gives preference to the new IP for a root nameserver.
rec_control dump-nsspeeds /tmp/out && grep '^[\;dh]' /tmp/out dumped 39 records ; nsspeed dump from thread follows ; d.root-servers.net -> 10.0.1.4/398.325378 h.root-servers.net -> 10.0.1.8/701.068420 ; nsspeed dump from thread follows ; d.root-servers.net -> 10.0.1.4/61.136986 10.0.2.4/201549.312500 h.root-servers.net -> 10.0.1.8/61.137657 ; nsspeed dump from thread follows ; d.root-servers.net -> 10.0.1.4/60.774120 10.0.2.4/200368.500000 h.root-servers.net -> 10.0.1.8/61.066288
It really is storing the wrong value for 10.0.1.4 somehow, as I'm measuring >200 msec responses with dig in this environment.