In pdns_recursor.cc is this calculation for each user query:
g_stats.avgLatencyUsec=(uint64_t)((1-0.0001)*g_stats.avgLatencyUsec + 0.0001*newLat);
This does not work if newLat<10000 (10ms). Because the uint64_t cast is cutting the decimal places and avgLatencyUsec retains its value minus 1/10000 of it. But it would work if avgLatencyUsec type is a double.[[BR]][[BR]]
Another point is this condition:
if(newLat < 1000000); // outliers of several minutes exist..
This means that timeouts does never count because the default network-timeout is 1500ms. That's bad. We suggest this:
newLat = min(newLat,(uint64_t)(g_networkTimeoutMsec*1000));
And in this context we have a suggestion for a new Recursor setting:
::arg().set("latency-statistic-size","Number of latency values to calculate the qa-latency average")="10000";
With this option we can change the smoothing factor of the qa-latency value.
Attached you will find our suggestions as a patch. It's probably faulty coded. But it works for us.
Attachment '' (avgLatency.patch) https://gist.github.com/5466728
implement patch from #424 improving ('fixing') our average latency ca…
…lculation. Closes #424.