PowerDNS Recursor qa-latency statistic failures #424

Closed
Habbie opened this Issue Apr 26, 2013 · 1 comment

Projects

None yet

1 participant

@Habbie
Member
Habbie commented Apr 26, 2013

In pdns_recursor.cc is this calculation for each user query:

g_stats.avgLatencyUsec=(uint64_t)((1-0.0001)*g_stats.avgLatencyUsec + 0.0001*newLat);

This does not work if newLat<10000 (10ms). Because the uint64_t cast is cutting the decimal places and avgLatencyUsec retains its value minus 1/10000 of it. But it would work if avgLatencyUsec type is a double.[[BR]][[BR]]

Another point is this condition:

uint64_t newLat=(uint64_t)(spent*1000000);
if(newLat < 1000000); // outliers of several minutes exist..
  ...

This means that timeouts does never count because the default network-timeout is 1500ms. That's bad. We suggest this:

uint64_t newLat=(uint64_t)(spent*1000000);
newLat = min(newLat,(uint64_t)(g_networkTimeoutMsec*1000));

And in this context we have a suggestion for a new Recursor setting:

::arg().set("latency-statistic-size","Number of latency values to calculate the qa-latency average")="10000";

With this option we can change the smoothing factor of the qa-latency value.

Attached you will find our suggestions as a patch. It's probably faulty coded. But it works for us.

@Habbie Habbie was assigned Apr 26, 2013
@Habbie
Member
Habbie commented Apr 26, 2013

Attachment '' (avgLatency.patch) https://gist.github.com/5466728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment