/ akka Public

# Loss of floating point precision in PhiAccrualFailureDetector#1821

Closed
opened this issue Nov 6, 2013 · 3 comments
Closed

# Loss of floating point precision in PhiAccrualFailureDetector #1821

opened this issue Nov 6, 2013 · 3 comments

### robdavid commented Nov 6, 2013

 I have been testing Akka on a resource constrained cluster of multiple JVMs on a single physical machine, with the result that heartbeat arrival times can vary widely, due, presumably, to GC's running a lot of the time. I tried to set a high value of the phi threshold to cope with this issue, but found this was ineffective. So I grabbed a copy of the PhiAccrualFailureDetector and added some logging to try to figure out why. I found that values of phi calculated by PhiAccrualFailureDetector would suddenly jump from approx. 15 to infinity. In the PhiAccrualFailureDetector, the calculation of Phi makes use of a function that calculates the cumulative distribution function for a given point on the bell curve: ``````/** * Cumulative distribution function for N(mean, stdDeviation) normal distribution. * This is an approximation defined in β Mathematics Handbook (Logistic approximation). * Error is 0.00014 at +- 3.16 */ private[apakgroup] def cumulativeDistributionFunction(x: Double, mean: Double, stdDeviation: Double): Double = { val y = (x - mean) / stdDeviation // Cumulative distribution function for N(0, 1) 1.0 / (1.0 + math.exp(-y * (1.5976 + 0.070566 * y * y))) } `````` This is then used to calculate a value for phi as follows: ``````val phi = -math.log10(1.0 - cumulativeDistributionFunction(timeDiff, mean, stdDeviation)) `````` For the sake of argument, let E be the value of `math.exp(-y * (1.5976 + 0.070566 * y * y))` The the calculation of phi boils down to phi = -log10(1 - 1/(1+E)) For small values of E (< 1E-15) then the limit of the precision of a double value means than the value of E is effectively discarded, as it is sufficiently smaller than 1.0 for it to “fall off” the mantissa when added together. This means that 1 - 1/(1+E) becomes 1 - 1/1 = 0, and the value of phi become infinite. Effectively this limits the maximum meaningful value of phi that can be returned to approx. 15. However, the expression within the log can be re-arranged algebraically: 1 - 1 / (1+E) = (1+E) / (1+E) - 1 / (1+E) = E / (1+E) And the phi calculation becomes phi = -log10(E/(1+E)) Now for small values of E, the loss of precision on 1+E means that E/(1+E) = E/1 = E and phi=-log10(E). The maximum value of phi is then constrained only by the largest negative number the log10() function can be produce, i.e. the largest negative exponent (approx -330). The approximation E/(1+E)=E for small values of E seems like the more correct one for floating point calculations with limited precision. In testing this alternate formulation, however, I have found that values of E = Infinity are possible, presumably due to large enough negative values of y in `math.exp(-y * (1.5976 + 0.070566 * y * y))`. The previous formulation handles this correctly and gives 1 for 1-1/(1+Infinity) whereas Infinity/(1+Infinity) yields NaN. Therefore my solution is to use the new calculation for values that are greater than the mean, and the old calculation otherwise. My final version of the phi function is: ``````private[apakgroup] def phi(timeDiff: Long, mean: Double, stdDeviation: Double): Double = { val y = (timeDiff - mean) / stdDeviation val e = math.exp(-y * (1.5976 + 0.070566 * y * y)) val phi = if (timeDiff > mean) -math.log10(e / (1.0 + e)) else -math.log10(1.0 - 1.0/(1.0 + e)) phi } `````` This seems robust in the testing I’ve done so far, yielding the same value of phi as previously for phi < 15 approx., but allowing me to usefully set a very high threshold (e.g. 250). The text was updated successfully, but these errors were encountered:

### bantonsson commented Nov 6, 2013

 Hi @robdavid , That is a very nice bug report and analysis/solution. Akka doesn't use github for issue tracking. Could you please open a ticket in Assembla here: http://www.assembla.com/spaces/akka/tickets Following the instructions here: http://doc.akka.io/docs/akka/current/project/issue-tracking.html Since you seem to have solved the problem, could open a pull request with your contributions as well? https://github.com/akka/akka/blob/master/CONTRIBUTING.md

### robdavid commented Nov 6, 2013

 OK, ticket #3706 created.

### bantonsson commented Nov 6, 2013

 Thank you very much.

mentioned this issue Dec 5, 2020