# A Note about Underflow and Overflow

In [1]:
import numpy as np

A computer represents a double-precision number using 64 bits.  Of these, 1 is for the sign of the number, one for the sign of the exponent, 53 for the base, and 9 for the exponent (a number less than 2 times 2 to a power).  The number is written in base 2, and the exponent is also written in base 2.  So the largest number that may be represented is

$2 \times 2^{2^{10} - 1} - 1 = 2^{1024} - 1$.

The smallest number that may be represented using double precision floating point is roughly 

$2^{-2^{10}}$,

but this time, the number multiplying the exponent could have all zeros with a one in the last place.  So the smallest number is actually about

$2^{-2^{10} - 53} = 2^{-1077}$.

Let's see.

In [2]:
print(2.**1023)
print(2.**1024)  # Overflow!

8.98846567431158e+307


OverflowError: (34, 'Numerical result out of range')

In [3]:
print(2.**(-1074))
print(2.**(-1075))  # Underflow!

5e-324
0.0


This can cause trouble with likelihoods when you multiply them together.  If you have a bunch of reasonable likelihoods, say 0.5, and you multiply 10,000 of them together, you are in trouble!

In [4]:
likelihoods = np.ones(10000)*0.5
print(np.prod(likelihoods))

0.0


Sometimes the log can help with cases like this.  For example, the product of the likelihoods underflows (and will not be amenable to minimization), but I can just take the sum of the log likelihoods instead to find the log of the total likelihood.  This is much more robust to underflow or overflow.

In [5]:
print(np.sum(np.log(likelihoods)))

-6931.471805599453


For MCMC, for example, you only ever need likelihood ratios.  You can compute the ratio between two likelihooods as

$\frac{\cal L_1}{\cal L_2} = \exp \left[ \log {\cal L}_1 - \log {\cal L}_2 \right]$

Notice that you never need to represent the likelihood itself using floating point.  You can take the difference of two logs first and _then_ exponentiate.  This is usually what you want to do.