# Accuracy 

## Overflow and underflow

In python 3, integers can hold arbitrarily large values (within limits of system memory).

In [1]:
bigNum = 2**1024

bigNum

179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216

However, `float`s can only represent values between $\sim\!10^{-308}$ (there are subtle caveats to this statement) and $\sim\!10^{308}$.

In [2]:
1e308, 1e308*10

(1e+308, inf)

The behavior above is referred to as "overflow."  `inf` is short for infinity, and behaves roughly as one would expect.

In [3]:
inf = float("inf")

print("type(inf): {}".format(type(inf)),
      "inf/1e308: {}".format(inf/1e308),
      "inf/inf: {}".format(inf/inf),
      sep="\n")

type(inf): <class 'float'>
inf/1e308: inf
inf/inf: nan


Python uses `nan` (not a number) to represent undefined or unpresentable values, which has the unusual property that it is not equal to itself.

In [4]:
x = inf/inf

x == x

False

The opposite scenario, in which a calculated value is too small, is referred to as "underflow."

In [5]:
1e-323, 1e-323/10

(1e-323, 0.0)

The case of underflow is more subtle than overflow.  There is a gradual loss of precision for values below $\sim\!10^{308}$.

In [6]:
smallNum = 1.123456789123456789e-307  

while smallNum>0:
    print(smallNum)
    smallNum /= 10

1.1234567891234568e-307
1.123456789123457e-308
1.12345678912346e-309
1.12345678912344e-310
1.1234567891235e-311
1.123456789124e-312
1.1234567891e-313
1.123456789e-314
1.12345679e-315
1.1234568e-316
1.123457e-317
1.123456e-318
1.12346e-319
1.1235e-320
1.12e-321
1.14e-322
1e-323


Other numerical types have limitations as well.

### Warning

Logical errors can happen when you are dealing with overflow and underflow situations.


In [7]:
a, b = 1.0e500, 7.0e500

a, b, a == b

(inf, inf, True)

The math module includes functions to test if a variable is `inf` or `nan`.

In [8]:
import math

math.isinf(a), math.isnan(x)

(True, True)

## `sys` module

Python has a `sys` module that allows you to check parameters of your computer: https://docs.python.org/3.6/library/sys.html

In [9]:
import sys
sys.platform

'darwin'

The attribute `float_info` gives information on the maximum and minimum `float` values, as well as other info.

In [10]:
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

## Float precision and numerical error

Since we have finite bits to represent variables, irrational numbers must be rounded.  However, rational numbers with more than 16 significant figures will also be rounded, leading to rounding errors.

In [11]:
0.1 + 0.1 + 0.1

0.30000000000000004

Why didn't python calculate 0.3?  To understand some issues related to rounding, we must consider the binary representation of variables.  In memory, numerical values are represented in base 2 (binary).  While 0.1 is a rational number in base 10, in base 2 it is the irrational number 0.0<span style="color:red">0011</span><span style="color:orange">0011</span><span style="color:yellow">0011</span><span style="color:green">0011</span><span style="color:blue">0011</span><span style="color:purple">0011</span>...  Therefore, this value can't be accurately represented with finite bits, resulting in rounding error.

------------------

### Equality testing

Due to issues related to numerical precision, you should never test for equality of two `float`s.

In [12]:
x=49.0

x*(1/49.0) == 1

False

Instead, you should test if two `float`s have a difference less than some small number.

In [13]:
x = 1.1+2.2
epsilon = 1e-12

print(x == 3.3, abs(x-3.3)<epsilon)

False True


Alternatively, you can use the `isclose` function (from the math module).

In [14]:
import math

math.isclose(x, 3.3)

True

Also note that floating point addition is not associative.

In [16]:
a, b, c = 1e14, 25.44, 0.74

(a+b)+c == a+(b+c)

False

Nor is it distributive.

In [17]:
a, b, c, = 100, 0.1, 0.2

a*(b+c) == a*b + a*c

False

## Loss of signficance

Suppose we want to calculate the difference betwen two numbers, 1.2345432 and 1.23451.  With full precision, the difference is 0.00000332.  

However, if we calculated the difference on a machine with only 6 digits of precision, (after rounding) we would calculate the difference as $1.23454-1.23451 = 0.00003$.  Initially, we had two numbers, with 8 and 6 significant figures.  However, the result of our calculation has only a single significant figure.  This is a common phenomenon when taking the difference of two similar numbers.

A similar phenomenon occurs when a small number is added/subtracted to a large number.  Suppose we want to add the numbers 12345.6 and 0.123456.  Again, on a machine with only 6 digits of precision, we would calculate the sum as $12345.6 + 0.123456 = 12345.7$, resulting in an error of 0.023456.

In reality, python has 15 digits of precision, so this is a smaller effect.  But it can still cause problems.  

Let's assume we want to calculate the difference between two numbers, $x=1+\sqrt{2}\times10^{-14}$ and $y=1$.  With infinite precision, the difference is $\sqrt{2}\times10^{-14}$ or approximately $1.41421356237309504\times10^{-14}$.  However, python gives a different result.

In [18]:
x = 1 + math.sqrt(2)*1e-14
y = 1

y-x

-1.4210854715202004e-14

This is accurate only to 1 decimal place.

To avoid such issues, the `fsum()` function (within the math module) can be used: https://docs.python.org/3/library/math.html#math.fsum

In [19]:
l = [.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]

sum(l), math.fsum(l)

(0.9999999999999999, 1.0)

And returning to the difference of $1+\sqrt{2}\times10^{-14}$ and $1$:

In [20]:
math.fsum([1,math.sqrt(2)*1e-14,-1])

1.4142135623730951e-14

This agrees with the true value to 15 decimal places.