# Numerical Methods - Biswa Datta

## 3.2.4 Catastrophic Cancellation

When computers perform arithmetic operations such as addition, subtraction etc. It does'nt happen how you would've expected it would, or rather should. 

Computers perform fixed-precision arithmetic with a maximum of 64 bits of precision. This has real [consequences](http://www-users.math.umn.edu/~arnold/disasters/).

One of the problems seen when working with limited precision is known as Catastrophic Cancellation. It is basically the phenomenon of losing significant digits when computers perform subtraction of close numbers, it starts out when information is lost due to the **rounding off** that occurs, but the consequences are only apparent when we subtract. In order to showcase this concept I'll use *decimal* - a python package which offers arbitrary-precision arithmetic.

Say I want to compute $f(0.01)$ (in order for us to easily visualize it, I'll be working with $5$ significant digits) where $f$ is,

$$f(x) = e^x - 1 - x$$

In 5 precision arthemetic, we can't hold the whole of $e$, so we round **up** $2.71828182845904$ to $2.7183$ as $28>25$. 


$$ f_5(0.01) = (2.7183)^{0.01}-1-0.01$$

Here $(2.7183)^{0.01} = 1.0100502346051552$, but again remember only 5 significant digits allowed! As $502>500$ we go with $1.0101$.

At this point note that the rounding that we did didnt introduce any significant error - just $\approx 10^{-5}$, which is expected considering that we are using only 5 significant digits.

$$\epsilon = |1.0100502346051552 - 1.0101 | = 4.976539484480291 \times 10^{-5}$$
$$\eta = \frac{4.976539484480291 \times 10^{-5}}{1.0100502346051552} = 4.927021759888705 \times 10^{-5}$$

Now comes the "cancellation",

$$ f_5(0.01) = 1.0101-1-0.01 = 0.0001$$

In [2]:
import math
import decimal
from decimal import *
from math import e

precision = 5
getcontext().prec = precision

val = 0.0100
decX = Decimal(val)
decE = Decimal(e)+Decimal(0)
print(decE)

def fun(x):
    print("i.e. e raised to x equals")
    print(decE**x)
    return(decE**x-Decimal(1)-x)

print("\nRemoving 1.0 and 0.01 we have \nf(0.01) = "+'{0:.4f}'.format(fun(decX)))

2.7183
i.e. e raised to x equals
1.0101

Removing 1.0 and 0.01 we have 
f(0.01) = 0.0001


So this looks OK right, this should be the answer, not an exact value but should atleast be around 5 decimal places close to the answer.

But no, the actual answer is $0.00005$ (upto 5 decimal places).

In [3]:
# Moving on to use all 53 bits of precision offered by python float
real = (e**val)-1-val
print("\nThe real answer is")
print('{0:.10f}'.format(real))

print("\nwhich we get this by subtracting x and 1 from e^x,")
print(e**val)
print(e**val-val)
print('{0:.10f}'.format(real))


The real answer is
0.0000501671

which we get this by subtracting x and 1 from e^x,
1.010050167084168
1.000050167084168
0.0000501671


Notice how the relative error is around $1$ which is 100,000 times worse! We get a percentage error of $99.33\%$

$$\epsilon = |0.0000501671 - 0.0001| = 4.98329 \times 10^{-5}$$
$$\eta = \frac{4.98329 \times 10^{-5}}{0.0000501671} = 0.9933382635233051$$


Now this error was introduced when we "rounded to nearest" and starting working with $1.0101$ instead of $1.01005$, note even if had rounded the other way ($1.0100$) we would still have around this same error.

But the reason the [relative error](https://en.wikipedia.org/wiki/Approximation_error#Formal_Definition) shot up was because the "actual" value became close to zero when we subtracted the close numbers. And **the closer you are to zero the more important these significant digits that we lost become.**

Now the wikipedia article on [Catastrophic Cancellation](https://en.wikipedia.org/wiki/Loss_of_significance) should make sense, note how the absolute error is still around $10^{-5}$, not much change there. It is the relative error which increased "substantially" and the key point is that relative error for any value near zero will be large.

A solution to this problem is to use alternative ways to compute what we want, hopefully in a way that doesnt involve catastrophic cancellation.

In our case we know,

$$e^x = 1+x+\frac{x^2}{2!} + \frac{x^3}{3!} + \cdots$$

So we can find $f(x)$ by adding positive numbers, avoiding subtraction (works only when x is positive)

$$f(x) = e^x - 1 - x = \frac{x^2}{2!} + \frac{x^3}{3!} + \cdots$$


In [4]:
# Alternatively using the power series
temp = Decimal(0.00000)

k = 3
for i in range(2,k+2):
    j = Decimal(i)
    temp = temp + decX**i/math.factorial(j)
    # A single term from that series gives the right result
    print("\n\nThe sum of first "+str(i-1)+" terms are,")
    print(temp)




The sum of first 1 terms are,
0.00005000


The sum of first 2 terms are,
0.000050167


The sum of first 3 terms are,
0.000050167


### A different example

Finally I'd like to mention a popular example - finding the roots to a quadratic equation. 

$$ax^2+bx+c = 0$$

When $\sqrt{b^2-4ac} \approx b$, then one root is likely to be wrong if we compute via the formula.

$$\frac{-b+\sqrt{b^2-4ac}}{2a}$$

and its not like this is unlikely, a small $c$ or $a$ will cause $4ac \approx 0$.

In which case the smart way is to find one root by subtraction,


$$\frac{-b-\sqrt{b^2-4ac}}{2a}$$

And the second root should be,

$$= \frac{\frac{c}{a}}{\frac{-b-\sqrt{b^2-4ac}}{2a}}$$

In the example below we see $99.975\%$ percentage error!

In [5]:
a = Decimal(1)
b = Decimal(2)
c = Decimal(10**-3)


estiRoot1 = (-b+(b**2-4*a*c).sqrt())/(2*a)
print("Root one is "+str(estiRoot1))

estiRoot2 = (-b-(b**2-4*a*c).sqrt())/(2*a)
print("Root two is "+str(estiRoot2))

print("But the actual root one is "+'{0:.8f}'.format((c/a)/estiRoot2))

print("The error is because of how sqrt(b^2-4ac) approx b, i.e. = "+str((b**2-4*a*c).sqrt()))

error = abs(estiRoot2 - estiRoot1)/abs(estiRoot2)
print("The error in percentage is "+str(100*error))

Root one is -0.0005
Root two is -1.9995
But the actual root one is -0.00050013
The error is because of how sqrt(b^2-4ac) approx b, i.e. = 1.9990
The error in percentage is 99.975
