## 2.1 Computer Arithmetic

Some knowledge of how computers perform numerical computations and how programming
languages work is useful in applied numerical work, especially if one is to
write efficient programs and avoid errors. It often comes as an unpleasant surprise to
many people to learn that exact arithmetic and computer arithmetic do not always
give the same answers, even in programs without programming errors.

Typically, computer languages such as Fortran and C allow several ways of representing
a number.

The exact details of the representation depends on the hardware but it will
suffice for our purposes to suppose that floating point numbers are stored in the form
$m2^n$, where m and e are integers with $-2^b \leq m <2^b$ and $-2^d \leq n < 2^d$.

The obvious way of computing this term will result in loss of precision.

These arise not only from overflow but from division by 0.

In addition,floating point numbers may get set to $nan$, which stands for not-a-number.

Roundoff error is only one of the pitfalls in evaluating mathematical expressions.
In numerical computations, error is also introduced by the computer's inherent inability
to evaluate certain mathematical expressions exactly. For all its power, a computer
can only perform a limited set of operations in evaluating expressions. Essentially this
list includes the four arithmetic operations of addition, subtraction, multiplication
and division, as well as logical operations of comparison.

Other common functions,such as exponential, logarithmic, and trigonometric functions cannot be evaluated directly using computer arithmetic. They can only be evaluated approximately using algorithms based on the four basic arithmetic operations.

For the common functions very efficient algorithms typically exist and these are
sometimes "hardwired" into the computer's processor or coprocessor. An important
area of numerical analysis involves determining efficient approximations that can be
computed using basic arithmetic operations.



$$exp(x) = \sum^{\infty}_{i=0} x^{n}/n!$$

Obviously one cannot compute the infinite sum, but one could compute a finite number
of these terms, with the hope that one will obtain sufficient accuracy for the
purpose at hand. The result, however, will always be inexact.

In [30]:
#Let's check Finite Precision Arithmetic
x=(1e-5 + 1) - 1
print('Case a) x is equal to {:0.10f} '.format(x))
x=1e-5 + (1 - 1)
print('Case b) x is equal to {:0.10f} '.format(x))
print('_____________________________')
x=(1e-5 + 1) - 1
print('Case a) x is equal to {:0.20f} '.format(x))
x=1e-5 + (1 - 1)
print('Case b) x is equal to {:0.20f} '.format(x))
print('_____________________________')
x=(1e-14 + 1) - 1
print('Case a) x is equal to {:0.20f} '.format(x))
x=1e-14 + (1 - 1)
print('Case b) x is equal to {:0.20f} '.format(x))


Case a) x is equal to 0.0000100000 
Case b) x is equal to 0.0000100000 
_____________________________
Case a) x is equal to 0.00001000000000006551 
Case b) x is equal to 0.00001000000000000000 
_____________________________
Case a) x is equal to 0.00000000000000999201 
Case b) x is equal to 0.00000000000001000000 


In [31]:
#Let's subtract two big numbers
x=1000000.2-1000000.1
print('x is equal to {:0.20f} '.format(x))

x is equal to 0.09999999997671693563 


In [32]:
#This is still a number
1e308

1e+308

In [33]:
#This causes overflow
1e309

inf

In [34]:
#This is computational garbage
1e309/1e309

nan

In [35]:
#Get the largest number
#https://stackoverflow.com/questions/1835787/what-is-the-range-of-values-a-float-can-have-in-python
sys.float_info.max

1.7976931348623157e+308

In [36]:
#Get the smallest positive -normalized- number
sys.float_info.min

2.2250738585072014e-308

In [37]:
#Get denormalized minimum
sys.float_info.min*sys.float_info.epsilon

5e-324

In [38]:
#Get the machine precision
sys.float_info.epsilon

2.220446049250313e-16