# Numerical Differentiation and Machine Error 

Throughout physics, we need to be able to take derivatives of functions and data. However, this process can introduce large errors so we need to procede carefully.

## Learning objectives

After this lesson, you should be able to:
* Compute a numerical derivative using forward/backward differentiation
* Be able to use different stencils
* Compare two floating point numbers
* Evaluate and communicate the numerical error resulting from performing differentiation discretely

## Derivatives

Estimating the derivative of a function is a very common task in scientific computing. The need arises, for example, when 
we have data that represent some dependent variable $f$ as a function of an independent variable $x$, and we would like 
to know the rate at which $f$ changes. If the data are generated from a numerical code, or from an experiment, then $f$ is 
only known at discrete 
values of $x$ and we cannot differentiate $f(x)$ analytically. We must resort to numerical techniques. 

Numerical differentiation can be difficult to do well. We cannot apply the definition 

$$
\label{derivativedef}
f'(x)  = \lim_{h\to 0}\frac{f(x+h)-f(x)}{h} 
\tag{1}
$$

directly if the data are discrete, because we cannot take the limit $h\to 0$. Even in cases where we can evaluate the 
function everywhere, this expression for the derivative is prone to an error known as _subtractive cancellation_.  Subtractive cancellation occurs if you make $h$ very small, as the definition requires. 

Consider for example the function $f(x) = \cos(x)\tanh(x)$. You can verify that it's derivative at $x=2$ is

$$
	f'(2) = \cos(2){\rm sech}^2(2) - \sin(2)\tanh(2) \approx -0.905989
$$

However, if we use the definition (1) with $h = 10^{-16}$, the result in double precision is 

$$
	f'(2) = \frac{f(2+h) - f(2)}{h} = 0.0
$$

The answer is wrong because with double precision both $f(2+h)$ and $f(2)$ evaluate to $-0.40117702779274822$.  
With single precision, subtractive cancellation occurs for much larger values of $h$. 

## Why is this? How variables are stored.

Computer represent all numbers as strings of bits. For integers, this is straightforward:

    1 = 0001
    2 = 0010
    3 = 0011

etc. For floating point numbers (decimals), life is a little bit harder.  To get an idea of how this works, try out this interactive tool:

http://evanw.github.io/float-toy/


## Forward and Backward Difference Formulas


The definition (1) for the derivative $f'(x)$ requires us to evaluate the function 
at two points, namely, $x+h$ and $x$. We need to develop techniques to *approximate the derivative $f'(x)$ 
that do not require us to take the limit $h\to 0$. The approximations to $f'(x)$ are constructed from 
combinations of the function $f$ evaluated at various points surrounding $x$. We refer to these as *finite 
difference* formulas. 

Let's say we want to approximate $f'(x)$ using the 
values of $f$ at the points $x$ and $x+h$. That is, we want a formula that says 

$$
\label{fprimeab}
f'(x) \approx a f(x+h) + b f(x) 
\tag{2}
$$

for some constants $a$ and $b$. We can determine $a$ and $b$ by using the Taylor series

$$
	f(x+h) = f(x) + f'(x) h + \frac{1}{2} f''(x) h^2 + \cdots
$$

to expand the right--hand side of Eq.(2). This yields 

$$
\label{afbfexpanded}
	a f(x+h) + b f(x)  = (a+b)f(x) + a f'(x) h + \frac{a}{2} f''(x) h^2 + \cdots
\tag{3}
$$

The right--hand side of this relation will equal $f'(x)$, approximately, if $a+b=0$ and $a=1/h$. (That is, $a=1/h$ and $b=-1/h$.) 
With these values for the constants, Eq.(2) yields 

$$
\label{forwarddiffformula}
	f'(x) \approx \frac{f(x+h) - f(x)}{h}
\tag{4}
$$

This approximation to the first derivative is called the _forward difference_. It is simply the 
definition (1) without the limit $h \to 0$.

Let's look at this result a bit more closely. With $a=1/h$ and $b=-1/h$, Eq.(3)
becomes 

$$
	\frac{1}{h} f(x+h) - \frac{1}{h} f(x) = f'(x) + \frac{1}{2} f''(x) h + \cdots
$$

Equivalently, we can write this as

$$
\label{fdwitherr}
	f'(x) = \frac{f(x+h) - f(x)}{h} - \frac{1}{2} f''(x) h + \cdots
\tag{5}
$$

This shows that the error in the forward difference approximation (5) is

$$
\label{EFexpression}
	{\cal E}_F = \frac{1}{2} f''(x) h + \cdots
$$

In particular, the leading term in ${\cal E}_F$ is proportional to $h$.

We can carry out a similar analysis to obtain a finite difference approximation to 
$f'(x)$ using the points $x-h$ and $x$. 
The result is the _backward difference_ approximation

$$
\label{backwarddiffformula}
	f'(x) \approx \frac{f(x) - f(x-h)}{h}
$$

with error 
$$
\label{EBexpression}
	{\cal E}_B = -\frac{1}{2} f''(x) h + \cdots 
$$

Again, the leading term in the error is proportional to $h$. 

## Exercise 1: 

Write a code to compute the forward and backward difference approximations to $f'(2.0)$, where 
$f(x) =  \cos(x)\tanh(x)$. Use $h = 10^{-1}, 10^{-2},\ldots,10^{-7}$. For both approximation methods, show that the errors ${\cal E}$ are 
proportional to $h$ in one of two ways:

1. Show that a log--log plot of ${\cal E}$ versus $h$ gives a straight 
line with slope $1$. 

2. If the error is proportional to $h$, then ${\cal E} = C h$ for some constant $C$. 
Let $h_1$ and $h_2$ denote your two smallest $h$ values, and ${\cal E}_1$ and ${\cal E}_2$ denote the corresponding 
errors. The relations ${\cal E}_1 = C h_1$ and ${\cal E}_2 = C h_2$ imply ${\cal E}_2/{\cal E}_1 = h_2/h_1$. If the 
ratio ${\cal E}_2/{\cal E}_1$ agrees (approximately) with the ratio $h_2/h_1$, then the errors are proportional to $h$.

**New command for today: numpy.zeros_like() -- it produces an array of zeros of the same shape as the one you give it.  So, for 1D arrays, it's equivalent to numpy.zeros(len(A)), where A is your reference array.**

In [None]:
import numpy as np
import mpmath as mp
import pylab as py

# Define the function to be finite-differentiated
def funCT(x):
    return np.cos(x)*np.tanh(x)

def sech(x):
  return 2.0/(np.exp(x)+ np.exp(-x))

# Define the analytical answer
def derivCT(x):
    return np.cos(x)*sech(x)**2-np.sin(x)*np.tanh(x)

In [None]:
# 1) show that the log-log plot of error-h gives a straight line with slope 1
h = 10.**np.arange(-7,0,1) # grid resolution
print(h)
N = len(h)

errFD = np.zeros_like(h)  # create array for errors of forward diff
errBD = np.zeros_like(h) # create array for errors of backward diff

x = 2. # point where we evaluate derivative

# Calculate errors for FFD and BFD. Use absolute number for log plot
for i in range(N):
    errFD[i] = np.absolute(derivCT(x) - (funCT(x + h[i]) - funCT(x))/h[i]) # FFD
    errBD[i] = np.absolute(derivCT(x) - (funCT(x) - funCT(x - h[i]))/h[i]) # BFD

# Plot the result
py.plot(np.log10(h), np.log10(errFD),'>-', label='forward diff')
py.plot(np.log10(h), np.log10(errBD),'<-', label='backward diff')
py.plot(np.log10(h), np.log10(h), '--k', label='$\propto h$')
py.xlabel('$\log_{10}(h)$')
py.ylabel('$\log_{10}(\cal E)$')
py.legend()
py.show()

# 2) if the error is propotional to h, the ratio for the last two elements of
# FD should agree with hGD ratio

hRatio = np.zeros(N-1) # ratio for FFD
errRatioF= np.zeros_like(hRatio) # ratio for FFD
errRatioB= np.zeros_like(hRatio) # ratio for BFD

for i in range(1,N):
    hRatio[i-1] = h[i]/h[i-1]
    errRatioF[i-1]= errFD[i]/errFD[i-1] 
    errRatioB[i-1]= errBD[i]/errBD[i-1] 

# Plot the ratio
py.plot(errRatioF,'>', label='forward')
py.plot(errRatioB,'<', label='backward')
py.plot(hRatio, '--k', label='h ratio')
py.xlabel('i')
py.legend()
#py.ylim([0,0.2])
py.show()

## Machine Error

In [None]:
#Computers represent numbers in a binary floating point format with a fixed number of bits.  
#This means that you have some limitation in how many digits your computer can represent.
#For example, try the following:

print(0.1+0.2)
print(0.3)

In [None]:
# This means that statements you *think* are true sometimes are not:
a = 0.1 + 0.2
b = 0.3

if a==b:
    print("a and b are the same!")
else:
    print("a and b are different")

In [None]:
# Aside from the numerical error you saw above in trying to use numbers h that are too small,
# the biggest place you'll have to pay attention is in comparisons such as the one above.  The correct
# way to do this is by comparing the absolutel value of the difference to a small number: |a - b| < tolerance

a = 0.1 + 0.2
b = 0.3

if abs(a-b) < 1e-10:
    print("a and b are the same!")
else:
    print("a and b are different")

You can find some more reading on floating point error here:
* http://www.lahey.com/float.htm
* https://docs.python.org/3/tutorial/floatingpoint.html

## Central Difference Formula

Let's derive a finite difference approximation for $f'(x)$ using the _three_ points $x-h$, $x$ and $x+h$. 
That is, we seek a relation 

$$
	f'(x) \approx a f(x-h) + b f(x) + c f(x+h)
    \tag{8}
$$

for some constants $a$, $b$ and $c$. The contants are determined by expanding $f(x-h)$ and $f(x+h)$ in Taylor series

$$
\begin{aligned}
	f(x-h) & = f(x) - f'(x) h + \frac{1}{2} f''(x) h^2 - \frac{1}{6} f'''(x) h^3 + \cdots  \\
	f(x+h) & = f(x) + f'(x) h + \frac{1}{2} f''(x) h^2 + \frac{1}{6} f'''(x) h^3 + \cdots 
\end{aligned}
$$

and inserting these into the right--hand side of Eq. (8):

$$
\begin{aligned}
	a f(x-h) + b f(x) + c f(x+h) & =  (a + b + c)f(x) + (c-a)f'(x) h + \frac{1}{2}(c+a)f''(x) h^2 \ \\
	&  \quad + \frac{1}{6}(c-a)f'''(x) h^3 + \cdots
\end{aligned}
$$

This expression will equal $f'(x)$, approximately, if 

$$
(a+b+c) = 0, \\
(c-a) = 1/h, \\
(c+a) = 0. 
$$

That is, $a = -1/(2h)$, $b=0$, and $c=1/(2h)$. With these values for the constants, we have 

$$
	-\frac{1}{2h} f(x-h) + \frac{1}{2h} f(x+h) = f'(x) + \frac{1}{6} f'''(x) h^2 + \cdots
$$

This gives us the __central difference__ formula for the first derivative:

$$
	f'(x) \approx \frac{f(x+h) - f(x-h)}{2h} 
$$

The error for this method is 

$$
	{\cal E}_C = \frac{1}{6} f'''(x) h^2 + \cdots
$$

It is proportional to $h^2$. 

The central difference formula is simply the average of the forward and 
backward difference formulas. In taking the average, the order $h$ terms in the errors ${\cal E}_F$ and ${\cal E}_B$ 
cancel. The order $h^2$ terms, included in the $\cdots$ of Eqs.(6) and (7), do not 
cancel; rather, they combine to give the central difference error ${\cal E}_C$. 

## Exercise

Numerically-compute the derivative of $f(x) = \cos(x)\tanh(x)$ at $x=2$ using the central difference method. Using a graph, show that the error ${\cal E}$ is proportional to $h^2$, by comparing to the value you know from calculus. 

In [None]:
h = 10.**(-np.arange(1,8,0.5)) # grid resolution
#print(h)
N = len(h)

errCD = np.zeros_like(h)  # create array for errors of forward diff
errFD = np.zeros_like(h) # create array for errors of backward diff

x = 2. # point where we evaluate derivative

# Calculate derivatives
for i in range(N):
    errCD[i]= np.abs(derivCT(x) - (funCT(x + h[i]) - funCT(x - h[i]))/(2*h[i]))/np.abs(derivCT(x)) # Central
    errFD[i]= np.abs(derivCT(x) - (funCT(x + h[i]) -funCT(x))/h[i])/np.abs(derivCT(x)) # Forward
    
# Plot the result: the slope is 2h in log plot, corresponding to h^2 error.
#py.plot(np.log10(h), np.log10(errFD),'>-', label='forward diff')
py.plot(np.log10(h), np.log10(errCD),'^-g', label='central diff - actual')
py.plot(np.log10(h), np.log10(h**2), '--k', label='$\propto h^2$')
py.xlabel('$\log_{10}(h)$')
py.ylabel('$\log_{10}(\cal E)$')
py.legend()

py.figure(2)
py.plot(h**2,errCD)
py.show()



In [None]:
len(h)

In [None]:
# The other way to show it
hRatio = np.zeros(N-1) # for plot
errRatioC= np.zeros(N-1) # ratio for FFD

for i in range(1,N):
    hRatio[i-1] = h[i]/h[i-1]
    errRatioC[i-1]= errCD[i]/errCD[i-1] 

# Plot the ratio
py.plot(errRatioC,'o', label='error ratio')
py.plot(hRatio**2, '--k', label='(h ratio)^2')
py.xlabel('index')
py.ylabel('ratio')
py.legend()
py.show()

Other stencils
--------------------
The pattern of evaluation points and coefficients is sometimes referred to as the "stencil". For example, the forward difference 
formulae might be called a one-sided, two-point stencil. The central difference formula 
is a centered, three-point stencil (although the coefficient of one of those points is zero). 

The method of the preceeding sections can be used to obtain other stencils for $f'(x)$. For example, we might 
want to calculate the derivative without any function evaluations at points less than $x$. For this we can choose 
a three-point stencil consisting of the points $x$, $x+h$ and $x+2h$. Using the Taylor series expressions 

$$
\begin{aligned}
	f(x+h) & = & f(x) + f'(x) h + \frac{1}{2} f''(x) h^2 + \frac{1}{6} f'''(x) h^3 + \cdots \\
	f(x+2h) & = & f(x) + 2 f'(x) h + \frac{4}{2} f''(x) h^2 + \frac{8}{6} f'''(x) h^3 + \cdots
\end{aligned}
$$

we have 

$$
\begin{aligned}
	a f(x) + b f(x+h) + c f(x+2h) & =   (a + b + c) f(x) + (b + 2c) f'(x) h \\
	 & + \frac{1}{2} (b + 4c) f''(x) h^2  + \frac{1}{6} (b + 8c) f'''(x) h^3 + \cdots
\end{aligned}
\tag{9}   
$$

This will approximate $f'(x)$ if the coefficients satisfy

$$
\begin{aligned}
(a + b + c) & = & 0 \\
	(b + 2c)h & = & 1 \\
	(b + 4c) & = & 0 
\end{aligned}
$$

The solution is $a = -3/(2h)$, $b = 2/h$, $c = -1/(2h)$. This yields the finite difference formula

$$
f'(x) \approx \frac{-3f(x) + 4f(x+h) - f(x+2h)}{2h}
\tag{10}
$$

The terms $(b+8c)f'''(x) h^3/6 + \cdots$ from Eq.(9), which are order $h^2$, do not vanish. Thus, the error 
for this one-sided, three-point stencil is proportional to $h^2$. 



In general, derivative formulas that use large stencils have higher order error. (That is, the error is a higher 
power of $h$.) However, derivative formulas with large stencils are more susceptible to subtractive cancellation errors. 
Thus, a stencil with a very high order error is not always accurate. 


## Exercise

First, make sure that you understand the three-point stencil derivation, above. If you get stuck on this problem, you can use the three-point stencil (first) and then come back later to update it to being a five-point stencil.

a) Determine the five-point centered stencil for $f'(x)$: this stencil is like the three-point stencil in Eq. 10, but spans the points $x-2h$, $x-h$, $x$, $x+h$, $x+2h$. You may use Mathematica or Maple to solve for the constants, or ask for help to get them. Write your equations in a similar form to the three-point stencil above. 

b) Numerically-compute the derivative of $f(x) = \cos(x)\tanh(x)$ at $x=2$ using this five-point stencil. Using a graph, show that the error ${\cal E}$ is proportional to $h^4$, by comparing to the value you know from calculus. 


In [None]:
h = 10**(-np.arange(1,8,.5)) # grid resolution
print(h)

err5CD = np.zeros_like(h) # create array for errors of 5-point CD
err3CD = np.zeros_like(h) # create array for errors of 3-point CD

x = 2. # point where we evaluate derivative

for i in range(len(h)):
    err5CD[i]= np.absolute((derivCT(x) - 
                           (funCT(x - 2*h[i]) - 8*funCT(x - h[i]) 
                            + 8*funCT(x + h[i]) - funCT(x + 2*h[i]))/(12*h[i]))/derivCT(x))*100
    err3CD[i]= np.absolute(derivCT(x) - 
                           (funCT(x + h[i]) - funCT(x - h[i]))/(2*h[i])) 

# Plot the result
#py.plot(np.log(h), np.log(err5CD),'o-m', label='5-point CD')
#py.plot(np.log(h), np.log(h**4), 'm--', label='$h^4$')
py.plot(np.log10(h), np.log10(err3CD),'^-g', label='3-point CD')
py.plot(np.log10(h[2:]), np.log10(h[2:]**2), 'g--', label='$h^2$')
py.legend()
py.xlabel('$\log_{10}(h)$')
py.ylabel('$\log_{10}(\cal E)$')

py.figure(2)
py.plot(h**4,err5CD,'-')
py.show()

In [None]:
print (np.log10(h))
print (np.log(err5CD))

## Second derivatives

We can apply the same technique to derive finite difference stencils for second derivatives, $f''(x)$, as well as higher order 
derivatives. For example, consider the three-point centered stencil for $f''(x)$. We can derive this stencil by
examining the Taylor expansion:


$$
\begin{aligned}
	a f(x-h) + b f(x) + c f(x+h) & =  (a + b + c)f(x) + (c-a)f'(x) h + \frac{1}{2}(c+a)f''(x) h^2  \\
	&  \quad + \frac{1}{6}(c-a)f'''(x) h^3 + \cdots
\end{aligned}
    \tag{11}
$$

The right-hand side will approximate $f''(x)$ if $(a+b+c)=0$, $(c-a)=0$ and $(c+a) h^2/2 = 1$. This gives 
$a = 1/h^2$, $b = -2/h^2$ and $c = 1/h^2$, so that

$$
	\frac{1}{h^2} f(x-h) - \frac{2}{h^2} f(x) + \frac{1}{h^2} f(x+h) = f''(x) + \cdots
$$

Thus, the centered three-point stencil for the second derivative is 

$$
	f''(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2}
    \tag{12}
$$

Note that the term $(c-a)f'''(x) h^3/6$ in Eq.(11) vanishes for the chosen values of 
$a$, $b$ and $c$. The next order term in Eq.(11) is proportional to $(c+a) f''''(x) h^4$. This term does not vanish and is proportional to $h^2$. Thus, the error for the formula (12) is of order $h^2$. 



## Exercise

a) Numerically-compute the second derivative of $f(x) = \ln(x)/\cosh(x)$ for $2.0 \le x \le 5.0$, using the three-point centered stencil (Eq. 12). Plot a graph of $f''(x)$ over this range. 

b) Consider this graph: what features (limits? max? min? other comparisons?) can you explain using mathematics, that give you confidence in your result?


In [None]:
def funLC(x):
    return np.log(x) / mp.cosh(x)

def fppLC(x):
    return -mp.sech(x)*(-x*x*np.log(x)*mp.tanh(x)*mp.tanh(x) + x*x*np.log(x)*mp.sech(x)*mp.sech(x) + 2*x*mp.tanh(x) +1) /x**2

# Variables
h = 0.001  #pick a reasonable grid spacing (not too small, not too large)
xarray= np.linspace(2, 5, 300) # x-axis: 300 points from 2 to 5
f = np.zeros_like(xarray)
fpp3CD = np.zeros_like(xarray) # the second derivative for funLC using the three-point CFD
fpp = np.zeros_like(xarray) # for analytical answer

# CFD
for ix, x in enumerate(xarray):
    f[ix] = funLC(x)
    fpp3CD[ix] = (funLC(x+h) + funLC(x-h) - 2*funLC(x))/h**2
    fpp[ix] = fppLC(x)

# the function
py.plot(xarray, f,'-b')
py.xlabel('$x$')
py.ylabel('$f(x)$')
py.show()
    
# the second derivative
py.plot(xarray, fpp3CD,'-r', label='3-point centered stencil')
py.plot(xarray, fpp ,'--k', label='analytic')
py.xlabel('$x$')
py.ylabel('$f^{\prime \prime}(x)$')
py.legend()
py.show()