# Introduction to Computer Programming and Numerical Methods

> **Mohamad M. Hallal, PhD** <br> Teaching Professor, UC Berkeley

[![License](https://img.shields.io/badge/license-CC%20BY--NC--ND%204.0-blue)](https://creativecommons.org/licenses/by-nc-nd/4.0/)
***

# Differentiation

1. [**The First Derivative**](#s1)
2. [**Numerical Differentiation**](#s2)
3. [**Error in Numerical Differentiation**](#s3)

***

# 0. Motivation

Understanding how physical quantities change over time, space, and other dimensions is crucial in various engineering and science disciplines. For instance, in civil engineering, predicting the flow of water through a dam requires understanding the rate of change of water pressure across different points in the structure. Similarly, in aerospace engineering, analyzing the trajectory of a rocket involves understanding how its velocity changes with respect to time and altitude.

Mathematically, the rate at which quantities change is modeled using derivatives. There is an extensive [set of rules](https://www.mathsisfun.com/calculus/derivatives-rules.html) for computing derivatives analytically: power rule, product rule, quotient rule, chain rule, differentiation rules for exponentials, inverses, and trigonometric functions, and the list goes on. While with these rules (and enough time and patience) we can find derivatives for any function, these analytical methods, especially for complex functions, can be cumbersome and time-consuming.

Moreover, in real-world scenarios, the exact mathematical form of the function may be unknown, and only discrete data points are typically available. In such cases, analytical differentiation is impossible. This limitation necessitates the use of numerical differentiation methods, which provide a practical approach to approximating derivatives when analytical solutions are impractical or unavailable. These methods allow us to estimate the rate of change of a function based on available data points, making them invaluable tools.

**Learning objectives:**

- Recognize the challenges associated with analytical differentiation in real-world problems
- Understand the practical significance of numerical differentiation
- Describe the general approach of numerical differentiation methods
- Implement different numerical differentiation techniques including, backward, forward, and central differentiation in Python
- Estimate the order of the error of these numerical approximations and choose the most accurate method
- Discuss the error and spacing trade-off in numerical differentiation and how very small spacing can result in increased error

# 1. The First Derivative <a id="s1"></a>

The derivative $f'(x)$ of a function $f(x)$ at a point $x$ is interpreted as the slope of the tangent line to the function at that specific point. Mathematically, the derivative is defined as:

$$f^\prime(x) = \lim_{h\rightarrow 0}\frac{f(x+h) - f(x)}{h}$$

If we drop the limit and instead use some small number for $h$, then we can estimate the first derivative as the slope of a secant line through two points $\big(x, f(x)\big)$ and $\big(x + h, f(x + h)\big)$. Conceptually, as the spacing between the points $h$ decreases, one endpoint of the interval slides toward the point of interest, and the slope of this secant line converges to the derivative of the function at $x$.

<br>

<center><figure>
  <img src="https://upload.wikimedia.org/wikipedia/commons/a/aa/Derivative_GIF.gif
" style="width:45%">
    <figcaption style="text-align:center"><strong>Derivative of a function:</strong> <a href="https://en.m.wikipedia.org/wiki/File:Derivative_GIF.gif">https://en.m.wikipedia.org/</a></figcaption>   
</figure></center>

# 2. Numerical Differentiation <a id="s2"></a>

In many scenarios, obtaining analytical solutions for derivatives can be either too complex or impractical due to the lack of a known mathematical function. Although physical data are inherently continuous, their values may only be available at discrete points. For instance, a GPS sensor might record position versus time pairs at regular intervals. Although position is a smooth and continuous function with respect to time, the GPS provides values only at discrete time intervals, rendering the underlying function unknown. In the absence of a mathematical function, analytical differentiation methods cannot be used to compute the rate of change.

In such cases, **finite difference** approximations of the derivative can be employed. These numerical approximations involve calculating the slope between two neighboring points from the available set of data points. There are three fundamental types of finite difference approximations:

1. Forward difference
2. Backward difference
3. Central difference

The following figure illustrates these three numerical differentiation methods used to estimate the slope or derivative:

<br>

<center><figure>
  <img src="https://docs.google.com/drawings/d/e/2PACX-1vSZ5kN1WZPnLn0MZ9uQ16mE-Gets1ot2JuLEYSvlW20QpAkBZXatQvgdYBLZP_dxUlVpJaYBi7Yn2fU/pub?w=1152&h=432
" style="width:100%">
    <figcaption style="text-align:center"><strong>Finite difference methods</strong></figcaption>   
</figure></center>

<br>

All these methods share the same primary idea: approximate the derivative using the slope of the secant line of two points. They differ in which two points are being used.

## 2.1. Forward Difference

The **forward difference** estimates the derivative of the function at $x_i$ as the slope of the line that connects $\big(x_i, f(x_i)\big)$ and a forward point $\big(x_{i+1}, f(x_{i+1})\big)$:

$$
f'(x_i) \approx \frac{f(x_{i+1}) - f(x_i)}{x_{i+1} - x_i}
$$

When working with discrete data, we are restricted by the spacing between successive measurements. However, if we have knowledge of the underlying mathematical function, $f(x)$, we can control the spacing between the two points, denoted as $h$, used in approximating the derivative. The forward finite-difference approximation in this case uses the line that connects $\big(x_i, f(x_i)\big)$ and $\big(x_{i}+h, f(x_{i}+h)\big)$:

$$
f'(x_i) \approx \frac{f(x_{i}+h) - f(x_i)}{h}
$$

<div class="alert alert-block alert-warning"> <b>NOTE!</b> When dealing with only discrete points, it is not possible to compute the derivative at the last point in the dataset $f'(x_n)$ using forward difference, as there is no forward point to use for the calculation.</div>

<div class="alert alert-block alert-success"> <b>TIP!</b> When dealing with discrete points, the <a href="https://numpy.org/doc/stable/reference/generated/numpy.diff.html"><code>np.diff(a, n=1)</code></a> function can be helpful. It returns an <code>ndarray</code> with values <code>out[i] = a[i+1] - a[i]</code>.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Write a function <code>forward_diff(f, x, h)</code> which takes as input a function object <code>f</code> and two scalar values <code>x, h</code>. The function should return an estimate of the derivative of <code>f</code> at <code>x</code> using the forward difference method and spacing <code>h</code>. Set the default value of the spacing equal to $10^{-3}$.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try your function <code>forward_diff(f, x, h)</code> for $f(x)=\cos(x)$ at <code>x=0.15</code>. Then, compute the analytical value of the derivative and the error between the numerical approximation and the analytical value.</div>

In [None]:
import numpy as np

# Point of interest
x = 0.15

# Estimate derivative and display results
estimate = ...
print(f"f'({x}) ~ {estimate}")

# Analytical solution for derivative
exact = ...
print(f"f'({x}) = {exact}")

# Calculate Error
print(f"Error    = {np.abs(exact - estimate)}")

## 2.2. Backward Difference

The **backward difference** estimates the derivative of the function at $x_i$ as the slope of the line that connects $\big(x_i, f(x_i)\big)$ and a backward point $\big(x_{i-1}, f(x_{i-1})\big)$:

$$
f'(x_i) \approx \frac{f(x_i) - f(x_{i-1})}{x_i - x_{i-1}}
$$

When the mathematical function is known, the backward finite-difference approximation equation becomes:

$$
f'(x_i) \approx \frac{f(x_i) - f(x_{i}-h)}{h}
$$

<div class="alert alert-block alert-warning"> <b>NOTE!</b> When dealing with only discrete points, it is not possible to compute the derivative at the first point in the dataset $f'(x_0)$ using backward difference, as there is no backward point to use for the calculation.</div>

<div class="alert alert-block alert-warning"> <b>NOTE!</b> When dealing with discrete points, the forward and backward difference methods give the same results but shifted by $i=1$. So, the forward difference result for $f'(x_0)$ will have the same value as the backward difference result for $f'(x_1)$, because both use the same equation $\dfrac{f(x_1) - f(x_{0})}{x_1 - x_{0}}$.</div>

## 2.3. Central Difference

The **central difference** estimates the derivative of the function at $x_i$ as the slope of the line that connects a backward $\big(x_{i-1}, f(x_{i-1})\big)$ and a forward $\big(x_{i+1}, f(x_{i+1})\big)$ point, centered about $x_i$:

$$
f'(x_i) \approx \frac{f(x_{i+1}) - f(x_{i-1})}{x_{i+1} - x_{i-1}}
$$

When the mathematical function is known, the central finite-difference approximation equation becomes:

$$
f'(x_i) \approx \frac{f(x_{i}+h) - f(x_{i}-h)}{2h}
$$

<div class="alert alert-block alert-warning"> <b>NOTE!</b> When dealing with only discrete points, it is not possible to compute the derivative at both the first and last point in the dataset using central difference, as there is no backward and forward point, respectively, to use for the calculation.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Write a function <code>central_diff(f, x, h)</code> which takes as input a function object <code>f</code> and two scalar values <code>x, h</code>. The function should return an estimate of the derivative of <code>f</code> at <code>x</code> using the central difference method and spacing <code>h</code>. Set the default value of the spacing equal to $10^{-3}$.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try your function <code>central_diff(f, x, h)</code> for $f(x)=\cos(x)$ at <code>x=0.15</code>. Then, compute the analytical value of the derivative and the error between the numerical approximation and the analytical value.</div>

In [None]:
# Point of interest
x = 0.15

# Estimate derivative and display results
estimate = central_diff(np.cos, x)
print(f"f'({x}) ~ {estimate}")

# Analytical solution for derivative
exact = -np.sin(x)
print(f"f'({x}) = {exact}")

# Calculate Error
print(f"Error    = {np.abs(exact - estimate)}")

## 2.4. Higher-Order Derivatives

It is also possible to approximate higher-order derivatives (e.g., $f''(x_i), f'''(x_i)$, etc.). For example, the second order derivative using the central difference method can be approximated as:

$$
f''(x_i) \approx \frac{f(x_{i}+h)-2f(x_i)+f(x_{i}-h)}{h^2}
$$


# 3. Error in Numerical Differentiation <a id="s3"></a>

Numerical differentiation methods, being approximations, inherently introduce some level of error. Understanding the accuracy of these methods and the magnitude of the error is crucial for their interpretation and improvement. Graphically, the error in numerical differentiation refers to the difference in the slope between the true and the approximated slopes by the numerical differentiation method, as illustrated in the figure below. Based on the figure, which method appears to be the most accurate? Which method(s) appear(s) to be the least accurate?

<center><figure>
  <img src="https://docs.google.com/drawings/d/e/2PACX-1vSS8tf-dA6R_AUeLyLkebqBIqy8tXF160HveLYRBcxERhoJDMPolMTzazPI0xtEgMV8a3_UwyaWImwb/pub?w=1152&h=720" style="width:75%">
    <figcaption style="text-align:center"><strong>Numerical differentiation error as a function of the step size $h$</strong></figcaption>   
</figure></center>

<br>

Calculating the error exactly, similar to the examples above, is generally not possible, as the actual derivative is the unknown we are trying to estimate. Instead, we focus on estimating the order or magnitude of the error. One common way of describing the order of the error is using Big-O notation, which we have previously used to describe the magnitude of the error of a truncated Taylor series. 


## 3.1. Order of Error

Calculating the error exactly, similar to the examples above, is generally not possible, as the actual derivative is the unknown we are trying to estimate. Instead, we focus on estimating the order or magnitude of the error. One common way of describing the order of the error is using Big-O notation as a function of the step size $h$, which we have previously used to describe the magnitude of the error of a truncated Taylor series. 

The table below summarizes the order of the error associated with various numerical differentiation methods:

| Method              | Error                         |
|:--------------------|:------------------------------|
| Forward Difference  | $\mathcal{O}\left(h\right)$   |
| Backward Difference | $\mathcal{O}\left(h\right)$   |
| Central Difference  | $\mathcal{O}\left(h^2\right)$ |

As the step size $h$ decreases, the numerical result tends to converge to the true value of the derivative, as evident in the figure above. However, even with the same step size $h$, different numerical differentiation methods exhibit different levels of accuracy. Below, we derive the order of error for the forward difference method.

### 3.1.1. Forward Difference

We already derived the forward difference equation based on the slope of the line that connects $\big(x_i, f(x_i)\big)$ and a forward point $\big(x_{i+1}, f(x_{i+1})\big)$:

$$
f'(x_i) \approx \frac{f(x_{i}+h) - f(x_i)}{h}
$$

While this is a simple way to derive the equation of forward difference, we can obtain the same equation using Taylor series. The Taylor series of a function $f(x)$ is:

$$ f(x) = f(a)+\frac {f'(a)}{1!} (x-a)+ \frac{f''(a)}{2!} (x-a)^2+\frac{f'''(a)}{3!}(x-a)^3+ \cdots = \sum_{n = 0}^{\infty} \frac{f^{(n)}(a)}{n!}(x-a)^n$$

If we substitute $x$ with $x+h$ and $a$ with $x$ in the above equation, we get:

$$ f(x+h) = f(x)+f'(x) (x+h-x)+ \frac{f''(x)}{2!} (x+h-x)^2+\frac{f'''(x)}{3!}(x+h-x)^3+ \cdots = f(x)+f'(x) h+ \frac{f''(x)}{2!} h^2+\frac{f'''(x)}{3!}h^3+ \cdots$$

By rearranging this equation to solve for $f'(x)$, we can obtain the forward difference formula and identify the error term:

$$ f'(x) = \dfrac{f(x+h)-f(x)- \dfrac{f''(x)}{2!} h^2-\dfrac{f'''(x)}{3!}h^3- \cdots}{h} = \underbrace{\dfrac{f(x+h)-f(x)}{h}}_{\text{Forward Difference}}- \underbrace{\dfrac{f''(x)}{2!} h-\dfrac{f'''(x)}{3!}h^2- \cdots}_{\text{Remainder}}$$

The first part of the above equation is the forward difference approximation of the derivative. By ignoring the remaining components of the above equation, this introduces error. If $h$ is a very small value, then the largest term in the remainder is the $h$ term (since for small values of $h$ like 0.01, $h^2$ will be even smaller, $h^3$ even smaller than that, etc). This means that if we use a forward difference approximation for $f'(x)$, then we expect our error to be about the same size as $h$. In this case, it is common to rewrite the forward difference as:

$$f'(x) = \dfrac{f(x+h)-f(x)}{h} + \mathcal{O}\left(h\right)$$

where $\mathcal{O}\left(h\right)$ reads "Big-O of $h$" indicates that the expected error for forward difference is proportional to $h$.

This is equivalent to saying that forward difference method is first order. In other words, the error is proportional to the step size $h$.

In general, Big-O notation is used to describe the asymptotic behavior of functions, indicating how fast a function grows or declines. In this case, we will use it to describe the order at which the error in numerical differentiation grows or declines as a function of the spacing $h$. We say that the error in a differentiation scheme is $\mathcal{O}\left(h^k\right)$ if and only if the error is proportional to $h^k$. This is equivalent to saying that the differentiation method is $k^{th}$ order. This analysis holds true for small values of $h$. The error analysis is not applicable when $h$ is large. Because we are focusing on small values of $h$, higher-order methods with larger $k$ have smaller error and are more accurate (since for small values of $h$ like 0.01, $h^2$ will be smaller, $h^3$ even smaller than that, etc).

### 3.1.2. Backward Difference

Similar error analysis can be conducted for backward difference to obtain the order of accuracy. The backward difference method is also first order, with a similar order of error as the forward difference method:

$$f'(x) = \dfrac{f(x)-f(x-h)}{h} + \mathcal{O}\left(h\right)$$


### 3.1.3. Central Difference

Similar error analysis can also be conducted for central difference to obtain the order of accuracy. However, unlike the forward and backward difference methods, the central difference method is second order, meaning the error is proportional to $h^2$, and thus is more accurate:

$$f'(x) = \dfrac{f(x+h)-f(x-h)}{2h} + \mathcal{O}\left(h^2\right)$$


## 3.2. Error and Spacing Trade-off 

Theoretically, reducing the spacing, $h$, between the points used in numerical differentiation improves accuracy. However, with very small $h$, round-off error becomes a significant concern, potentially compromising the accuracy of the numerical differentiation. Therefore, striking a balance between the acceptable error and the spacing $h$ is crucial.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Calculate the error for the forward and central difference methods relative to the analytical solution for $f(x)=\cos(x)$ at <code>x=0.15</code>. Calculate the error for different values of <code>h</code>.</div>

In [None]:
# Define inputs
x = 0.15
h = 1e-2

# Calculate error and display results
dfx = forward_diff(np.cos, x, h)
print(f"Forward difference error: O(h)   = {np.abs(-np.sin(x) - dfx)}")

# Calculate error and display results
dfx = central_diff(np.cos, x, h)
print(f"Central difference error: O(h^2) = {np.abs(-np.sin(x) - dfx)}")