# Introduction to Finite Differences and Numerical Solution of Differential Equations

We started with difference equations:
 - logistic map
 - Mandelbrot set
  
Shift to numerical solution of differential equations. 

- Use finite differences based on  __Taylor series__
- Convert ODEs to difference equations

## Numerical Differentiation

Differentiation (and integration): basic operations of calculus

Definition of the derivative:<br>

$$\frac{d f(t)}{dt} = \lim_{\Delta t\to 0} \frac{f(t+\Delta t)-f(t)}{\Delta t}$$ (1)

<br>Digital computation $\implies$ Floating-point numbers

Floating-point numbers $\implies$ finite set of finite numbers

No infinitesimals $\implies$ nothing actually $\to 0$

Feasible alternative: compute approx. of limit $\implies$ Taylor series


$$f(t+\Delta t) = f(t)+\Delta t\frac{df(t)}{dt}+\frac{\Delta t^2}{2!} \frac{d^2f(t)}{dt^2}+\frac{\Delta t^3}{3!} \frac{d^3f(c_1)}{dt^3}$$ (2)

<br>

Solving for $\frac{df(t)}{dt}$ gives __finite difference__ formula:
<br>

$$\frac{df(t)}{dt} = \frac{f(t + \Delta t) - f(t)}{\Delta t} + O(\Delta t)$$ (3)

<br>

_General terminology_: __Finite difference:__ Approx. of derivative in terms of function evaluations at finitely separated points.

_Particular case:_ __$1^{st}$-order forward diff. approx. of $1^{st}$ derivative__: 
- __First-order:__ Error term proportional to $(\Delta t)^1$ 
- __Forward difference:__ Looks forward (evaluates at $t$ and $t + \Delta t$).

Or look backward ($\Delta t \to -\Delta t$) $\implies$ alternative Taylor expansion:
<br>
$$f(t-\Delta t) = f(t)-\Delta t\frac{df(t)}{dt}+\frac{\Delta t^2}{2!} \frac{d^2f(t)}{dt^2}-\frac{\Delta t^3}{3!} \frac{d^3f(c_1)}{dt^3}$$ (4)
<br>
Compare to:

$$f(t+\Delta t) = f(t)+\Delta t\frac{df(t)}{dt}+\frac{\Delta t^2}{2!} \frac{d^2f(t)}{dt^2}+\frac{\Delta t^3}{3!} \frac{d^3f(c_1)}{dt^3}$$ (2)

> Quick aside on symmetry: 
<br>Eq.2 $\pm$ Eq.4 $\implies$ odd/even terms cancel
<br>But first, solve Eq.4 for $\frac{df(t)}{dt}\ldots$

$\implies$ First-order backward difference formula: 
<br>

$$\frac{df(t)}{dt} = \frac{f(t) - f(t - \Delta t)}{\Delta t} + O(\Delta t)$$ (5)
<br>
For higher-order estimate, subtract Eq.4 from Eq.2:  
<br>

<br>

$$f(t+\Delta t) - f(t-\Delta t) = 2 \Delta t\frac{df(t)}{dt}+2 \frac{\Delta t^3}{3!} \frac{d^3f(c_1)}{dt^3}$$ (6)

<br>
Even terms cancel, including the degree 2 term that led to the leading order error in the previous formula  
<br> <br>


Solve for $\frac{df(t)}{dt}$ $\implies$ __2nd order central difference formula__: 

$$\frac{df(t)}{dt} = \frac{f(t + \Delta t) - f(t - \Delta t)}{2 \Delta t} + O(\Delta t)^2$$ (7)

<br>

More evaluation points (+Taylor series) $\implies$ families of finite difference approximations of various derivatives with various orders of accuracy. (See tables...)

An important formula: <br>
__Second-order central difference estimate of the $2^{nd}$ derivative__

1. Sum the Taylor series (Eqs. 2&4) so the first derivatives cancel 

2. Solve for the second derivative:

$$\frac{d^2f(t)}{dt^2} = \frac{f(t + \Delta t) -2 f(t) + f(t - \Delta t)}{(\Delta t)^2} + O(\Delta t)^2$$ (7)

<br>

- Means to compute floating-point approx. derivatives
  
- Basis in Taylor series $\implies$ order of the __truncation error__
- Estimate  of how error depends on the spacing $\Delta t = h$
- Useful formulas involve truncation errors proportional to some positive power $n$; i.e. $\propto(h)^n$. 
- Expect decreasing spacing $\implies$ decreasing error.

Optimal accuracy is obtained by choosing $\Delta t$ as small as possible?

__NO!__ Why? 

2 major sources of error in digital computing:
- Truncation
  - Neglecting higher order terms in series expansion
- __Roundoff__
  - error incurred in approximating an infinite-precision real number by a finite-precision floating point number. 
- As $h$ becomes very small, the evaluation points get so closer than floating-point system cannot accurately resolve.

Typical case:
- For large spacing:
  - truncation error dominates
  - error can be reduced by decreasing the spacing. 
- For small spacing:
  - roundoff error dominates
  - smaller spacing makes the error worse, not better.

__Bottom line on spacing and error:__ 
- Truncation error dominates for large spacing and gets better as spacing decreases. 
- Roundoff error dominates for small spacing and gets better as spacing increases. 
- Some "middle ground" spacing minimizes total error.
- Typically some small integer root of _machine epsilon_    
  - Largest number that can be added to 1 and produce 1 as the floating-point result
  - Measure of bound on relative precision

Ready to move on to differential equations.



## Ordinary Differential Equations (ODEs) - Initial Value Problems (IVPs)

Intro to nomenclature: 

- __Differential equation__: equation with one or more derivatives. 
- __Independent variables__ appear on the bottom of a derivative.
- __Dependent variables__ appear on the top of a derivative.
- __Ordinary differential equation__, or system of equations, (ODEs) have a single independent variable. 
- __PArtial differential equations (PDEs)__ have multiple independent variables. 
- The __order__ of the DEs is the order of the highest derivative.
- __Linear__ refers to dependent variable:

  - $m y'' + c y' + k y = sin(\omega t)$ is __linear__ in y

  - $m y'' + c y' + k sin(y) = A sin(\omega t)$ is __nonlinear__ in y

Start with system of 1 or more $1^{st}$-order ODEs 

> Note that there is a standard "trick" to convert an $n^{th}$-order ODE to a system of first-order ODEs: Simply introduce new __dependent__ variables for derivatives of order $0$ through $n-1$. This immediately creates a system of $n$ first-order ODEs; the first $n-1$ equations define the variables, and the original ODE rewritten in terms of the new variables becomes the $n^{th}$ equation. <br>$m y'' + c y' + k sin(y) = A sin(\omega t)$ becomes 
> $$ 
\begin{aligned}
u_0' &= u_1 \\
m u_1' &= -c u_1 -k sin(u_0) + A sin(\omega t)
\end{aligned}$$

### Euler's method

Most fundamental problem: compute approx. solution of a single $1^{st}$-order ODE with a specified initial value:

$$\frac{dy}{dt} = f(t,y)\;, \qquad y(0) = y_0$$ (8)

You have likely seen the simplest approach: __Euler's method__:

- Classic "stepping" or "marching" method
- Replace continuous independent variable $t$ ("time") with discrete version:
$$t_n = t_0 + n \Delta t = t_0 + n h$$ (9)

- Compute sequence of values at the discrete "future" times based on the existing known "history": 

$$y_n = y(t_n) = y(t_0 + n h)$$ (10)

- End up again with a difference equation (or map)
- Derive difference equation by replacing the derivative with a finite difference estimate.
- Euler's method uses simple forward difference estimator

$$\frac{dy}{dt} = f(t,y)\;, \qquad y(0) = y_0 \rightarrow \frac{y_{n+1} - y_n}{h}\approx f(t_n, y_n)$$ (11)

- Solve for $y_{n+1}$ to obtain formula for  value at the next time step

$$y_{n+1} = y_n + h f(t_n, y_n)$$ (12)

_ More precisely called the __forward Euler method__ (uses forward difference estimate for the $1^{st}$ derivative) 
- Also called __explicit Euler method__: Eq.12 gives explicit formula for $y_{n+1}$ given $y_n, t_n,f$.

- Computes rate of change at the beginning of a time step, $f(y_n,t_n)$ and applies it over full step (from $t_n$ to $t_{n+1}$)
- Ignores the change of rate over short time step, which is not exact $\iff$ truncation error. 
  - Truncation error for each Euler step of Euler's $\sim  O(\Delta t)$ 
  - Number of time steps to cover an interval $O(\Delta t)^{-1} \implies$ __global truncation error__ $\sim O(1)$. 
  - Euler's method is simple & sometimes gives useful  results, but do not expect smaller $h$ to give better results.

- "Refinable" results requires higher-order truncation error.
  - How to create higher-order methods?
  - More terms in Taylor series + more evaluation points.

__Modified Euler-Cauchy Method__

Next step uses Euler's method to estimate an intermediate point where rate of change (i.e. the function $f$) is computed that gives better estimate of what happens over the interval. 

Euler-Cauchy steps:

1) Compute derivate (RHS) at initial time (left side of interval)

$$rate_{left} = f(t, y(t))$$ (13)

2) Use that derivative value to compute "Euler" estimate of midpoint value.

$$y_{mid} = y(t) + \frac{h}{2} rate_{left}$$ (14)

3) Use midpoint value to compute midpoint rate estimate.

$$rate_{mid} = f(t+\frac{h}{2}, y_{mid})$$ (15)

4) Use the midpoint rate estimage to compute the next value:
$$y_{RK2}(t+h) = y(t) + h (rate_{mid})$$ (16)

Put the pieces together:
$$y_{RK2}(t+h) = y(t)+h f\big(t+\frac{h}{2}, y(t)+\frac{h}{2} \; f(t,y(t)) \big)$$ (17)

- E-C truncation error:
  - $2^{nd}$-order local error 
  - $1^{st}$-order global error 
  - Smaller steps reduces global error (for "friendly" ODEs).

Why does modified Euler-Cauchy formula have subscript "RK2"?

Euler's method and modified E-C are first 2 methods in a family called __Runge-Kutta methods__:

To achieve higher order methods in RK family:
- Compute more rate estimates
- Combine results to knock out higher powers of $h$ in T.S.
- Increase degree of leading neglected term

## $4^{th}$ Order Runge-Kutta Method

Most common version: __Fourth-Order Runge_Kutta (RK4)__
- Computes average of 4 rate estimates 
  - 1 at the start of the interval
  - 2 in the middle
  - 1 at the end)

Details of RK4 algorithm:

1) Compute initial rate
$$f_1 = f(t_n,y_n)$$ (18)

1) Use initial rate to estimate midpoint values
$$f_2 = f(t_n+\frac{h}{2}, y_n+\frac{h}{2}f_1)$$ (19)

3) Use the midpoint estimate to compute improved midpoint estimate
$$f_3 = f(t_n+\frac{h}{2}, y_n+\frac{h}{2}f_2)$$ (20)

4) Use improved midpoint estimate to compute right-side estimate
$$f_4 = f(t_n+h,y_n+h f_3)$$ (21)

5) Compute weighted sum  of contributions to cancel as many terms in the Taylor series as possible.
6) "RK4" formula: local error $\sim O(h^5)$, global error $\sim O(h^4)$:
$$y_{n+1} = y_n + \frac{h}{6} [f_1 + 2 f_2 +2 f_3 + f_4]$$ (22)

## Application to Stability Analysis

When studying the behavior of dynamical systems, there are a few common phases of the analysis. 

- Modeling and application of physical principles. If you are studying a pendulum, you might make modeling approximations such as:

1) The pendulum is a rigid body.

2) The pendulum remains in a vertical plane.

3) Friction may or may not be considered according to some manageable model.

4) Some exterior forcing may be applied.

- Based on modeling assumptions, apply Newton's laws or Lagrange's equations to obtain equations of motion: (typically) $2^{nd}$-order ODE governing the motion of the system. 

Simplest model (ignoring damping and forcing) would be:

$$\theta'' = -sin(\theta); \qquad \theta(0)=\theta_0, theta'(0) = \theta'_0$$ (23)

Convert to $1^{st}$-order system:

$$ 
\begin{aligned}
u_0' &= u_1 \\
u_1' &= -sin(u_0)
\end{aligned}$$ (24)

with initial conditions (ICs:) $u_0(0)=\theta_0$, $u_1(0)=\theta'_0$.

Coding up the algorithms above, you now have tools to compute an approximate numerical solution for a particular choice of ICs.

Next stage of the dynamic analysis typically involves identification of equilibrium/steady-state solutions. 
- Set time derivatives to 0 (so all rates vanish)
- Solve rate equations for equilibrium values. 
- In this case, the equilibrium equations are:

$$ 
\begin{aligned}
0 &= u[1] \\
0 &= -sin(u[0])
\end{aligned}$$ (25)

$\implies$ 2 distinct equilibria:
- $u_1=0$ (so the pendulum is at rest) 
- Equilibrium positions:
  - $u_0 = 0$ (straight down beneath the pivot)
  - $u_0 = \pi$ (straight up above the pivot)
- Both valid equilibrium solutions
  - One is routinely observed
  - The other does not occur in practice
- The distiction is __stability__.

Many definitions of stability exist, but a simple one works here:

 __Stable equilibrium__ has a surrounding neighborhood where all ICs lead to trajectories that remain near the equilibrium.

If __any__ ICs in the neighborhood produce a trajectory that leaves the neighborhood, the equilibrium is __unstable__.

 ___Stability is not about particular set of ICs; it is about the collective behavior of ALL the initial conditions in the vicinity of the equilibrium.___

$\implies$ Stability is an ideal application for parallelization. 

- Computation of trajectory/history for a particular set of ICs must be computed serially
  - Cannot reasonably compute $y((k+1) h)$ without knowing $y(k h))
  - But trajectory for one set of ICs can be computed completely independent of the trajectory for any other set of ICs. 
- Stability studies lead naturally to the idea of launching a computational grid with kernel that solves the ODE for particular Ics. 
- GPU-based parallel computation can solve concurrently for MANY (realistically, millions of) ICs around the equilibrium.
- 3D plot of something like the ratio of the final and initial distance from the equilibrium will serve as  stability indicator (with potential "false positives").

Pursue this concept in more detail in an upcoming homework...


