# Wave equation

The Einstein field equations can, in the right gauge, "look like" wave equations. This is typically shown in linearized gravity, but can be seen (locally!) in the full theory as well. This is the foundation of the Generalized Harmonic formulation of the Einstein field equations. Start from the Ricci tensor, the key part of the Einstein tensor, and write it as
$$
R_{ab} = -\tfrac{1}{2} g^{cd} \partial_c \partial_d g_{ab} + \nabla_{(a} \Gamma_{b)} + \text{lower order terms}.
$$
Then recognize that the coordinate gauge freedom allows us to fix the contracted Christoffel symbols $\Gamma_b$ to be known, specificied functions $H_b$. The vacuum Einstein field equations then become
$$
g^{cd} \partial_c \partial_d g_{ab} = -2 \nabla_{(a} H_{b)} + \text{lower order terms}.
$$
The left-hand-side is the principal part, and is precisely the wave equation for each metric component.

We use that intuition to start solving numerics with the wave equation in $1+1$ dimensions. The notation we use is
$$
\partial_t^2 \phi = \partial_x^2 \phi
$$
where we set $c=1$.

Dealing with second *time* derivatives is annoying for numerics, so we will introduce auxilliary variables. We either use
$$
\partial_t \begin{pmatrix} \phi \\ \Psi \end{pmatrix} = \partial_x^2 \begin{pmatrix} 0 \\ \phi \end{pmatrix} + \begin{pmatrix} \Psi \\ 0 \end{pmatrix},
$$
which introduces the auxilliary variable $\Psi = \partial_t \phi$, or we use
$$
\partial_t \begin{pmatrix} \phi \\ \Psi \\ \Pi \end{pmatrix} = \partial_x \begin{pmatrix} 0 \\ \Pi \\ \Psi \end{pmatrix} + \begin{pmatrix} \Psi \\ 0 \\ 0 \end{pmatrix},
$$
which introduces the auxilliary variables $\Psi = \partial_t \phi$ and $\Pi = \partial_x \phi$.

## Finite differences

Imagine that the value of a function $f(x)$ is known at three points, $x_i$, and $x_i \pm \Delta x$. We can approximate the function $f$ using low order polynomials in three different ways:

1. Forward linear: a straight line interpolating $f(x_i)$ and $f(x_i + \Delta x)$;
2. Backward linear: a straight line interpolating $f(x_i)$ and $f(x_i - \Delta x)$;
3. Central quadratic: a quadratic interpolating $f(x_i)$ and $f(x_i \pm \Delta x)$.

We can then approximate the derivative of $f$ at $x_i$ using the derivative of the interpolating polynomial. These *finite difference approximations* are given by
$$
\begin{aligned}
\left. f'_{\mathrm{FD}} \right|_{x_i} &= \frac{f(x_i + \Delta x) - f(x_i)}{\Delta x}, \\
\left. f'_{\mathrm{BD}} \right|_{x_i} &= \frac{f(x_i) - f(x_i - \Delta x)}{\Delta x}, \\
\left. f'_{\mathrm{CD}} \right|_{x_i} &= \frac{f(x_i + \Delta x) - f(x_i - \Delta x)}{2 \Delta x}.
\end{aligned}
$$
These are the *forward*, *backward*, and *central* finite difference approximations, respectively.

We can use Taylor series expansion to check the accuracy of each approximation. As
$$
f(x_i \pm \Delta x) = f(x_1) \pm \Delta x \, \left. f' \right|_{x_i} + \frac{\Delta x^2}{2} \, \left. f'' \right|_{x_i} + \mathcal{O}(\Delta x^3),
$$
we can check that the forward and backward approximations have an error proportional to $\Delta x$, while the central approximation has an error proportional to $(\Delta x)^2$. We call these approximations *first order* and *second order*, respectively. We usually expect that the higher the order of the approximation, the more accurate it is.

We can also use the central quadratic approximation to approximate the second derivative of $f$ at $x_i$, giving
$$
\left. f''_{\mathrm{CD}} \right|_{x_i} = \frac{f(x_i + \Delta x) - 2 f(x_i) + f(x_i - \Delta x)}{\Delta x^2}.
$$
Again, Taylor series expansion shows this to be second order accurate.

### Implement and check

Using Python, implement each approximation and apply it to $f(x) = \exp(x)$ at $x = 0$. Check the error of each approximation as a function of $\Delta x$. Plot, on logarithmic scales, the error as a function of $\Delta x$. Add to the plot the expected error for each approximation.

We will demonstrate for forward differencing.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import ipympl

In [None]:
def fd(f, x, dx):
    """
    Forward differencing approximation of the derivative of f at x with step size dx.

    Parameters
    ----------
    f : callable
        Function for which the derivative is to be approximated.
    x : float
        Point at which the derivative is to be approximated.
    dx : float
        Step size for the forward differencing.

    Returns
    -------
    float
        Approximation of the derivative of f at x.
    """
    # Add your code for forward differencing here,
    # replacing the "raise NotImplementedError" line.
    raise NotImplementedError("Forward differencing not implemented yet")

# The remaining lines will compute the errors.
# First choose a range of dx values.
# These are logarithmically spaced values from 10^-8 to 1.
dxs = np.logspace(-8, 0)
# Create an array to hold the errors.
errors_fd = np.zeros_like(dxs)
# Loop over the dx values and compute the errors.
# Note: The function f is exp, and we know that the derivative of exp at 0 is 1.
for i, dx in enumerate(dxs):
    errors_fd[i] = np.abs(fd(np.exp, 0, dx) - 1)

In [None]:
# This cell will plot the errors.
# We expect power-law behaviour, where the error is proportional to dx to some power.
# Using a log-log plot will show this - it should be a straight line with the slope being 
# the exponent in the power-law.
plt.loglog(dxs, errors_fd, 'kx', label='Forward differencing error')
plt.loglog(dxs, errors_fd[-1]* (dxs / dxs[-1]), 'b--', label=r'$\propto \Delta x$')
plt.legend()
plt.xlabel(r'$\Delta x$')
plt.ylabel('Error')
plt.show()

We see slight deviations from the expected result for small $\Delta x$ (where floating point errors cause problems) and large $\Delta x$ (where the higher order terms in the Taylor series expansion cannot be neglected). Otherwise, convergence works.

Now add your code for the other approximations below.

In [None]:
# Define functions for backward and central differencing

def bd(f, x, dx):
    raise NotImplementedError("Backward differencing not implemented yet")

def cd(f, x, dx):
    raise NotImplementedError("Central differencing not implemented yet")

In [None]:
# Compute their errors

errors_bd = np.zeros_like(dxs)
errors_cd = np.zeros_like(dxs)
for i, dx in enumerate(dxs):
    errors_bd[i] = np.abs(bd(np.exp, 0, dx) - 1)
    errors_cd[i] = np.abs(cd(np.exp, 0, dx) - 1)

In [None]:
# Plot all errors and expectations
plt.loglog(dxs, errors_fd, 'kx', label='Forward differencing error')
plt.loglog(dxs, errors_bd, 'g+', label='Backward differencing error')
plt.loglog(dxs, errors_cd, 'ro', label='Central differencing error')
plt.loglog(dxs, errors_fd[-1] * (dxs / dxs[-1]), 'b--', label=r'$\propto \Delta x$')
plt.loglog(dxs, errors_cd[-1] * (dxs / dxs[-1])**2, 'b--', label=r'$\propto \Delta x^2$')
plt.legend()
plt.xlabel(r'$\Delta x$')
plt.ylabel('Error')
plt.ylim(1e-12, 1e1)
plt.show()

## Evolution

Finite differencing can be used in two ways. Above, we have used the *known* values of $f$ at multiple points to construct its derivative. Alternatively, we can use the *known* (or *computable*) values of the *derivative* of $f$, combined with its values at some points, to construct its values at other points. This allows us to evolve the function in time.

Consider the ordinary differential equation
$$
y'(t) = f(t, y(t)),
$$
with the simple special case $f = -y / \tau$ (which has solution $y(t) = y(0) \exp(-t / \tau)$). Because $f$ is a known function, if we know the value of $y$ at some time $t=T$, then we can compute $f$ at that time, and hence know the derivative. We can also *approximate* that derivative using (for example) forward differencing. Therefore we have
$$
\frac{y(T + \Delta t) - y(T)}{\Delta t} \approx f(T, y(T)),
$$
which we re-arrange to get *Euler's method*
$$
y(T + \Delta t) = y(T) + \Delta t \, f(T, y(T)).
$$
We can again use Taylor expansion to check that this is first order accurate.

### Implement and check

Using $f = -y / \tau$ with timescale $\tau = 1$ and initial condition $y(0) = 1$, implement Euler's method and evolve the function to $t = 2$. Check the error of the method as a function of $\Delta t$. Plot, on logarithmic scales, the error as a function of $\Delta t$. Add to the plot the expected error.

In [None]:
def f(t, y):
    return -y

def euler_step(f, t, y, dt):
    """ 
    Performs a single Euler step for the ODE dy/dt = f(t, y).

    Parameters
    ----------
    f : callable
        Function representing the right-hand side of the ODE.
    t : float
        Current time.
    y : float
        Current value of the dependent variable.
    dt : float
        Time step for the Euler method.

    Returns
    -------
    float
        The value of y after one Euler step.
    """
    raise NotImplementedError("Euler method not implemented yet")

# Integrates the ODE using the Euler method to a given end time
def euler(f, y0, t_end, dt):
    t = 0
    y = y0
    while t < t_end:
        y = euler_step(f, t, y, dt)
        t += dt
    return y

In [None]:
# This cell will compute the errors for the Euler method.
t_end = 2.0
dts = t_end**(-np.arange(4, 12))
errors_euler = np.zeros_like(dts)
for i, dt in enumerate(dts):
    errors_euler[i] = np.abs(euler(f, 1, t_end, dt) - np.exp(-t_end))

In [None]:
# This cell will plot the errors for the Euler method.
plt.loglog(dts, errors_euler, 'kx', label='Euler error')
plt.loglog(dts, (errors_euler[-1] / dts[-1]) * dts, 'b--', label=r'$\propto \Delta t$')
plt.legend()
plt.xlabel(r'$\Delta t$')
plt.ylabel('Error')
plt.show()

Try fixing the timestep $\Delta t$ to $0.1$ and look at the behaviour of the solution as $\tau$ is made smaller. You should see that the solution becomes unstable when $\tau$ is comparable to $\Delta t$. If you have time, prove this (use induction on the discrete solution).

# Method of Lines

We will take a PDE, such as the wave equation, and discretize it in space. What we mean by this is to introduce a grid of points in space, assume that key functions are known at those points, and then use (for example) finite differencing to compute the spatial derivatives at those points. That takes a PDE in the abstract form
$$
\partial_t q = \mathcal{L} q
$$
where $\mathcal{L}$ is a differential operator, and turns it into a system of *ODEs*
$$
\partial_t q_i = \mathbf{L} q_i
$$
where $\mathbf{L}$ is a discrete operator on the values of the function(s) in the state vector $q$, at the grid points $x_i$.

We then solve the time evolution of this system of ODEs using a method like Euler's method.

This is called the *method of lines*.

As an example, the wave equation in first order form, when approximated using central differencing, becomes
$$
\partial_t \begin{pmatrix} \phi_i \\ \Psi_i \\ \Pi_i \end{pmatrix} = \begin{pmatrix} \Psi_i \\ \frac{\Pi_{i+1} - \Pi_{i-1}}{2 \Delta x} \\ \frac{\Psi_{i+1} - \Psi_{i-1}}{2 \Delta x} \end{pmatrix}.
$$

We now need three things to solve this.

1. A method of evolving in time. Euler's method has poor accuracy, so we will use a second order Runge-Kutta method.
2. Initial data. We have to fix $q_i$ at $t=0$.
3. Boundary conditions. We have to fix $q$ at the boundaries of the domain, *and* say how this will be implemented in the discretization.

For simplicity we will use periodic boundary conditions. Assume that the domain is $x \in [-1, 1]$. We will use $N$ grid points covering the physical domain, and two additional *ghost* points at each end, for a total of $N+4$ points $x_0, x_1, \ldots, x_{N+3}$. We choose the points to be staggered symmetrically around the boundary. This means we identify $x_0$ with $x_{N}$ and $x_1$ with $x_{N+1}$, fixing the left boundary, and $x_{N+2}$ with $x_2$ and $x_{N+3}$ with $x_3$, fixing the right boundary. We can either set the values of the functions in the ghost points after updating the interior, or set the values of the time derivatives in the ghost points after computing the time derivatives in the interior.

We will assume that the initial data is a right propagating sine wave, $\phi(x, 0) = \sin(\pi x)$, $\Psi(x, 0) = -\pi \cos(\pi x)$, and $\Pi(x, 0) = \pi \cos(\pi x)$.

### Implement and check

Four functions are given:

1. A function to construct the grid;
2. A function to apply periodic boundary conditions on the grid;
3. A second order Runge-Kutta method that updates a single step;
4. An initial data function, giving a propagating sine wave (so the solution should be $\phi = \sin(\pi(x - t))$).

Implement the function that computes the spatial differencing, and use it to evolve the wave equation to $t = 1$. Plot the solution at $t = 1$. Link the timestep to the grid spacing using $\Delta t = \Delta x / 4$.

In [None]:
def grid(Npoints, xl=-1.0, xr=1.0):
    """
    Npoints is the number of interior points
    """
    
    dx = (xr - xl) / Npoints
    return dx, np.linspace(xl-3*dx/2.0, xr+3*dx/2.0, Npoints+4)

def apply_boundaries(q):
    """
    Periodic boundaries
    """
    N = q.shape[1] - 4

    q[:, 0] = q[:, N]
    q[:, 1] = q[:, N+1]
    q[:, N+2] = q[:, 2]
    q[:, N+3] = q[:, 3]
    
    return q

def RK2_step(q, RHS, dt, dx):
    """
    RK2 method
    """

    rhs = RHS(q, dx)
    qp = q + dt * rhs
    rhs_p = RHS(qp, dx)
    qnew = 0.5 * (q + qp + dt * rhs_p)

    return qnew

def initial_data(x):
    """
    Set the initial data. x are the coordinates. q (phi, phi_t, phi_x) are the variables.
    """
    q = np.zeros((3, len(x)))
    q[0, :] = np.sin(np.pi*(x))
    q[1, :] =-np.pi*np.cos(np.pi*(x))
    q[2, :] = np.pi*np.cos(np.pi*(x))
    
    return q

In [None]:
def RHS(q, dx):
    """
    RHS term.
    
    Parameters
    ----------
    
    q : array
        contains [phi, phi_t, phi_x] at each point
    dx : double
        grid spacing
        
    Returns
    -------
    
    dqdt : array
        contains the required time derivatives
    """
    
    raise NotImplementedError("RHS for wave equation not implemented yet")

In [None]:
# This cell should solve the wave equation using finite differences.
Npoints = 50
dx, x = grid(Npoints)
dt = dx / 4
q0 = apply_boundaries(initial_data(x))
q = apply_boundaries(initial_data(x))
Nsteps = int(1.0 / dt)
for n in range(Nsteps):
    q = RK2_step(q, RHS, dt, dx)
    q = apply_boundaries(q)

In [None]:
# This cell should plot the results.
plt.figure()
plt.plot(x, q0[0, :], 'b--', label="Initial data")
plt.plot(x, q[0, :], 'k-', label=r"$t=1$")
plt.xlabel(r"$x$")
plt.ylabel(r"$\phi$")
plt.xlim(-1, 1)
plt.legend()
plt.show()

### Convergence

For the linear wave equation the solution at $t=2$ should match the initial data, thanks to the domain size and the periodic boundary conditions. 

Check how the error at $t=2$ depends on the grid spacing. A function that computes various error norms is given.

In [None]:
def error_norms(U, U_initial):
    """
    Error norms (1, 2, infinity)
    """
    
    N = len(U)
    error_1 = np.sum(np.abs(U - U_initial))/N
    error_2 = np.sqrt(np.sum((U - U_initial)**2)/N)
    error_inf = np.max(np.abs(U - U_initial))
    
    return error_1, error_2, error_inf

In [None]:
# This cell computes the errors for various resolutions.
# First set up the grid sizes (50, 100, 200, ..., 6400).
Npoints_all = 50*2**np.arange(0, 7)
# Then create arrays to hold the errors and dx values.
errors = np.zeros(len(Npoints_all))
dxs = np.zeros(len(Npoints_all))
# Then loop over the grid sizes, compute the errors.
for i, Npoints in enumerate(Npoints_all):
    dx, x = grid(Npoints)
    dt = dx / 4
    q0 = apply_boundaries(initial_data(x))
    q = apply_boundaries(initial_data(x))
    Nsteps = int(2.0 / dt)
    t = 0
    for n in range(Nsteps):
        t += dt
        q = RK2_step(q, RHS, dt, dx)
        q = apply_boundaries(q)
    errors[i] = np.linalg.norm(q[0, :] - q0[0, :], 2) * np.sqrt(dx)
    dxs[i] = dx

In [None]:
# This cell will plot the errors against dx.
# The expected result is second order convergence, so the error should scale as dx^2.
plt.figure()
plt.loglog(dxs, errors, 'kx', label='Error')
plt.loglog(dxs, errors[-1] * (dxs / dxs[-1])**2, 'b--', label=r'$\propto \Delta x^2$')
plt.legend()
plt.xlabel(r'$\Delta x$')
plt.ylabel('Error')
plt.show()

# CFL limits

We have run the simulations with $\Delta t = \Delta x / 4$. The computational cost is given by the number of grid points multiplied by the number of updates. For given final time, increasing $\Delta t$ clearly reduces the cost, whilst reducing the number of grid points reduces the *accuracy* as well as the cost.

So, how big can we make the timestep?

Try running the simulation with $\Delta t = \tfrac{3}{2} \Delta x$. See what happens as you increase resolution.

In [None]:
Npoints = 20*2**np.arange(0, 4)
plt.figure()
for i, Npoints in enumerate(Npoints):
    dx, x = grid(Npoints)
    dt = 1.5*dx
    q0 = initial_data(x)
    q = initial_data(x)
    Nsteps = int(2.0 / dt)
    t = 0
    for n in range(Nsteps):
        t += dt
        q = RK2_step(q, RHS, dt, dx)
        q = apply_boundaries(q)
    plt.plot(x, q[0, :], label=rf"${Npoints}$ points")
plt.xlabel(r"$x$")
plt.ylabel(r"$\phi$")
plt.xlim(-1, 1)
plt.ylim(-1.5, 1.5)
plt.legend()
plt.show()

You should expect to see oscillations kicking in and destroying the simulation. These oscillations increase in their growth rate with resolution, so above about 100 points the results are obviously wrong.

The formal description of this is the *Courant-Friedrichs-Lewy* condition, which states that the timestep is limited by the speed of information propagation. In practice the precise numerical value of the limit depends on the numerical method and the dimensionality of the problem, but the qualitative behaviour is the same. We write the CFL condition as
$$
\Delta t \leq \sigma \, \frac{\Delta x}{c}
$$
where $c$ is the maximum wavespeed (eg, the speed of light), and $\sigma$ is typically between 0.1 and 0.5.

It follows that the total cost of a $3+1$ dimensional simulation is proportional to $N^4$, where $N$ is the number of grid points in each dimension. As the accuracy also changes with $N$, we see that fourth order methods are "special": if we double the amount of computational time, fourth order methods double the accuracy. For lower order methods the accuracy increase is much smaller.

# Extensions

## Gauge

One defining feature of 3+1 relativity, and hence NR, is the gauge dependence. A (very!) partial toy model is the *shifted* wave equation, which can be written in mixed order form as
$$
\partial_t \begin{pmatrix} \phi \\ \Psi \end{pmatrix} = \beta \partial_x \begin{pmatrix} \phi \\ \Psi \end{pmatrix} +  \partial_x^2 \begin{pmatrix} 0 \\ \phi \end{pmatrix} + \begin{pmatrix} \Psi \\ 0 \end{pmatrix},
$$
using the same definitions ($\Psi = \partial_t \phi$) as above. The shift effectively says how fast the spatial coordinates are moving with respect to space (!). In black hole spacetimes it stops coordinates falling onto the singularity (in a sense).

Try extending your code above to include the shift term $\beta$. 

* Start with small, constant shift. What impact does it have on the stable timestep?
* Try a spatially varying shift that is fixed in time. What impact does it have on the steepness of the propagating wave?
* Try a shift condition that is linked to the data; for example, link $\beta$ or its time derivative to a multiple of $\Psi$. Is this stable? What impact does it have on the solution?

## Higher order schemes

In most propagating problems the dominant error contribution comes from the spatial differencing. As discussed in the [theory notebook, 04](./04-theory.ipynb), a fourth order centred differencing scheme approximates the first derivative by
$$
\partial_x q \simeq \frac{1}{12 \Delta x} \left( -q_{j+2} + 8 q_{j+1} - 8 q_{j-1} + q_{j-2} \right) \, .
$$

Extend your code above (in first order form!) to use fourth order differencing in space. Be careful with the boundary conditions - significant code updates may be needed. Check the convergence rate of the errors. Does it converge consistently at high order (remember, the time differencing is only second order)? Does it make a major difference? What happens with longer evolution times?