# 2nd-order partial differential equations

---

A general, 2nd-order PDE for a function of 2 variables has the form
\begin{equation*}
A\partial_x^2 u + 2B\partial_x \partial_y u + C \partial_y^2 u + D\partial_x u + E\partial_y u + F = 0 \,,
\end{equation*}
with $A,\dots\,F$, and $u$ functions of $(x,y) \in V \subset \mathbb{R}^2$. The PDE is elliptic if $B^2-AC < 0$, parabolic if $B^2 - AC = 0$, and hyperbolic if $B^2-AC = 0$. 

This can be extended to higher dimensions by analogy. Take $u(x) = u(x_1, \dots, x_n)$ to be a function of $n$ real variables, and consider the set of functions $a_{ij}(x) = a_{ji}(x), b_i(x), c(x), f(x)$. We construct the general 2nd-order PDE
\begin{equation*}
\sum_{i,j=1}^n a_{ij}\partial_{x_i}\partial_{x_j} u + \sum_{i=1}^n b_{i}\partial_{x_i} u + c u + f = 0 \,.
\end{equation*}
The functions $a_{ij}$ form an $n\times n$ symmetric matrix $a$, which can be diagonalised by $a = SDS^T$, where $S$ is an orthogonal matrix, and $D$ is the diagonal matrix of eigenvalues. Let's consider the change of variables 
\begin{equation*}
y = S^T x \,, \quad y_j = \sum_{i=1}^n S_{ij}x_i \,.
\end{equation*}
We have $\partial_{x_i} = \sum_{j=1}^n S_{ij}\partial_{y_j}$, so $\partial_{x_i}\partial_{x_j}u=\sum_{k,l=1}^n S_{ik}\partial_{y_k}(S_{jl}\partial_{y_l})u$. The PDE becomes
\begin{equation*}
\sum_{i,j,k,l=1}^n a_{ij}S_{ik}S_{jl}\partial_{y_k}\partial_{y_l}u + \sum_{i=1}^n b'_{i}\partial_{y_i} u + c u + f =0 \,.
\end{equation*}
for some adjusted $b'_{i}$ (recall that $S$ depends on $x$). The first term is quite ugly, but using matrix notation, we see it simplifies to 
\begin{equation*}
\sum_{i,j=1}^n a_{ij}S_{ik}S_{jl} = \sum_{i,j=1}^n S^T_{ki}a_{ij}S_{jl} = (S^T a S)_{kl} = D_{kl} = d_k \delta_{kl} \,,
\end{equation*}
which allows us to write the PDE in canonical form,
\begin{equation*}
\sum_{i=1}^n d_i\partial^2_{y_i} u + \sum_{i=1}^n b'_{i}\partial_{y_i} u + c u + f = 0 \,.
\end{equation*}
The classification is as follows. The PDE elliptic if all eigenvalues are non-zero and have the same sign. It is hyperbolic if none of the eigenvalues are zero, but exactly 1 has the opposite sign to the others. It is parabolic if exactly 1 of the eigenvalues are zero, and the others are all of the same sign. Since the eigenvalues are functions of $x$, this classification is local and could change from point to point. PDEs that are elliptic at every point in the domain are uniformly elliptic, and similarly for parabolic and hyperbolic ones.

Connecting to the 2 dimension case, the matrix $a$ has determinant $AC - B^2 = d_1 d_2$ (the determinant is the product of the eigenvalues). Since we assume the matrix is non-zero, it can at most have 1 zero eigenvalue. This occurs when $B^2 - AC = 0$ --- the parabolic case. If $B^2 - AC < 0$, the determinant is positive-definite, so $d_{1,2}$ must have the same sign --- the elliptic case. Finally if $B^2 - AC > 0$, the determinant is negative-definite, so $d_{1,2}$ must have opposing signs --- the hyperbolic case. 


# Para/hyperbolic from elliptic

---

From now on, we will employ Einstein's summation notation, where repeated indices are summed over the assumed range; e.g. $a_ib_i = \sum_{j=1}^{n} a_jb_j$. We shall rewrite the canonical PDE (with wilful renamings) from the previous section as 
\begin{equation*}
\hat{L}u = f \,,\quad \hat{L} \coloneqq d_i\partial^2_i + b_i\partial_i + c \,,
\end{equation*}
where we abbreviate $\partial_i = \partial_{x_i}$. To be precise, our coordinates here are $x_i \in V\subset\mathbb{R}^n$. Let's now take $\hat{L}$ to be elliptic with the convention $d_i>0$ for all $i$, and consider an extra coordinate $t\in I\subset\mathbb{R}$. Our total domain is the $(n+1)$-dimensional space $\Omega = V\times I \subset\mathbb{R}^{n+1}$. We define on it the operator
\begin{equation*}
\hat{\mathcal{L}}_p \coloneqq \partial_t - \hat{L}\,,
\end{equation*}
where the coefficients in $\hat{L}$ are functions on $\Omega$. If we connect this to the general form of a 2nd-order PDE, we find the equivalent of the $a$ matrix to be 
\begin{equation*}
a = \begin{pmatrix} 0 & 0 \\ 0 & -D \end{pmatrix} \,.
\end{equation*}
Since $D$ is positive-definite, this has 1 zero eigenvalue and $n$ negative eigenvalues, and thus describes a parabolic system. Similarly, the $a$ matrix for the operator
\begin{equation*}
\hat{\mathcal{L}}_h \coloneqq \partial_t^2 - \hat{L} \,,
\end{equation*}
is 
\begin{equation*}
a = \begin{pmatrix} 1 & 0 \\ 0 & -D \end{pmatrix} \,.
\end{equation*}
This has 1 positive eigenvalue (equal to 1) and $n$ negative eigenvalues, which encodes a hyperbolic system. So by extending 1 dimension, we have constructed parabolic and hyperbolic PDEs from elliptic ones.

Common examples of this are the heat and wave equations. We are in $\mathbb{R}^{n+1}$ with coordinates $(t, x_i)$. The $n$-dimensional Laplace equation is
\begin{equation*}
\partial_i^2 u = 0 \,,
\end{equation*}
and is clearly elliptic. The heat equation is 
\begin{equation*}
\partial_t u - \partial_i^2 u = 0 \,,
\end{equation*}
and the wave equation is 
\begin{equation*}
\partial_t^2u - \partial_i^2 u = 0 \,.
\end{equation*}
These examples give a hint towards the behaviour of the solutions of such PDEs. The parabolic and hyperbolic equations generally describe dynamical systems, with $t$ representing time, whilst elliptic ones describe steady states.

# Brief detour into geometry and general relativity

---

A central object in the study of geometry is the metric $g$. It is a symmetric matrix that encodes distances between points on a manifold. For coordinates $x \in M$, where $M$ is an $n$-dimensional manifold (we shall gloss over the exact definition of this), the infinitesimal distance between 2 points on $M$ is given by 
\begin{equation*}
ds^2 = g_{ij}(x)dx^idx^j \,.
\end{equation*}
A Riemannian manifold is one where $g$ is positive-definite. A Lorentzian one is where $g$ has 1 negative eigenvalue and $n-1$ positive ones. Since we're dealing with distances (or some general notion of it), we generally require $g$ to be non-singular (no zero eigenvalues). These sound similar to the elliptic and hyperbolic classificiations earlier, and we can show this connection explicitly. 

We define the Laplacian $\Delta$ as the operator
\begin{equation*}
\Delta u = \frac{1}{\sqrt{|g|}}\partial_i\left({\sqrt{|g|}}g^{ij}\partial_j u\right) \,,
\end{equation*}
where $g^{ij}$ with raised indices is the inverse of $g_{ij}$, and $|g| = |\det g_{ij}|$. In physics, the equation $\Delta u = 0 $ describes the evolution of a free (scalar) field $u$ living on $M$. The purely 2nd-order term is $g^{ij}\partial_i\partial_j u$, so this generalised Laplace equation is elliptic for Riemannian manifolds and hyperbolic for Lorentzian ones.

Let's look at some examples. In general relativity, a spacetime in the absence of any massive objects is flat, Minkowski space. This is Lorentzian and has the metric 
\begin{equation*}
ds^2 = -c^2dt^2 + dx^2 + dy^2 + dz^2 \,,
\end{equation*}
where $c$ is the speed of light. The Laplace equation is the standard wave equation
\begin{equation*}
\frac{1}{c^2}\frac{\partial^2 u}{\partial t^2} - \frac{\partial^2 u}{\partial x^2} - \frac{\partial^2 u}{\partial y^2} - \frac{\partial^2 u}{\partial z^2}= 0 \,.
\end{equation*}

# Elliptic PDEs: Newtonian gravity

---

We're living in 3 spatial dimensions with coordinates $x, y, z$. We put a point mass of mass $M$ in the origin. What's the gravitational potential $V$? Gauss' law states that $V$ statisfies the elliptic PDE
\begin{equation*}
\Delta V = 4\pi GM\rho
\end{equation*}
where $G$ is Newton's constant and $\rho$ is the gravitational density. For a point mass at the origin, $\rho = M\delta(x,y,z)$. Let's consider using spherical coordinates $(r, \theta, \phi)$ defined in the usual way and impose spherical symmetry, so $V = V(r)$. Performing the change of coordinates and integrating over the angular coordinates, Gauss' law becomes
\begin{equation*}
\frac{d}{dr}\left(r^2\frac{dV}{dr}\right) = GM\delta(r)\,.
\end{equation*}
Another way of deriving this is to note that in spherical coordinates, infinitesimal distances are given by $ds^2 = dr^2 + r^2(d\theta^2 + \sin^2\theta d\phi^2)$. The metric is $g = diag(1, r^2, r^2\sin^2\theta)$ with $|g| = r^2\sin\theta$ (which is the Jacobian!), and the equation above comes directly from the definition of the Laplacian. 

To solve this, let's pretend that $r$ can be negative. Clearly, we want $V = 0$ there. Integrating Gauss' law over a small interval $(-\epsilon, \epsilon)$, $\epsilon > 0$, we find
\begin{equation*}
\epsilon^2(V'(\epsilon) - V'(-\epsilon)) = GM \,.
\end{equation*}
Taking $\epsilon \to 0^+$ yields the jump condition 
\begin{equation*}
\lim_{\epsilon\to0^+}\epsilon^2V'(\epsilon) = GM \,.
\end{equation*}
The left limit $V'(0^-) = 0$ as $V$ is 0 for all negative $r$. We use this as a boundary condition to evolve $V$ for all $r > 0$. The general solution to the unsourced equation is 
\begin{equation*}
V = a + \frac{b}{r} \,.
\end{equation*}
Since $V=0$ at infinity (otherwise we'll have infinite energies), $a=0$. For $r > 0$, $r^2V' = -b$. The jump condition then fixes $b = -GM$. In full,
\begin{equation*}
V = -\frac{GM}{r} \,,
\end{equation*}
which is Newton's law of gravitation.

# Parabolic PDEs: Heat and finance I - Brownian motion

---

The heat equation in 1 spatial dimension is 
\begin{equation*}
\frac{\partial u}{\partial t} - \alpha\frac{\partial^2 u}{\partial x^2} = 0 \,,
\end{equation*}
where $\alpha$ is a constant. We generally take $t\in \mathbb{R}^+$ and $x \in \mathbb{R}$. This equation (as it names suggests) arises in the modelling of temperature. But we dealt with physics already above, so let's look at a different way this equation comes up.

Brownian motion is a stochastic process $\{B_t\}_{t\in[0,\infty)}$ on a probability space with measure $\mathbb{Q}$ that obeys $B_t \sim N(0,t)$ and $\mathbb{E}_s[B_t] = B_s$ for all $s \leq t$ where $\mathbb{E}_s$ the conditional expectation given all the information up to and including $s$. We take the convention $B_0=0$. 

Let $\rho(t,\cdot):\mathbb{R}\to\mathbb{R}^+$ be the (unconditional) probability distribution of $B_t$. It is Gaussian, and is given by
\begin{equation*}
\rho(t,x) = \frac{1}{\sqrt{2\pi t}}\exp\left(-\frac{x^2}{2t}\right) \,.
\end{equation*}
Some algebra shows that
\begin{equation*}
\frac{\partial\rho}{\partial t} = \frac{1}{2}\frac{\partial^2\rho}{\partial x^2} \,,
\end{equation*}
so $\rho$ solves the heat equation with $\alpha=1/2$. $\rho(t,x)$ is often called the heat kernel. It has the key property $\lim_{t\to0^+}\rho(t,x) = \rho(0,x) = \delta(x)$. To show this, we need to do some gymnastics. Firstly, let $\lambda > 0$ be a real constant and take $g(x)$ to be twice differentiable. We assume $g(x)$ has a global maximum located at $x_0$. Consider the integral
\begin{equation*}
I(\lambda) = \int_{-\infty}^{\infty} \exp\left(\lambda g(x)\right) f(x) dx \,.
\end{equation*}
The saddle point method tells us that 
\begin{equation*}
I(\lambda) \sim \sqrt{\frac{2\pi}{\lambda |g''(x_0)|}}\exp\left(\lambda g(x_0)\right)f(x_0) \,,\quad \lambda\to\infty \,.
\end{equation*}
For our case, we identify $\lambda = 1/2t$ and $g(x) = -x^2$. The global maximum is at $x_0 = 0$, so the saddle point method gives
\begin{equation*}
\lim_{t\to0^+}\int_{-\infty}^{\infty}\rho(t,x)f(x)dx = f(0) \,,
\end{equation*}
which defines $\rho(0,x) = \delta(x)$. A corollary of this is that the solution to 
\begin{equation*}
\frac{\partial u}{\partial t} = \frac{1}{2}\frac{\partial^2 u}{\partial x^2} \,,\quad u(0,x)=f(x) \,,
\end{equation*}
for an initial profile $f(x)$ is given by
\begin{equation*}
u(t,x) = \int_{-\infty}^{\infty}\rho(t,x-y) f(y) dy \,.
\end{equation*}
Shifting the integration variable from $y$ to $x+y$ and using $\rho(t,-y) = \rho(t,y)$, the integral above becomes
\begin{equation*}
u(t,x) = \int_{-\infty}^{\infty}\rho(t,y) f(x + y) dy \,.
\end{equation*} 
This allows us to provide a probabilistic interpretation of $u(t,x)$ as the expectation of $f(x + B_t)$,
\begin{equation*}
u(t,x) = \mathbb{E}[f(x+B_t)] \,.
\end{equation*}

# Parabolic PDEs: Heat and finance II - Feynmann-Kac and Black-Scholes

---

In finance, the fair value of a derivative contract with payoff $f(X_T)$ at expiry date $T$ is given by
\begin{equation*}
V(t,x) = \mathbb{E}[D(t,T)f(X_T) \rvert X_t = x] \,. 
\end{equation*}
with $D(t,T)$ the forward, stochastic discount factor, and $\{X_t\}_{t\in[0,\infty)}$ is a stochastic process. This can also be rewritten as a solution to a parabolic PDE with a suitable terminal condition at $t=T$. The exact form is given by the Feynmann-Kac formula.

Proving Feynmann-Kac is incredibly involved. We will demonstrate how the Black-Scholes formula, an instance of the Feynmann-Kac formalism, can be obtained using techniques we developed above. We will assume deterministic rates, so $D(t,T) = \exp(-r(T-t))$, and consider a process $X_t = m(t) + \sigma B_t$, where $m(t)$ is deterministic. 

The condition $X_t = x$ implies $B_t = (x - m(t)) /\sigma = b(t,x)$. Given this, $B_T\sim N(b(t,x), T-t)$. Since the combination $T-t$ will keep on appearing in this section, we will define $\tau = T-t$ and work with $\tau$ instead. The probability distribution function of $B_T | B_t = b(\tau,x)$ is 
\begin{equation*}
\rho(y; \tau, x) = \frac{1}{\sqrt{2\pi \tau}}\exp\left(-\frac{(y-b(\tau,x))^2}{2\tau}\right) \,,
\end{equation*}
with which we can express
\begin{equation*}
V(\tau,x) = \exp(-r\tau)\int_{-\infty}^{\infty}\rho(y;\tau,x)f(m(T) + y) dy \,.
\end{equation*}
Now, a straightforward calculation yields the following relation for the distribution,
\begin{equation*}
\frac{\partial\rho}{\partial\tau} = \frac{1}{2}\sigma^2\frac{\partial^2\rho}{\partial x^2} - \frac{dm}{d\tau}\frac{\partial\rho}{\partial x}\,.
\end{equation*}
This implies (as nothing in $f(m(T)+y)$ depends on $\tau$ or $x$; for $m(T)$ we are varying the $t$ part of $\tau$ not $T$), 
\begin{equation*}
\frac{\partial V}{\partial\tau} = -rV + \frac{1}{2}\sigma^2\frac{\partial^2 V}{\partial x^2} - \frac{dm}{d\tau}\frac{\partial V}{\partial x}\,.
\end{equation*}
where the first term comes from the discount factor. Furthermore, the saddle point approximation shows that $\rho(y;\tau=0,x) = \delta(y-b(\tau=0,x))$, which gives the terminal condition
\begin{equation*}
V(\tau=0,x) = f(m(T) + b(\tau=0,x)) = f(X_T) \,.
\end{equation*}
We are nearly there! 

To get to Black-Scholes, let's consider the textbook lognormal stock process $S_t = S_0\exp(\mu t + \sigma B_t)$, where $\mu = r - \sigma^2/2$. Define $X_t = \log S_t$ and we find $m(t) = \log S_t + \mu t$ with $dm/dt = \mu$. We treat the fundamental process to be the stock process, so it is useful to change coordinates from $x$ to $s = \exp(x)$. Doing so, and transforming back to the $t$ variable, we have
\begin{equation*}
\frac{\partial V}{\partial t} + \frac{1}{2}\sigma^2 s^2\frac{\partial^2 V}{\partial s^2} + rs\frac{\partial V}{\partial s} - rV = 0 \,.
\end{equation*}
This is precisely the Black-Scholes PDE.


# Hyperbolic PDEs: Wave equation

---

From Maxwell's classical field equations for electromagnetism to bosonic string theory, the wave equation underpins the dynamics of a vast number of physical processes. Here, we look at 2 dimensions $(t, x)$ with $t$ being time and $x$ being space and study the equation
\begin{equation*}
\frac{\partial^2 u}{\partial t^2} - c^2\frac{\partial^2 u}{\partial x^2} = 0 \,,
\end{equation*}
where $c>0$ is a constant. Although we are in 2 dimensions and we generally agree we live in 4, this exact equation actually shows up in string theory, so it's definitely worth studying!

Let's consider a change of coordinates $(t, x) \to (x^+, x^-)$ defined by
\begin{equation*}
x^+ = x + ct \,,\quad x^- = x - ct \,.
\end{equation*}
These are called light-cone coordinates. In these coordinates, the wave equation reads
\begin{equation*}
\frac{\partial^2 u}{\partial x^+ \partial x^-} = 0 \,.
\end{equation*}
This effectively separates the equation: $\partial_{-}u$ must be a constant in $x^+$ (and vice-versa). The general solution is then
\begin{equation*}
u(x^+, x^-) = f(x^+) + g(x^-) \,,
\end{equation*}
for arbitrary functions $f$ and $g$. In the original coordinates, this is 
\begin{equation*}
u(t, x) = f(x + ct) + g(x - ct) \,.
\end{equation*}
The wave equation is 2nd-order in time, so an initial value problem requires a specification of $u(0,x)$ and $\partial_t u(0, x)$. Suppose
\begin{equation*}
u(0,x) = a(x) \,,\quad \frac{\partial u}{\partial t}(0,x) = b(x) \,.
\end{equation*}
In terms of $f$ and $g$, these are
\begin{align*}
f(x) + g(x) = a(x) \,,\quad c\left(f'(x) - g'(x)\right) = b(x) \,.
\end{align*}
Assuming $b$ is integrable, we have
\begin{equation*}
f(x) = \frac{1}{2}a(x) + \frac{1}{2c}\int_{-\infty}^x b(v) dv\,,\quad g(x) = a(x) - f(x) \,, 
\end{equation*}
so the solution to the initial value problem is
\begin{equation*}
u(t,x) = \frac{1}{2}\left(a(x + ct) - a(x - ct)\right) + \frac{1}{2c}\int_{x-ct}^{x+ct}b(v) dv\,.
\end{equation*}
This is also known as d'Alembert's formula.

Before we finish, let's discuss the combinations $x^\pm = x\pm ct$. Consider a function $h(x^+) = h(x + ct)$. Both $x$ and $t$ are free variables, so there are multiple ways $x+ct$ can evolve. What is required for $x+ct$ to remain constant? Let's fix $x + ct = x_0$. Solving this for $x$ gives $x = x_0 - ct$. So as $t$ increases, $x$ must decrease with speed $c$ in order for $x + ct$ to stay fixed. This means that $h(x+ct)$ is a left-moving function --- as time goes by, the profile of $h$ moves to the left. Similarly, a function of $x^- =x-ct$ is right-moving. d'Alembert's formula then expresses the general solution to the wave equation as a sum of left- and right-moving waves.