## The Klein-Gordon Equation
To get a relativistic quantum mechanical equation, we can take the relativistic
energy-momentum relation,

$$
E^2 - p^2c^2 = (mc^2)^2
$$

and apply the quantum mechanical prescription

$$
E\to i\hbar \dfrac{\partial}{\partial t}, \quad p\to -i\hbar\boldsymbol{\nabla},
$$

where we're just working in the free particle case ($V(\mathbf{r}) = 0$). This
gets us the Klein-Gordon equation

$$
 -\hbar^2\dfrac{\partial^2\Psi}{\partial t^2} + \hbar^2c^2\nabla^2\Psi = (mc^2)^2\Psi.
$$

Rearranging constants yields

$$
\dfrac{1}{c^2}\dfrac{\partial^2\Psi}{\partial t^2} - \nabla^2\Psi =
\left(\dfrac{mc}{\hbar}\right)^2\Psi.
$$

This is an inhomogeneous wave equation. Schrödinger actually discovered this
before the equation that bears his name, but abandoned it after it failed to 
predict the energy levels of the hydrogen atom. It turns out that the
Klein-Gordon equation models spin-0 quantum objects, which was not useful
for modeling the spin-1/2 electron.

## The Dirac Equation
For the aforementioned reason and others, Dirac wanted to find an equation that
was *first* order in space and time. The question he sought to answer was how to
take the square root of the relativistic energy-momentum relation. In 
four-vector notation,

$$
p_\mu p^\mu - (mc)^2 = 0,
$$

where $p_\mu$ is the covariant momentum 4-vector and $p^\mu$ is the 
contravariant momentum four-vector, related by

$$
p_\mu = \eta_{\mu\nu}p^\nu
$$

with the Minkowski metric 
$\eta_{\mu\nu} = \eta^{\mu\nu} = \operatorname{diag}(1, -1, -1, -1)$,
and repeated indices are summed. Dirac tried factoring this equation by
analogy to a difference in squares,

$$
p_\mu p^\mu - (mc)^2 = (\beta^\kappa p_\kappa + mc)
(\gamma^\lambda p_\lambda - mc)
$$

with $\beta^\kappa$ and $\gamma^\lambda$ as coefficients to be determined.
Carrying out the multiplication, we get

$$
p_\mu p^\mu - (mc)^2 =
\beta^\kappa\gamma^\lambda p_\kappa p_\lambda + 
(\gamma^\kappa - \beta^\kappa) p_\kappa mc - (mc)^2.
$$

Since there are no terms linear in $p_\mu$ in the relativistic
energy-momentum relation, $\beta^\kappa = \gamma^\kappa$. This means that

$$
\gamma^\kappa \gamma^\lambda p_\kappa p_\lambda = p_\mu p^\mu.
$$

Since there are no mixed indices leftover, all mixed index terms must vanish, 
i.e.

$$
\gamma^\mu \gamma^\nu + \gamma^\nu\gamma^\mu = 0, \qquad \mu \ne\nu
$$

and all squared terms must equal $1$ or $-1$. In fact,

$$
(\gamma^0)^2 = 1, \qquad (\gamma^1)^2 = (\gamma^2)^2 = (\gamma^3)^2 = -1.
$$

However, there are no numbers that obey these relations. Dirac had the insight
to treat these not as numbers, but as matrices, which don't commute and could
follow the above relation. As matrices, the $\gamma^\mu$ follow the succinct
relation

$$
\{\gamma^\mu, \gamma^\nu\} = 2\eta^{\mu\nu},
$$ (eq:gamma-anticommutation)
where $\{A, B\} = AB + BA$ is the *anticommutator* of matrices $A$ and $B$, and
there is an implied identity matrix of appropriate size after $\eta^{\mu\nu}$.
It turns out that the smallest matrices that obey these anticommutator relations
are $4\times 4$. In the Dirac basis, these matrices are

$$
\gamma^0 =
\begin{pmatrix}
I_2 & 0_2\\ 0_2 & -I_2
\end{pmatrix}, \qquad
\gamma^i =
\begin{pmatrix}
0_2 & \sigma^i\\ -\sigma^i & 0_2
\end{pmatrix}
$$ (eq:gamma-def)

where $\sigma^i\ (i = 1, 2, 3)$ are the Pauli matrices,

$$
\sigma^1 = 
\begin{pmatrix}
0 & 1 \\ 1 & 0
\end{pmatrix}, \qquad
\sigma^2 = 
\begin{pmatrix}
0 & -i \\ i & 0
\end{pmatrix}, \qquad
\sigma^3 = 
\begin{pmatrix}
1 & 0 \\ 0 & -1
\end{pmatrix},
$$
and each entry in the gamma matrices is a $2\times 2$ matrix, with $I_2$
representing the identity matrix and $0_2$ representing a matrix of zeros.

The energy-momentum relation now reads

$$
p_\mu p^\mu - (mc)^2 = (\gamma^\rho p_\rho + mc)(\gamma^\sigma p_\sigma - mc) = 0.
$$ (eq:dirac-diff-sq)
By convention, the second factor in the middle expression of 
Eq. {eq}`eq:dirac-diff-sq` is taken to be the Dirac equation,

$$
(\gamma^\mu p_\mu - mc) = 0.
$$
Now we apply the quantum prescription, which in 4-vector notation is

$$
p_\mu = i\hbar \partial_\mu
$$
where we define the 4-gradient

$$
\partial_\mu \equiv \dfrac{\partial}{\partial r^\mu} = (\partial_0, \boldsymbol\nabla)
$$
or, to be explicit about the components,

$$
\partial_0 = \dfrac{1}{c}\dfrac{\partial}{\partial t},\qquad
\partial_1 = \dfrac{\partial}{\partial x},\qquad
\partial_2 = \dfrac{\partial}{\partial y},\qquad
\partial_3 = \dfrac{\partial}{\partial z}.
$$
Acting this differential operator on the wave function $\psi$, we get
the Dirac equation,

$$
i\hbar\gamma^\mu \partial_\mu \psi - mc\psi = 0.
$$
Because the $\gamma^\mu$ matrices are $4\times 4$ matrices, $\psi$ must
be a four-component column matrix,

$$
\psi =
\begin{pmatrix}
\psi_1\\ \psi_2\\ \psi_3\\ \psi_4\\ 
\end{pmatrix}.
$$
This object is known as a *Dirac spinor* or a *bispinor*. It turns out that
it is *not* a four-vector, which will be covered in the next section.