# Chapter 3

## 1. Simulating Hamiltonian dynamics

Another major potential application of quantum computers is the simulation of quantum dynamics. Indeed, this was the idea that first led Feynman to propose the concept of a quantum computer. In this lecture we will see how a universal quantum computer can efficiently simulate several natural families of Hamiltonians. These simulation methods could be used either to simulate actual physical systems or to implement quantum algorithms defined in terms of Hamiltonian dynamics, such as continuous-time quantum walks and adiabatic quantum algorithms.

### 1.1. Hamiltonian dynamics

In quantum mechanics, time evolution of the wave function $|\psi(t)\rangle$ is governed by the Schrödinger equation,

$$
i \hbar \frac{\mathrm{~d}}{\mathrm{~d} t}|\psi(t)\rangle=H(t)|\psi(t)\rangle
$$

Here $H(t)$ is the *Hamiltonian*, an operator with units of energy, and $\hbar$ is Planck's constant. For convenience it is typical to choose units in which $\hbar=1$. Given an initial wave function $|\psi(0)\rangle$, we can solve this differential equation to determine $|\psi(t)\rangle$ at any later (or earlier) time $t$.

For $H$ independent of time, the solution of the Schrödinger equation is $|\psi(t)\rangle=e^{-i H t}|\psi(0)\rangle$. For simplicity we will only consider this case. There are many situations in which time-dependent Hamiltonians arise, not only in physical systems but also in computational applications such as adiabatic quantum computing. In such cases, the evolution cannot in general be written in such a simple form, but nevertheless similar ideas can be used to simulate the dynamics.

### 1.2. Efficient simulation

We will say that a Hamiltonian $H$ acting on $n$ qubits can be *efficiently simulated* if for any $t>0, \epsilon>0$ there is a quantum circuit $U$ consisting of poly $(n, t, 1 / \epsilon)$ gates such that $\left\|U-e^{-i H t}\right\|<\epsilon$. Clearly, the problem of simulating Hamiltonians in general is BQP-hard, since we can implement any quantum computation by a sequence of Hamiltonian evolutions. In fact, even with natural restrictions on the kind of Hamiltonians we consider, it is easy to specify Hamiltonian simulation problems that are BQP-complete (or more precisely, PromiseBQP-complete).

You might ask why we define the notion of efficient simulation to be polynomial in $t$; if $t$ is given as part of the input, this means that the running time is, strictly speaking, not polynomial in the input size. However, one can show that a running time polynomial in $\log t$ is impossible; running time $\Omega(t)$ is required in general (intuitively, one cannot "fast forward" the evolution according to a generic Hamiltonian) [1](https://arxiv.org/abs/quant-ph/0508139). The dependence on $\epsilon$ is more subtle. In fact, it is possible to achieve running time logarithmic in $1 / \epsilon$, as we discuss further later.

We would like to understand the conditions under which a Hamiltonian can be efficiently simulated. Of course, we cannot hope to efficiently simulate arbitrarily Hamiltonians, just as we cannot hope to efficiently
implement arbitrary unitaries. Instead, we will simply describe a few classes of Hamiltonian that can be efficiently simulated. Our strategy will be to start from simple Hamiltonians that are easy to simulate and define ways of combining the known simulations to give more complicated ones.

There are a few cases where a Hamiltonian can obviously simulated efficiently. For example, this is the case if $H$ only acts nontrivially on a constant number of qubits, simply because any unitary evolution on a constant number of qubits can be approximated with error at most $\epsilon$ using poly $\left(\log \frac{1}{\epsilon}\right)$ one- and two-qubit gates, using the Solovay-Kitaev theorem.

Note that since we require a simulation for an arbitrary time $t$ (with poly $(t)$ gates), we can rescale the evolution by any polynomial factor: if $H$ can be efficiently simulated, then so can $c H$ for any $c=\operatorname{poly}(n)$. This holds even if $c<0$, since any efficient simulation is expressed in terms of quantum gates, and can simply be run in reverse.

In addition, we can rotate the basis in which a Hamiltonian is applied using any unitary transformation with an efficient decomposition into basic gates. In other words, if $H$ can be efficiently simulated and the unitary transformation $U$ can be efficiently implemented, then $U H U^{\dagger}$ can be efficiently simulated. This follows from the simple identity

$$
e^{-i U H U^{\dagger} t}=U e^{-i H t} U^{\dagger}
$$

Another simple but useful trick for simulating Hamiltonians is the following. Suppose $H$ is diagonal in the computational basis, and any diagonal element $d(a)=\langle a| H|a\rangle$ can be computed efficiently. Then $H$ can be simulated efficiently using the following sequence of operations, for any input computational basis state $|a\rangle$ :

$$
\begin{aligned}
|a, 0\rangle & \mapsto|a, d(a)\rangle \\
& \mapsto e^{-i t d(a)}|a, d(a)\rangle \\
& \mapsto e^{-i t d(a)}|a, 0\rangle \\
& =e^{-i H t}|a\rangle|0\rangle
\end{aligned}
$$

By linearity, this process simulates $H$ for time $t$ on an arbitrary input.
Note that if we combine this simulation with the previous one, we have a way to simulate any Hamiltonian that can be efficiently diagonalized, and whose eigenvalues can be efficiently computed.

### 1.3. Product formulas

Many natural Hamiltonians have the form of a sum of terms, each of which can be simulated by the techniques described above. For example, consider the Hamiltonian of a particle in a potential:

$$
H=\frac{p^{2}}{2 m}+V(x)
$$

To simulate this a digital quantum computer, we can imagine discretizing the $x$ coordinate. The operator $V(x)$ is diagonal, and natural discretizations of $p^{2}=-\mathrm{d}^{2} / \mathrm{d} x^{2}$ are diagonal in the discrete Fourier basis. Thus we can efficiently simulate both $V(x)$ and $p^{2} / 2 m$. Similarly, consider the Hamiltonian of a spin system, say of the form

$$
H=\sum_{i} h_{i} X_{i}+\sum_{i j} J_{i j} Z_{i} Z_{j}
$$

(or more generally, any $k$-local Hamiltonian, a sum of terms that each act on at most $k$ qubits). This consists of a sum of terms, each of which acts only only a constant number of qubits and hence is easy to simulate.

In general, if $H_{1}$ and $H_{2}$ can be efficiently simulated, then $H_{1}+H_{2}$ can also be efficiently simulated. If the two Hamiltonians commute, then this is trivial, since $e^{-i H_{1} t} e^{-i H_{2} t}=e^{-i\left(H_{1}+H_{2}\right) t}$. However, in the general case where the two Hamiltonians do not commute, we can still simulate their sum as a consequence of the Lie product formula

$$
e^{-i\left(H_{1}+H_{2}\right) t}=\lim _{m \rightarrow \infty}\left(e^{-i H_{1} t / m} e^{-i H_{2} t / m}\right)^{m}
$$

A simulation using a finite number of steps can be achieved by truncating this expression to a finite number of terms, which introduces some amount of error that must be kept small. In particular, if we want to have

$$
\left\|\left(e^{-i H_{1} t / m} e^{-i H_{2} t / m}\right)^{m}-e^{-i\left(H_{1}+H_{2}\right) t}\right\| \leq \epsilon
$$

it suffices to take $m=O\left((\nu t)^{2} / \epsilon\right)$, where $\nu:=\max \left\{\left\|H_{1}\right\|,\left\|H_{2}\right\|\right\}$. (The requirement that $H_{1}$ and $H_{2}$ be efficiently simulable means that $\nu$ can be at most poly $(n)$.

It is somewhat unappealing that to simulate an evolution for time $t$, we need a number of steps proportional to $t^{2}$. Fortunately, the situation can be improved if we use higher-order approximations. For example, one can show that

$$
\left\|\left(e^{-i H_{1} t / 2 m} e^{-i H_{2} t / m} e^{-i H_{1} t / 2 m}\right)^{m}-e^{-i\left(H_{1}+H_{2}\right) t}\right\| \leq \epsilon
$$

with a smaller value of $m$. In fact, by using even higher-order approximations, it is possible to show that $H_{1}+H_{2}$ can be simulated for time $t$ with only $O\left(t^{1+\delta}\right)$, for any fixed $\delta>0$, no matter how small [1](https://arxiv.org/abs/quant-ph/0508139).

A Hamiltonian that is a sum of polynomially many terms can be efficiently simulated by composing the simulation of two terms, or by directly using an approximation to the identity

$$
e^{-i\left(H_{1}+\cdots+H_{k}\right) t}=\lim _{m \rightarrow \infty}\left(e^{-i H_{1} t / m} \cdots e^{-i H_{k} t / m}\right)^{m}
$$

Another way of combining Hamiltonians comes from commutation: if $H_{1}$ and $H_{2}$ can be efficiently simulated, then $i\left[H_{1}, H_{2}\right]$ can be efficiently simulated. This is a consequence of the identity

$$
e^{\left[H_{1}, H_{2}\right] t}=\lim _{m \rightarrow \infty}\left(e^{-i H_{1} \sqrt{t / m}} e^{-i H_{2} \sqrt{t / m}} e^{i H_{1} \sqrt{t / m}} e^{i H_{2} \sqrt{t / m}}\right)^{m}
$$

which can again be approximated with a finite number of terms. However, I don't know of any algorithmic application of such a simulation.

### 1.4. Sparse Hamiltonians

We will say that an $N \times N$ Hermitian matrix is sparse (in a fixed basis) if, in any fixed row, there are only poly $(\log N)$ nonzero entries [2](https://arxiv.org/abs/quant-ph/0301023). The simulation techniques described above allow us to efficiently simulate sparse Hamiltonians. More precisely, suppose that for any $a$, we can efficiently determine all of the $b$ s for which $\langle a| H|b\rangle$ is nonzero, as well as the values of the corresponding matrix elements; then $H$ can be efficiently simulated. In particular, this gives an efficient implementation of the continuous-time quantum walk on any graph $G=(V, E)$ whose maximum degree is poly $(\log |V|)$.

The basic idea of the simulation is to edge-color the graph, simulate the edges of each color separately, and combine these pieces using Lie product formula. The main new technical ingredient in the simulation is a means of coloring the edges of the graph of nonzero matrix elements of $H$. A classic result in graph theory (Vizing's Theorem) says that a graph of maximum degree $d$ has an edge coloring with at most $d+1$ colors (in fact, the edge chromatic number is either $d$ or $d+1$ ). If we are willing to accept a polynomial overhead in the number of colors used, then we can actually find an edge coloring using only local information about the graph. Here we describe a simple $d^{2}$-coloring for the case of a bipartite graph, which is sufficient for general Hamiltonian simulation using a simple reduction [3](https://arxiv.org/abs/1312.1414).

**Lemma**: Suppose we are given an undirected, bipartite graph $G$ with $N$ vertices and maximum degree $d$, and that we can efficiently compute the neighbors of any given vertex. Then there is an efficiently computable edge coloring of $G$ with at most $d^{2}$ colors.

**Proof**: Number the vertices of $G$ from 1 through $N$. For any vertex $\alpha$, let $\operatorname{idx}(\alpha, \beta)$ denote the index of vertex $\beta$ in the list of neighbors of $\alpha$. Define the color of the edge $\alpha \beta$, where $\alpha$ is from the left part of the bipartition and $\beta$ is from the right, to be the ordered pair $(\operatorname{idx}(\alpha, \beta), \operatorname{idx}(\beta, \alpha))$. Clearly there are at most $d^{2}$ such colors. This is a valid coloring since if $(\alpha, \beta)$ and $(\alpha, \delta)$ have the same color, then $\operatorname{idx}(\alpha, \beta)=\operatorname{idx}(\alpha, \delta)$, so $\beta=\delta$. Similarly, if $(\alpha, \beta)$ and $(\gamma, \beta)$ have the same color, then $\operatorname{idx}(\beta, \alpha)=\operatorname{idx}(\beta, \gamma)$, so $\alpha=\gamma$.

Given this lemma, the simulation proceeds as follows. First, to ensure that the graph of $H$ is bipartite, we actually simulate evolution according to the Hamiltonian $\sigma_{x} \otimes H$, which is bipartite and has the same sparsity as $H$. Since $e^{-i\left(\sigma_{x} \otimes H\right) t}|+\rangle|\psi\rangle=|+\rangle e^{-i H t}|\psi\rangle$, we can recover a simulation of $H$ from a simulation of $\sigma_{x} \otimes H$.

Now write $H$ as a diagonal matrix plus a matrix with zeros on the diagonal. We have already shown how to simulate the diagonal part, so we can assume $H$ has zeros on the diagonal without loss of generality.

It suffices to simulate the term corresponding to the edges of a particular color $c$. We show how to make the simulation work for any particular vertex $x$; then it works in general by linearity. By computing the complete list of neighbors of $x$ and computing each of their colors, we can reversibly compute $v_{c}(x)$, the vertex adjacent to $x$ via an edge with color $c$, along with the associated matrix element:

$$
|x\rangle \mapsto\left|x, v_{c}(x), H_{x, v_{c}(x)}\right\rangle
$$

Then we can simulate the $H$-independent Hamiltonian defined by the map

$$
|x, y, h\rangle \mapsto h\left|y, x, h^{*}\right\rangle
$$

since it is easily diagonalized, as it consists of a direct sum of two-dimensional blocks. Finally, we can uncompute the second and third registers. Before the uncomputation, the simulation produces a linear combination of the states $\left|x, v_{c}(x), H_{x, v_{c}(x)}\right\rangle$ and $\left|v_{c}(x), x, H_{x, v_{c}(x)}^{*}\right\rangle$. Since

$$
\left|v_{c}(x), x, H_{x, v_{c}(x)}^{*}\right\rangle=\left|v_{c}(x), v_{c}\left(v_{c}(x)\right), H_{v_{c}(x), x}\right\rangle
$$

the uncomputation works identically for both components.

### 1.5. Measuring an operator

We can view a Hermitian operator not just as the generator of dynamics, but also as a quantity to be measured. In a practical quantum simulation, the desired final measurement might be of this type. For example, we might want to measure the energy of the system, and the Hamiltonian could be a complicated sum of noncommuting terms.

It turns out that any Hermitian operator that can be efficiently simulated (viewing it as the Hamiltonian of a quantum system) can also be efficiently measured using a formulation of the quantum measurement process given by von Neumann. In fact, von Neumann's procedure is essentially the same as quantum phase estimation.

In von Neumann's description of the measurement process, a measurement is performed by coupling the system of interest to an ancillary system, which we call the pointer. Suppose that the pointer is a one-dimensional free particle and that the system-pointer interaction Hamiltonian is $H \otimes p$, where $p$ is the momentum of the particle. Furthermore, suppose that the mass of the particle is sufficiently large that we can neglect the kinetic term. Then the resulting evolution is

$$
e^{-i t H \otimes p}=\sum_{a}\left[\left|E_{a}\right\rangle\left\langle E_{a}\right| \otimes e^{-i t E_{a} p}\right]
$$

where $\left|E_{a}\right\rangle$ are the eigenstates of $H$ with eigenvalues $E_{a}$. Suppose we prepare the pointer in the state $|x=0\rangle$, a narrow wave packet centered at $x=0$. Since the momentum operator generates translations in position, the above evolution performs the transformation

$$
\left|E_{a}\right\rangle \otimes|x=0\rangle \mapsto\left|E_{a}\right\rangle \otimes\left|x=t E_{a}\right\rangle .
$$

If we can measure the position of the pointer with sufficiently high precision that all relevant spacings $x_{a b}=t\left|E_{a}-E_{b}\right|$ can be resolved, then measurement of the position of the pointer-a fixed, easy-to-measure observable, independent of $H$-effects a measurement of $H$.

Von Neumann's measurement protocol makes use of a continuous variable, the position of the pointer. To turn it into an algorithm that can be implemented on a digital quantum computer, we can approximate the evolution using $r$ quantum bits to represent the pointer. The full Hilbert space is thus a tensor
product of a $2^{n}$-dimensional space for the system and a $2^{r}$-dimensional space for the pointer. We let the computational basis of the pointer, with basis states $\{|z\rangle\}$, represent the basis of momentum eigenstates. The label $z$ is an integer between 0 and $2^{r}-1$, and the $r$ bits of the binary representation of $z$ specify the states of the $r$ qubits. In this basis, $p$ acts as

$$
p|z\rangle=\frac{z}{2^{r}}|z\rangle .
$$

In other words, the evolution $e^{-i t H \otimes p}$ can be viewed as the evolution $e^{-i t H}$ on the system for a time controlled by the value of the pointer.

Expanded in the momentum eigenbasis, the initial state of the pointer is

$$
|x=0\rangle=\frac{1}{2^{r / 2}} \sum_{z=0}^{2^{r}-1}|z\rangle
$$

The measurement is performed by evolving under $H \otimes p$ for some appropriately chosen time $t$. After this evolution, the position of the simulated pointer can be measured by measuring the qubits that represent it in the $x$ basis, i.e., the Fourier transform of the computational basis.

Note that this discretized von Neumann measurement procedure is equivalent to phase estimation. Recall that in the phase estimation problem, we are given an eigenvector $|\psi\rangle$ of a unitary operator $U$ and asked to determine its eigenvalue $e^{i \phi}$. The algorithm uses two registers, one that initially stores $|\psi\rangle$ and one that will store an approximation of the phase $\phi$. The first and last steps of the algorithm are Fourier transforms on the phase register. The intervening step is to perform the transformation

$$
|\psi\rangle \otimes|z\rangle \mapsto U^{z}|\psi\rangle \otimes|z\rangle
$$

where $|z\rangle$ is a computational basis state. If we take $|z\rangle$ to be a momentum eigenstate with eigenvalue $z$ (i.e., if we choose a different normalization) and let $U=e^{-i H t}$, this is exactly the transformation induced by $e^{-i(H \otimes p) t}$. Thus we see that the phase estimation algorithm for a unitary operator $U$ is exactly von Neumann's prescription for measuring $i \ln U$.
