In [None]:
!pip install pennylane

# Introducing (Dynamical) Lie Algebras for quantum practitioners


## Introduction


Lie algebras, pronounced like the name \"Lee\", offer a fresh
perspective on some of the established ideas in quantum physics and
become more and more important in quantum computing. Let us recap some
of the key concepts of quantum mechanics and how they relate to and give
rise to Lie algebras.

Most physicists know quantum physics in terms of wavefunctions
$|\psi\rangle$ that live in a [Hilbert
space](https://en.wikipedia.org/wiki/Hilbert_space) $\mathcal{H},$ as
well as (bounded) linear operators $\hat{O}$ that live in the space of
linear operators on that Hilbert space, $\mathcal{L}(\mathcal{H}).$ For
finite dimensional systems (think, $n$ number of qubits) we have complex
valued state vectors (wavefunctions) in $\mathcal{H} = \mathbb{C}^{2^n}$
with norm 1 and square matrices (linear operators) in
$\mathcal{L}(\mathcal{H}) = \mathbb{C}^{2^n \times 2^n}.$

Two very important sub-classes of linear operators in quantum mechanics
are unitary and Hermitian operators. Hermitian operators $H$ are
self-adjoint, $H^\dagger = H,$ and describe observables that can be
measured. Unitary operators are norm-preserving such that
$\langle \psi | U^\dagger U | \psi \rangle = \langle \psi | \psi \rangle,$
in particular we have $U^{-1} = U^\dagger.$ They describe how quantum
states are transformed and ensure that their norm is preserved.

A unitary operator can always be written as

$$U = e^{-i H},$$

where we say that $H$ is the generator of $U.$ Take for example a single
qubit rotation $U(\phi) = e^{-i \frac{\phi}{2} X}.$ $U(\phi)$ is the
unitary evolution that rotates a quantum state in Hilbert space around
the x-axis on the [Bloch
sphere](https://en.wikipedia.org/wiki/Bloch_sphere), and is generated by
the Pauli $X$ [matrix](https://en.wikipedia.org/wiki/Pauli_matrices).

The space of all such unitary operators forms the so-called special
unitary group $SU(N),$ where for qubit systems we have $N=2^n$ with $N$
the dimension of the group and 'n' the number of qubits. In
quantum computing, we are typically dealing with the Hilbert space
$\mathcal{H} = \mathbb{C}^{2^n}$ and for full universality we require
the available gates to span all of $SU(2^n).$ That means when we have
all unitaries of $SU(2^n)$ available to us, we can reach any state in
Hilbert space from any other state.

The Lie group $SU(2^n)$ has an associated Lie algebra to it, called
$\mathfrak{su}(2^n)$ (more on that later). In some cases, it is more
convenient to work with the associated Lie algebra rather than the Lie
group.

So if you are familiar with quantum computing but knew nothing about Lie
algebras and Lie groups before this demo, the good news is that you
actually already know the elements of both. Roughly speaking, the
relevant Lie group in quantum computing is the space of unitaries, and
the relevant Lie algebra is the space of Hermitian matrices. Further,
they are related to each other: The Lie algebra (Hermitian matrices)
generates the Lie group (unitaries) via the exponential map. There are,
however, some subtleties if we want to be mathematically precise, as we
will explore more in depth now.

## Lie algebras


After some motivation and connections to concepts we are already
familiar with, let us formally introduce Lie algebras. An
[algebra](https://en.wikipedia.org/wiki/Algebra_over_a_field) is a
vector space equipped with a bilinear operation. A [Lie
algebra](https://en.wikipedia.org/wiki/Lie_algebra) $\mathfrak{g}$ is a
special case where the bilinear operation behaves like a commutator. In
particular, the bilinear operation
$[\bullet, \bullet]: \mathfrak{g} \times \mathfrak{g} \rightarrow \mathfrak{g}$
needs to satisfy

-   $[x, x] = 0 \ \forall x \in \mathfrak{g}$ (alternativity)
-   $[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 \ \forall x,y,z \in \mathfrak{g}$
    (Jacobi identity)
-   $[x, y] = - [y, x] \ \forall x,y \in \mathfrak{g}$
    (anti-commutativity)

The last one, anti-commutativity, technically is not an axiom but
follows from bilinearity and alternativity, but is so crucial that it is
worth highlighting. These properties generally define the so-called Lie
bracket, where the commutator is just one special case thereof. A
different example would be the
[cross-product](https://en.wikipedia.org/wiki/Cross_product) between
vectors in $\mathbb{R}^3.$ Note also that we are talking about a
**vector** space in the mathematical sense, and the elements
(\"vectors\") in $\mathfrak{g}$ are actually operators (matrices) in our
case looking at quantum physics.

One very relevant Lie algebra for us is the special unitary algebra
$\mathfrak{su}(N),$ the space of $N \times N$ skew-Hermitian matrices
with trace zero. The fact that we look at skew-Hermitian
($H^\dagger = - H$) instead of Hermitian ($H^\dagger = H$) matrices is a
technical detail (see note below). For all practical purposes you can
just think of Hermitian operators with an imaginary factor and note that
linear combinations are strictly over the reals. In fact, you may
sometimes find references to $\mathfrak{su}(N)$ being the Hermitian
matrices in physics literature (see
[Wikipedia](https://en.wikipedia.org/wiki/Special_unitary_group#Fundamental_representation)).


Note : 

The result of a commutator between two Hermitian operators $H_1$ and
$H_2$ is always skew-Hermitian due to the commutator\'s
anti-commutativity, i.e.

$$[H_1, H_2]^\dagger = [H_2^\dagger, H_1^\dagger] = - [H_1, H_2].$$

This means that Hermitian operators are not closed under commutation,
and thus do not form a Lie algebra (because the commutator maps outside
the set of Hermitian matrices). But instead, skew-Hermitian operators
do. Note that the algebra of $N \times N$ skew-Hermitian matrices is
called the unitary algebra $\mathfrak{u}(N),$ whereas the additional
property of the trace being zero making it the special
unitary algebra $\mathfrak{su}(N).$ They generate the unitary group
$U(N)$ and the special unitary group $SU(N)$ with determinant 1,
respectively.

The Pauli matrices $\{iX, iY, iZ\}$ span the $\mathfrak{su}(2)$ algebra
that we can associate with single qubit dynamics. For multiple qubits we
have

$$\mathfrak{su}(2^n) = \text{span}_{\mathbb{R}}\left(\{iX_0, .., iY_0, .., iZ_0, .., iX_0 X_1, .. iY_0 Y_1, .., iZ_0 Z_1, ..\}\right),$$

where the span is over the reals $\mathbb{R}.$ In particular, we cannot
do a complex span over Paulis, since this could destroy the
anti-commutativity again. Another way of thinking about this is that Lie
algebra elements \"live\" in the exponent of a unitary operator, and
having that exponent become Hermitian instead of skew-Hermitian destroys
the unitary property.

Let us briefly test some of these properties numerically. First, let us
do a linear combination of $\{iX, iY, iZ\}$ with some real values and
check unitarity after putting them in the exponent.


In [None]:
import numpy as np
import pennylane as qml
from pennylane import X, Y, Z

su2 = [1j * X(0), 1j * Y(0), 1j * Z(0)]

coeffs = [1., 2., 3.]                           # some real coefficients
exponent = qml.dot(coeffs, su2)                 # linear combination of operators
U = qml.math.expm(exponent.matrix())            # compute matrix exponent of lin. comb.
print(np.allclose(U.conj().T @ U, np.eye(2)))   # check that result is unitary UU* = 1

If we throw complex values in the mix, the resulting matrix is not
unitary anymore.


In [None]:
coeffs = [1., 2.+ 1j, 3.]                       # some complex coefficients
exponent = qml.dot(coeffs, su2)
U = qml.math.expm(exponent.matrix())
print(np.allclose(U.conj().T @ U, np.eye(2)))   # result is not unitary anymore

## Relation to Lie groups


We said earlier that the Lie group $SU(N)$ is generated by the Lie
algebra $\mathfrak{su}(N).$ But what do we actually mean by that?
Essentially, for every unitary matrix $U \in SU(N)$ there is a (real)
linear combination of elements $iP_j \in \mathfrak{su}(N)$ such that

$$U = e^{i \sum_{j=1}^N \lambda_j P_j}$$

for some real coefficients $\lambda_j \in \mathbb{R}.$

In quantum computing, we are interested in unitary gates that, when
composed together, realize a complicated unitary evolution $U.$ That
could, for example, be a unitary that prepares the ground state of a
Hamiltonian from the $|0\rangle^{\otimes n}$ state or perform a
sub-routine like the quantum Fourier transform. In particular, we are
not composing quantum circuits via creating superpositions of Lie
algebra elements as is done in the last equation.

Luckily, beyond the relation above, we also know that any unitary matrix
$U \in SU(2^n)$ can be decomposed in a finite product of elements from a
universal gate set $\mathcal{U},$

$$U = \prod_j U_j$$

for $U_j \in \mathcal{U}.$ A universal gate set is formed exactly when
the generators of its elements form $\mathfrak{su}(2^n).$ Note that in
this equation the product may feature a large number of gates $U_j,$ so
universality does not guarantee an efficient decomposition but rather
just a finite one.

## Dynamical Lie Algebras


A different way of looking at this is taking a set of generators
$\{iG_j\}$ and asking what kind of unitary evolutions they can generate.
This naturally introduces the so-called Dynamical Lie Algebra (DLA),
originally proposed in quantum optimal control theory and recently
re-emerging in the quantum computing literature. The DLA $i\mathfrak{g}$
is given by all possible nested commutators between the generators
$\{iG_j\},$ until no new and linearly independent skew-Hermitian
operator is generated. This is called the Lie-closure and is written
like

$$i \mathfrak{g} = \langle iG_1, iG_2, iG_3,.. \rangle_\text{Lie}.$$

Let us do a quick example and compute the Lie closure of $\{iX, iY\}$
(more examples later).


In [None]:
print(qml.commutator(1j * X(0), 1j * Y(0)))

We know that the commutator between $iX$ and $iY$ yields a new operator
$\propto iZ.$ Note that we do not care for scalar coefficients, just the
operators (technically, we care for linear independence, and $2i Z$ is
of course linearly dependent on $iZ$). So we add $iZ$ to our list of
operators and continue to take commutators between them.


In [None]:
list_ops = [1j * X(0), 1j * Y(0), 1j * Z(0)]
for op1 in list_ops:
    for op2 in list_ops:
        print(qml.commutator(op1, op2))

Since no new operators have been created we know the lie closure is
complete and our dynamical Lie algebra is
$\langle\{iX, iY\}\rangle_\text{Lie} = \{iX, iY, iZ\}( = \mathfrak{su}(2)).$

PennyLane provides some dedicated functionality for Lie algebras. We can
compute the Lie closure of the generators using `qml.lie_closure`.


In [None]:
dla = qml.lie_closure([X(0), Y(0)])
dla

On one hand, the Lie closure ensures that the DLA is closed under
commutation. But you can also think of the Lie closure as filling the
missing operators to describe the possible dynamics in terms of its Lie
algebra. Let us stick to the example above and imagine for a second that
we dont take the Lie closure but just take the two generators
$\{iX, iY\}.$ These two generators suffice for universality (for a
single qubit) in that we can write any evolution in the Dynamical Lie
Group $SU(2)$ as a finite product of these $X$ and $Y$ rotations
$e^{-i \phi X}$ and $e^{-i \phi Y}.$ For example, let us write a Pauli-Z
rotation at non-trivial angle $0.5$ as a product of them.


In [None]:
U_target = qml.matrix(qml.RZ(-0.5, 0))
decomp = qml.ops.one_qubit_decomposition(U_target, 0, rotations="XYX")
print(decomp)

We can check that this is indeed a valid decomposition by computing the
trace distance to the target.


In [None]:
U = qml.prod(*decomp).matrix()
print(1 - np.real(np.trace(U_target @ U))/2)

So we see that a finite set of generators $iX$ and $iY$ suffice to
express the target unitary. However, we cannot write
$U = e^{-i(\lambda_1 X + \lambda_2 Y)}$ since we are missing the $iZ$
from the DLA
$i\mathfrak{g} = \langle iX, iY \rangle_\text{Lie} = \{iX, iY, iZ\}.$

## Ising-type Lie algebras


Let us work through another example as an exercise. Let us look at the
generators $\{iX_0 X_1, iZ_0, iZ_1\}.$ You may recognize them as the
terms in the transverse field Ising model (here for the simple case of
$n=2$)

$$H_\text{Ising} = \sum_{\langle i, j \rangle} X_i X_j + \sum_{j=1}^n Z_j$$

where $\langle i, j \rangle$ indicates a sum over nearest neighbors in
the system\'s topology. Let us compute the first set of commutators for
those generators.


In [None]:
generators = [1j * (X(0) @ X(1)), 1j * Z(0), 1j * Z(1)]

# collection of linearly independent basis vectors, automatically discards linearly dependent ones
dla = qml.pauli.PauliVSpace(generators, dtype=complex)
for i, op1 in enumerate(generators):
    for op2 in generators[i+1:]:
        res = qml.commutator(op1, op2)/2
        res = res.simplify() # ensures all products of scalars are executed
        print(f"[{op1}, {op2}] = {res}")

        if res.scalar != 0. and dla.is_independent(res.pauli_rep):
            # Note that the previous is_independent check is just for pedagocical purposes
            # as dla.add already checks linear independence below
            print(f"Appending {res}")
            dla.add(res)

We obtain two new operators $iY_0 X_1$ and $iX_0 Y_1$ and append the
list of operators of the DLA. We then continue with depth-1 nested
commutators (\"nested\" as $iY_0 X_1 \propto [iX_0 X_1, iZ_0]$).


In [None]:
for i, op1 in enumerate(dla.basis.copy()):
    for op2 in dla.basis.copy()[i+1:]:
        res = qml.commutator(op1, op2)/2
        res = res.simplify()
        print(f"[{op1}, {op2}] = {res}")

        if res.scalar != 0. and dla.is_independent(res.pauli_rep):
            print(f"Appending {res}")
            dla.add(res)

The only new operator here is $iY_0 Y_1,$ which we add to the list of
the DLA. We could continue this process with a second nesting layer but
will find that no new operators are added past this point. We finally
end up with the DLA
$\{X_0 X_1, Z_0, Z_1, iY_0 X_1, iX_0 Y_1, iY_0 Y_1\}$


In [None]:
for op in dla.basis:      
    print(op)

Curiously, even though both $iZ_0$ and $iZ_1$ are in the DLA, $iZ_0 Z_1$
is not. Hence, products of generators are not necessarily in the DLA.

We have constructed the DLA by hand to showcase the process. We can use
the PennyLane function `lie_closure` for
convenience. In that case, we omit the explicit use of the imgaginary
factor.


In [None]:
dla2 = qml.lie_closure([X(0) @ X(1), Z(0), Z(1)])
for op in dla2:
    print(op)

The DLA obtained from the Ising generators form the so-called special
orthogonal Lie algebra $\mathfrak{so}(4),$ which has the dimension
$4*3/2 = 6$ (see table below), equal to the number of operators we
obtain from computing the Lie closure. For more qubits $n,$ the
associated DLA for the transverse field Ising model is
$\mathfrak{so}(2n)$ for open boundary conditions and
$\mathfrak{so}(2n)^{\oplus 2}$ for cyclic boundary conditions in 1D.

We can easily verify this using `lie_closure`.


In [None]:
def IsingGenerators(n, bc="open"):
    gens = [X(i) @ X(i+1) for i in range(n-1)]
    gens += [Z(i) for i in range(n)]
    if bc == "periodic":
        gens += [X(n-1) @ X(0)]
    return gens

for n in range(2, 5):
    open_ = qml.lie_closure(IsingGenerators(n, "open"))
    periodic_ = qml.lie_closure(IsingGenerators(n, "periodic"))
    print(f"Ising for n = {n}")
    print(f"open: {len(open_)} = {n*(2*n-1)} = 2n * (2n - 1)/2")
    print(f"open: {len(periodic_)} = {2*n*(2*n-1)} = 2 * 2n * (2n - 1)/2")

This Ising-type Lie algebra is one of only a few handful DLAs that have
polynomial scaling, see for a full classification in 1D and are thus
efficiently simulatable. Less common but also relevant is the
[symplectic algebra](https://en.wikipedia.org/wiki/Symplectic_group)
$\mathfrak{sp}(2N).$

In the table below we provide the dimensions of some of the common
simple Lie algebras.


  |Lie algebra                      |   dimension |
  |:-:|:-:|
  |$\mathfrak{su}(N)$               |   $N^2-1$ |
  |$\mathfrak{so}(N)$               |   $N(N-1)/2$ |
  |$\mathfrak{sp}(N)$               |   $N(N+1)/2$ |
  
## Hamiltonian Symmetries


With this new knowledge we are now able to understand what is meant when
some Hamiltonian models are said to be symmetric under some symmetry
group. Specifically, let us look at the spin-1/2 Heisenberg model
Hamiltonian in 1D with nearest neighbor interactions,

$$H_\text{Heis} = \sum_{j=1}^{n-1} J_j \left(X_j X_{j+1} + Y_j Y_{j+1} + Z_j Z_{j+1} \right)$$

with some coupling constants $J_j \in \mathbb{R}.$ First it is important
to understand that the generators here are made up of the whole sum of
operators $X_j X_{j+1} + Y_j Y_{j+1} + Z_j Z_{j+1}$, and not each
individual term $X_j X_{j+1}$, $Y_j Y_{j+1},$ and $Z_j Z_{j+1}.$ This
Hamiltonian is said to be $SU(2)$ invariant, but what does that mean?

First, let us identify total spin components

$$S_\text{tot}^{x} = \sum_{j=1}^n X_j ; \ S_\text{tot}^{y} = \sum_{j=1}^n Y_j ; \ S_\text{tot}^{z} = \sum_{j=1}^n Z_j.$$

Together, they span a representation of $\mathfrak{su}(2)$ (more on that
below). These total spin components each commute with the system
Hamiltonian, i.e.

$$[S_\text{tot}^{x}, H_\text{Heis}] = 0 ; [S_\text{tot}^{y}, H_\text{Heis}] = 0 ; [S_\text{tot}^{z}, H_\text{Heis}] = 0$$

Let us briefly verify this for a small example for `n = 3` qubits that
readily generalizes to arbitrary sizes.


In [None]:
n = 3
H = qml.sum(*(P(i) @ P(i+1) for i in range(n-1) for P in [X, Y, Z]))

SX = qml.sum(*(X(i) for i in range(n)))
SY = qml.sum(*(Y(i) for i in range(n)))
SZ = qml.sum(*(Z(i) for i in range(n)))

print(qml.commutator(H, SX))
print(qml.commutator(H, SY))
print(qml.commutator(H, SZ))

Now that we know that the Heisenberg model Hamiltonian commutes with any
$S_\text{tot}^{\alpha}$ for $\alpha \in \{x, y, z\},$ we also know that
any observable composed of the total spin components

$$\hat{O} = c_x S^x_\text{tot} + c_x S^y_\text{tot} + c_x S^z_\text{tot}$$

commutes with the Hamiltonian,

$$[\hat{O}, H_\text{Heis}] = 0.$$

An immediate consequence of this is that also
$[e^{-i\hat{O}}, H_\text{Heis}] = 0.$ Hence, $H_\text{Heis}$ is
invariant under any action of $e^{-i \hat{O}} \in SU(2),$

$$e^{i\hat{O}} H_\text{Heis} e^{-i\hat{O}} = H_\text{Heis}.$$

Thus, $H_\text{Heis}$ is said to be $SU(2)$ symmetric.

There are several things to note: We have so far been sloppy in equating
Lie algebras with one of many possible representations (e.g.
$\text{span}_{\mathbb{R}} \{iX, iY, iZ\} = \mathfrak{su}(2)$ above). The
total spin component operators
$S_\text{tot}^{x}, S_\text{tot}^{y}, S_\text{tot}^{z}$ span another
representation of $\mathfrak{su}(2)$ and, therefore, generate $SU(2).$
This is easily verified by looking at the commutation relation between
these operators that match
$[\hat{O}_i, \hat{O}_j] = 2i \varepsilon_{ij\ell} \hat{O}_\ell,$ the
defining property of $\mathfrak{su}(2).$


In [None]:
print(qml.commutator(SX, SY) == (2j*SZ).simplify())
print(qml.commutator(SZ, SX) == (2j*SY).simplify())
print(qml.commutator(SY, SZ) == (2j*SX).simplify())

Another perspective on the inherent $SU(2)$ symmetry of $H_\text{Heis}$
is that the expectation value of $\hat{O}$ with respect to any state
$|\psi\rangle$ is invariant under evolution of $H_\text{Heis}.$ This can
be seen by looking at

$$\langle \psi(t) | \hat{O} |\psi(t)\rangle = \langle \psi | e^{i t H_\text{Heis}} \hat{O} e^{-i t H_\text{Heis}} |\psi\rangle = \langle \psi | e^{i t H_\text{Heis}} e^{-i t H_\text{Heis}} \hat{O} |\psi\rangle = \langle \psi | \hat{O} |\psi\rangle$$

where $|\psi(t)\rangle = e^{-i t H_\text{Heis}} |\psi\rangle$ is the
evolved state under $H_\text{Heis}.$ In that sense, $\hat{O}$ is a
conserved quantity of the system. One often associates a so-called
[quantum number](https://en.wikipedia.org/wiki/Quantum_number) with each
generator of the symmetry, here
$\{S_\text{tot}^{x}, S_\text{tot}^{y}, S_\text{tot}^{z}\},$ the total
spin numbers.

Overall, we saw that $H_\text{Heis}$ is invariant under action of
$SU(2)$ and how this gives rise to conserved quantities.

Note:

Symmetries also play a big role in quantum phase transitions: Imagine
preparing the ground state at zero temperature of a system that has a
symmetry. Accordingly, the ground state must be invariant under that
symmetry. I.e., the expectation value of the conserved quantities must
not change by adiabatically (very slowly) changing the system parameters
while staying at zero temperature. However, there may be critical point
in the parameter space of the Hamiltonian where a conserved quantity
does, in fact, change. That is what is called the spontaneous breaking
of the symmetry and it is associated with a quantum phase
transition.


## Conclusion


Given Hermitian operators $G = \{h_i\}$ (think Hermitian observables
like terms of a Hamiltonian), the dynamical Lie algebra $\mathfrak{g}$
can be computed via the Lie closure $\langle \cdot \rangle_\text{Lie}$
(see `pennylane.lie_closure`),

$$\mathfrak{g} = \langle \{h_i\} \rangle_\text{Lie} \subseteq \mathfrak{su}(2^n).$$

That is, by computing all possible nested commutators until no new
operators emerge. This leads to a set of operators that is closed under
commutation, hence the name. In particular, the result of the commutator
between any two elements in $\mathfrak{g}$ can be decomposed as a linear
combination of other elements in $\mathfrak{g},$

$$[h_\alpha, h_\beta] = \sum_\gamma f^\gamma_{\alpha \beta} h_\gamma.$$

The coefficients $f^\gamma_{\alpha \beta}$ are called the structure
constants of the DLA and can be computed via the standard projection in
vector spaces (as is $\mathfrak{g}$),

$$f^\gamma_{\alpha \beta} = \frac{\langle h_\gamma, [h_\alpha, h_\beta]\rangle}{\langle h_\gamma, h_\gamma\rangle}.$$

The main difference from the usual vector spaces like $\mathbb{R}^N$ or
$\mathbb{C}^N$ is that here we use the trace inner product between
operators
$\langle h_\alpha, h_\beta \rangle = \text{tr}\left[h_\alpha^\dagger h_\beta \right]$


Note
:::

Technically, the (dynamical) Lie algebra is formed by skew-Hermitian
operators $\{i h_i\}.$ We avoid this distinction here since for all
practical purposes one can also look at Hermitian operators and
explicitly add imaginary units in the exponents where appropriate. 

With this introduction, we hope to clarify some terminology, introduce
the basic concepts of Lie theory and motivate their relevance in quantum
physics by touching on universality and symmetries. While Lie theory and
symmetries are playing a central role in established fields such as
quantum phase transitions (see note above) and [high energy
physics](https://en.wikipedia.org/wiki/Standard_Model), they have
recently also emerged in quantum machine learning with the onset of
geometric quantum machine learning. Further, DLAs have recently become instrumental in
classifying criteria for barren plateaus and designing simulators based
on them.


## References

1. Roeland Wiersema, Efekan Kökcü, Alexander F. Kemper, Bojko N. Bakalov “Classification of dynamical Lie algebras for translation-invariant 2-local spin systems in one dimension” arXiv:2309.05690, 2023.
2. Johannes Jakob Meyer, Marian Mularski, Elies Gil-Fuster, Antonio Anna Mele, Francesco Arzani, Alissa Wilms, Jens Eisert “Exploiting symmetry in variational quantum machine learning” arXiv:2205.06217, 2022.
3. Quynh T. Nguyen, Louis Schatzki, Paolo Braccia, Michael Ragone, Patrick J. Coles, Frederic Sauvage, Martin Larocca, M. Cerezo “Theory for Equivariant Quantum Neural Networks” arXiv:2210.08566, 2022.
4. Enrico Fontana, Dylan Herman, Shouvanik Chakrabarti, Niraj Kumar, Romina Yalovetzky, Jamie Heredge, Shree Hari Sureshbabu, Marco Pistoia “The Adjoint Is All You Need: Characterizing Barren Plateaus in Quantum Ansätze” arXiv:2309.07902, 2023.
5. Michael Ragone, Bojko N. Bakalov, Frédéric Sauvage, Alexander F. Kemper, Carlos Ortiz Marrero, Martin Larocca, M. Cerezo “A Unified Theory of Barren Plateaus for Deep Parametrized Quantum Circuits” arXiv:2309.09342, 2023.
6. Matthew L. Goh, Martin Larocca, Lukasz Cincio, M. Cerezo, Frédéric Sauvage “Lie-algebraic classical simulations for variational quantum computing” arXiv:2308.01432, 2023.
7. Rolando D. Somma “Quantum Computation, Complexity, and Many-Body Physics” arXiv:quant-ph/0512209, 2005.


# g-sim: Lie-algebraic classical simulations for variational quantum computing


For the most part, we now know the phenomenon of
barren plateaus can be reduced to the dimension of the circuit\'s
dynamical Lie algebra (DLA). In particular, exponentially sized DLAs lead to
exponentially vanishing gradients (barren plateaus). Conversely, it has
been realized that circuits with polynomially sized DLAs can be
efficiently simulated using a technique called $\mathfrak{g}$-sim,
leading to discussions on whether all trainable parametrized circuits
are also efficiently classically simulable.

So what is all the fuss about? How does $\mathfrak{g}$-sim work? What
are its restrictions? How can I run it in PennyLane? We are going to try
to answer all these questions in the demo below.

## Introduction


Lie algebras are tightly connected to quantum physics. While Lie algebra
theory is an integral part of high energy and condensed matter physics,
recent developments have shown connections to quantum simulation and
quantum computing. In particular, the infamous
'barren plateau problem' has been fully characterized by the underlying
dynamical Lie algebra (DLA). The main result of these works is that the dimension of the
circuit\'s DLA is inversely proportional to the variance of the mean of
the gradient (over a uniform parameter distribution), leading to
exponentially vanishing gradients in the uniform average case whenever
the DLA scales exponentially in system size.

At the same time, there exist Lie algebraic techniques with which one
can classically simulate expectation values of circuits with a
complexity polynomial in the dimension of the circuit\'s DLA. Hence,
circuits with guaranteed non-exponentially vanishing gradients in the
uniform average case are classically simulable, leading to some debate
on whether the field of variational quantum computing is doomed or not.
The majority of DLAs are in fact exponentially sized, shifting this
debate towards the question of whether or not uniform average case
results are relevant in practice for variational methods, with some
arguing for better initialization methods.

In this demo, we want to focus on those cases where efficient classical
simulation is possible due to polynomially sized DLAs. These instances
are rather limited as it mainly concerns DLAs of non-interacting systems
as well as the transverse-field Ising model and variations thereof (see
for details).



## $\mathfrak{g}$-sim theory


In Lie algebra simulation, $\mathfrak{g}$-sim, we are interested in how
expectation values of Lie algebra elements are transformed under unitary
evolution. We start from an initial expectation value vector of the
input state $\rho^0$ with respect to each DLA element,

$$(\vec{e}^0)_\alpha = \text{tr}\left[h_\alpha \rho^0 \right].$$

Graphically, we can represent this as a tensor with one leg.

![](Hands_on_6_images/e.png)

When we transform the state $\rho^0$ with a unitary evolution $U,$ we
can use the cyclic property of the trace to shift the evolution onto the
DLA element,

$$(\vec{e}^1)_\alpha = \text{tr}\left[ h_\alpha U \rho^0 U^\dagger \right] = \text{tr}\left[ U^\dagger h_\alpha U \rho^0 \right].$$

In the context of $\mathfrak{g}$-sim, we assume the unitary operator to
be generated by DLA elements $h_\mu \in \mathfrak{g};$ in particular, we
have

$$U = e^{-i \theta h_\mu}$$

with some real parameter $\theta \in \mathbb{R}.$

As a consequence of the [Baker--Campbell--Hausdorff
formula](https://en.wikipedia.org/wiki/Baker%E2%80%93Campbell%E2%80%93Hausdorff_formula),
we know that any $h_\alpha \in \mathfrak{g}$ transformed under such a
$U$ is again in $\mathfrak{g}$ (because it leads to a sum of nested
commutators between DLA elements, and the DLA is closed under
commutation). In fact, it is a well-known result that the resulting
operator is given by the exponential of the structure constants

$$e^{i \theta h_\mu} h_\alpha e^{-i \theta h_\mu} = \sum_\beta e^{-i \theta f^\mu_{\alpha \beta}} h_\beta.$$

This is the identity connecting the adjoint representations of a Lie
group, $\text{Ad}_{e^{-ih_\mu}}(x) = e^{ih_\mu} x e^{-ih_\mu},$ and the
adjoint representation of the associated Lie algebra,
$\left(\text{ad}_{h_\mu}\right)_{\alpha \beta} = f^\mu_{\alpha \beta}.$
It can be summarized as

$$\text{Ad}_{e^{-ih_\mu}} = e^{-i \text{ad}_{h_\mu}}.$$

To the best of our knowledge there is no universally accepted name for
this identity (see, e.g. [Adjoint representation
(Wikipedia)](https://en.wikipedia.org/wiki/Adjoint_representation) or
[Lemma 3.14 in Introduction to Lie Groups and Lie
Algebras](https://www.math.stonybrook.edu/~kirillov/mat552/liegroups.pdf)),
so we shall refer to it as the \"adjoint identity\" from here on.

With this, we can see how the initial expectation value vector is
transformed under unitary evolution,

$$(\vec{e}^1)_\alpha = \sum_\beta e^{-i \theta f^\mu_{\alpha \beta}} \text{tr}\left[h_\beta \rho^0 \right].$$

This is simply the matrix-vector product between the adjoint
representation of the unitary gate and the initial expectation value
vector. For a unitary circuit composed of multiple gates,

$$\mathcal{U} = \prod_j e^{-i \theta_j h_j},$$

this becomes the product of multiple adjoint representations of said
gates,

$$\tilde{U} = \prod_j e^{-i \theta_j \text{ad}_{h_j}}.$$

So overall, the evolution can be summarized graphically as the
following.

![](Hands_on_6_images/Ue.png)

We are typically interested in expectation values of observables
composed of DLA elements,
$\langle \hat{O} \rangle = \sum_\alpha w_\alpha h_\alpha.$ Overall, the
computation in $\mathfrak{g}$-sim is a vector-matrix-vector product,

$$\langle \hat{O} \rangle = \text{tr}\left[\hat{O} \mathcal{U} \rho^0 \mathcal{U}^\dagger \right] = \sum_{\alpha \beta} w_\alpha \tilde{U}_{\alpha \beta} e_\beta = \vec{w} \cdot \tilde{U} \cdot \vec{e}.$$

Or, graphically:

![](Hands_on_6_images/wUe.png)

The dimension of
$\left(\text{ad}_{h_j}\right)_{\alpha \beta} = f^j_{\alpha \beta}$ is
$\text{dim}(\mathfrak{g}) \times \text{dim}(\mathfrak{g}).$ So while we
evolve a $2^n$-dimensional complex vector in state vector simulators, we
evolve a $\text{dim}(\mathfrak{g})$-dimensional expectation vector in
$\mathfrak{g}$-sim, which is more efficient whenever
$\text{dim}(\mathfrak{g}) < 2^n.$ In general, it is efficient whenever
$\text{dim}(\mathfrak{g}) = O\left(\text{poly}(n)\right).$

## $\mathfrak{g}$-sim in PennyLane


Let us put this into practice and write a differentiable
$\mathfrak{g}$-simulator in PennyLane. We start with some boilerplate
PennyLane imports.


In [None]:
import pennylane as qml
from pennylane import X, Z, I
import numpy as np

import jax
import jax.numpy as jnp
from jax.scipy.linalg import expm

jax.config.update("jax_enable_x64", True)
jax.config.update("jax_platform_name", "cpu")

## System DLA


As mentioned before, polynomially sized DLAs are rare, with the
transverse-field Ising model (TFIM) with nearest neighbors being one of
them. We take, for simplicity, the one-dimensional variant with open
boundary conditions,

$$H_\text{TFIM} = \sum_{j=1}^{n-1} J X_j X_{j+1} + \sum_{i=1}^{n} h Z_j.$$

We define its generators and compute the
`pennylane.lie_closure`


In [None]:
n = 10 # number of qubits.
generators = [X(i) @ X(i+1) for i in range(n-1)]
generators += [Z(i) for i in range(n)]

# work with PauliSentence instances for efficiency
generators = [op.pauli_rep for op in generators]

dla = qml.pauli.lie_closure(generators, pauli=True)
dim_g = len(dla)

We are using the `pennylane.pauli.PauliSentence` representation of the operators via the `op.pauli_rep`
attribute for more efficient arithmetic and processing.

## Initial expectation value vector

With that, we can compute the initial expectation value vector for the
$\rho_0 = |0 \rangle \langle 0 |$ initial state for every DLA element.
We are doing a trick of representing the initial state as a Pauli
operator,
$|0 \rangle \langle 0 |^{\otimes n} = \prod_{i=1}^n (I_i + Z_i)/2.$ We
take advantage of the locality of the DLA elements and use the analytic,
normalized trace method
`.pennylane.pauli.PauliSentence.trace`,
all to avoid having to go to the full Hilbert space.


In [None]:
# compute initial expectation value vector
e_in = np.zeros(dim_g, dtype=float)

for i, h_i in enumerate(dla):
    # initial state |0x0| = (I + Z)/2, note that trace function
    # below already normalizes by the dimension,
    # so we can ommit the explicit factor /2
    rho_in = qml.prod(*(I(i) + Z(i) for i in h_i.wires))
    rho_in = rho_in.pauli_rep

    e_in[i] = (h_i @ rho_in).trace()

e_in = jnp.array(e_in)
e_in

## Observable


We can compute the expectation value of any linear combination of DLA
elements. We choose the TFIM Hamiltonian itself,

$$\hat{O} = H_\text{TFIM} = \sum_j J X_j X_{j+1} + h Z_j.$$

So just the generators with some coefficients. Here we choose $J=h=0.5$
for simplicity. We generate the $\vec{w}$ vector by setting the
appropriate coefficients to [\`0.5].\`


In [None]:
w = np.zeros(dim_g, dtype=float)
w[:len(generators)] = 0.5
w = jnp.array(w)

## Forward and backward pass


Together with the structure constants computed via
`~pennylane.structure_constants` we now
have all ingredients to define the forward pass of the expectation value
computation. For demonstration purposes, we choose a random subset of
`depth=10` generators for gates from the DLA.


In [None]:
adjoint_repr = qml.pauli.structure_constants(dla)

depth = 10
gate_choice = np.random.choice(dim_g, size=depth)
gates = adjoint_repr[gate_choice]

def forward(theta):
    # simulation
    e_t = e_in
    for i in range(depth):
        e_t = expm(theta[i] * gates[i]) @ e_t

    # final expectation value
    result_g_sim = w @ e_t

    return result_g_sim.real

theta = jax.random.normal(jax.random.PRNGKey(0), shape=(10,))

gsim_forward, gsim_backward = forward(theta), jax.grad(forward)(theta)
gsim_forward, gsim_backward

As a sanity check, we compare the computation with the full state vector
equivalent circuit.


In [None]:
H = 0.5 * qml.sum(*[op.operation() for op in generators])

@qml.qnode(qml.device("default.qubit"), interface="jax")
def qnode(theta):
    for i, mu in enumerate(gate_choice):
        qml.exp(-1j * theta[i] * dla[mu].operation())
    return qml.expval(H)

statevec_forward, statevec_backward = qnode(theta), jax.grad(qnode)(theta)
statevec_forward, statevec_backward

We see that both simulations yield the same results, while full state
vector simulation is done with a $2^n = 1024$ dimensional state vector,
and $\mathfrak{g}$-sim with a $\text{dim}(g) = 2n (2n-1)/2 = 190$
dimensional expectation value vector.


In [None]:
print(
    qml.math.allclose(statevec_forward, gsim_forward), 
    qml.math.allclose(statevec_backward, gsim_backward),
)

Beyond 6 qubits, $\mathfrak{g}$-sim is more efficient in simulating
circuits generated by the TFIM Hamiltonian.


In [None]:
import matplotlib.pyplot as plt
ns = np.arange(2, 17)

plt.plot(ns, 2*ns*(2*ns-1)/2, "x-", label="dim(g)")
plt.plot(ns, 2**ns, ".-", label="2^n")
plt.yscale("log")
plt.legend()
plt.xlabel("n qubits")
plt.show()

## VQE

Let us do a quick run of the variational quantum eigensolver (VQE) on the system at hand.

First, we define our optimization loop in jax.


In [None]:
import optax
from datetime import datetime

def run_opt(value_and_grad, theta, n_epochs=100, lr=0.1, b1=0.9, b2=0.999, E_exact=0., verbose=True):

    optimizer = optax.adam(learning_rate=lr, b1=b1, b2=b2)
    opt_state = optimizer.init(theta)

    energy = np.zeros(n_epochs)
    gradients = []
    thetas = []

    @jax.jit
    def step(opt_state, theta):
        val, grad_circuit = value_and_grad(theta)
        updates, opt_state = optimizer.update(grad_circuit, opt_state)
        theta = optax.apply_updates(theta, updates)

        return opt_state, theta, val


    t0 = datetime.now()

    ## Optimization loop
    for n in range(n_epochs):
        opt_state, theta, val = step(opt_state, theta)

        energy[n] = val
        thetas.append(theta)
    t1 = datetime.now()
    if verbose:
        print(f"final loss: {val - E_exact}; min loss: {np.min(energy) - E_exact}; after {t1 - t0}")
    
    return thetas, energy, gradients

We can use the Hamiltonian variational ansatz as a natural
parametrization of an ansatz circuit to obtain the ground-state energy.

In particular, we use the full Hamiltonian generator with a trainable
parameter for each term,

$$\prod_{\ell=1}^{10} e^{-i \sum_j \theta^X_j X_j X_{j+1} + \theta^Z_j Z_j},$$

and repeat that over `depth=10` layers.


In [None]:
# Pick the adjoint repr of only the Hamiltonian generators
ham_terms = adjoint_repr[:len(generators)]

def forward(theta):
    # simulation
    e_t = jnp.array(e_in)

    for i in range(depth):
        e_t = expm(jnp.einsum("j,jkl->kl", theta[i], ham_terms)) @ e_t

    # final expectation values
    result_g_sim = w @ e_t

    return result_g_sim.real

Now we can run the optimization to find the ground-state energy.


In [None]:
theta = jax.random.normal(jax.random.PRNGKey(0), shape=(depth, len(generators),))

value_and_grad = jax.jit(jax.value_and_grad(forward))

value_and_grad(theta) # jit-compile first

E_exact = H.eigvals().min()

_, energies, _ = run_opt(value_and_grad, theta, E_exact=E_exact, verbose=True)

import matplotlib.pyplot as plt
plt.plot(energies-E_exact)
plt.yscale("log")
plt.ylabel("$E - E_{exact}$")
plt.xlabel("epochs")
plt.show()

We see good convergence to the true ground-state energy after `100`
epochs.


## References

1. Korbinian Kottmann “Introducing (Dynamical) Lie Algebras for quantum practitioners” PennyLane Demos, 2024.
2. Enrico Fontana, Dylan Herman, Shouvanik Chakrabarti, Niraj Kumar, Romina Yalovetzky, Jamie Heredge, Shree Hari Sureshbabu, Marco Pistoia “The Adjoint Is All You Need: Characterizing Barren Plateaus in Quantum Ansätze” arXiv:2309.07902, 2023.
3. Michael Ragone, Bojko N. Bakalov, Frédéric Sauvage, Alexander F. Kemper, Carlos Ortiz Marrero, Martin Larocca, M. Cerezo “A Unified Theory of Barren Plateaus for Deep Parametrized Quantum Circuits” arXiv:2309.09342, 2023.
4. Rolando D. Somma “Quantum Computation, Complexity, and Many-Body Physics” arXiv:quant-ph/0512209, 2005.
5. Rolando Somma, Howard Barnum, Gerardo Ortiz, Emanuel Knill “Efficient solvability of Hamiltonians and limits on the power of some quantum computational models” arXiv:quant-ph/0601030, 2006.
6. Victor Galitski “Quantum-to-Classical Correspondence and Hubbard-Stratonovich Dynamical Systems, a Lie-Algebraic Approach” arXiv:1012.2873, 2010.
7. Matthew L. Goh, Martin Larocca, Lukasz Cincio, M. Cerezo, Frédéric Sauvage “Lie-algebraic classical simulations for variational quantum computing” arXiv:2308.01432, 2023.
8. M. Cerezo, Martin Larocca, Diego García-Martín, N. L. Diaz, Paolo Braccia, Enrico Fontana, Manuel S. Rudolph, Pablo Bermejo, Aroosa Ijaz, Supanut Thanasilp, Eric R. Anschuetz, Zoë Holmes “Does provable absence of barren plateaus imply classical simulability? Or, why we need to rethink variational quantum computing” arXiv:2312.09121, 2023.
9.Roeland Wiersema, Efekan Kökcü, Alexander F. Kemper, Bojko N. Bakalov “Classification of dynamical Lie algebras for translation-invariant 2-local spin systems in one dimension” arXiv:2309.05690, 2023.
10. Guglielmo Mazzola “Quantum computing for chemistry and physics applications from a Monte Carlo perspective” arXiv:2308.07964, 2023.
11. Chae-Yeun Park, Minhyeok Kang, Joonsuk Huh “Hardware-efficient ansatz without barren plateaus in any depth” arXiv:2403.04844, 2024.


# (g + P)-sim: Extending g-sim by non-DLA observables and gates

In the previous section, we introduced the core concepts of Lie-algebraic simulation
techniques, such as $\mathfrak{g}$-sim. With that, we can compute
quantum circuit expectation values using the so-called
"dynamical Lie algebra (DLA)" of the circuit. The complexity of $\mathfrak{g}$-sim is
determined by the dimension of the corresponding Lie algebra,
$\mathfrak{g}.$ Adding operators to $\mathfrak{g}$ can transform a
polynomially sized DLA to an exponentially sized, but we show here that
when one is using only a few of a specific kind of non-DLA gates, the
increase in size is polynomial.



## Introduction

Lie-algebraic simulation techniques such as $\mathfrak{g}$-sim can be
handy in the niche cases where the
dynamical Lie algebra (DLA) scales polynomially with the number of qubits. Because those
cases essentially boil down to the transverse field Ising model (TFIM)
and variants thereof in 1D, we will do a case study on its DLA
specifically.

We are interested in the case where we want to extend the DLA
$\mathfrak{g}$ by a few additional gates that are outside the DLA. For
$n$ qubits we get a DLA dimension of
$\text{dim}(\mathfrak{g}) = 2n(2n-1)/2$ for the TFIM. Suppose we want to expand the DLA by a single operator $p$
in order to use it as a gate, and let us assume that $p$ is the product
of two DLA operators that, itself, is not part of the DLA. Adding
product operators to the TFIM DLA and computing their new Lie closure
can lead to an exponential increase with a new dimension up to
$2(2^{2n-2}-1).$ In that worst case, we get the so-called associative
algebra of $\mathfrak{g};$ that is, the algebra from the
closure over multiplication, i.e. which looks at all possible products
of operators. This is also a Lie algebra.

Here, we show how to extend the DLA by such a $p$ gate without going to
the exponentially large associative algebra, but instead make use of the
fact that $p$ is a product of DLA elements. We do so by looking at
moments of $\mathfrak{g}$ instead. The $m$-th order
moments are products of $(m+1)$ DLA elements. E.g.
$p = h_{\alpha_1} h_{\alpha_2} \notin \mathfrak{g}$ is a first order
moment. Depending on their order, every non-DLA moment gate increases
the highest moment order considered in the computation, $m_\text{comp}$.
The overall cost scales with the maximum order
$\text{dim}(\mathfrak{g})^{m_\text{comp}}.$

In the worst case, each moment expands the space of operators by a
factor $\text{dim}(\mathfrak{g}),$ such that for $m$ moments, we are
dealing with a $\text{dim}(\mathfrak{g})^{m+2}$ dimensional space. In
that sense, this is similar to
Clifford+T simulators where expensive $T$ gates come with an exponential cost. A
key difference is that for a finite dimensional DLA, there is a maximum
moment $m_\text{max}.$ This corresponds to simply constructing the full
associative algebra again. In the case that the required
$m_\text{comp} = m_\text{max},$ we can just perform regular
$\mathfrak{g}$-sim with the associative algebra. Here we will consider
the case $m_\text{comp} < m_\text{max}.$

In, the authors already hint at the possibility of extending
$\mathfrak{g}$-sim by expectation values of products of DLA elements. In
this demo, we extend this notion to gates generated by
moments of the DLA.

## $\mathfrak{g}$-sim


Let us briefly recap the core principles of $\mathfrak{g}$-sim. We
consider a Lie algebra $\mathfrak{g} = \{h_1, .., h_d\},$ which is
closed under commutation (see `~pennylane.lie_closure`). We know that gates $e^{-i \theta h_\alpha}$ transform Lie
algebra elements into Lie algebra elements,

$$e^{i \theta h_\mu} h_\alpha e^{-i \theta h_\mu} = \sum_\beta \left(e^{-i \theta \text{ad}_{h_\mu}}\right)_{\alpha \beta} h_\beta.$$

This is the adjoint identity with the adjoint representation of the Lie
algebra given by the `~pennylane.structure_constants`,
$f^\mu_{\alpha \beta} = -i \left(\text{ad}_{h_\mu}\right)_{\alpha \beta}.$

This lets us evolve any expectation value of DLA elements using the
adjoint representation of the DLA. For that, we define the expectation
value vector $(\boldsymbol{e})_\alpha = \text{tr}[h_\alpha \rho].$

Also, let us write $U = e^{-i \theta \text{ad}_{h_\mu}}$ corresponding
to a unitary $\mathcal{U} = e^{-i \theta h_\mu}.$ Using the adjoint
identity above and the cyclic property of the trace, we can write an
evolved expectation value vector as

$$\text{tr}\left[h_\alpha \mathcal{U} \rho \mathcal{U}^\dagger\right] = \sum_\beta U_{\alpha \beta} \text{tr}\left[h_\beta \rho \right].$$

Hence, the expectation value vector is simply transformed by matrix
multiplication with $U$ and we have

$$\boldsymbol{e}^\text{out} = U \boldsymbol{e}^\text{in}$$

for some input expectation value vector $\boldsymbol{e}^\text{in}.$

A circuit comprised of multiple unitaries $\mathcal{U}$ then simply
corresponds to evolving the expectation value vector with $U.$

We are going to concretely use the DLA of the transverse field Ising
model,

$$H_\text{Ising} = J \sum_{i=1}^{n-1} X_i X_{i+1} + h \sum_{i=1}^n Z_i.$$

This is one of the few systems that yield a polynomially sized DLA. Let
us construct its DLA via `~pennylane.lie_closure`.


In [None]:
import pennylane as qml
import numpy as np

from pennylane import X, Y, Z, I
from pennylane.pauli import PauliSentence, PauliWord
from scipy.linalg import expm

import copy 

# TFIM generators
def TFIM(n):
    generators = [X(i) @ X(i+1) for i in range(n-1)]
    generators += [Z(i) for i in range(n)]
    generators = [op.pauli_rep for op in generators]

    dla = qml.pauli.lie_closure(generators, pauli=True)
    dim_dla = len(dla)
    return generators, dla, dim_dla

generators, dla, dim_g = TFIM(n=4)

In regular $\mathfrak{g}$-sim, the unitary evolution $\mathcal{U}$ of
the expectation vector is simply generated by the adjoint representation
$U.$


In [None]:
adjoint_repr = qml.structure_constants(dla)
gate = adjoint_repr[-1]
theta = 0.5

U = expm(theta * gate)

## $(\mathfrak{g}+P)$-sim


We now want to extend $\mathfrak{g}$-sim by operators that are not in
the DLA, but a product of DLA operators. Note that while the DLA is
closed under commutation, it is not closed under multiplication, such
that products of DLA elements are in general not in $\mathfrak{g}.$

Let us look at the adjoint action of a gate generated by
$p = h_{\mu_1} h_{\mu_2} .. \notin \mathfrak{g},$

$$e^{i \theta h_{\mu_1} h_{\mu_2} ..} h_\alpha e^{-i \theta h_{\mu_1} h_{\mu_2} ..} = \sum_\beta \boldsymbol{P}^0_{\alpha \beta} h_\beta + \sum_{\beta_1 \beta_2} \boldsymbol{P}^1_{\alpha \beta_1 \beta_2} h_{\beta_1} h_{\beta_2} + ...$$

Here, $\boldsymbol{P}^m$ correspond to the contributions of the $m$-th
moments in $\mathfrak{g}.$ Let us look at the case where we use a first
order product, $\mathcal{P} = e^{-i \theta h_{\mu_1} h_{\mu_2}},$ and
only DLA operators and first order moments contribute to the adjoint
action.

For that, let us construct a concrete example. First we pick two
elements from $\mathfrak{g}$ such that their product is not in
$\mathfrak{g}.$


In [None]:
p = dla[-5] @ dla[-2]
p = next(iter(p)) # strip any scalar coefficients
dla_vspace = qml.pauli.PauliVSpace(dla, dtype=complex)
dla_vspace.is_independent(p.pauli_rep)


Note: 

For DLAs consisting of Pauli words --- as is the case for the TFIM ---
we can simply remove any scalar factors to avoid having additional
imaginary factors in the exponent of gates.


Now, we compute
$e^{i \theta h_{\mu_2} h_{\mu_1}} h_\alpha e^{-i \theta h_{\mu_1} h_{\mu_2}}$
and decompose it in the DLA and first moments.

Note that since the product basis is overcomplete, we only keep track of
the linearly independent elements and ignore the rest.


In [None]:
def exppw(theta, ps):
    # assert that it is indeed a pure Pauli word, not a sentence
    assert (len(ps) == 1 and isinstance(ps, PauliSentence)) or isinstance(ps, PauliWord)
    return np.cos(theta) * PauliWord({}) + 1j * np.sin(theta) * ps

theta = 0.5

P = exppw(theta, p)
P_dagger = exppw(-theta, p) # complex conjugate with p just being a hermitian pauli word

P0 = np.zeros((dim_g, dim_g), dtype=float)

for i, h1 in enumerate(dla):
    res = P @ h1 @ P_dagger
    for j, h2 in enumerate(dla):
        # decompose the result in terms of DLA elements
        # res = ∑ (res · h_j / ||h_j||^2) * h_j 
        value = (res @ h2).trace().real
        value = value / (h2 @ h2).trace()
        P0[i, j] = value

P1 = np.zeros((dim_g, dim_g, dim_g), dtype=float)

for i, h1 in enumerate(dla):
    res = P @ h1 @ P_dagger
    dla_and_M1_vspace = copy.deepcopy(dla_vspace)
    for j, h2 in enumerate(dla):
        for l, h3 in enumerate(dla):
            prod = h2 @ h3
            
            if not dla_and_M1_vspace.is_independent(prod):
                continue

            # decompose the result in terms of products of DLA elements
            # res = ∑ (res · p_j / ||p_j||^2) * p_j 
            value = (res @ prod).trace().real
            value = value / (prod @ prod).trace().real
            P1[i, j, l] = value
            dla_and_M1_vspace.add(prod)

We want to confirm that the adjoint action of $\mathcal{P}$ is indeed
fully described by the zeroth and first moments.

For that, we reconstruct the transformed DLA elements and compare them
with the decomposition.


In [None]:
for i, h1 in enumerate(dla):
    res = P @ h1 @ P_dagger
    res.simplify()

    reconstruct = sum([P0[i, j] * dla[j] for j in range(dim_g)])
    reconstruct += sum([P1[i, j, l] * dla[j] @ dla[l] for j in range(dim_g) for l in range(dim_g)])
    reconstruct.simplify()

    assert res == reconstruct

Now that we have successfully constructed a $\mathcal{P}$ gate, let us
look how entering it in a circuit transforms DLA elements (and therefore
expectation value vector elements).

$$\begin{aligned}
\begin{align*}
(\boldsymbol{e})_\alpha & = \text{tr}\left[h_\alpha \mathcal{P} \rho \mathcal{P}^\dagger \right] = \text{tr}\left[\mathcal{P}^\dagger h_\alpha \mathcal{P} \rho \right] \\
\ & = \sum_\beta \boldsymbol{P}^0_{\alpha \beta} \text{tr}\left[ h_\beta \rho \right] + \sum_{\beta_1 \beta_2} \boldsymbol{P}^1_{\alpha \beta_1 \beta_2} \text{tr}\left[ h_{\beta_1} h_{\beta_2} \rho \right] \\
\ & = \sum_\beta \boldsymbol{P}^0_{\alpha \beta} \boldsymbol{E}^0_\beta + \sum_{\beta_1 \beta_2} \boldsymbol{P}^1_{\alpha \beta_1 \beta_2} \boldsymbol{E}^1_{\beta_1 \beta_2}
\end{align*}
\end{aligned}$$

Here we have defined the expectation tensor
$(\boldsymbol{E}^m)_{\beta_1 , .. , \beta_{m+1}} = \text{tr}\left[ h_{\beta_1} .. h_{\beta_{m+1}} \rho \right]$
for the $m$-th moment. Note that $\boldsymbol{e} = \boldsymbol{E}^0$ is
the expectation value vector for regular $\mathfrak{g}$-sim.

Such a computation corresponds to the branching off from the original
diagram, with an extra contribution coming from the higher moments.

![](Hands_on_6_images/first_split.png)

When inserting an arbitrary DLA gate $U$ before and $V$ after the
$\mathcal{P}$ gate, we obtain the following diagram. Note that $U$ and
$V$ can be compositions of multiple DLA gates again.

![](Hands_on_6_images/first_order_diagram.png)

Note that in one vertical column the $U$ correspond to the same matrix.

## Example


Let us compute an example. For that we start by computing the initial
expectation vector and tensor.


In [None]:
E0_in = np.zeros((dim_g), dtype=float)
E1_in = np.zeros((dim_g, dim_g), dtype=float)
E_in = [E0_in, E1_in]

for i, hi in enumerate(dla):
    rho_in = qml.prod(*(I(i) + Z(i) for i in hi.wires))
    rho_in = rho_in.pauli_rep

    E_in[0][i] = (hi @ rho_in).trace()

for i, hi in enumerate(dla):
    for j, hj in enumerate(dla):
        prod = hi @ hj
        if prod.wires != qml.wires.Wires([]):
            rho_in = qml.prod(*(I(i) + Z(i) for i in prod.wires))
        else:
            rho_in = PauliWord({}) # identity
        rho_in = rho_in.pauli_rep

        E_in[1][i, j] = (prod @ rho_in).trace().real

Now we need to compute the two branches from the diagram above.

![](Hands_on_6_images/first_order_diagram.png)


In [None]:
# some other DLA gate V
V = expm(0.5 * adjoint_repr[-2])

# contract first branch
# V - P - U - E^0_in
res0 = U @ E_in[0]
res0 = P0 @ res0
res0 = V @ res0

# contract second branch

# --U-==-+--------+   -+------+
#        | E^1_in | =  | res  |
# --U-==-+--------+   -+------+
res = np.einsum("ij,jl->il", U, E_in[1])
res = np.einsum("kl,il->ik", U, res)

#        +-----+-==-+------+
#  -V-==-| P^1 |    | res  |
#        +-----+-==-+------+
res = np.einsum("ijl,jl->i", P1, res)
res = V @ res

res = res + res0

As a sanity check, let us compare this to the same circuit but using our
default state vector simulator in PennyLane.


In [None]:
@qml.qnode(qml.device("default.qubit"))
def true():
    qml.exp(-1j * theta * dla[-1].operation())
    qml.exp(-1j * 0.5 * p.operation())
    qml.exp(-1j * 0.5 * dla[-2].operation())
    return [qml.expval(op.operation()) for op in dla]

true_res = np.array(true())

np.allclose(true_res, res)

We find that indeed both results coincide and expectation value vectors
are correctly transformed in $(\mathfrak{g}+P)$-sim.


## Higher moments


We can extend the above approach by a second $P$ gate in the circuit.
This leads to contributions from up to the third order. The diagram for
a circuit $P V P U$ has the following five branches.

First, the zeroth- and first-order contribution. This can be seen as the
branching off from the first previous diagram.

![](Hands_on_6_images/2P_first_second.png)

We also obtain the third-order diagram containing both
$\boldsymbol{P}^1$ tensors.

![](Hands_on_6_images/2P_fourth.png)

To get the correct results, we also obtain intermediate second-order
diagrams.

![](Hands_on_6_images/2P_thirds.png)

## Moment vector space


The above diagrams are handy to understand the maximum moment order
required for adding $P$ gates of a certain order. There is, however, a
lot of redundancy due to the overcompleteness of naively looking at
moments as all possible products between DLA elements.

Instead, we can also work in the vector space of unique moments

$$\mathcal{M}^m := \{p | p= h_{\alpha_1} .. h_{\alpha_m+1} \notin \mathcal{M}^{m-1} \} \cup \mathcal{M}^{m-1}$$

that is iteratively built from $\mathcal{M}^0 = \mathfrak{g}.$

Even though these spaces in general do not form Lie
algebras, we can still compute their (pseudo) adjoint representations
and use them for $\mathfrak{g}$-sim as long as we work in a large enough
space with the correct maximum moment order.

We now set up the moment vector spaces starting from the DLA and keep
adding linearly independent product operators.


In [None]:
def Moment_step(ops, dla):
    MomentX = qml.pauli.PauliVSpace(ops.copy())
    for i, op1 in enumerate(dla):
        for op2 in ops[i+1:]:
            prod = op1 @ op2
            # ignore scalar coefficient
            pw = next(iter(prod.keys()))

            MomentX.add(pw)
    
    return MomentX.basis

Moment0 = dla.copy()
Moment = [Moment0]
dim = [len(Moment0)]
for i in range(1, 5):
    Moment.append(Moment_step(Moment[-1], dla))
    dim.append(len(Moment[-1]))

dim

We see that for the considered example of $n = 4$ we reach the maximum
moment already for the second order (the additional operator in the
third moment space is just the identity).

We can repeat our original computation for the first moments using the
$98$-dimensional first-order moment vector space $\mathcal{M}^1.$

The recipe follows the exact same steps as $\mathfrak{g}$-sim but using
$\mathcal{M}^1$ instead. First, we compute the input expectation value
vector.


In [None]:
comp_moment = 1

e_in = np.zeros((dim[comp_moment]), dtype=float)

for i, hi in enumerate(Moment[comp_moment]):
    rho_in = qml.prod(*(I(i) + Z(i) for i in hi.wires))
    rho_in = rho_in.pauli_rep

    e_in[i] = (hi @ rho_in).trace()

Next, we compute the (pseudo-)adjoint representation of $\mathcal{M}^1.$


In [None]:
adjoint_repr = qml.structure_constants(Moment[comp_moment])

We can now choose arbitrary DLA gates and a maximum of one
$P$ gate to evolve the expectation value vector.


In [None]:
e_t = e_in
e_t = expm(0.5 * adjoint_repr[dim_g-1]) @ e_t # the same U gate
e_t = expm(0.5 * adjoint_repr[74]) @ e_t      # the same P gate
e_t = expm(0.5 * adjoint_repr[dim_g-2]) @ e_t # the same V gate

The final result matches the state vector result again


In [None]:
np.allclose(e_t[:dim_g], true_res)

## Limitations

We saw how we can make use of moment vector spaces to extend
$\frak{g}$-sim by non-DLA elements under certain conditions. The upside
is that while the Lie closure or construction of the associative algebra
leads to an exponential DLA in $n,$ we get away with a polynomial cost
in $n,$ as we have $O(\text{dim}(\mathfrak{g})^{m_{\text{comp}}})$-sized
objects for some fixed maximum moment order $m_{\text{comp}}$ in the
computation, with some additional reductions due to the redundancies in
the moment spaces.

However, we argue that while this is interesting in theory, there is
little practical utility. To show that, we plot the dimensions of the
first and second moments against the associative algebra dimension.


In [None]:
dims_dla = []
dims_moment = []
dims_tempdla = []

ns = np.arange(2, 6)

for n in ns:
    _, dla, dim_g = TFIM(n)

    Moment0 = dla.copy()
    Moment = [Moment0]
    dim = [len(Moment0)]
    for i in range(1, 5):
        Moment.append(Moment_step(Moment[-1], dla))
        dim.append(len(Moment[-1]))

    tempdla = qml.lie_closure(dla + [Moment[1][-1]], pauli=True)

    dims_dla.append(dim_g)
    dims_moment.append(dim)
    dims_tempdla.append(len(tempdla))

import matplotlib.pyplot as plt

plt.title("Dimensions of $\\langle g + P \\rangle_{{Lie}}$ vs $\mathcal{M}^m$")

plt.plot(ns, dims_tempdla, "o--", label="${{dim}}(\\langle g + P \\rangle_{{Lie}})$", color="tab:blue")
plt.plot(ns, 2 * (2**(2*ns - 2) - 1), "-", label="$2(2^{{2n-2}}-1)$", color="tab:blue")

color = ["tab:orange", "tab:green", "tab:cyan"]
dims_moment = np.array(dims_moment)
for i in range(3):
    plt.plot(ns, dims_moment[:, i], "x--", label=f"$dim(\mathcal{{M}}^{i})$", color=color[i])
    fitcoeff = np.polyfit(ns, dims_moment[:, i], i+2) # polynomial fit of order m+1
    plt.plot(ns, np.poly1d(fitcoeff)(ns), "-", label=f"$O(n^{{{i}+2}})$", color=color[i])

plt.xticks(ns)
plt.yscale("log")

plt.legend()
plt.xlabel("n")
plt.show()

First, we note that the maximum moment is quickly reached for small
system sizes. In that case we have $m_\text{comp} = m_\text{max}$ so we
might as well look at the associative algebra and perform regular
$\mathfrak{g}$-sim. Secondly, we also note that the dimensions quickly
explode and become hard to handle in reasonable times.

For example, in all cases here there are no non-trivial third-order
moments. We would have to go to $n = 6,$ for which there are $1980$
elements in $\mathcal{M}^3,$ which corresponds to iterating over
$1980^3 \approx 2^{32}$ commutators to compute the (pseudo-)adjoint
representation. This is already a stretch to accomplish with the
available tools.

Hence, this method is effectively restricted to very few non-DLA gates
of Ising-type DLAs, rendering the method overall niche for practical
applications. On the other hand, we gained some new theoretical insights
into the relationship between the simulability of a quantum circuit and
its DLA. In particular, we constructed a polynomial algorithm in the
number of qubits $n,$ for simulating DLA circuits with up to
$m_\text{comp}$ moments in
$\mathcal{O}(\text{dim}\left(\mathfrak{g}(n)\right)^{m_\text{comp} + 2}) = \mathcal{O}(\text{poly}(n)).$


## References

1. Rolando D. Somma “Quantum Computation, Complexity, and Many-Body Physics” arXiv:quant-ph/0512209, 2005.
2. Rolando Somma, Howard Barnum, Gerardo Ortiz, Emanuel Knill “Efficient solvability of Hamiltonians and limits on the power of some quantum computational models” arXiv:quant-ph/0601030, 2006.
3. Victor Galitski “Quantum-to-Classical Correspondence and Hubbard-Stratonovich Dynamical Systems, a Lie-Algebraic Approach” arXiv:1012.2873, 2010.
4. Matthew L. Goh, Martin Larocca, Lukasz Cincio, M. Cerezo, Frédéric Sauvage “Lie-algebraic classical simulations for variational quantum computing” arXiv:2308.01432, 2023.
5. Roeland Wiersema, Efekan Kökcü, Alexander F. Kemper, Bojko N. Bakalov “Classification of dynamical Lie algebras for translation-invariant 2-local spin systems in one dimension” arXiv:2309.05690, 2023.


# Shadow Hamiltonian simulation


Shadow Hamiltonian simulation is a new approach to
quantum simulation on quantum computers. Despite its name, it has little
to do with
"classical shadows". In quantum simulation, the goal is typically to simulate
the time evolution of expectation values of $M$ observables $O_m,$ for
$m=1,\ldots ,M.$ The common approach is to evolve the wave function
$|\psi\rangle$ and then measure the desired observables after the
evolution.

In shadow Hamiltonian simulation, we instead directly encode the
expectation values in a proxy state --- the **shadow state** --- and
evolve that state accordingly. Specifically for time evolution, we can
write a shadow Schrödinger equation that governs the dynamics of the
shadow state.

![](Hands_on_6_images/OGthumbnail_shadow_hamiltonian_simulation.png)

This is fundamentally different to the common approach. Foremost, the
dimensionality of the shadow system no longer depends on the number of
constituents, $n,$ of the system. In fact, the underlying state can be
mixed or even infinite-dimensional. Instead, the shadow system\'s size
is dependent on the number of observables $M$ that we want to measure.
Note that there are conditions of completeness on the observables for
the shadow encoding to succeed, called invariance property
in. Further, since the expectation values are encoded in the amplitudes
of states, we cannot directly measure them anymore, but need to resort
to some form of state tomography. On the other hand, this gives us
entirely new possibilities by letting us sample from the probability
distribution $p_m = |\langle O_m \rangle|^2$ and measure the absolute
value of all observables simultaneously in the standard Z basis.

In this demo, we are going to introduce the basic concepts of shadow
Hamiltonian simulation alongside some easy-to-follow code snippets. We
will also see later how shadow Hamiltonian simulation comes down to
g-sim , a
Lie-algebraic classical simulation tool, but run on a quantum computer
with some simplifications specifically due to considering Hamiltonian
simulation.

## Shadow Hamiltonian simulation --- Definition


In common quantum Hamiltonian simulation, we evolve a state vector
$|\psi(t)\rangle$ according to the Schrödinger equation,

$$\frac{\text{d}}{\text{dt}} |\psi(t)\rangle = -i H |\psi(t)\rangle,$$

by some Hamiltonian $H,$ and then compute expectation values of the
evolved state through measurement. In shadow Hamiltonian simulation, we
encode a set of expectation values in the amplitudes of a quantum state,
and evolve those according to some shadow Schrödinger equation.

For that, we first need to define the shadow state,

$$\begin{aligned}
|\rho\rangle = \frac{1}{\sqrt{A}} \begin{pmatrix} \langle O_1 \rangle \\ \vdots \\ \langle O_M \rangle \end{pmatrix},
\end{aligned}$$

for a set of operators $S = \{O_m\}$ and normalization constant
$A = \sum_m |\langle O_m \rangle|^2.$ This means that we can encode
these $M$ operator expectation values into $n_S$ qubits, as long as
$2^{n_S} \geq M.$ Note that $\langle O_m \rangle = \text{tr}[O_m \rho],$
so we can have mixed or even infinite-dimensional states $\rho.$

The shadow state evolves according to its shadow Schrödinger equation,

$$\frac{\text{d}}{\text{dt}} |\rho\rangle = - i H_S |\rho\rangle.$$

The Hamiltonian matrix $H_S$ is given by the commutation relations
between the system Hamiltonian $H$ and the operators in $S = \{O_m\},$

$$[H, O_m] = - \sum_{m'=1}^M \left( H_S \right)_{m m'} O_{m'}.$$

Let us solve for the matrix elements $(H_S)_{m m'}.$ To do this, recall
that a vector $\boldsymbol{v}$ can always be decomposed in an orthogonal
basis $\boldsymbol{e}_j$ via
$\boldsymbol{v} = \sum_j \frac{\langle \boldsymbol{e}_j, \boldsymbol{v}\rangle}{||\boldsymbol{e}_j||^2} \boldsymbol{e}_j.$
Since the operators under consideration are elements of the vector space
of Hermitian operators, we can use this to compute $H_S.$

In particular, with the trace inner product, this amounts to

$$[H, O_m] = \sum_{m'=1}^M \frac{\text{tr}\left( O_{m'} [H, O_m] \right)}{|| O_{m'} ||^2} O_{m'},$$

from which we can read off the matrix elements of $H_S,$ i.e.,

$$(H_S)_{m m'} = -\frac{\text{tr}\left( O_{m'} [H, O_m] \right)}{|| O_{m'} ||^2}.$$

Now, we can see that the operators $O_m$ need to be chosen such that all
potentially new operators $\mathcal{O} = [H, O_m]$, resulting from
taking the commutator between $H$ and $O_m,$ are decomposable in terms
of $O_m$ again. In particular, the operators $O_m$ need to form a basis
for $\{\mathcal{O} | \mathcal{O} = [H, O_m] \}.$ Another way to say this
is that $\{O_m\}$ need to contain all nested commutators
$[[[H, O_m], O_m'], .. ],$ which is similar to
`~pennylane.lie_closure` but weaker
because it revolves around just $H.$ In the paper this is called the
**invariance property**.

Note
:::

Take for example $H = X$ and $S = \{Y\}$. Then $[H, Y] = iZ,$ so there
is no linear combination of elements in $S$ that can decompose $[H, Y].$
We need to extend the list such that we have $S = \{Y, Z\}$. Now all
results from commutation, $[H, Y] = iZ$ and $[H, Z] = -iY,$ are
supported by $S.$ This is similar to the Lie closure that we discussed, but the requirements are not as strict because we only need
support with respect to commentators with $H,$ and not among all
elements in $S.$


## How this relates to g-sim


In "g-sim" , we
have operators $\{ g_i \}$ that are generators or observables for a
parametrized quantum circuit, e.g.
$U(\theta) = \prod_\ell \exp(-i \theta_\ell g_\ell)$ and
$\langle g_i \rangle.$ For that, we are looking at the so-called
dynamical Lie algebra (DLA) of the circuit,
$\mathfrak{g} = \langle \{ g_i \} \rangle_\text{Lie} = \{ g_1, .., g_{|\mathfrak{g}|} \},$
as well as the adjoint representation
$(-i \text{ad}_{g_\gamma})_{\alpha \beta} = f^\gamma_{\alpha \beta},$
where $f^\gamma_{\alpha \beta}$ are the
`~pennylane.structure_constants` of the
DLA. They are computed via

$$f^\gamma_{\alpha \beta} = \frac{\text{tr}\left(g_\gamma [g_\alpha, g_\beta] \right)}{||g_\gamma||^2}.$$

The operators in $\frak{g}$ can always be orthonormalized via the
[Gram--Schmidt
process](https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process), in
which case we can drop the denominator. Further, by means of the cyclic
property of the trace, we can rewrite this expression to obtain

$$f^\gamma_{\alpha \beta} = \text{tr}\left(g_\beta [g_\gamma, g_\alpha] \right).$$

From this, we see how $H_S$ corresponds to the adjoint representation
$i \text{ad}_H$ (but we don\'t require the full Lie algebra here, see
below). 

In g-sim, we also evolve expectation vectors
$(\vec{g})_i = \langle g_i \rangle.$ In particular, the circuit of
evolving a state according to $U(\theta)$ and computing expectation
values $\langle g_i \rangle$ then corresponds to evolving $\vec{g}$ by
$\prod_\ell \exp(-i \theta_\ell \text{ad}_{g_\ell}).$

Shadow Hamiltonian simulation can thus be viewed as g-sim with a single,
specific gate $U(\theta) = e^{-i \theta H}$ and parameter $\theta = t,$
and run on a quantum computer.

One striking difference is that, because we only have one specific
\"gate\", we do not need the full Lie closure of the operators whose
expectation values we want to compute. Instead, here it is sufficient to
choose $O_m$ such that they build up the full support for all
$[H, O_m].$ This could potentially be a significant difference, as the
Lie closure in most cases leads to an exponentially large DLA, though
the scaling of the span of all $[H, O_m]$ is unclear at this point.

## A simple example


The abstract concepts of shadow Hamiltonian simulation are best
illustrated with a simple and concrete example. We are interested in
simulating the Hamiltonian evolution of

$$H = X + Y$$

after a time $t = 1$ and computing the expectation values of
$S = \{X, Y, Z, I \}.$ In the standard formulation, we simply evolve the
initial quantum state $|\psi(0)\rangle = |0\rangle$ by $H$ in the
following way.


In [None]:
import pennylane as qml
import numpy as np
from pennylane import X, Y, Z, I

dev = qml.device("default.qubit")

S = [X(0), Y(0), Z(0), I(0)]
H = X(0) + Y(0)

@qml.qnode(dev)
def evolve(H, t):
    qml.evolve(H, t)
    return [qml.expval(Om) for Om in S]

t = 1.
O_t_standard = np.array(evolve(H, t))
O_t_standard

We evolved a $2^n = 2$ dimensional quantum state and performed $3$
independent (non-commuting) measurements.

In shadow Hamiltonian simulation, we encode $4$ expectation values in a
$2^2 = 4$-dimensional quantum state, i.e., $n_S = 2.$

For this specific example, the number of operators is larger than the
number of qubits, leading to a shadow system that is larger than the
original system. This may or may not be a clever choice, but the point
here is just to illustrate the conceptual difference between both
approaches. The authors in show various examples where the resulting
shadow system is significantly smaller than the original system. It
should also be noted that having a smaller shadow system may not always
be its sole purpose, as there are conceptually new avenues one can
explore with shadow Hamiltonian simulation, such as sampling from the
distribution $p_m = |\langle O_m \rangle |^2.$

Let us first construct the initial shadow state $\boldsymbol{O}(t=0)$ by
computing
$\langle O_m \rangle_{t=0} = \text{tr}\left(O_m |\psi(0)\rangle \langle \psi(0)| \right)$
with $|\psi(0)\rangle = |0\rangle.$ The `pauli_rep` attribute of
PennyLane operators returns a
`~.pennylane.pauli.PauliSentence`
instance and lets us efficiently compute the trace, where we use the
trick that $|0 \rangle \langle 0| = (I + Z)/2.$


In [None]:
S_pauli = [op.pauli_rep for op in S]

O_0 = np.zeros(len(S))

for m, Om in enumerate(S_pauli):
    psi0 = (I(0) + Z(0)).pauli_rep

    O_0[m] = (psi0 @ Om).trace()


O_0

There are a variety of methods to encode this vector in a qubit basis,
but we will just be using `~.pennylane.StatePrep` later.

We now go on to construct the shadow Hamiltonian $H_S$ by computing the
elements
$(H_S)_{m m'} = \frac{\text{tr}\left( O_{m'} [H, O_m] \right)}{|| O_{m'} ||^2},$
and we again make use of the
`~.pennylane.pauli.PauliSentence.trace`
method.


In [None]:
H_pauli = H.pauli_rep

H_S = np.zeros((len(S), len(S)), dtype=complex)

for m, Om in enumerate(S_pauli):
    com = H_pauli.commutator(Om)
    for mt, Omt in enumerate(S_pauli):
        # v = ∑ (v · e_j / ||e_j||^2) * e_j

        value = (Omt @ com).trace()
        value = value / (Omt @ Omt).trace()  
        H_S[m,mt] = value

H_S = -H_S # definition eq. (2) in [1]

In order for the shadow evolution to be unitary and implementable on a
quantum computer, we need $H_S$ to be Hermitian.


In [None]:
np.all(H_S == H_S.conj().T)

Knowing that, we can write the formal solution to the shadow Schrödinger
equation as

$$\boldsymbol{O}(t) = \exp\left(-i t H_S \right) \boldsymbol{O}(0).$$


In [None]:
from scipy.linalg import expm

O_t = expm(-1j * t * H_S) @ O_0
O_t

Up to this point, this is equivalent to
g-sim if we
were doing classical simulation. Now, the main novelty for shadow
Hamiltonian simulation is to perform this on a quantum computer by
encoding the expectation values of $\langle O_m \rangle$ in the
amplitude of a quantum state, and to translate $H_S$ accordingly.

This can be done by decomposing the numerical matrix $H_S$ into Pauli
operators, which can, in turn, be implemented on a quantum computer.


In [None]:
H_S_qubit = qml.pauli_decompose(H_S)
H_S_qubit

Using all these ingredients, we now are able to formulate the shadow
Hamiltonian simulation as a quantum algorithm. For the amplitude
encoding, we need to make sure that the state is normalized. We use that
normalization factor to then later retrieve the correct result.


In [None]:
A = np.linalg.norm(O_0)

@qml.qnode(dev)
def shadow_evolve(H_S_qubit, O_0, t):
    qml.StatePrep(O_0 / A, wires=range(2))
    qml.evolve(H_S_qubit, t)
    return qml.state()

O_t_shadow = shadow_evolve(H_S_qubit, O_0, t) * A

print(O_t_standard)
print(O_t_shadow)

We see that the results of both approaches match.

The first result is coming from three independent measurements on a
quantum computer after evolution with system Hamiltonian $H.$ This is
conceptually very different from the second result where
$\boldsymbol{O}$ is encoded in the state of the shadow system (note the
`qml.state()` return), which we evolved according to $H_S.$

In the first case, the measurement is directly obtained, however, in the
shadow Hamiltonian simulation, we need to access the amplitudes of the
underlying state. This can be done naively with state tomography, but in
instances where we know that $\langle O_m \rangle \geq 0,$ we can just
sample bitstrings according to $p_m = |\langle O_m\rangle|^2.$ The
ability to sample from such a distribution
$p_m = |\langle O_m\rangle|^2$ is a unique and new feature to shadow
Hamiltonian simulation.

We should also note that we made use of the abstract quantum
sub-routines `~.pennylane.evolve` and
`~.pennylane.StatePrep`, which each
warrant their specific implementation. For example,
`~.pennylane.StatePrep` can be realized
by `~MottonenStatePreparation`and
`~.pennylane.evolve` can be realized by
`TrotterProduct`, though that is not be
the focus of this demo.


## Conclusion


We introduced the basic concepts of shadow Hamiltonian simulation and
learned how it fundamentally differs from the common approach to
Hamiltonian simulation.

We have seen how classical Hamiltonian simulation is tightly connected
to g-sim, but run on a quantum computer. A significant difference comes
from the fact that the authors in specifically look at Hamiltonian
simulation, $\exp(-i t H),$ which allows us to just look at operators
$O_m$ that support all commutators $[H, O_m],$ instead of the full Lie
closure. There may be some advantage to this feat, because Lie algebras
in quantum computing typically scale exponentially. However, the scaling
of such sets of operators is unclear at this point and needs further
investigation.

Note that even in the case of an exponentially sized set of operators,
we have --- at least in principle --- an exponentially large state
vector to store the $M \leq 2^{n_S}$ values. In the absolute worst case
we have $\mathfrak{su}(2^n)$ with a dimension of $2^{2n}-1,$ so
$n_S = 2n$ and thus it is just doubling the number of qubits.

The biggest potential to this new persepctive on Hamiltonian simulation
most likely lies in finding interesting applications like or that
naturally encode the problem and allow for efficient retrieval of all
the relevant information.


## References


1. Rolando D. Somma, Robbie King, Robin Kothari, Thomas O’Brien, Ryan Babbush “Shadow Hamiltonian Simulation” arXiv:2407.21775, 2024.
2. Rolando D. Somma “Quantum Computation, Complexity, and Many-Body Physics” arXiv:quant-ph/0512209, 2005.
3. Rolando Somma, Howard Barnum, Gerardo Ortiz, Emanuel Knill “Efficient solvability of Hamiltonians and limits on the power of some quantum computational models” arXiv:quant-ph/0601030, 2006.
4. Victor Galitski “Quantum-to-Classical Correspondence and Hubbard-Stratonovich Dynamical Systems, a Lie-Algebraic Approach” arXiv:1012.2873, 2010.
5. Matthew L. Goh, Martin Larocca, Lukasz Cincio, M. Cerezo, Frédéric Sauvage “Lie-algebraic classical simulations for variational quantum computing” arXiv:2308.01432, 2023.
6. Roeland Wiersema, Efekan Kökcü, Alexander F. Kemper, Bojko N. Bakalov “Classification of dynamical Lie algebras for translation-invariant 2-local spin systems in one dimension” arXiv:2309.05690, 2023.
7. Gerard Aguilar, Simon Cichy, Jens Eisert, Lennart Bittel “Full classification of Pauli Lie algebras” arXiv:2408.00081, 2024.
8. Ryan Babbush, Dominic W. Berry, Robin Kothari, Rolando D. Somma, Nathan Wiebe “Exponential quantum speedup in simulating coupled classical oscillators” arXiv:2303.13012, 2023.
9. Alice Barthe, M. Cerezo, Andrew T. Sornborger, Martin Larocca, Diego García-Martín “Gate-based quantum simulation of Gaussian bosonic circuits on exponentially many modes” arXiv:2407.06290, 2024.
