# Optimizing a Tensor Network using Pytorch


In this example we show how a general machine learning
strategy can be used to optimize tensor networks with
respect to some target loss function.

We'll take the example of maximizing the overlap of some
matrix product state with periodic boundary conditions
with a densely represented state, since this does not
have a simple, deterministic alternative.

``quimb`` makes use of ``opt_einsum`` which can contract
tensors with a variety of backends. Here we'll use 
``pytorch``. Note that pytorch does not yet support complex
data (but that also means we don't need to conjugate using
the ``.H`` attribute).

In [1]:
import torch

import quimb as qu
import quimb.tensor as qtn

# perform all contractions with pytorch
qtn.set_contract_backend('torch')

First, find a (dense) PBC groundstate, $| gs \rangle$:

In [2]:
L = 16
H = qu.ham_heis(L, sparse=True, cyclic=True)
gs = qu.groundstate(H)

Then we convert it to a (constant) torch array:

In [3]:
# this converts the dense vector to an effective 1D tensor network
target = qtn.Dense1D(gs)

# this maps the torch.tensor function over all the data arrays, here only one
target.apply_to_arrays(torch.tensor)

Next we create an initial guess random MPS, $|\psi\rangle$, also converting each 
of the arrays to torch variables (but now requiring the 
gradient so that each can be optimized):

In [4]:
bond_dim = 32
mps = qtn.MPS_rand_state(L, bond_dim, cyclic=True)
mps.apply_to_arrays(lambda t: torch.tensor(t, requires_grad=True))

Last, we set up a ``pytorch`` optimizer, taking as the loss 
the normalized target overlap $\dfrac{|\langle gs | \psi \rangle|^2} { \langle \psi | \psi \rangle }$:

In [5]:
# we give the optimizer all the tensors it should optimize
optimizer = torch.optim.Adam([t.data for t in mps], lr=0.01)

# perform 100 steps of optimization
for t in range(1, 101):
    
    # negate the overlap as we a minimizing
    loss = - (mps @ target)**2 / (mps @ mps)
    
    # reset, compute the gradient, and take a optimize step
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if t % 10 == 0:
        print(f"round: {t}, loss: {loss.item()}")

round: 10, loss: -0.9405585461294388
round: 20, loss: -0.9788279069737648
round: 30, loss: -0.9925598607587998
round: 40, loss: -0.9965893422669746
round: 50, loss: -0.997983439965575
round: 60, loss: -0.9986126769786011
round: 70, loss: -0.9989872559043187
round: 80, loss: -0.9991876671618736
round: 90, loss: -0.9993254027902169
round: 100, loss: -0.9994244319138171


We now have a pretty good fidelity between our PBC MPS ansatz and the target groundstate.

Although the loss was computed with normalization, the MPS still needs to be normalized:

In [6]:
mps /= (mps @ mps)**0.5
mps @ mps

tensor(1.0000, dtype=torch.float64, grad_fn=<AsStridedBackward>)

And finally we can check that the overlap matches the loss found:

In [7]:
(mps @ target)**2

tensor(0.9994, dtype=torch.float64, grad_fn=<PowBackward0>)

Other things to think about might be:

- playing with the optimizer type (here ADAM) and settings (e.g. learning rate)
- using single precision data for GPU acceleration

We can also convert the ``pytorch`` arrays back to numpy with:

In [8]:
# the 'detach' unlinks the tensors from the gradient calculator
mps.apply_to_arrays(lambda t: t.detach().numpy())

In [9]:
type(mps[4].data)

numpy.ndarray