## Tutorial 3: Gauge Freedom

### As found in [here.](https://www.tensors.net/p-tutorial-3)

In this tutorial we will learn about manipulating the gauge freedom in TN, and how this freedom can be exploited in order to achieve an optimal decomposition of a tensor within a network. Topics include:

* Tree TN
* Gauge freedom in TN
* Shifting the center of orthogonality
* Tensor decompositions within networks

In [1]:
import numpy as np
from numpy import linalg as LA
from ncon import ncon

### Tree tensor networks

In this tutorial we shall focus on TN that do not possess closed loops, i.e. TN called tree TN. These networks possess many nice properties that networks containing closed loops lack and are thus much easier to manipulate. However, most of the results presented in this tutorial regarding gauge freedom can be generalized to the case of TN containing closed loops, as discussed [here](https://arxiv.org/abs/1801.05390).

In the following figure we can see an example of a tree TN. If we select a tensor to act as the center (or root node) then it is always possible to understand the tree TN as being composed of a set of distinct **branches** extending from this chosen tensor. The right side of the following figure depicts the four branches that extend from the order-4 tensor $A$ of the left-side of the figure. Importantly, connections between the different branches are not possible in networks without closed loops (hence why the name tree).

<img src="img/Fig18.png" alt="drawing" width="600"/>

### Gauge freedom

Let $T$ be a TN that, under contraction of all internal indices, evaluates to some tensor $D$. In this tutorial we shall be concerned with the uniqueness of the decomposition: is there a different choice of tensors within the TN that will still evaluate to the same tensor $D$?

The answer is yes! As shown in the figure below, on any internal index one can introduce a resolution of the identity. However, absorbing one of these matrices into each adjoining tensor does change their content (while leaving the geometry of the TN unchanged). Thus we conclude that there are infinitely many choices of tensors such that the TN product evaluates to some fixed output tensor. We call this the gauge freedom of the TN.

<img src="img/Fig19.png" alt="drawing" width="300"/>

While in some respects the gauge freedom is a nuisance, it can also be exploited so simplify many types of operations on TN. Indeed, most TN algorithms require fixing the gauge in a prescribed manner in order to function correctly. We now discuss several ways to fix the gauge in such a way as to create a *center of orthogonality*, and the utility of doing so.

## Creating a center of orthogonality

**Def**: Let $T:\{ A, B, C, ... \}$ be a tree TN, then a tensor $A$ is a center of orthogonality if, for every branch of the network attached to $A$, the branch forms an isometry between its open indices and the index connected to tensor $A$.

This is seen easier with an example:

<img src="img/Fig3.png" width='500'/>

Here, the tensor $A$ from the network $T$ (left-side image) is a center of orthogonality iff the constraints of the right-side image are satisfied, which demand that each of the branches connected to $A$ forms an isometry.

We now discuss two different methods for changing the gauge in network $T$ to make any tensor $A$ into center of orthogonality. Later we will reveal the significance of doing so.

#### Method 1: 'Pulling through'

Here we describe a method for setting a tensor $A$ within a network $T$ as a center of orthogonality through iterative use of the QR decomposition. (SVD could also be used but QR is computationally faster)

The idea is: we transform every individual tensor within a branch into a (properly oriented) isometry, then the entire branch collectively becomes an isometry and thus satisfies the definition of center of orthogonality. The method can be decomposed into 3 steps:

1) Begin by orienting each index with an arrow that points towards the chosen center tensor $A$:

<img src="img/Fig5.png" width='200'/>

2) Then, starting from a tensor at the tip of a branch, perform a QR decomposition on the tensor (under the partition between incoming and outgoing arrows). Next redefine the tensor in question as the orthogonal 'Q' part of the QR decomposition and absorb the 'R' matrix into the tensor connected to the outgoing arrow:

<img src="img/Fig6.png" width='300'/>

3) Repeat, working inwards, until all tensors are isometric w.r.t. their incoming and outgoing arrows. Tensor $A$ is now a center of orthogonality.

<img src="img/Fig7.png" width='300'/>

Final result:

<img src="img/Fig8.png" width='600'/>

Now we will see and example code that implements this method, and we will check that the initial and final TN still contract to the same tensor. Note that we use a convention such that tensor indices are ordered from left-to-right along the bottom, then left-to-right along the top.

In [6]:
# Example: Creating a center of orthogonality by 'pulling through'
d = 3

# Define tensors
A = np.random.rand(d,d,d,d)
# branch 1:
B = np.random.rand(d,d,d); D = np.random.rand(d,d,d); E = np.random.rand(d,d,d)
# branch 3:
F = np.random.rand(d,d,d)
# branch 4:
C = np.random.rand(d,d,d); G = np.random.rand(d,d,d)

# Iterate QR decompositions
# branch 1:
DQ, DR = LA.qr(D.reshape(d**2,d)); DQ = DQ.reshape(d,d,d)
EQ, ER = LA.qr(E.reshape(d**2,d)); EQ = EQ.reshape(d,d,d)
Bt = ncon([B,DR,ER],[[1,2,-3],[-1,1],[-2,2]]) # DULR ordering of indices

BQ, BR = LA.qr(Bt.reshape(d**2,d)); BQ = BQ.reshape(d,d,d)

# branch 2:
FQ, FR = LA.qr(F.reshape(d**2,d)); FQ = FQ.reshape(d,d,d)

# branch 3:
GQ, GR = LA.qr(G.reshape(d**2,d)); GQ = GQ.reshape(d,d,d)
Ct = ncon([C,GR],[[1,-2,-3],[-1,1]])
CQ, CR = LA.qr(Ct.reshape(d**2,d)); CQ = CQ.reshape(d,d,d)

# A'
Ap = ncon([A,BR,FR,CR],[[1,-2,2,3],[-1,1],[-3,2],[-4,3]])

# T is now formed by {Ap, BQ, CQ, DQ, EQ, FQ, GQ}


# Check that both TN evaluate to the same tensor
# I've used different index notation than the original source
connections = [[1,-5,2,3],[4,5,1],[6,-10,3],[-1,-2,4],[-3,-4,5],[-6,-7,2],[-8,-9,6]]
H0 = ncon([A,B,C,D,E,F,G], connections)
H1 = ncon([Ap,BQ,CQ,DQ,EQ,FQ,GQ], connections)

dH = LA.norm(H0-H1)/LA.norm(H0)

print(dH)

1.0230833475924413e-15


#### Method 2: Direct Orthogonalization

Using as example the TN from method 1, we'll describe a method for setting $A$ as a center of orthogonality directly using a single eigen-decomposition for each branch. The steps are:

1) Begin by computing the positive-definite density matrix $\rho$ associated to each index about the chosen center $A$; this is given by contracting the open indices from a branch with the corresponding open indices from the conjugate of the branch as seen below.

<img src="img/Fig26.png" width='600'/>

2) Then compute the principle square root $X_i$ of each of the density matrices $\rho_i = X_i^\dagger X_i$.

3) Finally, we make the change of gauge on each of the indices of tensor $A$ using the appropriate $X$ matrix and its corresponding inverse, as depicted in the figure below. Tensor $A$ is now a center of orthogonality

<img src="img/Fig21.png" width='600'/>

In [7]:
# Example: Creating a center of orthogonality with 'direct orthogonalization'
d = 3

# Define tensors
A = np.random.rand(d,d,d,d)
# branch 1:
B = np.random.rand(d,d,d); D = np.random.rand(d,d,d); E = np.random.rand(d,d,d)
# branch 3:
F = np.random.rand(d,d,d)
# branch 4:
C = np.random.rand(d,d,d); G = np.random.rand(d,d,d)

# Compute density matrices and their principle squared roots
rho1 = ncon([B,D,E,B,D,E],[[5,6,-2],[1,2,5],[3,4,6],[7,8,-1],[1,2,7],[3,4,8]])
rho2 = ncon([F,F],[[1,2,-2],[1,2,-1]])
rho3 = ncon([C,G,C,G],[[3,5,-2],[1,2,3],[4,5,-1],[1,2,4]])

d1, u1 = LA.eigh(rho1); sq_d1 = np.sqrt(abs(d1))
d2, u2 = LA.eigh(rho2); sq_d2 = np.sqrt(abs(d2))
d3, u3 = LA.eigh(rho3); sq_d3 = np.sqrt(abs(d3))

X1 = u1 @ np.diag(sq_d1) @ u1.T; X1inv = u1 @ np.diag(1./sq_d1) @ u1.T
X2 = u2 @ np.diag(sq_d2) @ u2.T; X2inv = u2 @ np.diag(1./sq_d2) @ u2.T
X3 = u3 @ np.diag(sq_d3) @ u3.T; X3inv = u3 @ np.diag(1./sq_d3) @ u3.T

# Execute gauge changes (part 3)
Ap = ncon([A,X1,X2,X3],[[1,-2,2,3],[-1,1],[-3,2],[-4,3]])
Bp = ncon([B,X1inv],[[-1,-2,1],[1,-3]])
Fp = ncon([F,X2inv],[[-1,-2,1],[1,-3]])
Cp = ncon([C,X3inv],[[-1,-2,1],[1,-3]])

# T is now formed by: {Ap, Bp, Cp, D, E, Fp, G}

# Now check that both TN evaluate to the same tensor
connections = [[3,-5,4,5],[1,2,3],[6,-10,5],[-1,-2,1],[-3,-4,2],[-6,-7,4],[-8,-9,6]]
H0 = ncon([A,B,C,D,E,F,G],connections)
H1 = ncon([Ap,Bp,Cp,D,E,Fp,G],connections)
dH = LA.norm(H0 - H1) / LA.norm(H0)

print(dH)

1.889982462987246e-15


#### Comments:

Both methods have their own advantages and the preferred method may depend on the specific application in mind.

Nevetheless, in practice the second one is typically computationally cheaper and easier to execute. In addition, this method only requires changing the gauge on the indices connected to the center. On the other hand, the former involves changing the gauge on all indices of the TN. There are some applications where it is desirable to make every tensor into an isometry, so the 'pulling through' will be preferred there. Moreover, 'pulling through' can be advantageous if high precision is desired, as the errors due to floating-point arithmetic are lesser.

### Tensor decompositions within networks

In tutorial 2 we described how the SVD can be applied to optimally decompose a tensor into a product with some restricted rank. Here we take this concept a step further and describe how, by creating a center of orthogonality, a tensor within a network can be optimally decomposed as to minimize the global error from the entire network.

Let us consider the TN we've been working with in the last 2 methods. If we replace tensor $A$ with some new tensor $A'$, the TN will evaluate to a different tensor. When the TN has $A$ in it we call this tensor to which the network evaluates $H$, and $H'$ when $A'$ is in the TN.

**Theorem**: If tensor $A$ is a center of orthogonality, then the local difference between tensors $||A-A'||$ precisely equals the global difference between the networks $||H-H'||$.

**Proof**: Because $A$ is a center of orthogonality, each of the branches must contract to the identity with their hermitian conjugates. Thus, when taking the trace, $\text{tr}(H^\dagger H) = \text{tr}(A^\dagger A)$, and the same happens for $H'$ and $A'$. Similarly, the branches also cancel in the scalar product of $H$ with $H'$, as we've only replaced $A$ by $A'$, but the branches remained unchanged. By the definition of the Frobenius norm, it then follows that $|| H - H' || = || A - A' ||$. Diagramatically, all I'm saying is:

<img src="img/Fig25.png" width='600'/>

**Corollary**: If the center of orthogonality tensor $A$ is replaced with a product of tensors as $A' = A_L \cdot A_R$, then the optimal restricted rank approximation for $A$ is also optimal for minimizing the global difference $||H - H'||$.

This corollay turns out to be an exceptionally useful result. An important task in many TN algorithms is to decompose a tensor that resides with a network into a product of tensors in such a way as to minimize the global error. For instance, we may wish to replace $A$ with a minimal rank product $A_L \cdot A_R$ so that it minimizes $|| H - H' ||$.

This could've been a demonically hard problem, but this corollary implies a straight-forward solution. What we have to do is to appropriately fix the gauge degrees of freedom so that $A$ gets transformed into a center of orthogonality, which then implies that the global error becomes equivalent to the local error of the decomposition. We can then use the optimal single tensor decomposition based on the SVD (see tutorial 2) which will achieve the desired goal of minimizing the global error $|| H - H' ||$:

<img src="img/Fig24.png" width='600'/>

#### Outlook:

In this tutorial we have gained an understanding of why TN possess gauge freedom, as well as how to exploit this freedom to create a center of orthogonality, which then greatly helps us in minimizing the global error. Many important TN methods, such as the DMRG algorithm, rely heavily on these concepts. In tutorial 4 we shall consider some extensions to these ideas, focusing more thoroughly on m,ulti-stage tensor decompositions, as well as how gauge freedom can be fixed to bring a network into canonical form.

## Problem Set 3:

We¡re given the following TN:

<img src="img/Fig27.png" width='600'/>

We also define tensors $B$ and $C$ to be equal to $A$, $B = A = C$, and assume that all tensor indices are of dimension $d = 12$.

**(a)** Contract the TN to form the tensor $H$ explicitly. Evaluate the norm $||H||$.

In [2]:
# (a) Define A, B, C, form H and evaluate its norm.
d = 12

A = np.zeros((d,d,d))

for i in range(d):
    for j in range(d):
        for k in range(d):
            A[i,j,k] = np.sqrt(i + 2.*j + 3.*k + 6)
            
B = A
C = A

H = ncon([A,B,C],[[-1,-2,1],[1,-3,2],[2,-4,-5]])

LH = LA.norm(H)

print(LH)

17331274.43296364


**(b)** Use the truncated SVD to optimally decompose $C$ into a rank $\chi = 2$ product of tensors, $C \to C_L \cdot C_R$, as depicted in the following picture. Contract the new TN to for a single tensor $H_1$. Compute the truncation error $\varepsilon = || H - H_1 || / ||H||$.

<img src="img/Fig28.png" width='500'/>

In [4]:
# (b) truncated SVD
um,sm,vhm = LA.svd(C.reshape(d,d**2))
chi = 2
CL = um[:,:chi] @ np.diag(np.sqrt(sm[:chi]))
CR = (np.diag(np.sqrt(sm[:chi])) @ vhm[:chi,:]).reshape(chi,d,d)

H1 = ncon([A,B,CL,CR],[[-1,-2,1],[1,-3,2],[2,3],[3,-4,-5]])
err1 = LA.norm(H - H1) / LH

print(err1)

9.458269336371587e-07


**(c)** Starting from the original network, transform tensor $C$ into a center of orthogonality using the "pulling through" method based on the QR decomposition, obtaining a new network of tensors $\{ A', B', C' \}$. Evaluate this network to a single tensor $H'$ and then check that $|| H - H' || = 0$.

In [7]:
# (c) Transform C into center of orthogonality using the "pulling through" method

AQ, AR = LA.qr(A.reshape(d**2,d)); AQ = AQ.reshape(d,d,d)

Bt = ncon([AR,B],[[-1,1],[1,-2,-3]])

BQ, BR = LA.qr(Bt.reshape(d**2,d)); BQ = BQ.reshape(d,d,d)

Ct = ncon([BR,C],[[-1,1],[1,-2,-3]])

Ht = ncon([AQ,BQ,Ct],[[-1,-2,1],[1,-3,2],[2,-4,-5]])

print(LA.norm(H - Ht)) # Should be zero. Idk whether 1e-9 is "zero enough".

6.426172660496248e-09


**(d)** Repeat the truncation step from part (b) on the transformed tensor $C' \to C'_L \cdot C'_R$, again keeping rank $\chi = 2$. Contract the new TN $\{ A', B', C'_L, C'_R \}$ to form a single tensor $H'_1$. Compute the truncation error $\varepsilon' = || H - H'_1 || / ||H||$ and confirm that it is smaller than the result from (b). Why is this the case?

In [9]:
# (d) Repeat (b) on the transformed tensor Ct obtained in (c). 
um,sm,vhm = LA.svd(Ct.reshape(d,d**2))
chi = 2
CtL = um[:,:chi] @ np.diag(np.sqrt(sm[:chi]))
CtR = (np.diag(np.sqrt(sm[:chi])) @ vhm[:chi,:]).reshape(chi,d,d)

Ht1 = ncon([AQ,BQ,CtL,CtR],[[-1,-2,1],[1,-3,2],[2,3],[3,-4,-5]])
errt = LA.norm(H - Ht1) / LH

print(errt)

# Truncation error is smaller since Ct is a center of orthogonality, which guarantees 
# that global truncation error is minimized.

1.0618346029953804e-07
