In [None]:
# Set this variable yourself.
running_on_colab = False
# Store data as reduced density matrix `rho` or eigenvector tuple `EVW`.
rho_or_EVW = 'rho'

# Machine Learning of Many Body Localization

## Introduction

Use exact diagonalization to obtain a few eigenstates near energy $E = 0$ from the Heisenberg model with a
random field, 

\begin{equation}
    H = J \sum_i \vec{S}_{i} \cdot \vec{S}_{i+1} - \sum_i h_i S^z_i
\end{equation}

, where the values of the field $ h_i \in [-W, W] $ are chosen from a uniform random distribution with a "disorder strength" $W$ (with moderate system sizes $L \approx 12$). 

The exciting property of this model is that it is believed to undergo a phase transition from an extended phase (small $W$) to a localized phase (large $W$). 

We will use ML to detect this transition: Pick a number of eigenstates that are near energy $E = 0$ and obtain the reduced density matrices $\rho^A$, where $A$ is a region of $n$ consecutive spins (a few hundred to thousands eigenstates for different disorder realizations). 

Now use the density matrices for $W = 0.5 J$ and $W = 8.0 J$ to train a neural network (just interpret the entries of $\rho^A$ as an image with $2^n \times 2^n$ pixel). 
Then use this network and study the output of the neural network for different $W$. 

How does the results depend on system size $L$ and block size $n$? 
At which $W_c$ do you expect the transition to occur?

### Wikipedia
[Many body localization](https://en.wikipedia.org/wiki/Many_body_localization)  
[Localization protected quantum order](https://en.wikipedia.org/wiki/Localization_protected_quantum_order)  

Many-body localization (MBL) is a dynamical phenomenon which leads to the breakdown of equilibrium statistical mechanics in isolated many-body systems. Such systems never reach local thermal equilibrium, and retain local memory of their initial conditions for infinite times.

MBL was first proposed by P.W. Anderson in 1958 as a possibility that could arise in strongly disordered quantum systems. The basic idea was that if particles all live in a random energy landscape, then any rearrangement of particles would change the energy of the system. Since energy is a conserved quantity in quantum mechanics, such a process can only be virtual and cannot lead to any transport of particle number or energy.  

The process of thermalization erases local memory of the initial conditions. In textbooks, thermalization is ensured by coupling the system to an external environment or "reservoir," with which the system can exchange energy. What happens if the system is isolated from the environment, and evolves according to its own Schrödinger equation? Does the system still thermalize?

Quantum mechanical time evolution is unitary and formally preserves all information about the initial condition in the quantum state at all times.

This question can be formalized by considering the quantum mechanical density matrix ρ of the system. If the system is divided into a subregion A (the region being probed) and its complement B (everything else), then all information that can be extracted by measurements made on A alone is encoded in the reduced density matrix $\rho_A = Tr_B (\rho(t))$. If in the long time limit $\rho_A(t)$ approaches a thermal density matrix at a temperature set by the energy density in the state, then the system has "thermalized," and no local information about the initial condition can be extracted from local measurements. This process of "quantum thermalization" may be understood in terms of B acting as a reservoir for A. In this perspective, the entanglement entropy $ S = - Tr \rho_A log \rho_A $ of a thermalizing system in a pure state plays the role of thermal entropy. Thermalizing systems therefore generically have extensive or "volume law" entanglement entropy at any non-zero temperature.

In contrast, if $\rho_A(t)$ fails to approach a thermal density matrix even in the long time limit, and remains instead close to its initial condition $\rho_A(0)$, then the system retains forever a memory of its initial condition in local observables. This latter possibility is referred to as "many body localization," and involves B failing to act as a reservoir for A. Eigenstates of systems exhibiting MBL do not obey the ETH, and generically follow an "area law" for entanglement entropy (i.e. the entanglement entropy scales with the surface area of subregion A).

In thermalizing systems, energy eigenstates have volume law entanglement entropy. In MBL systems, energy eigenstates have area law entanglement entropy.

In thermalizing systems, entanglement entropy grows as a power law in time starting from low entanglement initial conditions. In MBL systems, entanglement entropy grows logarithmically in time starting from low entanglement initial conditions.

In thermalizing systems, the dynamics of out-of-time-ordered correlators forms a linear light cone which reflects the ballistic propagation of information. In MBL systems, the light cone is logarithmic.

What's more, while individual eigenstates aren't themselves experimentally accessible, order in eigenstates nevertheless has measurable dynamical signatures. The eigenspectrum properties change in a singular fashion as the system transitions between from one type of MBL phase to another, or from an MBL phase to a thermal one---again with measurable dynamical signatures.

## Imports

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import os
import sys

os.environ['running_on_colab'] = str(running_on_colab)
# running_on_colab = (os.getenv('running_on_colab', 'False') == 'True')

if running_on_colab:
    data_root             = 'drive/MyDrive/Colab Data/MBL/'
    sys.path.append(data_root)
else:
    data_root             = './'

# Store data as reduced density matrix `rho` or eigenvector tuple `EVW`.
os.environ['rho_or_EVW'] = str(rho_or_EVW)
# running_on_colab = (os.getenv('rho_or_EVW', 'EVW') == 'rho')

from file_io import *
from data_gen import *

In [None]:
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from matplotlib.ticker import MaxNLocator

dpi = 100
fig_w = 1280
fig_h = 640

%matplotlib inline

In [None]:
if running_on_colab:
    !cat /proc/cpuinfo

In [None]:
if running_on_colab:
    !pip install ipython-autotime
    %load_ext autotime

In [None]:
if running_on_colab:
    !pip install pytorch_lightning==0.7.6 torchsummary==1.5.1

## Test execution time of Exact Diagonalization

### Test solving a single Hamiltonian

In [None]:
L = 10
W = 0.5
J = 1
periodic = False
num_Hs = 10

H = build_H(L, W, J, periodic)
E, V = ED(H.toarray())
print('2^L: {}'.format(2**L))
print('#eigenvalues: {}'.format(len(E)))
print('E.shape: {}'.format(E.shape))
print('V.shape: {}'.format(V.shape))

In [None]:
%%timeit
# Solve using numpy's dense solver
E, V = ED(H.toarray())
E0, V0 = select_N_eigenvalues(E, V, 20)

In [None]:
%%timeit
# Solve using scipy's sparse solver instead.
E1, V1 = ED_sparse(H, 20)

In [None]:
# Solve using numpy's dense solver
E, V = ED(H.toarray())
E0, V0 = select_N_eigenvalues(E, V, 20)

# Solve using scipy's sparse solver instead.
E1, V1 = ED_sparse(H, 20)

In [None]:
# Check eigenvalues are sorted correctly, and eigenvectors are selected along the correct axis.
V0_norm = []
V1_norm = []
for i in range(len(E0)):
    V0_norm.append(np.linalg.norm(V0[:,i]))
    V1_norm.append(np.linalg.norm(V1[:,i]))

for E0i, E1i, V0i, V1i in zip(E0, E1, V0_norm, V1_norm):
    print('Numpy: {:+.12f} | Numpy norm: {:.16f}'.format(E0i, V0i))
    print('Scipy: {:+.12f} | Scipy norm: {:.16f}'.format(E1i, V1i))

assert np.allclose(V0i, V1i), 'Eigenvectors evaluated using Numpy and Scipy should be identical.'

### Test solving multiple Hamiltonians

In [None]:
Hs = build_Hs(L, W, J, periodic, num_Hs)
Es, Vs = EDs(Hs)
E0s, V0s = EDs_sparse(Hs, 20)

### Test execution time, numpy vs scipy

In [None]:
# Test execution time: numpy.
J  = 1                      # Always = 1
Ws = [8]                    # Disorder strength W.
Ls = list(range(8,12))      # System size L.
ns = [1]*len(Ls)            # Number of samples for each L.
ps = [False]                # Periodic or not.
et = []                     # Execution time.
num_EV = 20                 # Number of eigenvalues near zero to save.

for L, num_Hs in zip(Ls, ns):
    for W in Ws:
        for p in ps:
            start_time = time.time()
            batch_generate_ED_data(L, W, J, p, num_Hs, num_EV, save_data=False, npsp='np')
            exec_time = time.time() - start_time
            et.append(exec_time)
            print('Computed: L={:02d} | W={:.2f} | periodic={: <5}. Execution took {: 8.2f}s or {: 6.2f}min'.format(L, W, str(p), exec_time, exec_time/60))


In [None]:
fig, axes = plt.subplots(1, 2, figsize=(fig_w/dpi,fig_h/dpi), dpi=dpi, squeeze=False)

base = 10
axes[0,0].plot(Ls, et)
axes[0,1].plot(np.array(Ls), np.log(et) / np.log(base))

axes[0,0].set_title('Numpy Execution time vs System size')
axes[0,1].set_title('Log(Numpy Execution time) vs System size')
axes[0,0].set_ylabel('Execution time')
axes[0,1].set_ylabel('Log(Execution time)')

for axe in axes:
    for ax in axe:
        ax.set_xlabel('System size L')
        # ax.legend(loc='best')
        ax.xaxis.set_major_locator(MaxNLocator(integer=True))

In [None]:
# Test execution time: scipy.
J  = 1                      # Always = 1
Ws = [8]                    # Disorder strength W.
Ls = list(range(8,16))      # System size L.
ns = [1]*len(Ls)            # Number of samples for each L.
ps = [False]                # Periodic or not.
et = []                     # Execution time.
num_EV = 20                 # Number of eigenvalues near zero to save.

for L, num_Hs in zip(Ls, ns):
    for W in Ws:
        for p in ps:
            start_time = time.time()
            batch_generate_ED_data(L, W, J, p, num_Hs, num_EV, save_data=False, npsp='sp')
            exec_time = time.time() - start_time
            et.append(exec_time)
            print('Computed: L={:02d} | W={:.2f} | periodic={: <5}. Execution took {: 8.2f}s or {: 6.2f}min'.format(L, W, str(p), exec_time, exec_time/60))


In [None]:
fig, axes = plt.subplots(1, 2, figsize=(fig_w/dpi,fig_h/dpi), dpi=dpi, squeeze=False)

base = 10
axes[0,0].plot(Ls, et)
axes[0,1].plot(np.array(Ls), np.log(et) / np.log(base))

axes[0,0].set_title('SciPy Execution time vs System size')
axes[0,1].set_title('Log(SciPy Execution time) vs System size')
axes[0,0].set_ylabel('Execution time')
axes[0,1].set_ylabel('Log(Execution time)')

for axe in axes:
    for ax in axe:
        ax.set_xlabel('System size L')
        # ax.legend(loc='best')
        ax.xaxis.set_major_locator(MaxNLocator(integer=True))

In [None]:
# Sample parameters.
J  = 1                                 # Always = 1
Ws = [0.5, 8] * 10 + [i/2 for i in range(1*2, 10*2)] # Disorder strength W.
Ls = [   8,   9,  10,  11, 12, 13, 14] # System size L.
ns = [1000, 500, 250, 100, 50, 20, 10] # Number of samples for each L.
ps = [False, True]                     # Periodic or not.

In [None]:
# Test file size. It's not feasible to store ALL eigenvectors.
J  = 1             # Always = 1
Ws = [0.5, 8] * 10 + [i/2 for i in range(1*2, 10*2)] # Disorder strength W.
Ls = [   8]        # System size L.
ns = [1000]        # Number of samples for each L.
ps = [False, True] # Periodic or not.

fs = 0
for L, num_Hs in zip(Ls, ns):
    for W in Ws:
        for p in ps:
            start_time = time.time()
            # batch_generate_ED_data(L, W, J, p, num_Hs)
            fs += 1
            exec_time = time.time() - start_time
            # print('Computed: L={:02d} | W={:.2f} | periodic={: <5}. Execution took {: 8.2f}s or {: 6.2f}min'.format(L, W, str(p), exec_time, exec_time/60))
print('Eigenvectors `Vs` dominates the file size. Assume 1000 samples are generated for each W, for around 20 Ws:')
print('For L=8, each Es file is 2MB, Hs is 23MB, Vs is 1000MB.')
print('Estimate file size of each run is thus {}MB'.format(fs*(2+23+1000)))

In [None]:
print('The full eigenvectors `Vs` dominates the file size.')
for i in range(5):
    print('For L = {:2d}, Vs is {:3d} MB.'.format(8+i, 4**i))

## Partial trace and reduced density matrix

### Irrelevant stuff

#### Summary

According to my limited understanding, a so-called partial trace is simply summing contributions of coefficients outside a chosen subsystem. 

Let's start with a simple example of two spins, divided into two subsystems $A$ and $B$, each with one spin:

\begin{equation}
    | s_1 \rangle \otimes | s_2 \rangle = | s_1 \rangle_A \otimes | s_2 \rangle_B = | A \rangle \otimes | B \rangle
\end{equation}

An eigenvector has the following basis:

\begin{equation}
    \begin{pmatrix}
        | (\downarrow)_A (\downarrow)_B \rangle \\
        | (\downarrow)_A (\uparrow)_B   \rangle \\
        | (\uparrow)_A   (\downarrow)_B \rangle \\
        | (\uparrow)_A   (\uparrow)_B   \rangle \\
    \end{pmatrix}
\end{equation}

Now add coefficients $C_i$ of the eigenvector:

\begin{equation}
    \begin{pmatrix}
        C_0 | (\downarrow)_A (\downarrow)_B \rangle \\
        C_1 | (\downarrow)_A (\uparrow)_B   \rangle \\
        C_2 | (\uparrow)_A   (\downarrow)_B \rangle \\
        C_3 | (\uparrow)_A   (\uparrow)_B   \rangle \\
    \end{pmatrix}
\end{equation}

Or $C_{i,j}$:

\begin{equation}
    \begin{pmatrix}
        C_{0,0} | (\downarrow)_A (\downarrow)_B \rangle \\
        C_{0,1} | (\downarrow)_A (\uparrow)_B   \rangle \\
        C_{1,0} | (\uparrow)_A   (\downarrow)_B \rangle \\
        C_{1,1} | (\uparrow)_A   (\uparrow)_B   \rangle \\
    \end{pmatrix}
\end{equation}

The reduce density matrix of subsystem $A$ is ${\rho}_A$ is defined as a partial trace over $B$:

\begin{equation}
    \rho_A = Tr_B(\rho)
\end{equation}

That can be obtained simply by summing coefficients related to $B$, thereby removing its contribution:

\begin{equation}
    \begin{aligned}[c]
        \begin{pmatrix}
            (C_0 + C_1) | (\downarrow)_A \rangle \\
            (C_2 + C_3) | (\uparrow)_A   \rangle \\
        \end{pmatrix}
    \end{aligned}
    \quad or \quad
    \begin{aligned}[c]
        \begin{pmatrix}
            (C_{0,0} + C_{0,1}) | (\downarrow)_A \rangle \\
            (C_{1,0} + C_{1,1}) | (\uparrow)_A   \rangle \\
        \end{pmatrix}
    \end{aligned}
\end{equation}

We observe that if we have an eigenvector in an array, the notation $C_i$ used on the left hand side is the array index, while the right hand side is the binary representation of said index. Does this relation extend beyond two spins? (Probably not, unless each spin is its own subsystem...)

But this is clearly wrong, because this is just a vector, not a matrix. Recall from Chapter 5 of our lecture notes that the definition of a reduced density matrix is also:

\begin{equation}
    \rho_A = Tr_B(\rho) = Tr_B(| \psi \rangle \langle \psi |) = \sum_{i',j,i,j} \psi^\ast_{i',j} \psi_{i,j} | i' \rangle \langle i | ,
\end{equation}

where $i$ are spins/sites related to subsystem $A$, and $j$ to $B$. An element of this matrix is simply:

\begin{equation}
    {(\rho_A)}_{i',i} = Tr_B(\rho)_{i',i} = \sum_{j} \psi^\ast_{i',j} \psi_{i,j} | i' \rangle \langle i |
\end{equation}

The basis $ | i' \rangle \langle i | $ is always implicit in all calculations and never really affect anything, ever, really. The only purpose is to make everything more complicated and confusing.

Reverting back to the notation of our previous 2-spin example, with $\psi_{i,j} \to C_i$ and $| i \rangle \to | s_1 \rangle_A$, a reduced density matrix can be constructed as follows:

\begin{equation}
    \begin{pmatrix}
        (C_0 + C_1)^\ast \times (C_0 + C_1) &                     ?               \\
                            ?               & (C_2 + C_3)^\ast \times (C_2 + C_3) \\
    \end{pmatrix}
\end{equation}

Now what should the off-diagonal terms look like? We have two choices:

\begin{equation}
    \begin{aligned}[c]
        \begin{pmatrix}
            (C_0 + C_1)^\ast \times (C_0 + C_1) & (C_2 + C_3)^\ast \times (C_0 + C_1) \\
            (C_0 + C_1)^\ast \times (C_2 + C_3) & (C_2 + C_3)^\ast \times (C_2 + C_3) \\
        \end{pmatrix}
    \end{aligned}
    \quad or \quad
    \begin{aligned}[c]
        \begin{pmatrix}
            (C_0 + C_1)^\ast \times (C_0 + C_1) & (C_0 + C_1)^\ast \times (C_2 + C_3) \\
            (C_2 + C_3)^\ast \times (C_0 + C_1) & (C_2 + C_3)^\ast \times (C_2 + C_3) \\
        \end{pmatrix}
    \end{aligned}
    \quad ...? 
\end{equation}

We are using a neural network and treating this as an image, so it would not matter if we flip the off-diagonal terms. The latter appears to be correct though, because:

\begin{equation}
    \begin{pmatrix}
        a^\ast_0 \\
        a^\ast_1
    \end{pmatrix}
    \otimes
    \begin{pmatrix}
        b_0 & b_1
    \end{pmatrix}
    \; = \;
    \begin{pmatrix}
        a^\ast_0 b_0 & a^\ast_0 b_1 \\
        a^\ast_1 b_0 & a^\ast_1 b_1
    \end{pmatrix}
\end{equation}

Finally, how to efficiently identify which coefficients such as $(C_0 + C_1)$ to sum, in a vectorized manner? All this nonsense is probably a single line in programming code.

Some references:
* https://arxiv.org/pdf/1601.07458.pdf
* http://www.quantum.umb.edu/Jacobs/QMT/QMT_AppendixA.pdf
* https://physics.stackexchange.com/questions/179671/how-to-take-partial-trace

To construct a reduced density matrix by taking a partial trace of a full density matrix, the dimensions of the matrix goes from $2^L \times 2^L$ to $2^n \times 2^n$. This requires a **RECTANGULAR** matrix. However, through the million definitions and tutorials and stackexchange articles I've read, none, **NONE**, explicitly stated how to construct it for the general case. They either kept it in terms of abstract notations, or they have already simplified it to a specific case. **NONE** are usable.

To add insult to injury, our system is a bit more complicated than just $ | A \rangle \otimes | B \rangle $. But rather, because we want $n$ consecutive spins in the center, the system is actually $ | B_1 \rangle \otimes | A \rangle \otimes | B_2 \rangle $!

### My current understanding / implementation of the rectangular matrix is as follows:

\begin{equation}
    \mathbb{B} = | j \rangle = | 1 \rangle_{B1} \otimes \mathbb{I}_{A} \otimes | 1 \rangle_{B2} =
    \begin{pmatrix}
        1 \\
        \vdots \\
        1
    \end{pmatrix}
    \otimes
    \begin{pmatrix}
        1 & \dots & 0 \\ 
        \vdots & \ddots  & \vdots \\
        0 & \dots & 1
    \end{pmatrix}
    \otimes
    \begin{pmatrix}
        1 \\
        \vdots \\
        1
    \end{pmatrix}
\end{equation}

Then apply this matrix to the full density matrix:

\begin{equation}
    \mathbb{B}^\dagger \mathbb{\rho} \mathbb{B}
\end{equation}

This assumes the sum over sites of $j$ in subsystem $B$ is whatever-ative, i.e.

\begin{equation}
    Tr_B(\rho) = \sum_j (\langle B_1 | \otimes \langle A | \otimes \langle B_2 |) \; \rho \; (| B_1 \rangle \otimes | A \rangle \otimes | B_2 \rangle) = (\langle 1 |_{B1} \otimes \mathbb{I}_{A} \otimes \langle 1 |_{B2}) \; \rho \; (| 1 \rangle_{B1} \otimes \mathbb{I}_{A} \otimes | 1 \rangle_{B2})
\end{equation}

# In the end, tensor contraction is used!

See function `partial_trace_tensor()`.

#### Other resources

Given any orthonormal basis sets $|a_i\rangle$ and $|b_i\rangle$ for $\mathcal H_a$ and $\mathcal H_b$ respectively, any operator $K$ on the space $\mathcal H_a\otimes \mathcal H_b$ can be written:

$$K = \sum_{ij k\ell} K_{ijk\ell} |a_i\rangle |b_j\rangle \langle a_k|\langle b_\ell|$$

where $K_{ijk\ell} \equiv \langle a_i|\langle b_j| K |a_k\rangle |b_\ell\rangle$, and $\langle a_i|\langle b_j| \equiv \langle a_i|\otimes \langle b_j|$ (I omit the $\otimes$ for notational clarity).  The partial trace is then defined to be

$$\mathrm{Tr}_b(K):= \sum_{ik\ell}K_{i\ell k\ell}|a_i\rangle\langle a_k|$$

which is now a linear operator on $\mathcal H_a$ alone, with coefficients $$\bigg(\mathrm{Tr}_b(K)\bigg)_{ik} = \sum_\ell K_{i\ell k\ell}$$

Source: https://physics.stackexchange.com/questions/616061/where-does-the-expression-mathrmtrk-sum-j-1n-langle-psi-jk-psi-j/

#### Test how outer product works

In [None]:
a0 =  1 +  2j
a1 =  3 +  5j
b0 =  7 + 11j
b1 = 13 + 17j

In [None]:
a = np.array([1 +  2j,  3 +  5j])
b = np.array([7 + 11j, 13 + 17j])
np.outer(a.conj(), b)

In [None]:
(1 -  2j) * (7 + 11j)

In [None]:
(1 -  2j) * (13 + 17j)

#### Test bit shift (unused)

In [None]:
num_E     = 10 # #eigenstates, ~100 - ~1000.
num_sites = 6  # #n consecutive spins.

In [None]:
# Source: https://stackoverflow.com/questions/147713/how-do-i-manipulate-bits-in-python
def get_bit(value, n):
    return ((value >> n & 1) != 0)

def set_bit(value, n):
    return value | (1 << n)

def clear_bit(value, n):
    return value & ~(1 << n)

In [None]:
print(get_bit(32, 4))
print(get_bit(32, 5))

In [None]:
def drop_sites(L, sites):
    """Select sites to drop.
    This is only for demonstration. The mechanics is probably wrong."""

    idx = []
    # `i` is the index along an eigenvector.
    # `site` is the site to be dropped / summed over while constructing a reduced density matrix.
    for i in range(2**L):
        # Spins are numbered from left to right.
        # But bits are numbered from right to left.
        # Hence we need to flip the binary representation.
        for site in sites:
            print('Check site {} for removal.'.format(site))
            print('{:0{}b}'.format(i, L))
            if site == 0:
                print('^')
            else:
                print(' ' * (site) + '^')
            if get_bit(i, L - site - 1) == 1:
                idx.append(i)

    return idx

print(drop_sites(2, [0]))
print('='*25)
print(drop_sites(2, [1]))

### Compare implementations of partial trace

In [None]:
Tr_B = build_partial_trace_matrix(2, [0], [1], [])
print(Tr_B.shape)
print(Tr_B.A)

In [None]:
# Hamiltonian sample.
L  = 8
W1 = 0.5
W2 = 8
J  = 1
periodic = False
num_Hs = 10

H1 = build_H(L, W1, J, periodic=False)
E1, V1 = ED(H1.toarray())

print('2^L: {}'.format(2**L))
print('#eigenvalues: {}'.format(len(E1)))
print('E.shape: {}'.format(E1.shape))
print('V.shape: {}'.format(V1.shape))

H2 = build_H(L, W2, J, periodic=False)
E2, V2 = ED(H2.toarray())

In [None]:
rho = get_rho(V1[:,0])
print(rho.shape)
print(np.max(rho))
max_idx = np.unravel_index(rho.argmax(), rho.shape) # 2D index of np.argmax().
print(max_idx)

pos = max_idx[0] # Which 3x3 sub-matrix along the diagonal to print.
print(rho[pos-1:pos+2, pos-1:pos+2])

In [None]:
# Compare matrix-generated rho_A vs tensor-generated rho_A.
A_sites  = [1,2,3,4,5,6] # Keep n consecutive
B1_sites = [0]
B2_sites = [7]
rho_A_mat1 = partial_trace_matrix(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_mat2 = partial_trace_matrix(L, A_sites, B1_sites, B2_sites, V2[:,0])
print('Theoretical size of reduced density matrix rho_A: {}'.format(2**len(A_sites)))
print('Shape of computed rho_A: {}'.format(rho_A_mat1.shape))
print('Size of computed rho_A: {}'.format(rho_A_mat1.size))

In [None]:
rho_A_ten1 = partial_trace_tensor(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_ten2 = partial_trace_tensor(L, A_sites, B1_sites, B2_sites, V2[:,0])
print('Theoretical size of reduced density matrix rho_A: {}'.format(2**len(A_sites)))
print('Shape of computed rho_A: {}'.format(rho_A_ten1.shape))
print('Size of computed rho_A: {}'.format(rho_A_ten1.size))

In [None]:
rho_A_kev1 = partial_trace_kevin(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_kev2 = partial_trace_kevin(L, A_sites, B1_sites, B2_sites, V2[:,0])
print('Theoretical size of reduced density matrix rho_A: {}'.format(2**len(A_sites)))
print('Shape of computed rho_A: {}'.format(rho_A_kev1.shape))
print('Size of computed rho_A: {}'.format(rho_A_kev1.size))

In [None]:
rho_A_jon1 = partial_trace_jonas(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_jon2 = partial_trace_jonas(L, A_sites, B1_sites, B2_sites, V2[:,0])
print('Theoretical size of reduced density matrix rho_A: {}'.format(2**len(A_sites)))
print('Shape of computed rho_A: {}'.format(rho_A_jon1.shape))
print('Size of computed rho_A: {}'.format(rho_A_jon1.size))

In [None]:
# Check Kevin's implementation and mine.
assert np.allclose(rho_A_ten1, rho_A_kev1)
assert np.allclose(rho_A_ten2, rho_A_kev2)

In [None]:
%%timeit
rho_A_ten1 = partial_trace_tensor(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_ten2 = partial_trace_tensor(L, A_sites, B1_sites, B2_sites, V2[:,0])

In [None]:
%%timeit
rho_A_kev1 = partial_trace_kevin(L, A_sites, B1_sites, B2_sites, V1[:,0])
rho_A_kev2 = partial_trace_kevin(L, A_sites, B1_sites, B2_sites, V2[:,0])
# Kevin is faster by 10-20%.

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(fig_w/dpi,fig_h/dpi*2), dpi=dpi, squeeze=False)

im1 = axes[0, 0].imshow(np.abs(rho_A_mat1))
im2 = axes[0, 1].imshow(np.abs(rho_A_ten1))
im3 = axes[1, 0].imshow(np.abs(rho_A_mat2))
im4 = axes[1, 1].imshow(np.abs(rho_A_ten2))

axes[0, 0].set_title('$W=0.5 \quad \\rho_A$ computed using matrix (seems wrong)')
axes[0, 1].set_title('$W=0.5 \quad \\rho_A$ computed using tensor')
axes[1, 0].set_title('$W=8.0 \quad \\rho_A$ computed using matrix (almost correct)')
axes[1, 1].set_title('$W=8.0 \quad \\rho_A$ computed using tensor')

# plt.colorbar(im1, ax=axes[0, 0])#.set_label('Entropy')
# plt.colorbar(im2, ax=axes[0, 1])#.set_label('Entropy')

fig.tight_layout()

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(fig_w/dpi,fig_h/dpi*2), dpi=dpi, squeeze=False)

im1 = axes[0, 0].imshow(np.abs(rho_A_kev1))
im2 = axes[0, 1].imshow(np.abs(rho_A_ten1))
im3 = axes[1, 0].imshow(np.abs(rho_A_kev2))
im4 = axes[1, 1].imshow(np.abs(rho_A_ten2))

axes[0, 0].set_title('$W=0.5 \quad \\rho_A$ computed using Kevin')
axes[0, 1].set_title('$W=0.5 \quad \\rho_A$ computed using tensor')
axes[1, 0].set_title('$W=8.0 \quad \\rho_A$ computed using Kevin')
axes[1, 1].set_title('$W=8.0 \quad \\rho_A$ computed using tensor')

# plt.colorbar(im1, ax=axes[0, 0])#.set_label('Entropy')
# plt.colorbar(im2, ax=axes[0, 1])#.set_label('Entropy')

fig.tight_layout()

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(fig_w/dpi,fig_h/dpi*2), dpi=dpi, squeeze=False)

im1 = axes[0, 0].imshow(np.abs(rho_A_jon1))
im2 = axes[0, 1].imshow(np.abs(rho_A_ten1))
im3 = axes[1, 0].imshow(np.abs(rho_A_jon2))
im4 = axes[1, 1].imshow(np.abs(rho_A_ten2))

axes[0, 0].set_title('$W=0.5 \quad \\rho_A$ computed using Jonas')
axes[0, 1].set_title('$W=0.5 \quad \\rho_A$ computed using tensor')
axes[1, 0].set_title('$W=8.0 \quad \\rho_A$ computed using Jonas')
axes[1, 1].set_title('$W=8.0 \quad \\rho_A$ computed using tensor')

# plt.colorbar(im1, ax=axes[0, 0])#.set_label('Entropy')
# plt.colorbar(im2, ax=axes[0, 1])#.set_label('Entropy')

fig.tight_layout()

## Visualize reduced density matrices for different W

In [None]:
num = 4
num_plots = num * num # Number of rho_A to display.
base_samples = num_plots // 2

In [None]:
# Test drawing images of reduced density matrix.
J  = 1                      # Always = 1
Ls = [10]                   # System size L.
ps = [False]                # Periodic or not.
num_EV = 1                  # Number of eigenvalues near zero to save.

In [None]:
# Training data.
Ws_main = [0.5, 8]          # Disorder strength W.
Hs = [base_samples]         # Number of samples per L per W.
et = []                     # Execution time.
rho_As_dict = {}

for L, num_Hs in zip(Ls, Hs):
    for p in ps:
        start_time = time.time()
        rho_As = batch_gen_rho_data_main(L, Ws_main, J, p, num_Hs, num_EV, max_n=10, save_data=False)
        for n, rho_A in rho_As.items():
            if n not in rho_As_dict:
                rho_As_dict[n] = []
            rho_As_dict[n] = rho_As_dict[n] + rho_A
        exec_time = time.time() - start_time
        et.append(exec_time)
        print('Computed: L={:02d} | periodic={: <5}.'.format(L, str(p)))
        print('Execution took {: 8.2f}s or {: 6.2f}min.'.format(exec_time, exec_time/60))


In [None]:
# Random data.
Ws_rand = np.random.uniform(0.1, high=9.9, size=(2 * base_samples,)) # Disorder strength W.
Hs = [1]                    # Number of samples per L per W.
et = []                     # Execution time.
rho_As_rand = {}

for L, num_Hs in zip(Ls, Hs):
    for p in ps:
        start_time = time.time()
        rho_As = batch_gen_rho_data_rand(L, Ws_rand, J, p, num_Hs, num_EV, max_n=10, save_data=False)
        for n, rho_A in rho_As.items():
            if n not in rho_As_rand:
                rho_As_rand[n] = []
            rho_As_rand[n] = rho_As_rand[n] + rho_A
        exec_time = time.time() - start_time
        et.append(exec_time)
        print('Computed: L={:02d} | periodic={: <5}.'.format(L, str(p)))
        print('Execution took {: 8.2f}s or {: 6.2f}min.'.format(exec_time, exec_time/60))


In [None]:
n = 6
# sample_idx = np.random.randint(0, len(rho_As_dict[n]), size=num*num)
sample_idx = np.arange(num*num)

fig, axes = plt.subplots(num, num, figsize=(fig_w/dpi,fig_h/dpi*2), dpi=dpi, squeeze=False)
fig.suptitle('Visualize training $\\rho_A$ ($W=0.5,W=8.0$)', fontsize=16)

for i, idx in enumerate(sample_idx):
    axes[i%num,i//num].imshow(np.abs(rho_As_dict[n][idx][0]))
    axes[i%num,i//num].annotate('W={:3.1f}'.format(rho_As_dict[n][idx][1]), (0.5,0.5), xycoords='axes fraction', ha='center', color='w', fontsize=14)

for axe in axes:
    for ax in axe:
        # ax.legend(loc='best')
        ax.xaxis.set_ticklabels([])
        ax.yaxis.set_ticklabels([])
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)

fig.tight_layout()
fig.subplots_adjust(top=0.95)

In [None]:
n = 6
# sample_idx = np.random.randint(0, len(rho_As_rand[n]), size=num*num)
sample_idx = np.arange(num*num)

fig, axes = plt.subplots(num, num, figsize=(fig_w/dpi,fig_h/dpi*2), dpi=dpi, squeeze=False)
fig.suptitle('Visualize random $\\rho_A$ ($W \in [0.1,9.9]$)', fontsize=16)

for i, idx in enumerate(sample_idx):
    axes[i%num,i//num].imshow(np.abs(rho_As_rand[n][idx][0]))
    axes[i%num,i//num].annotate('W={:3.1f}'.format(rho_As_rand[n][idx][1]), (0.5,0.5), xycoords='axes fraction', ha='center', color='w', fontsize=14)

for axe in axes:
    for ax in axe:
        # ax.legend(loc='best')
        ax.xaxis.set_ticklabels([])
        ax.yaxis.set_ticklabels([])
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)

fig.tight_layout()
fig.subplots_adjust(top=0.95)

## Batch generate data (execution part)

In [None]:
print('Estimated runtime to generate 10,000 samples per W:')
for i in range(7):
    print('For L = {:2d}, est runtime {:3d} min or {: 2.2f} hrs.'.format(8+i, 7 * 2**i, 7 * 2**i / 60))
print(' ')

print('Estimated runtime to generate 100,000 samples per W:')
for i in range(7):
    print('For L = {:2d}, est runtime {:4d} min or {:5.2f} hrs.'.format(8+i, 70 * 2**i, 70 * 2**i / 60))
print(' ')

print('It will therefore be safer to break down generation to segments of 1-hr:')
for i in range(7):
    print('For L = {:2d}, separate execution into {:2d} batches with {:6d} samples each.'.format(8+i, 2**i, 100000 // 2**i))

In [None]:
# Batch generate reduced density matrix.

k = 5
batches = 10
batch_resume = 1
base_sample = 10000 // k // batches # Divide by k (num_EV)
rand_sample = 100           # Samples per random W.
Ws_main = [0.5, 8]          # Disorder strength W.
Ws_rand = np.random.uniform(0.1, high=7.9, size=(2 * base_sample // rand_sample,))
J  = 1                      # Always = 1
Ls = list(range(8,13,1))    # System sizes L.
ps = [False, True]          # Periodic or not.
et = []                     # Execution time.
Hs_main = [base_sample]*len(Ls) # Number of samples per L per W.
Hs_rand = [rand_sample]*len(Ls) # Number of samples per L per W.
num_EVs = [k]               # Number of eigenvalues near zero to save.

for i in range(batches):

    if i < batch_resume:
        tqdm.write('{} | Processing batch {:03d} of {:d}:'.format(dt(), i+1, batches)) #, flush=True)
        tqdm.write('{} | Batch {:03d} skipped.'.format(dt(), i+1, batches)) #, flush=True)
        tqdm.write(' ') #, flush=True)
        continue

    tqdm.write('='*60) #, flush=True)
    tqdm.write('{} | Processing batch {:03d} of {:d}:'.format(dt(), i+1, batches)) #, flush=True)
    tqdm.write(' ') #, flush=True)
    for L, num_Hs_m, num_Hs_r in zip(Ls, Hs_main, Hs_rand):

        tqdm.write('='*40) #, flush=True)
        for num_EV in num_EVs:
            for p in ps:
                start_time = time.time()

                tqdm.write('{} | Generating training data for L={:02d} | num_EV={} | periodic={: <5}...'.format(dt(), L, num_EV, str(p))) #, flush=True)
                batch_gen_rho_data_main(L, Ws_main, J, p, num_Hs_m, num_EV, max_n=6, clamp_zero=1e-32, save_data=True)
                tqdm.write('{} | Generating random data for L={:02d} | num_EV={} | periodic={: <5}...'.format(dt(), L, num_EV, str(p))) #, flush=True)
                batch_gen_rho_data_rand(L, Ws_rand, J, p, num_Hs_r, num_EV, max_n=6, clamp_zero=1e-32, save_data=True)

                exec_time = time.time() - start_time
                et.append(exec_time)
                tqdm.write('{} | Computed: L={:02d} | num_EV={} | periodic={: <5}.'.format(dt(), L, num_EV, str(p))) #, flush=True)
                tqdm.write('{} | Execution took {: 8.2f}s or {: 6.2f}min.'.format(dt(), exec_time, exec_time/60)) #, flush=True)
                tqdm.write(' ') #, flush=True)

        tqdm.write('='*40) #, flush=True)
        tqdm.write(' ') #, flush=True)

    tqdm.write('{} | Batch {:03d} of {:d} completed.'.format(dt(), i+1, batches)) #, flush=True)
    tqdm.write('='*60) #, flush=True)
    tqdm.write(' ') #, flush=True)

    if check_shutdown_signal():
        break