#Entanglement Feature Learning

In [2]:
%run EFL.py

## Restricted Boltzman Machine (RBM)

### RBM Kernel
Differences to standard RBM:
* the binary units takes values $\pm1$ instead of $0,1$.
* **unbiased**, RTN of pure state are not biased, so the only variational parameter is the weight matrix.
* elements in the weight matrixes are **positive** definite, because they correspnd to the logarithmic bond dimension.

Initialization of weight matrix. Wish to:
* reflect the locality.
* ferromagnetic, tune to critical.
* with some randomness.

Energy model ($v_i,h_j=\pm1$):
$$H=-\sum_{i,j}v_iW_{ij}h_{j}.$$
The free energy (given visible spins) is:
$$F[v]=-\ln\sum_{[h]}e^{-H}=-\sum_{j}\ln \left(2\cosh\left(\sum_{i}v_iW_{ij}\right)\right).$$
Local field seen by $h_j$: $\sum_i v_i W_{ij}$, seen by $v_i$: $\sum_{j} W_{ij} h_j$. This alows to relax the RBM to equilibrium by alternating Gibbs sampling.

Cost function:
$$\text{cost}= \Delta F = F[v_\text{init}]-F[v_\text{equi}]$$
Update rule ($\lambda_\text{l}$ learning rate, $\lambda_\text{f}$ forgetting rate):
$$\frac{\mathrm{d}W}{\mathrm{d}t}= (1-\lambda_\text{f})W-\lambda_\text{l}\frac{\partial\Delta F}{\partial W}$$
* If any element in $W$ becomes negative, the element is set to zero.
* Forgetting rate can be used to control the bond dimension.

#### Test by Ideal States
* trivial product state: deep PM, all Ising configuration equal weight.
* random state (maximally thermalized): deep FM, all up or all down.

In [1]:
%run EFL.py
#train_set = numpy.array([[1,1,1,1],[-1,-1,-1,-1]]*200,dtype=float)
#train_set = numpy.array([[1,-1,1,-1],[-1,1,-1,1]]*200,dtype=float)
train_set = numpy.asarray(numpy.random.randint(0,2,(400,4))*2-1,dtype=float)
rbm = RBM(W='random',method='CD')
#lr_table = [0.2,0.3,0.25,0.15,0.1]
#fr_table = [0.05,0.01,0.,0.,0.]
lr_table = [0.5,0.3,0.2,0.15,0.1,0.1,0.1]
fr_table = [0.,0.,0.,0.,0.,0.,0.]

ImportError: No module named 'theano'

NameError: name 'RBM' is not defined

FM, maximally thermal.

In [3]:
for epoch in range(7):
    cost, xent = rbm.train(train_set,
                           learning_rate=lr_table[epoch],
                           forgetting_rate=fr_table[epoch])
    print('Epoch %d: '%epoch, 'cost = %f, xent = %f'%(cost, xent))
rbm.W.get_value()

Epoch 0:  cost = -0.109159, xent = 0.022368
Epoch 1:  cost = -0.074178, xent = 0.010012
Epoch 2:  cost = -0.076009, xent = 0.008760
Epoch 3:  cost = -0.031306, xent = 0.008158
Epoch 4:  cost = -0.047077, xent = 0.007955
Epoch 5:  cost = -0.030862, xent = 0.007696
Epoch 6:  cost = -0.124440, xent = 0.007290


array([[ 1.5620738 ,  1.76418861],
       [ 1.55504711,  1.53255343],
       [ 1.36638278,  1.8026775 ],
       [ 1.54176937,  1.59831983]])

PM, trivial product state.

In [6]:
for epoch in range(7):
    cost, xent = rbm.train(train_set,
                           learning_rate=lr_table[epoch],
                           forgetting_rate=fr_table[epoch])
    print('Epoch %d: '%epoch, 'cost = %f, xent = %f'%(cost, xent))
rbm.W.get_value()

Epoch 0:  cost = 0.112113, xent = 3.242719
Epoch 1:  cost = 0.022328, xent = 3.070230
Epoch 2:  cost = 0.007352, xent = 2.949821
Epoch 3:  cost = 0.013084, xent = 2.926602
Epoch 4:  cost = 0.001652, xent = 2.892917
Epoch 5:  cost = 0.005669, xent = 2.922602
Epoch 6:  cost = 0.004110, xent = 2.880515


array([[ 0.03113473,  0.06022195],
       [ 0.43808222,  0.00861   ],
       [ 0.01715053,  0.01461342],
       [ 0.03886801,  0.29126499]])

### Convolutional RBM
Assumption: translational symmetry (or in statistical sense under disorder average). Weights are shared among kernels for each layer and each group. This makes the algorithm scalable. For disordered system, this will learn the disorder averaged entanglement features.

Consider a 1D free fermion CFT, the Renyi entropy given by (acorrding to Calabrese and Cardy 2004, 2009)
$$S\propto\sum_{i,j}\ln|u_i-v_j|-\sum_{i<j}\ln|u_i-u_j|-\sum_{i<j}\ln|v_i-v_j|.$$
$u_i$ and $v_i$ are positions of kinks and antikinks in the Ising configuraiton.

In [26]:
from itertools import combinations
def entropy_CFT(kinks):
    us = kinks[0::2]
    vs = kinks[1::2]
    Suv = sum(numpy.log(abs(u-v)) for u in us for v in vs)
    Suu = sum(numpy.log(abs(u1-u2)) for u1, u2 in combinations(us,2))
    Svv = sum(numpy.log(abs(v1-v2)) for v1, v2 in combinations(vs,2))
    S = Suv-Suu-Svv
    return S
entropy_CFT([3,10])

1.9459101490553132

## Deep Belief Net

In [10]:
%run EFL.py
#train_set = numpy.array([[1,1,1,1],[-1,-1,-1,-1]]*200,dtype=float)
#train_set = numpy.array([[1,-1,1,-1],[-1,1,-1,1]]*200,dtype=float)
train_set = numpy.asarray(numpy.random.randint(0,2,(400,4))*2-1,dtype=float)
rbm = RBM(W='random',method='CD')
lr_table = [0.2,0.3,0.25,0.15,0.1]
fr_table = [0.05,0.01,0.,0.,0.]
#lr_table = [0.5,0.3,0.2,0.15,0.1,0.1,0.1]
#fr_table = [0.,0.,0.,0.,0.,0.,0.]
for epoch in range(5):
    cost, xent = rbm.train(train_set,
                           learning_rate=lr_table[epoch],
                           forgetting_rate=fr_table[epoch])
    print('Epoch %d: '%epoch, 'cost = %f, xent = %f'%(cost, xent))
rbm.W.get_value()

Epoch 0:  cost = 0.123299, xent = 3.075038
Epoch 1:  cost = 0.000767, xent = 2.811155
Epoch 2:  cost = -0.000510, xent = 2.820312
Epoch 3:  cost = 0.000394, xent = 2.820871
Epoch 4:  cost = -0.000169, xent = 2.814361


array([[ 0.22448242,  0.08630163],
       [ 0.05376896,  0.10348084],
       [ 0.15858293,  0.03803954],
       [ 0.21123996,  0.01548139]])

In [20]:
rbm.bottomup(train_set);

In [16]:
rbm._bottomup

<theano.compile.function_module.Function at 0x115b82ba8>

In [63]:
%run EFL.py
train_set = numpy.array([[1,1,1,1,1,1,1,1],[-1,-1,-1,-1,-1,-1,-1,-1]]*200,dtype=float)
#train_set = numpy.array([[1,-1,1,-1],[-1,1,-1,1]]*200,dtype=float)
#train_set = numpy.asarray(numpy.random.randint(0,2,(400,8))*2-1,dtype=float)
dbn = DBN([8,4,2])
dbn.train(train_set, lrs = [0.5,0.3,0.2,0.15], frs = [0.05,0.01])

RBM layer 0 ---
Epoch 0:  cost = -0.727944, xent = 0.442594
Epoch 1:  cost = -0.532840, xent = 0.124340
Epoch 2:  cost = -0.197564, xent = 0.069346
Epoch 3:  cost = -0.276740, xent = 0.049645
Epoch 4:  cost = -0.344374, xent = 0.038671
Epoch 5:  cost = -0.123221, xent = 0.031379
Epoch 6:  cost = -0.141894, xent = 0.027855
RBM layer 1 ---
Epoch 0:  cost = -0.498184, xent = 0.195079
Epoch 1:  cost = -0.381247, xent = 0.121287
Epoch 2:  cost = -0.327675, xent = 0.066703
Epoch 3:  cost = -0.189362, xent = 0.043557
Epoch 4:  cost = -0.152801, xent = 0.033933
Epoch 5:  cost = -0.166142, xent = 0.029565
Epoch 6:  cost = -0.124329, xent = 0.026525


In [58]:
dbn.rbms[0].W.get_value()

array([[ 0.74378813,  0.69830952,  0.69364525,  0.68225864],
       [ 0.81431069,  0.69718579,  0.66758886,  0.67747641],
       [ 0.8540145 ,  0.74476177,  0.61490136,  0.59400549],
       [ 0.70656703,  0.79957898,  0.69043742,  0.56957501],
       [ 0.55321003,  0.66975958,  0.77827142,  0.68380722],
       [ 0.6162805 ,  0.63284488,  0.76471595,  0.86908277],
       [ 0.64686609,  0.63555991,  0.67143179,  0.79119637],
       [ 0.74910137,  0.75495721,  0.76042006,  0.78714136]])

In [67]:
ori = T.matrix()
rec = dbn.reconstruct(ori)
rfun = theano.function([ori],rec)

In [70]:
rfun(train_set[0:7])

array([[ 0.995034  ,  0.99171267,  0.99370661,  0.99266298,  0.99147429,
         0.99191823,  0.9942658 ,  0.99142982],
       [-0.995034  , -0.99171267, -0.99370661, -0.99266298, -0.99147429,
        -0.99191823, -0.9942658 , -0.99142982],
       [ 0.995034  ,  0.99171267,  0.99370661,  0.99266298,  0.99147429,
         0.99191823,  0.9942658 ,  0.99142982],
       [-0.995034  , -0.99171267, -0.99370661, -0.99266298, -0.99147429,
        -0.99191823, -0.9942658 , -0.99142982],
       [ 0.995034  ,  0.99171267,  0.99370661,  0.99266298,  0.99147429,
         0.99191823,  0.9942658 ,  0.99142982],
       [-0.995034  , -0.99171267, -0.99370661, -0.99266298, -0.99147429,
        -0.99191823, -0.9942658 , -0.99142982],
       [ 0.995034  ,  0.99171267,  0.99370661,  0.99266298,  0.99147429,
         0.99191823,  0.9942658 ,  0.99142982]])