## Task #2

In the following task, we will train a Restricted Boltzmann Machine (RBM) on 100 Rydberg atoms data. We will compare the energy of our simulated system against the exact known energy. In order to do this, it is necessary to explore some parameters of the Boltzman network. The number of hidden nodes and samples is important in order to obtain good results.

Imports and loading in data:

In [None]:
import numpy as np
import torch
import Rydberg_energy_calculator
from RBM_helper import RBM

training_data = torch.from_numpy(np.loadtxt("Rydberg_data.txt"))

The binary data in ```Rydberg\_data.txt``` corresponds to 100 atoms. An exact resolution of a system via diagonalization requires around $2^N$ terms, which in this case is beyond any possible calulation. Nontheless, RBM allow us to heavily compress this problem, changing the exponentially growing complexity for a linear growing complexity. For recovering the wavefunction of a system with 100 atoms, we only require $100 + n_h + n_h \times 100$ numbers, where $n_h$ is the number of hidden nodes.

We will evaluate the energy during training and compare it to the exact energy. This can be done with ```Rydberg\_energy\_calculator.py```. We will arbitrarly select a learning criterion, i.e. a limit to cut our training with satisfactory results.
We selected as learning criteria $\vert E_{RBM} - E_{exact} \vert \leq 0.0002$, where $E_{exact} = -4.1203519096$.

This problem relies heavily on the size of the sample we take from our data. The more samples we use the more complex is the network we need to generalize. We will first consider the entire dataset and find the minimum number of hidden nodes required to reach the learning criteria.
Each iteration will change the number of hidden nodes, and will have at most 1000 epochs.

In [None]:
flag = 0
i = 0
epochs = 1000
num_samples = 20000 
n_vis = training_data.shape[1]
exact_energy = -4.1203519096
print("Exact energy: ",exact_energy)

while flag == 0 :
    i = i + 1
    n_hin = i
    rbm = RBM(n_vis, n_hin)
    print("\n The number of hidden units is: ", n_hin)
  
    e = 0
    while (e < epochs):
        e = e + 1
        rbm.train(training_data)   
        if e % 100 == 0:
            init_state = torch.zeros(num_samples, n_vis)
            RBM_samples = rbm.draw_samples(1000, init_state)
            energies = Rydberg_energy_calculator.energy(RBM_samples, rbm.wavefunction) 
            print("Epoch:", e,". Energy from RBM samples:", energies.item(),". Error:", abs(exact_energy - energies.item()))
            if (abs(exact_energy - energies.item()) < 0.0002):
                print("FINAL NUMBER OF HIDDEN UNITS:", n_hin)
                print("FINAL NUMBER OF EPOCHS:", e)
                print("ERROR:", abs(exact_energy - energies.item()))
                e = epochs
                flag = 1

Exact energy:  -4.1203519096

 The number of hidden units is:  1




Epoch: 100 . Energy from RBM samples: -4.120062240258802 . Error: 0.0002896693411980067
Epoch: 200 . Energy from RBM samples: -4.119976002260587 . Error: 0.00037590733941339494
Epoch: 300 . Energy from RBM samples: -4.120082717140181 . Error: 0.0002691924598190454
Epoch: 400 . Energy from RBM samples: -4.120004489701917 . Error: 0.00034741989808306784
Epoch: 500 . Energy from RBM samples: -4.120038590317868 . Error: 0.00031331928213251814
Epoch: 600 . Energy from RBM samples: -4.120016504978766 . Error: 0.0003354046212340478
Epoch: 700 . Energy from RBM samples: -4.120052344188419 . Error: 0.0002995654115807156
Epoch: 800 . Energy from RBM samples: -4.120031126971855 . Error: 0.0003207826281448334
Epoch: 900 . Energy from RBM samples: -4.120077148590775 . Error: 0.0002747610092255215
Epoch: 1000 . Energy from RBM samples: -4.1201191562144714 . Error: 0.000232753385528639

 The number of hidden units is:  2
Epoch: 100 . Energy from RBM samples: -4.119901615188974 . Error: 0.000450294411

We will now double the number of hidden units (and hence the complexity of the network), but lowering the number of samples to find the minimum required number to achive the same ammount of precision.

In [None]:
flag = 0
i = 0
epochs = 1000
n_hin = 3 * 2 # in the previous case it converged with 3 units
n_vis = training_data.shape[1]
exact_energy = -4.1203519096
print("Exact energy: ",exact_energy,". Hidden units:",n_hin,".")

while flag == 0 :
  i = i + 1
  num_samples = 10 * i
  rbm = RBM(n_vis, n_hin)
  print("\nThe number of samples is: ", num_samples)
  
  e = 0
  while (e < epochs):
    e = e + 1
    rbm.train(training_data)   
    if e % 100 == 0:
      init_state = torch.zeros(num_samples, n_vis)
      RBM_samples = rbm.draw_samples(1000, init_state)
      energies = Rydberg_energy_calculator.energy(RBM_samples, rbm.wavefunction) 
      print("Epoch:", e,". Energy from RBM samples:", energies.item(),". Error:", abs(exact_energy - energies.item()))
      if (abs(exact_energy - energies.item()) < 0.0002):
        print("NUMBER OF SAMPLES:", num_samples)
        print("FINAL NUMBER OF EPOCHS:", e)
        print("ERROR:", abs(exact_energy - energies.item()))
        e = epochs
        flag = 1

Exact energy:  -4.1203519096 . Hidden units: 6 .

The number of samples is:  10
Epoch: 100 . Energy from RBM samples: -4.114849608549276 . Error: 0.005502301050723801
Epoch: 200 . Energy from RBM samples: -4.118903595039777 . Error: 0.0014483145602230962
Epoch: 300 . Energy from RBM samples: -4.11916454590617 . Error: 0.0011873636938304344
Epoch: 400 . Energy from RBM samples: -4.119307788753898 . Error: 0.001044120846102281
Epoch: 500 . Energy from RBM samples: -4.1191645907336305 . Error: 0.00118731886636958
Epoch: 600 . Energy from RBM samples: -4.121560546042896 . Error: 0.0012086364428958163
Epoch: 700 . Energy from RBM samples: -4.119111079515618 . Error: 0.0012408300843818054
Epoch: 800 . Energy from RBM samples: -4.118059725777168 . Error: 0.00229218382283225
Epoch: 900 . Energy from RBM samples: -4.119417895696038 . Error: 0.0009340139039624162
Epoch: 1000 . Energy from RBM samples: -4.117016605105741 . Error: 0.003335304494259006

The number of samples is:  20
Epoch: 100 . En

For a better precission with the whole data set, such as $\vert E_{RBM} - E_{exact} \vert \leq 0.0001$, the required number of hidden nodes is extremly hard to find. Iteration leading up to 100 did not found any case, we even tried it with a greater number of hidden units without meeting the learning criteria. As the number of samples decreases, the complexity of the system necessary to meet the learning criteria gets lower.

More information on these kind of systems with RBM can be found in the work [Integrating Neural Networks with a Quantum Simulator for State Reconstruction](https://arxiv.org/pdf/1904.08441.pdf)