# Task #2

A template code for training an RBM on Rydberg atom data (the full dataset) is provided below. For the first part of this task (determining the minimum number of hidden units), start with 20 hidden units. 

Imports and loadining in data:

In [1]:
import numpy as np
import torch
from RBM_helper import RBM

import Rydberg_energy_calculator

training_data = torch.from_numpy(np.loadtxt("Rydberg_data.txt"))

Define the RBM:

In [2]:
n_vis = training_data.shape[1]
n_hin = 1

rbm = RBM(n_vis, n_hin)

Train the RBM:

In [3]:
epochs = 500
num_samples = 2000

exact_energy = -4.1203519096
print("Exact energy: ",exact_energy)

for e in range(1, epochs+1):
    # do one epoch of training
    rbm.train(training_data)   
 
    # now generate samples and calculate the energy
    if e % 100 == 0:
        print("\nEpoch: ", e)
        print("Sampling...")

        init_state = torch.zeros(num_samples, n_vis)
        RBM_samples = rbm.draw_samples(100, init_state)

        print("Done sampling. Calculating energy...") 
 
        energies = Rydberg_energy_calculator.energy(RBM_samples, rbm.wavefunction) 
        print("Energy from RBM samples: ", energies.item())

Exact energy:  -4.1203519096

Epoch:  100
Sampling...
Done sampling. Calculating energy...




Energy from RBM samples:  -4.120142919012315

Epoch:  200
Sampling...
Done sampling. Calculating energy...
Energy from RBM samples:  -4.120207028500263

Epoch:  300
Sampling...
Done sampling. Calculating energy...
Energy from RBM samples:  -4.119560076406688

Epoch:  400
Sampling...
Done sampling. Calculating energy...
Energy from RBM samples:  -4.12024532814234

Epoch:  500
Sampling...
Done sampling. Calculating energy...
Energy from RBM samples:  -4.119890676021925


 Rydberg Hamiltonian :$H = -\sum_{<i,j>} Vij( \sigma_i^z \sigma_j^z + \sigma_i^z + \sigma_j^z)- \Omega \sum_i \sigma_i^z - h \sum_i \sigma_i^x$

In [4]:
n_vis = training_data.shape[1]
n_hin = 0

epochs = 1000
num_samples = 2000

exact_energy = -4.1203519096
error = 1.0

while error > 0.0001:
    
    n_hin += 1
    print("Hidden Unit number :", n_hin)
    rbm = RBM(n_vis, n_hin)
    
    for e in range(1, epochs+1):
        rbm.train(training_data)
        
#         if e % 100 == 0:
            
    init_state = torch.zeros(num_samples, n_vis)
    RBM_samples = rbm.draw_samples(100, init_state)

    energies = Rydberg_energy_calculator.energy(RBM_samples, rbm.wavefunction) 
    RBM_energy = energies.item()

    error = abs(RBM_energy - exact_energy)
    print("Error = {}   RBM Energy = {}  Exact Energy = {}".format(error, RBM_energy, exact_energy))
            
#         if error <= 0.0001:
#                 break

    
    

print("Minimum number of Hidden Units required to get error < 0.0001 = ", n_hin)

Hidden Unit number : 1
Error = 9.28301796854214e-06   RBM Energy = -4.1203426265820315  Exact Energy = -4.1203519096
Minimum number of Hidden Units required to get error < 0.0001 =  1


Interesting....... only 1 hidden unit is required, this might due to fact that the process is GPU accelerated.

In [5]:
# Multiply the number of hidden units by 2
n_hin = n_hin * 2
n_vis = training_data.shape[1]


# Start with 500 data points
n = 400

epochs = 1000
num_samples = 2000

exact_energy = -4.1203519096
error = 1.0

while error > 0.0001:
    
    n += 100
    print("Number of Sample Data :", n)
    trimmed_trainingData = training_data[0:n]
    
    rbm = RBM(n_vis, n_hin)
    
    for e in range(1, epochs+1):
        rbm.train(trimmed_trainingData)
        
        if e % 100 == 0:
            
            init_state = torch.zeros(num_samples, n_vis)
            RBM_samples = rbm.draw_samples(100, init_state)

            energies = Rydberg_energy_calculator.energy(RBM_samples, rbm.wavefunction) 
            RBM_energy = energies.item()

            error = abs(RBM_energy - exact_energy)
            print("Error = {}   RBM Energy = {}  Exact Energy = {}".format(error, RBM_energy, exact_energy))
            
            if error <= 0.0001:
                break

    
    

print("Minimum number of Data required to get error < 0.0001 = ", n)

Number of Sample Data : 500
Error = 0.9505218109094429   RBM Energy = -3.169830098690557  Exact Energy = -4.1203519096
Error = 0.2793773204268897   RBM Energy = -3.8409745891731104  Exact Energy = -4.1203519096
Error = 0.10330758350066382   RBM Energy = -4.017044326099336  Exact Energy = -4.1203519096
Error = 0.051779171079623154   RBM Energy = -4.068572738520377  Exact Energy = -4.1203519096
Error = 0.02856264163303468   RBM Energy = -4.091789267966965  Exact Energy = -4.1203519096
Error = 0.016848553297537094   RBM Energy = -4.103503356302463  Exact Energy = -4.1203519096
Error = 0.013667881244726843   RBM Energy = -4.106684028355273  Exact Energy = -4.1203519096
Error = 0.007329712248393072   RBM Energy = -4.113022197351607  Exact Energy = -4.1203519096
Error = 0.007061478694111223   RBM Energy = -4.113290430905889  Exact Energy = -4.1203519096
Error = 0.005879168156781134   RBM Energy = -4.114472741443219  Exact Energy = -4.1203519096
Number of Sample Data : 600
Error = 0.738810447

Error = 0.0016802927682171287   RBM Energy = -4.118671616831783  Exact Energy = -4.1203519096
Error = 0.0026946412822450583   RBM Energy = -4.117657268317755  Exact Energy = -4.1203519096
Error = 0.0021015302121059065   RBM Energy = -4.118250379387894  Exact Energy = -4.1203519096
Error = 0.002257766747058021   RBM Energy = -4.118094142852942  Exact Energy = -4.1203519096
Number of Sample Data : 1400
Error = 0.12407244657911454   RBM Energy = -3.9962794630208855  Exact Energy = -4.1203519096
Error = 0.022134941974789157   RBM Energy = -4.098216967625211  Exact Energy = -4.1203519096
Error = 0.007556216547389916   RBM Energy = -4.11279569305261  Exact Energy = -4.1203519096
Error = 0.0030277368767288593   RBM Energy = -4.117324172723271  Exact Energy = -4.1203519096
Error = 0.0031144796396400665   RBM Energy = -4.11723742996036  Exact Energy = -4.1203519096
Error = 0.0025371190315297554   RBM Energy = -4.11781479056847  Exact Energy = -4.1203519096
Error = 0.001990349337764208   RBM Ene

Error = 0.006146082087671978   RBM Energy = -4.114205827512328  Exact Energy = -4.1203519096
Error = 0.002092412885985162   RBM Energy = -4.118259496714015  Exact Energy = -4.1203519096
Error = 0.0014002294602875054   RBM Energy = -4.118951680139713  Exact Energy = -4.1203519096
Error = 0.0015159543405127707   RBM Energy = -4.118835955259487  Exact Energy = -4.1203519096
Error = 0.001103482020837987   RBM Energy = -4.119248427579162  Exact Energy = -4.1203519096
Error = 0.0009576301200135973   RBM Energy = -4.1193942794799865  Exact Energy = -4.1203519096
Error = 0.0004019778624533288   RBM Energy = -4.119949931737547  Exact Energy = -4.1203519096
Error = 0.0015823485833506012   RBM Energy = -4.1187695610166495  Exact Energy = -4.1203519096
Error = 0.0010415631254501179   RBM Energy = -4.11931034647455  Exact Energy = -4.1203519096
Number of Sample Data : 2300
Error = 0.037380606106696135   RBM Energy = -4.082971303493304  Exact Energy = -4.1203519096
Error = 0.005579485526621575   RBM

Error = 0.0012694670977921874   RBM Energy = -4.119082442502208  Exact Energy = -4.1203519096
Error = 0.0012166148933099308   RBM Energy = -4.11913529470669  Exact Energy = -4.1203519096
Error = 0.000970567936632527   RBM Energy = -4.119381341663368  Exact Energy = -4.1203519096
Number of Sample Data : 3100
Error = 0.017426790401880332   RBM Energy = -4.10292511919812  Exact Energy = -4.1203519096
Error = 0.0017743058046759685   RBM Energy = -4.118577603795324  Exact Energy = -4.1203519096
Error = 0.0016018579800221033   RBM Energy = -4.118750051619978  Exact Energy = -4.1203519096
Error = 0.0009902256021367961   RBM Energy = -4.119361683997863  Exact Energy = -4.1203519096
Error = 0.0007357754478558576   RBM Energy = -4.119616134152144  Exact Energy = -4.1203519096
Error = 0.0010638602159946942   RBM Energy = -4.119288049384005  Exact Energy = -4.1203519096
Error = 0.0007540067628930558   RBM Energy = -4.119597902837107  Exact Energy = -4.1203519096
Error = 0.0009186673850472005   RBM

Error = 0.0015514493751034308   RBM Energy = -4.118800460224897  Exact Energy = -4.1203519096
Error = 0.0014573008021026013   RBM Energy = -4.1188946087978975  Exact Energy = -4.1203519096
Error = 0.0007829884798828957   RBM Energy = -4.119568921120117  Exact Energy = -4.1203519096
Error = 0.0010507062710747306   RBM Energy = -4.119301203328925  Exact Energy = -4.1203519096
Error = 0.0009811279609985846   RBM Energy = -4.1193707816390015  Exact Energy = -4.1203519096
Error = 0.000675211719365798   RBM Energy = -4.119676697880634  Exact Energy = -4.1203519096
Error = 0.0009412556417034423   RBM Energy = -4.119410653958297  Exact Energy = -4.1203519096
Error = 0.0016208453694757097   RBM Energy = -4.118731064230524  Exact Energy = -4.1203519096
Number of Sample Data : 4000
Error = 0.007580850957591423   RBM Energy = -4.112771058642409  Exact Energy = -4.1203519096
Error = 0.001343387240253513   RBM Energy = -4.119008522359747  Exact Energy = -4.1203519096
Error = 0.0014726987249975565   

Error = 0.0007451416356536456   RBM Energy = -4.119606767964346  Exact Energy = -4.1203519096
Error = 0.0007801630778745405   RBM Energy = -4.1195717465221255  Exact Energy = -4.1203519096
Error = 0.0003937439767804918   RBM Energy = -4.11995816562322  Exact Energy = -4.1203519096
Number of Sample Data : 4800
Error = 0.0042322258881455355   RBM Energy = -4.1161196837118545  Exact Energy = -4.1203519096
Error = 0.0010158211847066667   RBM Energy = -4.119336088415293  Exact Energy = -4.1203519096
Error = 0.0006071254652111335   RBM Energy = -4.119744784134789  Exact Energy = -4.1203519096
Error = 0.0005143681320145532   RBM Energy = -4.1198375414679855  Exact Energy = -4.1203519096
Error = 0.0011665801692339883   RBM Energy = -4.119185329430766  Exact Energy = -4.1203519096
Error = 0.00025981783292650107   RBM Energy = -4.120092091767074  Exact Energy = -4.1203519096
Error = 0.000714297116269691   RBM Energy = -4.11963761248373  Exact Energy = -4.1203519096
Error = 0.0004817395122174162 

Error = 0.0006988708909245744   RBM Energy = -4.1196530387090755  Exact Energy = -4.1203519096
Error = 0.0004165490589622678   RBM Energy = -4.119935360541038  Exact Energy = -4.1203519096
Error = 0.0009488572873168621   RBM Energy = -4.119403052312683  Exact Energy = -4.1203519096
Error = 0.000813949800041236   RBM Energy = -4.119537959799959  Exact Energy = -4.1203519096
Error = 0.0007687296881631056   RBM Energy = -4.119583179911837  Exact Energy = -4.1203519096
Error = 0.0009328785201612178   RBM Energy = -4.119419031079839  Exact Energy = -4.1203519096
Error = 0.0006796035818847912   RBM Energy = -4.119672306018115  Exact Energy = -4.1203519096
Error = 0.0009551843330006804   RBM Energy = -4.119396725266999  Exact Energy = -4.1203519096
Number of Sample Data : 5700
Error = 0.0040614683407236285   RBM Energy = -4.1162904412592765  Exact Energy = -4.1203519096
Error = 0.0009624096869620402   RBM Energy = -4.119389499913038  Exact Energy = -4.1203519096
Error = 0.000973493736875497  

The issue here is that due to a low sample size and with a low number of hidden layer n_hid = 2 and a sample count of 2000, thus the number of sample data required for the RBM to learn the pattern is going to be high for it to truly learn the distribution of the $H_2$ data,, by increasing the number of *num_samples* would reduce the number of Data points required. 