# Search for Hyperparameters

The RBM does not train well on $r$ distances which are not 1.2.

Here, we will try to find good hyperparameters foar a larger number of cases.

Currently the hyperparameters available to us are the learning rate and the number of hidden units.

We can select for those which maximise accuracy and minimise time - or a combination of the two.

We will take a random sample of hyperparameters, as this has been shown to be more advantgeous than sweeping:
http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

In [1]:
from hyperparameter_helper import H2HyperparameterSearch
from variable_r_helper import RBMOnH2Train
import pickle as pkl
import numpy as np
import os
import torch
import pandas as pd
import numpy
import matplotlib.pyplot as plt

The file hyperparameter_helper.py contains a class which will run a sample over the given hyperparameters.
We need to specify the start, stop, and number of points for each hyperparameter, and the toatal number of samples if
randomly sampling.

In [3]:
lr_range = (0.0001, 0.5, 100)
n_hdn_range = (2, 20, 19)
n_param_samples = 50

hp_search =  H2HyperparameterSearch(lr_range, n_hdn_range, epochs=500, n_samples=1000, n_param_samples=n_param_samples,
                                    verbose=True)
res = hp_search.search_hyperparams()

True energy: -1.0642022250418146
Params: lr = 0.005099, n_hdn = 2

Epoch:  100
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9232028177012065

Epoch:  200
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9381345501446641

Epoch:  300
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9504060631362554

Epoch:  400
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9579996288469593

Epoch:  500
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9830290390556717
Training finished.
Final energy: -1.0025880588597966
Accuracy: 5.79%
Time taken sampling: 0.45933103561401367
Params: lr = 0.22005599999999997, n_hdn = 14

Epoch:  100
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -1.1235875633664527

Epoch:  200
Sampling the RBM...
Done sampling. Calculating energy...
Energy from R

Unnamed: 0,accuracy,epochs,final_energy,molecule,n_hidden,n_samples,time,true_energy
0,5.789705,500.0,-1.002588,H2,2.0,1000.0,32.167159,-1.064202
1,1.157698,500.0,-1.051882,H2,14.0,1000.0,44.723088,-1.064202
2,1.900086,500.0,-1.084423,H2,19.0,1000.0,49.485407,-1.064202
3,1.403128,500.0,-1.079134,H2,13.0,1000.0,46.93185,-1.064202
4,2.602004,500.0,-1.091893,H2,18.0,1000.0,51.886073,-1.064202
5,2.864759,500.0,-1.033715,H2,7.0,1000.0,43.853511,-1.064202
6,1.172951,500.0,-1.076685,H2,12.0,1000.0,45.912742,-1.064202
7,5.145459,500.0,-1.11896,H2,18.0,1000.0,50.882051,-1.064202
8,0.680145,500.0,-1.07144,H2,6.0,1000.0,39.188137,-1.064202
9,3.516163,500.0,-1.101621,H2,10.0,1000.0,49.078314,-1.064202


In [8]:
f_path = "./H2_hyperparam_search_0.pkl"
with open(f_path, 'rb') as f:
    results = pkl.load(f)
good = results[results['accuracy'] < 1.0]
good

Unnamed: 0,accuracy,epochs,final_energy,molecule,n_hidden,n_samples,time,true_energy
8,0.680145,500.0,-1.07144,H2,6.0,1000.0,39.188137,-1.064202
10,0.312067,500.0,-1.060881,H2,2.0,1000.0,32.441092,-1.064202
14,0.010858,500.0,-1.064318,H2,6.0,1000.0,38.519782,-1.064202
17,0.785453,500.0,-1.072561,H2,13.0,1000.0,44.493783,-1.064202
19,0.705266,500.0,-1.056697,H2,11.0,1000.0,41.997424,-1.064202
21,0.407654,500.0,-1.059864,H2,9.0,1000.0,42.81716,-1.064202
26,0.761419,500.0,-1.056099,H2,11.0,1000.0,41.780523,-1.064202
29,0.725197,500.0,-1.07192,H2,12.0,1000.0,43.83051,-1.064202
32,0.479381,500.0,-1.069304,H2,2.0,1000.0,32.087734,-1.064202
33,0.632317,500.0,-1.057473,H2,8.0,1000.0,43.207472,-1.064202


In [2]:
lr_range = (0.005, 0.05, 100)
n_hdn_range = (2, 15, 14)
n_param_samples = 50

hp_search =  H2HyperparameterSearch(lr_range, n_hdn_range, epochs=500, n_samples=1000, n_param_samples=n_param_samples,
                                    verbose=True)
results_fine = hp_search.search_hyperparams()

True energy: -1.0642022250418146
Params: lr = 0.00725, n_hdn = 3

Epoch:  100
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9073824847259213

Epoch:  200
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9534719771085887

Epoch:  300
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9642342542760198

Epoch:  400
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -0.9914683127888095

Epoch:  500
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -1.0084551509708022
Training finished.
Final energy: -0.9991613163095685
Accuracy: 6.11%
Time taken sampling: 0.9506816864013672


Params: lr = 0.0383, n_hdn = 14

Epoch:  100
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  -1.0641605930240368

Epoch:  200
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  

In [6]:
results_fine

Unnamed: 0,accuracy,epochs,final_energy,learning_rate,molecule,n_hidden,n_samples,time,true_energy
0,6.111706,500.0,-0.999161,0.00725,H2,3.0,1000.0,75.151241,-1.064202
1,0.900474,500.0,-1.073785,0.0383,H2,14.0,1000.0,110.868697,-1.064202
2,0.382266,500.0,-1.060134,0.01445,H2,6.0,1000.0,91.826155,-1.064202
3,0.855124,500.0,-1.073302,0.03425,H2,14.0,1000.0,114.484857,-1.064202
4,0.770478,500.0,-1.056003,0.0176,H2,5.0,1000.0,84.943738,-1.064202
5,1.619794,500.0,-1.08144,0.04145,H2,13.0,1000.0,100.926268,-1.064202
6,0.162429,500.0,-1.062474,0.03785,H2,2.0,1000.0,80.626601,-1.064202
7,0.660299,500.0,-1.071229,0.0419,H2,2.0,1000.0,77.645098,-1.064202
8,0.620135,500.0,-1.070802,0.0455,H2,6.0,1000.0,89.081964,-1.064202
9,0.780544,500.0,-1.055896,0.02615,H2,14.0,1000.0,116.199287,-1.064202


In [7]:
good_fine = results_fine[results_fine['accuracy'] < 0.25]
good_fine

Unnamed: 0,accuracy,epochs,final_energy,learning_rate,molecule,n_hidden,n_samples,time,true_energy
6,0.162429,500.0,-1.062474,0.03785,H2,2.0,1000.0,80.626601,-1.064202
25,0.22035,500.0,-1.061857,0.01715,H2,6.0,1000.0,124.238229,-1.064202
27,0.157101,500.0,-1.065874,0.03245,H2,7.0,1000.0,115.151458,-1.064202
31,0.12496,500.0,-1.062872,0.04325,H2,5.0,1000.0,111.111249,-1.064202
38,0.041679,500.0,-1.064646,0.02165,H2,5.0,1000.0,90.127171,-1.064202
40,0.24079,500.0,-1.06164,0.04325,H2,13.0,1000.0,113.513227,-1.064202
41,0.092873,500.0,-1.063214,0.0437,H2,2.0,1000.0,77.78558,-1.064202
48,0.009738,500.0,-1.064306,0.02525,H2,13.0,1000.0,116.229354,-1.064202


It looks like our optimal learning rate falls between 0.02 and 0.05, and that the number of hidden units can be as high
as 13, but is generally lower.

Good results can even be found with 2 hidden units.

Let's see if we can use these results to plot the energy profile for more values of $r$.

In [2]:
r_trainer = RBMOnH2Train(lr=0.02, verbose=True, epochs=1000)
results = r_trainer.train_for_all_r()
results

r = 0.2
True Energy: 0.1442108747311382

Epoch:  100
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.17046417400769812

Epoch:  200
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.1640083243417159

Epoch:  300
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.17188552696016063

Epoch:  400
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.11633451303921195

Epoch:  500
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.15912163029779733

Epoch:  600
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.15838919743490204

Epoch:  700
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.16087116769414755

Epoch:  800
Sampling the RBM...
Done sampling. Calculating energy...
Energy from RBM samples:  0.15410334912327037

Epoch:  900
Sampling the RBM...


Unnamed: 0,accuracy,epochs,final_energy,learning_rate,molecule,n_hidden,n_samples,time,true_energy
0,7.339718,1000.0,0.154796,0.02,H2,6.0,1000.0,212.850434,0.144211
1,5.768698,1000.0,-0.305249,0.02,H2,6.0,1000.0,257.395499,-0.323935
2,2.294543,1000.0,-0.598841,0.02,H2,6.0,1000.0,305.206533,-0.612904
3,2.599812,1000.0,-0.779698,0.02,H2,6.0,1000.0,320.414579,-0.80051
4,1.786193,1000.0,-0.908731,0.02,H2,6.0,1000.0,276.513867,-0.925258
5,2.0592,1000.0,-0.988231,0.02,H2,6.0,1000.0,200.771632,-1.009009
6,3.957378,1000.0,-1.02323,0.02,H2,6.0,1000.0,195.294003,-1.065391
7,1.083623,1000.0,-1.090352,0.02,H2,6.0,1000.0,196.147779,-1.102297
8,0.347578,1000.0,-1.121675,0.02,H2,6.0,1000.0,194.021504,-1.125587
9,0.353029,1000.0,-1.13492,0.02,H2,6.0,1000.0,190.772524,-1.138941
