In [1]:
import numpy as np
import torch
from torch import nn
%matplotlib notebook

import qucumber
from qucumber.nn_states import PositiveWaveFunction
from qucumber.callbacks import MetricEvaluator, LivePlotting
from qucumber.utils.data import load_data
from qucumber.utils import training_statistics as ts

from masked_rbm import MaskedBinaryRBM

qucumber.set_random_seed(161)

In [2]:
data, psi = load_data("../../QuCumber/examples/Tutorial1_TrainPosRealWaveFunction/tfim1d_data.txt", 
                      "../../QuCumber/examples/Tutorial1_TrainPosRealWaveFunction/tfim1d_psi.txt")

In [3]:
nn_state = PositiveWaveFunction(10, 10, gpu=False)
space = nn_state.generate_hilbert_space()
init_params = {k: v.clone() for k, v in nn_state.rbm_am.named_parameters()}  # save the initial weights for later

In [4]:
me = MetricEvaluator(50, {"fidelity": ts.fidelity}, 
                     target_psi=psi, verbose=True, space=space)
lp = LivePlotting(50, me, "fidelity")
cbs = [me, lp]

We begin by training the unmasked model for 500 epochs to create a mask.

In [5]:
me.clear_history()  # clear the callback's internal history
nn_state.fit(data, lr=0.01, epochs=500, progbar="notebook", callbacks=cbs)

<IPython.core.display.Javascript object>

HBox(children=(IntProgress(value=0, description='Epochs ', max=500), HTML(value='')))

Epoch: 50	fidelity = 0.792118
Epoch: 100	fidelity = 0.904313
Epoch: 150	fidelity = 0.937336
Epoch: 200	fidelity = 0.958010
Epoch: 250	fidelity = 0.967849
Epoch: 300	fidelity = 0.972355
Epoch: 350	fidelity = 0.977644
Epoch: 400	fidelity = 0.982219
Epoch: 450	fidelity = 0.985747
Epoch: 500	fidelity = 0.988031



In [6]:
# mask 50% of the weights which have smallest magnitudes
masks = {k: MaskedBinaryRBM.create_mask(v, p=0.5) for k, v in nn_state.rbm_am.named_parameters()}
rbm = MaskedBinaryRBM(init_params, masks, gpu=False)

In [7]:
nn_state2 = PositiveWaveFunction(10, 10, gpu=False, module=rbm)

In [8]:
me.clear_history()
nn_state2.fit(data, lr=0.01, epochs=2000, progbar="notebook", callbacks=cbs)

<IPython.core.display.Javascript object>

HBox(children=(IntProgress(value=0, description='Epochs ', max=2000), HTML(value='')))

Epoch: 50	fidelity = 0.476748
Epoch: 100	fidelity = 0.650215
Epoch: 150	fidelity = 0.834613
Epoch: 200	fidelity = 0.916550
Epoch: 250	fidelity = 0.947965
Epoch: 300	fidelity = 0.965222
Epoch: 350	fidelity = 0.973701
Epoch: 400	fidelity = 0.978057
Epoch: 450	fidelity = 0.980836
Epoch: 500	fidelity = 0.983402
Epoch: 550	fidelity = 0.985436
Epoch: 600	fidelity = 0.986067
Epoch: 650	fidelity = 0.986876
Epoch: 700	fidelity = 0.987009
Epoch: 750	fidelity = 0.987009
Epoch: 800	fidelity = 0.988250
Epoch: 850	fidelity = 0.988368
Epoch: 900	fidelity = 0.988404
Epoch: 950	fidelity = 0.988923
Epoch: 1000	fidelity = 0.989299
Epoch: 1050	fidelity = 0.989839
Epoch: 1100	fidelity = 0.990230
Epoch: 1150	fidelity = 0.990309
Epoch: 1200	fidelity = 0.990942
Epoch: 1250	fidelity = 0.991167
Epoch: 1300	fidelity = 0.991548
Epoch: 1350	fidelity = 0.991557
Epoch: 1400	fidelity = 0.991750
Epoch: 1450	fidelity = 0.992199
Epoch: 1500	fidelity = 0.992579
Epoch: 1550	fidelity = 0.992662
Epoch: 1600	fidelity = 0.992

Note that at the 500th epoch, the fidelity is already higher than that of the original (unmasked) model at the same epoch. We let the model train further just to make sure the fidelity doesn't plateau, and to see how well the training converges. I found that fidelity does start plateau-ing earlier than 99% fidelity when more than 70% of the weights are being pruned. 65% pruning gave a plateau around 99.2% fidelity, 70% pruning gave a plateau around 97.5% fidelity, 80% pruning plateau'd around 86% fidelity, 90% pruning plateau'd around 57% fidelity.

Next, we check the fidelity of the masked model when its weights are set to their initial values. We don't worry about the biases as they are, by default, set to 0 upon initialization of the model.

In [9]:
nn_s = PositiveWaveFunction(10,10,gpu=False)
nn_s.rbm_am.weights = nn.Parameter(weights.clone() * mask)

NameError: name 'weights' is not defined

In [None]:
ts.fidelity(nn_s, psi, space)

Compare to the fidelity of a model with identical initial parameters, but with no masking applied:

In [None]:
nn_s_unmasked = PositiveWaveFunction(10,10,gpu=False)
nn_s_unmasked.rbm_am.weights = nn.Parameter(weights.clone())
ts.fidelity(nn_s_unmasked, psi, space)

We see that masking improves the initial model. Lastly, we will train this model for 2000 epochs to compare the performance of the masked model.

In [None]:
me.clear_history()
nn_s.fit(data, lr=0.01, epochs=2000, progbar="notebook", callbacks=cbs)

At epoch 2000, the unmasked model seems to have won out slightly, but the fidelities are too close together to make any solid conclusion. If we consider that the mask required 500 training epochs to produce, it makes more sense to compare the 1500th epoch of the masked model to the 2000th epoch of the unmasked. These two values are still relatively close together. Due to the closeness of the fidelities, it doesn't really make much sense to draw conclusions on each model's relative performance from these figures. Indeed, with the right random seed, the masked model sometimes wins out. 

We *can*, however, conclude that even with only half the number of weights, the RBM is still able to reconstruct the desired wavefunction with about the same accuracy.