# Training an RBM on dummy data

In this notebook, we'll go through some basics on how to train an RBM. The input data we will use is going to be synthetic.

First, import the things we need:

In [1]:
import numpy as np
import torch
from RBM_helper import RBM
import gzip
import pickle

Define number of training steps (epochs)

In [2]:
epochs = 400

Now, define the dummy data. The dummy data will contain the strings `[1,0,1,0]` and `[0,1,0,1]` with equal probability. After we train the RBM on this data, the RBM should only reproduce these two strings with equal probability (if the training went well!).

In [3]:
data = np.array([[1,0,1,0]]*1000 + [[0,1,0,1]]*1000)
data = torch.FloatTensor(data)

Now, define the RBM. We choose 3 visible units (because the input data is 4 dimensional) and 4 hidden units.

In [4]:
vis = len(data[0]) #input dimension

n_vis = vis
n_hid = vis # set the number of hidden units to the number of visible units for now

rbm = RBM(n_vis, n_hid)

Now train it!

In [5]:
for epoch in range(1,epochs+1):
    rbm.train(data)
    if epoch % 50 == 0:
        print("Epoch: ",epoch)

Epoch:  50
Epoch:  100
Epoch:  150
Epoch:  200
Epoch:  250
Epoch:  300
Epoch:  350
Epoch:  400


Did the RBM learn from the data at all? Let's find out. Draw 10 samples from RBM and print them. If the training was successful, the samples should be reminiscent of the input dummy data. You can also increase the number of samples and count how often each sample appears. They should be equally likely, because they are also equally likely in the input data.

In [6]:
num_samples = 10
k = 20 
init_state = torch.zeros(num_samples, n_vis)
rbm_samples = rbm.draw_samples(k, init_state)
print(rbm_samples.detach().numpy())   

[[0. 1. 0. 1.]
 [1. 0. 1. 0.]
 [1. 0. 1. 0.]
 [0. 1. 0. 1.]
 [0. 1. 0. 1.]
 [0. 1. 0. 1.]
 [1. 0. 1. 0.]
 [0. 1. 0. 1.]
 [0. 1. 0. 1.]
 [0. 1. 0. 1.]]


Not bad! It looks like the RBM learned the distribution of the input data.