# Restricted Boltzmann Machine

These are shallow neural networks that learns to reconstruct data by an unsupervised manner. The first layer is called <b>Visible Layer</b> and the second layer is called <b>Invisible Layer</b>. Its called restricted because the neurons in the same layer are not connected to each other. RBM is a generative model. A generative model specify a probability distribution over a dataset of input vectors. We can do both supervise and unsupervised tasks with generative models.
    
- In unsupervised task we design the model to find P(x), where P is the probability given x as an input.
- In supervised task we desing the model to find P(x|y), that is probability of x given y(label of x).

## Importing stuff

we will be import python script that will help us in processing the outputs

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
from PIL import Image
from utils import tile_raster_images
import matplotlib.pyplot as plt
%matplotlib inline

The RMB has 2 layers (visible and hidden). Here we have 7 neurons for our visible layer and 2 neurons for our invisible layer. Each neuron in a layer will have its bias.

In [3]:
v_bias = tf.placeholder('float', [7]) # visible layer bias
h_bias = tf.placeholder('float', [2]) # hidden layer bias

Here the weights matrix between the visible and hidden layer will be of 7x2

In [5]:
W = tf.constant(np.random.normal(loc=0.0, scale=1.0,
                                 size=(7,2)).astype(np.float32))

RBM has 2 phases: 
- Forward pass
- Backword pass or Reconstruction
 
1) Forward Pass: <br>
In forward pass the model takes one input say X through all the visible nodes, and pass it to the hidden nodes. In the hidden node the input X is multiplied by $W_{ij}$ and then added to h_bias. The result is then fed into a sigmoid function, which gives the output, that is $P({h_j})$, where j is the unit number.
 
Here $P({h_j})$ represents the probabilities of the hidden units. And all values together its called the probability distribution. 

In [9]:
sess = tf.Session()
X = tf.constant([[1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]])
v_state = X
print("INPUT: ", sess.run(v_state))

h_bias = tf.constant([0.1,0.1])
print("h_bias: ", sess.run(h_bias))
print("W: ", sess.run(W))

# hidden layer output
h_prob = tf.nn.sigmoid(tf.matmul(v_state, W) + h_bias) 
print("P(h|v): ", sess.run(h_prob))

# Drawing samples from the distribution
h_state = tf.nn.relu(tf.sign(h_prob - tf.random_uniform(tf.shape(h_prob)))) # states
print("h0 states: ", sess.run(h_state))

INPUT:  [[1. 0. 0. 1. 0. 0. 0.]]
h_bias:  [0.1 0.1]
W:  [[-1.4318179  -1.6963902 ]
 [-0.22592606 -0.36571947]
 [-1.2760918   1.0116413 ]
 [-0.14385842 -0.3269168 ]
 [-1.6623821  -1.1617765 ]
 [ 0.05079454 -0.9133033 ]
 [ 1.6075933  -0.31550974]]
P(h|v):  [[0.18608138 0.12749326]]
h0 states:  [[0. 0.]]


2) Backword Pass or Reconstruction: <br>

Now the hidden layer will act as the input to the model. Means, h will become the input in backward pass, with the same weight matrix and bias the produced output will try to reconstruct the original input.

In [10]:
vb = tf.constant([0.1,0.2,0.1,0.1,0.1,0.2,0.1])
print("bias: ", sess.run(vb))
v_prob = sess.run(tf.nn.sigmoid(tf.matmul(h_state, tf.transpose(W)) + vb))
print("P(vi|h): ", v_prob)
v_state = tf.nn.relu(tf.sign(v_prob - tf.random_uniform(tf.shape(v_prob))))
print("v probability states: " , sess.run(v_state))

bias:  [0.1 0.2 0.1 0.1 0.1 0.2 0.1]
P(vi|h):  [[0.5249792  0.54983395 0.5249792  0.5249792  0.5249792  0.54983395
  0.5249792 ]]
v probability states:  [[1. 0. 1. 1. 1. 0. 1.]]
