# Assignment 3

Restricted Boltzmann Machine (RBM) + Constructive Divergence


# Restricted Boltzmann Machine (RBM) and Contrastive Divergence  

## Description  
We begin by defining the steps of the **Contrastive Divergence (CD)** algorithm, used for training an RBM. Below are the detailed steps to implement a generalized RBM.  

### Iterations  

1. **Initialization**  
   Start with an input vector $ v^{(0)} $ (real-world data).  

2. **Hidden Layer Calculation**  
   $$
   P(h_j^{(0)} \mid v^{(0)}) = \sigma\left(\sum_i v_i^{(0)} w_{ij} + b_j\right)
   $$  

3. **Visible Neurons Reconstruction**  
   $$
   P(v_i^{(1)} \mid h^{(0)}) = \sigma\left(\sum_j h_j^{(0)} w_{ij} + a_i\right)
   $$  

4. **Hidden Neurons Recalculation**  
   $$
   P(h_j^{(1)} \mid v^{(1)}) = \sigma\left(\sum_i v_i^{(1)} w_{ij} + b_j\right)
   $$  

5. **Weights Update**  
   $$
   \Delta w_{ij} = \eta \left( v_i^{(0)} h_j^{(0)} - v_i^{(1)} h_j^{(1)} \right)
   $$  

### Notes  
- $\sigma$: Sigmoid activation function.  
- $\eta$: Learning rate.  
- Steps 2-4 form a **Gibbs sampling step**, used to approximate the gradient during training.

## Implementation
First, we define the sigmoid function:

In [None]:
import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

Next, we define methods to calculate the probabilities for hidden and visible neurons:

In [None]:
# Calculate hidden neuron probabilities
def prop_up(v, W, b):
    return sigmoid(np.dot(v, W) + b)

# Calculate visible neuron probabilities
def prop_down(h, W, a):
    return sigmoid(np.dot(h, W.T) + a)

We also need a method to sample binary values from a probability distribution:

In [None]:
def sample_prob(probs):
    return np.random.binomial(1, probs)

The Contrastive Divergence algorithm can then be implemented as follows:

In [None]:
def contrastive_divergence(v0, W, a, b, lr, k=1):
    # Step 1: v0 (real input data)
    # Step 2 (Hidden layer calculation)
    h_prob_0 = prop_up(v0, W, b)
    h_state = sample_prob(h_prob_0)

    for step in range(k):
        # Step 3 (Visible Neurons Reconstruction)
        v_prob = prop_down(h_state, W, a)
        v_state = sample_prob(v_prob)

        # Step 4 (Hidden neurons recalc)
        h_prob = prop_up(v_state, W, b)
        h_state = sample_prob(h_prob)

    # Step 5: Update weights and biases
    W += lr * (np.outer(v0, h_prob_0) - np.outer(v_state, h_prob))
    a += lr * (v0 - v_state)
    b += lr * (h_prob_0 - h_prob)

    error = np.mean((v0 - v_prob) ** 2)

    return W, a, b, error

Finally, the training method for our RBM can be written as:

In [None]:
def train_rbm(x, n_visible, n_hidden, epochs=10, k=1, lr=0.01):
    # Weight initialization from Gaussian distribution
    W = np.random.normal(0, 0.01, (n_visible, n_hidden))
    a = np.zeros(n_visible)  # Visible bias
    b = np.zeros(n_hidden)   # Hidden bias

    for epoch in range(epochs):
        error = 0
        for v in x:
            W, a, b, cd_error = contrastive_divergence(v, W, a, b, lr, k)
            error += cd_error

        error /= len(x)
        print(f"Epoch {epoch}, Error: {error:.4f}")

    return W, a, b

Where W, a, and b are the learned weights and biases.