# Generating Samples
----
### Introduction
Here we want to generate binary samples of length $v$ using Gibbs sampling based on a prior distribution based on a $v \times v$ matrix $W$ initialized from a known distribution. Using the samples generated, we would eventually want to reproduce/learn the matrix $W$ from the samples using minimum probability flow.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

% matplotlib inline

We shall consider a network with $v$ vertices where each vertex is binary. For a network with $v$ vertices, there are $2^v$ possibles binary states. We initialize an initial state of the network using the Bernoulli distribution with success probability $p$.

In [19]:
# Number of neurons:
v = 5
# Success probability:
p = 0.5

In [20]:
np.random.seed(0)
initialState = np.random.binomial(1, p, v)
print ('Initial states: ',initialState)

Initial states:  [1 1 1 1 0]


We now initialize a $v$ by $v$ matrix $W$ with each entry drawn from a standard normal distribution, $N(0,1)$. For each entry in the matrix, $W_{ij}$ denotes the parameter/weight associated with the connection from unit $i$ to $j$ (can we think of it as the conditional weight of $v^{(t+1)}_i=1$ given $v^{(t)}_j=1$?). Here we save the matrix $W$ so we can verify the learning by MPF. 

(Personal notes: Later, we will learn that initializing the matrix $W$ with zero diagonals will make it easier in the generation of samples.)

In [6]:
np.random.seed(0)
# Get a symmetric matrix with diagonal all zeros
W = np.random.normal(0, 1, (v,v))
W = 0.5 * (W + np.transpose(W))

# To save and load W matrix
np.save('W.dat', W)
# W = np.load('W.dat')
print (W)

[[ 1.76405235 -0.28856034  0.56139078  1.28728376 -0.34271591]
 [-0.28856034  0.95008842  0.65145815  0.69543011  0.53210855]
 [ 0.56139078  0.65145815  0.76103773 -0.04174162  0.65414972]
 [ 1.28728376  0.69543011 -0.04174162  0.3130677  -0.79813038]
 [-0.34271591  0.53210855  0.65414972 -0.79813038  2.26975462]]


### How to do Gibbs Sampling
The reason for doing Gibbs sampling is to generate samples $\mathcal{S}$ from known parameters $W$ and then use MPF to learn the parameters $W$ using $\mathcal{S}$. To sample from this multivariate distribution, we start with an initial state obtained from a prior belief, following which sampling from the conditional distribution is done to get a new state of a **vertex**. Thus if we were to sample each vertex sequentially, a network with $v$ vertices would require sampling from (different) conditional distributions $v$ times for a new state of the network to be obtained.

#### Algorithm: Gibbs sampler (cycle)
1. Initialize $\mathbf{x^{(0)}}=(x_1^{(0)},\ldots,x_v^{(0)})$ base on some prior belief.
2. For $i = 1,2, \ldots$
    - sample $X_1^{(i)}\sim \mathbb{P}(X_1^{(i)}=x_1^{(i)}\mid X_2=x_2^{(i-1)},\ldots,X_v=x_v^{(i-1)})$
    - sample $X_2^{(i)}\sim \mathbb{P}(X_2^{(i)}=x_2^{(i)}\mid X_1=x_1^{(i)},X_3=x_3^{(i-1)}\ldots,X_v=x_v^{(i-1)})$
    - in general, sample $X_j^{(i)}\sim \mathbb{P}(X_{j}^{(i)}=x_{j}^{(i)}\mid X_1=x_1^{(i)},\ldots, X_{j-1}=x_{j-1}^{(i)},X_{j+1}=x_{j+1}^{(i-1)},\ldots,X_v=x_v^{(i-1)})$ for $j=1,2, \ldots v$, which then generates a new state for the network.

There is also another variation of the Gibbs sampler called the **random scan** where the update of the state of the vertex is not in a cycle but done in a random manner. Below are some functions that are defined for the implementation of the Gibbs sampler.

In [15]:
def sigmoid(x):
    """
    Takes in a vector x and returns its sigmoid activation.
    Input:
    - x: a numpy array
    """
    return 1/(1 + np.exp(-x))


def single_unit_update(initialState, W, v):
    """
    Returns the new states and the state of the vth vertex that has been updated conditioned on the other units
    Input:
    - initialState: a numpy array of binary values denoting the initial state of the nodes.
    - W: a 2d numpy array of values that the prior distribution is based from. 
    - v: (int) the state of the vertex to be updated.
    """
    stateSize = initialState.shape[0]
    newState = initialState
#     Here we see that to update a single vertex state we only use the weights Wij for i not 
#     equal to j and hence the reason to set the diagonals to be zero earlier. But since 
#     we did not we have to kill off the diagonals of W here.
    prob = sigmoid((W - (W * np.eye(stateSize))).dot(initialState))
    newState[v] = np.random.binomial(1, prob[v], 1)
#     print (initialState[n], newState[n])
    return newState, newState[v]


def gibbs_sample(initialState, W):
    """
    Returns the new state of the network after updating all v units sequentially, given an initialized state 
    of the network and weight matrix W.
    Input:
    - initialState: a numpy array of binary values denoting the initial state of the nodes.
    - W: a 2d numpy array.
    """
#     print ('initialState:', initialState)
    stateSize = initialState.shape
    newState = np.zeros(stateSize)
    for i in range(stateSize[0]):
#         print ('Changing the state for unit %d...'% i)
        initialState, vertexState = single_unit_update(initialState, W, i)
#         print ('Old unit state is %d, new unit state is %d'% (initialState[i], unitState))
        newState[i] = vertexState  
#     print ('newState:', newState)
    return newState    

def ran_gibbs_sample(initialState, W):
    """
    Returns the new state of the network after updating all v units sequentially, given an initialized state 
    of the network and weight matrix W.
    Input:
    - initialState: a numpy array of binary values denoting the initial state of the nodes.
    - W: a 2d numpy array.
    """
#     print ('initialState:', initialState)
    stateSize = initialState.shape
    newState = np.zeros(stateSize)
    for i in range(stateSize[0]):
#         print ('Changing the state for unit %d...'% i)
        initialState, vertexState = single_unit_update(initialState, W, i)
#         print ('Old unit state is %d, new unit state is %d'% (initialState[i], unitState))
        newState[i] = vertexState  
#     print ('newState:', newState)
    return newState    



def multi_gibbs_sample(initialState, W, n):
    """
    Performs gibbs sampling n times with a given initial state and prior distribution matrix W
    and stores each sample as a row.
    Input:
    - initialState: a numpy array of binary values denoting the initial state of the nodes.
    - W: a 2d numpy array of values that the prior distribution is based from. 
    - n: (int) number of samples to be drawn.
    """
    stateSize = initialState.shape[0]
    sample = np.zeros((n, stateSize))
    for i in range(n):
        sample[i, :] = gibbs_sample(initialState, W)
    return sample    
        

In [17]:
s = multi_gibbs_sample(initialState, W, 10)

Changing the state for unit 0...
Old unit state is 1, new unit state is 1
Changing the state for unit 1...
Old unit state is 1, new unit state is 1
Changing the state for unit 2...
Old unit state is 1, new unit state is 1
Changing the state for unit 3...
Old unit state is 0, new unit state is 0
Changing the state for unit 4...
Old unit state is 1, new unit state is 1
Changing the state for unit 0...
Old unit state is 0, new unit state is 0
Changing the state for unit 1...
Old unit state is 1, new unit state is 1
Changing the state for unit 2...
Old unit state is 1, new unit state is 1
Changing the state for unit 3...
Old unit state is 0, new unit state is 0
Changing the state for unit 4...
Old unit state is 1, new unit state is 1
Changing the state for unit 0...
Old unit state is 0, new unit state is 0
Changing the state for unit 1...
Old unit state is 1, new unit state is 1
Changing the state for unit 2...
Old unit state is 1, new unit state is 1
Changing the state for unit 3...
Old u

In [18]:
s

array([[ 1.,  1.,  1.,  0.,  1.],
       [ 0.,  1.,  1.,  0.,  1.],
       [ 0.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  0.],
       [ 1.,  0.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  0.]])