In [1]:
import numpy as np
import theano.tensor as T

Here we shall implement the cost function of Minimum Probability Flow

It can be shown that, for $E_{x}(W,b)=-\frac{1}{2}(x^TWx+bx)$ we have
$$E_x(W,b)-E_{x'}(W,b)=(1/2-x_h)(Wx+b)_h$$
where $x$ and $x'$ are data vectors with a [Hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) of one. The cost function of MPF denoted by $K(\theta)$ is given by:
$$K(\theta) = \frac{\epsilon}{|D|}\sum_{x\in D}\sum_{h=1}^{d}\exp\left[(1/2-x)_h(Wx+b)_h\right]$$
where $x$ is vector in the dataset $D$, $d$ is the dimension of the vector $x$ and $W, b$ are the weights and bias to be learnt respectively.

In [2]:
data = np.load('gibbs-sample.dat.npy')
print ('Shape of data: ', data.shape)

Shape of data:  (50000, 16)


Initialise parameters for $W$ and $b$

In [3]:
# Parameters
v = 16
epsilon = 0.01
D = data.shape[0]

In [4]:
W = np.random.rand(v, v)
b = np.random.rand(v, 1)
print ('Shape of W:', W.shape)
print ('Shape of b:', b.shape)

Shape of W: (16, 16)
Shape of b: (16, 1)


In [5]:
def Kcost(data, W, b):
    return np.sum(np.exp((0.5 - data) * data.dot(W.T) + b.T)) * (epsilon/D)

In [6]:
Kcost(data, W, b)

1.0261404989342755