Assignment 3

Implement from scratch an RBM and apply it to DSET3. The RBM should be implemented fully by you (both CD-1 training and inference steps) but you are free to use library functions for the rest (e.g. image loading and management, etc.).

1.     Train an RBM with a number of hidden neurons selected by you (single layer) on the MNIST data (use the training set split provided by the website).

2.     Use the trained RBM to encode a selection of test images (e.g. using one per digit type) using the corresponding activation of the hidden neurons.

3.    Train a simple classifier (e.g. any simple classifier in scikit) to recognize the MNIST digits using as inputs their encoding obtained at step 2. Use the standard training/test split. Show a performance metric of your choice in the presentation/handout.

In [2]:
import matplotlib.pyplot as plt
import idx2numpy
tr_images=idx2numpy.convert_from_file('./dataset/train-images.idx3-ubyte')
tr_labels=idx2numpy.convert_from_file('./dataset/train-labels.idx1-ubyte')
ts_images=idx2numpy.convert_from_file('./dataset/t10k-images.idx3-ubyte')
ts_labels=idx2numpy.convert_from_file('./dataset/t10k-labels.idx1-ubyte')

In [3]:
import numpy as np
def sigmoid(x):
    return 1/(1+np.exp(-x))

In [4]:
class RBM:
    def __init__(self,visible_size,hidden_size):

        self.visible_bias= np.zeros(visible_size,dtype='float64')
        self.hidden_bias= np.zeros(hidden_size,dtype='float64')

        self.weights=np.random.normal(scale=0.01,size=(visible_size,hidden_size))
        print(f"buildinig a RBM with {visible_size} visible units and {hidden_size} hidden units")

    def _sample(self,prob):
        return (prob > np.random.rand(*prob.shape)).astype(np.float64)
    def sample_hidden(self,v):
        ha_prob= sigmoid(v@self.weights+self.hidden_bias)
        ha_states= self._sample(ha_prob)
        return ha_prob,ha_states
    def sample_visible(self,h):
        recon_prob=sigmoid(h@self.weights.T+self.visible_bias)
        recon_act= self._sample(recon_prob)
        return recon_prob,recon_act
    def train(self,values,eta=0.01,epochs=100,batch_size=64):
        print(f"training started with {values.shape[0]} samples \nepochs={epochs}\t batch size={batch_size}\t learning rate={eta}")
        for e in range(epochs):
            for i in range(0,values.shape[0],batch_size):
                # clamp data as input
                clamped_data= self._sample(values[i:i+batch_size])
                #sample h given v
                ha_prob,ha_states=self.sample_hidden(clamped_data)
                #calculate wake part
                wake=clamped_data.T@ha_prob
                #sample v given h
                recon_prob,recon_act=self.sample_visible(ha_states)
                active_prob=sigmoid(recon_act@self.weights+ self.hidden_bias)
                #calculate dream part
                dream=recon_act.T@active_prob
                delta_w=(wake-dream)/batch_size
                delta_bh = (np.mean(ha_prob, axis=0) - np.mean(active_prob, axis=0))
                delta_bv = (np.mean(clamped_data, axis=0) - np.mean(recon_act, axis=0))

                self.weights+=eta*delta_w
                self.hidden_bias+=eta*delta_bh
                self.visible_bias+=eta*delta_bv
            clamped_data= self._sample(values)
            ha_prob,ha_states=self.sample_hidden(clamped_data)
            recon_prob,recon_act=self.sample_visible(ha_states)
            print(f"epoch no.{e+1} reconstruction error: {np.mean((clamped_data-recon_act)**2)}")
    def encode(self,data):
        #sample h given v
        _,ha_states=self.sample_hidden(data)
        return ha_states

In [14]:
rbm=RBM(28*28,50)

training=tr_images.reshape((-1,28*28))
#binarize the data
training=(training>127).astype(np.float64)
rbm.train(training,
          eta=0.2,
          epochs=10,
          batch_size=64
          )

buildinig a RBM with 784 visible units and 50 hidden units
training started with 60000 samples and 10 epochs and batch size of 64
epoch no.1 reconstruction error: 0.09015204081632654
epoch no.2 reconstruction error: 0.08310070153061225
epoch no.3 reconstruction error: 0.08007946428571429
epoch no.4 reconstruction error: 0.07856607142857143
epoch no.5 reconstruction error: 0.07730799319727891
epoch no.6 reconstruction error: 0.07683022959183673
epoch no.7 reconstruction error: 0.07616315901360544
epoch no.8 reconstruction error: 0.07548830782312925
epoch no.9 reconstruction error: 0.07488426870748299
epoch no.10 reconstruction error: 0.07443101615646258


In [15]:
test=ts_images.reshape((-1,28*28))
test=(test>127).astype(np.float64)


In [21]:
h_train=rbm.encode(training)
h_test=rbm.encode(test)

In [30]:
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
nb=GaussianNB().fit(training,tr_labels)
pred=nb.predict(test)
print(accuracy_score(ts_labels,pred))
nb=GaussianNB().fit(h_train,tr_labels)
pred=nb.predict(h_test)
print(accuracy_score(ts_labels,pred))

0.5391
0.6894


In [28]:
from sklearn.neural_network import MLPClassifier
from sklearn.linear_model import LogisticRegression
mlp=LogisticRegression().fit(training,tr_labels)
pred=mlp.predict(test)
print(accuracy_score(ts_labels,pred))
mlp=LogisticRegression().fit(h_train,tr_labels)
pred=mlp.predict(h_test)
print(accuracy_score(ts_labels,pred))

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


0.917
0.8797


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [29]:
from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=5).fit(training, tr_labels)
pred=neigh.predict(test)
print(accuracy_score(ts_labels,pred))
neigh=KNeighborsClassifier(n_neighbors=5).fit(h_train,tr_labels)
pred=neigh.predict(h_test)
print(accuracy_score(ts_labels,pred))

0.958
0.9218
