# Neural Networks

In this notebook we will see what the MNIST dataset is, and how to perform classification on it using Neural Networks. The Python library we are going to use is PyBrain.

For a more detailed description of the problem, have a look at 
http://martin-thoma.com/classify-mnist-with-pybrain/

In [None]:
import gzip
from numpy import zeros, uint8, ravel

import pylab as plt
from pylab import imshow, show, cm

from pybrain.datasets import ClassificationDataSet
from pybrain.utilities import percentError
from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules import SoftmaxLayer

import os.path
import numpy as np
import idx2numpy

%matplotlib inline  


## Data

We are going to work with the MNIST (http://yann.lecun.com/exdb/mnist/) dataset of handwritten digits. In this case the data are small images, we can define a function to visualize them

In [None]:
# View a single image, possibly with the associated label
def view_image(image, label=""):
    print("Label: %s" % label)
    imshow(image, cmap=cm.gray)
    show()

Now we want to open our data. They come already split in training and test set. 

In [None]:
# Get test set

images = gzip.open('t10k-images-idx3-ubyte.gz', 'rb')
labels = gzip.open('t10k-labels-idx1-ubyte.gz', 'rb')

# sample size
rows=28
cols=28

# build a dictionary for the data
testing = {'x':idx2numpy.convert_from_file(images), 'y':idx2numpy.convert_from_file(labels), 'rows':rows, 'cols':cols}

print("Number of test samples: %i" % len(testing['x']))



Let's use the same procedure for training samples. For time reasons, we won't use all the training data (60K) but only a subset (1000).

In [None]:
Ntrain = 1000

images = gzip.open('train-images-idx3-ubyte.gz', 'rb')
labels = gzip.open('train-labels-idx1-ubyte.gz', 'rb')

rows=28
cols=28
training = {'x':idx2numpy.convert_from_file(images), 'y':idx2numpy.convert_from_file(labels), 'rows':rows, 'cols':cols}

idx = np.random.permutation(xrange(len(training['x'])))

training['x'] = training['x'][idx[:Ntrain], :, :]
training['y'] = training['y'][idx[:Ntrain]]

    
print("Number of training samples: %i" % len(training['x']))

Now we can visualize some of the samples - change the index to see different digits

In [None]:
index = 109
view_image(training['x'][index], label=training['y'][index])

In [None]:
plt.figure(figsize=(14, 12))
nrows = 2
ncols = 4
idx = np.random.permutation(xrange(len(training['x'])))
k = 0
for i in xrange(2):
    for j in xrange(4):
        plt.subplot(nrows, ncols, k)
        plt.imshow(training['x'][idx[k]], cmap=cm.gray)
        k+=1


## Building a Neural Network

Let's first have a look at the different parameters we need to build the network and perform the classification

In [None]:
# number of neurons in the hidden layer, that is, what extracts features/aspects of the input
hidden_neurons = 200

# number of iterations over the dataset to train the network
epochs = 10

# how much an updating step influences the current value of the weights
learning_rate = 0.01

# how fast the learning rate goes to zero
lrdecay = 1

# how much weight are reduced after each update
weightdecay = 0.01

# adds a fraction of the previous weight update to the current one
momentum = 0.1


Converting dataset for PyBrain usage

In [None]:
# how many features?
input_features = testing['rows'] * testing['cols']
print("Input features: %i" % input_features)

#MNIST has 10 classes (digits 0 to 9)
classes = 10
# build datasets with PyBrain
train_data = ClassificationDataSet(input_features, 1, nb_classes=classes)
test_data = ClassificationDataSet(input_features, 1, nb_classes=classes)

# add samples to training and test set
for i in range(len(testing['x'])):
    test_data.addSample(ravel(testing['x'][i]), [testing['y'][i]])
for i in range(len(training['x'])):
    train_data.addSample(ravel(training['x'][i]), [training['y'][i]])

# turns into convenient data structure for PyBrain
train_data._convertToOneOfMany()
test_data._convertToOneOfMany()



Building the network and performing classification

In [None]:
# building the network!
net = buildNetwork(train_data.indim, hidden_neurons, train_data.outdim, outclass=SoftmaxLayer)

# backpropagation trainer
trainer = BackpropTrainer(net, dataset=train_data, momentum=momentum,
                              verbose=False, weightdecay=weightdecay,
                              learningrate=learning_rate,
                              lrdecay=lrdecay)

# training and testing the network
for i in range(epochs):
    trainer.trainEpochs(1)
    train_res = percentError(trainer.testOnClassData(),
                                 train_data['class'])
    test_res = percentError(trainer.testOnClassData(
                                 dataset=test_data), test_data['class'])

    print "epoch: ", trainer.totalepochs
    print "train error: ", train_res
    print "test error: ", test_res


What do you think would happen if we used more training data? 

If you have time, build a cross validation step to see how much the error changes with different chunks of data.