In [1]:
import cPickle as pickle
import gzip
import numpy as np
import nnetwork as nn

The mnist data set can, e.g., be found on this webpage:
http://deeplearning.net/tutorial/gettingstarted.html
Here, we just store the datafile in the dataset directory.

In [2]:
fp = './datasets/mnist.pkl.gz'
with gzip.open(fp, 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f)

We need to transform the target values into one hot encoding.
For this we define a small helper function.

In [3]:
def one_hot_encoding(targets):
    n_examples = targets.shape[0]
    one_hot_targets = np.zeros((n_examples, 10))
    idx = np.arange(n_examples)
    one_hot_targets[idx, targets] = 1.
    return one_hot_targets

target_train = one_hot_encoding(train_set[1])
target_valid = one_hot_encoding(valid_set[1])
test_valid = one_hot_encoding(test_set[1])

We are going to implement a very basic loop for hyper parameter tuning of the l2-regularisation.
In each loop occurence, we create a neural network with a different l2 value and train until our train-criterion is broken. We stop training if the loss after one epoch is only marginally lower than the previous epoch's loss by the value of the precision variable, which we set in the fit function of the neural network. We then evaluate the accuracy on the validation set and chose the best l2 parameter based on this metric. (We could also use the loss on the validation set)

To initialise the network, we also have to define the number of layers and their number of nodes. Note, that only DenseLayers are implemented here thus far. We go for a very basic network to test our implementation. We connect the input layer directly with the output layer.

The parameters of the network are initialised randomly from a Gaussian distribution with mean = 0. and sigma = 0.5.

Before training, we do mean subtraction on the data. (We need to take the transpose to match our notation here. The input vector needs the shape (n_features, n_examples).

In [4]:
mean = np.mean(train_set[0], axis=0)
x_train = (train_set[0] - mean).T
x_valid = (valid_set[0] - mean).T
x_test = (test_set[0] - mean).T

In [5]:
n_layers = np.array([784, 10])
l2s = [0.01, 0.1, 1.]
best_acc = 0.
best_net = None
for l2 in l2s:
    net = nn.NeuralNet(n_layers, l2)
    net.train(x_train, target_train.T, precision=0.005)
    y_pred = net.predict(x_valid)
    acc = 100 * np.mean(np.where(y_pred==valid_set[1], 1, 0))
    print 'final validation accuracy: {}'.format(acc)
    if acc >= best_acc:
        best_net = net
        best_acc = acc



NEURAL NETWORK created!
2 layers in total (including In- and Output-layer)
1 parameter matrices

0 1.2 16.725429711608854
10 1.2 2.671846101435424
20 1.2 1.766095572005696
30 1.2 1.4600830873914332
40 1.2 1.3014568377464524
50 1.2 1.2024202124167436
60 1.2 1.1337871998275162
Final Cost: 1.1016339929932715
final validation accuracy: 85.8

NEURAL NETWORK created!
2 layers in total (including In- and Output-layer)
1 parameter matrices

0 1.2 15.152282475275971
10 1.2 2.620502120372509
20 1.2 1.7489864952331369
30 1.2 1.456645764689112
40 1.2 1.3039962257359063
50 1.2 1.2077457756605023
60 1.2 1.1404189546883947
Final Cost: 1.108666011691292
final validation accuracy: 85.78

NEURAL NETWORK created!
2 layers in total (including In- and Output-layer)
1 parameter matrices

0 1.2 16.524092187516043
10 1.2 2.687052534857849
20 1.2 1.7919482691303896
30 1.2 1.490592703839368
40 1.2 1.3327939255476424
50 1.2 1.2332068682435917
60 1.2 1.1635654428268551
Final Cost: 1.125770833429142
final validat

Now that we trained the networks and found the best value of the l2 parameter, we would like to evaluate the final accuracies, for the train, validation and test set.
For this, we simply use the predict function of the neural network:

In [6]:
y_pred = best_net.predict(x_train)
print 'final train accuracy: {}'.format(100 * np.mean(np.where(y_pred==train_set[1], 1, 0)))

final train accuracy: 83.792


In [7]:
y_pred = best_net.predict(x_valid)
print 'final validation accuracy: {}'.format(100 * np.mean(np.where(y_pred==valid_set[1], 1, 0)))

final validation accuracy: 85.8


In [8]:
y_pred = best_net.predict(x_test)
print 'final test accuracy: {}'.format(100 * np.mean(np.where(y_pred==test_set[1], 1, 0)))

final test accuracy: 85.11


We see that we can get a decent accuracy of hand-written digits recognition with our implementation of neural networks. This confirms that our implementation is correct (including the backpropagation).

It is surprising, that simply by connecting the input to the output Layer, we can get an accuray on the test set of 85%.