
# Handwritten digit recognition using 2-Layer NN

The input dataset `digits` consists of 8x8 pixel images of handwritten digits. The goal is to recognize a handwritten digit. 

* Since an input image has 8 x 8 pixels, we have totally 8x8 = 64 features in the dataset. 
* We will assume that there are m neurons in one hidden layer
* Also since there are 10 digits, the output layer (the only layer) consists of 10 neurons.

Therefore the weight matrices for the two layers have shapes (64, m) and (m, 10)

Next We load the dataset digits and split the dataset randomly into training dataset and test dataset.

In [1]:
from sklearn import datasets
digits = datasets.load_digits()
X = digits.data
Y = digits.target
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33)

### Initialize `MNPClassifier` 

Note the value of argument `hidden_layer_sizes`.

In [2]:
from sklearn.neural_network import MLPClassifier
mm = 10
mlp = MLPClassifier(
    hidden_layer_sizes=(2,),  # One hidden Layer
    alpha=0,
    solver="sgd",
    verbose=True,
    learning_rate_init=0.1,
)
mlp.fit(X_train, Y_train)

Iteration 1, loss = 2.67129034
Iteration 2, loss = 2.38383385
Iteration 3, loss = 2.34046651
Iteration 4, loss = 2.31745457
Iteration 5, loss = 2.31713061
Iteration 6, loss = 2.31370710
Iteration 7, loss = 2.31042903
Iteration 8, loss = 2.30847421
Iteration 9, loss = 2.31647243
Iteration 10, loss = 2.31219381
Iteration 11, loss = 2.30728487
Iteration 12, loss = 2.30752042
Iteration 13, loss = 2.30666700
Iteration 14, loss = 2.30826543
Iteration 15, loss = 2.30975356
Iteration 16, loss = 2.30838748
Iteration 17, loss = 2.30912299
Iteration 18, loss = 2.30834181
Iteration 19, loss = 2.30787380
Iteration 20, loss = 2.30946118
Iteration 21, loss = 2.31086564
Iteration 22, loss = 2.31184143
Iteration 23, loss = 2.31858411
Iteration 24, loss = 2.31708724
Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.


MLPClassifier(alpha=0, hidden_layer_sizes=(2,), learning_rate_init=0.1,
              solver='sgd', verbose=True)

In [3]:
import numpy as np
from sklearn import metrics
Y_pred = mlp.predict(X_test)
print('Accuracy = %0.4f'%np.mean(Y_pred == Y_test))
print("Confusion matrix:\n%s" % metrics.confusion_matrix(Y_test, Y_pred))

Accuracy = 0.0909
Confusion matrix:
[[ 0  0  0  0 63  0  0  0  0  0]
 [ 0  0  0  0 54  0  0  0  0  0]
 [ 0  0  0  0 61  0  0  0  0  0]
 [ 0  0  0  0 60  0  0  0  0  0]
 [ 0  0  0  0 54  0  0  0  0  0]
 [ 0  0  0  0 67  0  0  0  0  0]
 [ 0  0  0  0 60  0  0  0  0  0]
 [ 0  0  0  0 61  0  0  0  0  0]
 [ 0  0  0  0 59  0  0  0  0  0]
 [ 0  0  0  0 55  0  0  0  0  0]]


Class probabilities can be predicted using the `predict_proba` method as shown below.

In [4]:
mlp.predict_proba(X_test[np.newaxis, 0])

array([[0.10852528, 0.09449283, 0.08766922, 0.12291573, 0.1330957 ,
        0.08034748, 0.10257861, 0.08362342, 0.09771277, 0.08903896]])

In [5]:
mlp.predict(X_test[np.newaxis, 0])

array([4])