# Python-MLearning: Digits recognition using Neural Network (NN) and Sklearn Library

## Model: Digits 0-9 approach using Train_Test_Split method


By: Hector Alvaro Rojas &nbsp;&nbsp;|&nbsp;&nbsp; Data Science, Visualizations and Applied Statistics &nbsp;&nbsp;|&nbsp;&nbsp; April 14, 2018<br>
    Url: [http://www.arqmain.net]   &nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;   GitHub: [https://github.com/arqmain]
    <hr>

## I IMPORT REQUIRED PACKAGES

In [1]:
%matplotlib inline
import os
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
from IPython.display import Image
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from datetime import datetime


## II LOADING DATA

In [2]:
#Checking working directory
# import os
os.getcwd()

'C:\\Users\\Alvaro\\Documents\\R-Python-Projects_April042018\\Python_Projects\\Machine-Learning\\NNetwork\\NN2\\DigitSklearn\\Python'

In [5]:
#List files in a directory
os.listdir()

['.ipynb_checkpoints',
 'mnist_My.csv',
 'PYTHON-MLearning_NN2.ipynb',
 'PYTHON-MLearning_NN2_GridSearchCV.ipynb',
 'PYTHON-MLearning_NN2_KFold.ipynb',
 'PYTHON-MLearning_NN2_RandomizedSearchCV.ipynb']

In [6]:
# read csv (comma separated value) into data
data=pd.read_csv('mnist_My.csv')
data.columns

Index(['label', 'pixel0', 'pixel1', 'pixel2', 'pixel3', 'pixel4', 'pixel5',
       'pixel6', 'pixel7', 'pixel8',
       ...
       'pixel774', 'pixel775', 'pixel776', 'pixel777', 'pixel778', 'pixel779',
       'pixel780', 'pixel781', 'pixel782', 'pixel783'],
      dtype='object', length=785)

## II NN MODELING

# Train and Validation Datasets

In [7]:
# train test split
X=data.iloc[:,1:]
y=data.iloc[:,0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4)

# Train and Test dataset size details
print("X Shape :: ", X.shape)
print("y :: ", y.shape)
print("X_train Shape :: ", X_train.shape)
print("y_train Shape :: ", y_train.shape)
print("X_test Shape :: ", X_test.shape)
print("y_test Shape :: ", y_test.shape)

X Shape ::  (70000, 784)
y ::  (70000,)
X_train Shape ::  (56000, 784)
y_train Shape ::  (56000,)
X_test Shape ::  (14000, 784)
y_test Shape ::  (14000,)


## Build Model

### Fit the model and evaluate it 

#### Fitting the Model

There are various options associated with NN classification object, like "activation", "Number of Layers" , and "Number of Neurons in a layer" etc. All of this form part of the tune possibilities of the model.  You can view full list of tunable parameters [here](http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier).

We will use the default values for the parameters to expose the Train_Test_Split method.

In [8]:
# Fitting NN model
from sklearn.neural_network import MLPClassifier
startTime = datetime.now()
mlp = MLPClassifier(solver='adam',activation = 'logistic', hidden_layer_sizes=(16,16))
mlp.fit(X_train,y_train)
print ('Total running time (H: M: S. ThS)', datetime.now()-startTime, 'seconds.')


Total running time (H: M: S. ThS) 0:01:07.792877 seconds.


#### Evaluating the model

In [9]:
# Evaluating NN model
print('With Neural Network () accuracy is: ',round(mlp.score(X_train,y_train),4)) # accuracy 

With Neural Network () accuracy is:  0.9014


In [10]:
# Confusion matrix with NN
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.neural_network import MLPClassifier
#from sklearn.metrics import accuracy_score


prediction = mlp.predict(X_test)
# compute the overall accuracy and display the classification report
print("Model --> NEURAL NETWORK (NN)")
print("Overall Accuracy: {}".format(accuracy_score(y_test, prediction)))
cm = confusion_matrix(y_test,prediction)
print('Confusion matrix: \n',cm)
print('Classification report: \n',classification_report(y_test,prediction))


Model --> NEURAL NETWORK (NN)
Overall Accuracy: 0.8984285714285715
Confusion matrix: 
 [[1314    0    9    6    8   10    9    9   12    1]
 [   1 1511   13   15    3   11    1    1   24   12]
 [  16   14 1202   19   18    4   15   30   47    2]
 [  11    5   42 1241    1   49    4   36   24   11]
 [   3    4    7    0 1239    2   13    5    6   83]
 [  25    1   17   68   10 1076   10    6   33   18]
 [  12    4   12    1   16   36 1326    0    6    0]
 [   7    5   20   11   15    1    1 1264   14   53]
 [  15   11   25   33   18   70   12   25 1121   17]
 [  12    3    2   23   73    5    3   51    6 1284]]
Classification report: 
              precision    recall  f1-score   support

          0       0.93      0.95      0.94      1378
          1       0.97      0.95      0.96      1592
          2       0.89      0.88      0.89      1367
          3       0.88      0.87      0.87      1424
          4       0.88      0.91      0.90      1362
          5       0.85      0.85      

Precision, recall and f1-score are metrics to measure the accuracy of classification models. A general explanation can be got in [Wikipedia](https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers).

Looks like we misclassified 1422 digit's images, leaving us with a 89.84% accuracy rate (with 90% precision and 90% recall).

If you do want to extract the MLP weights and biases after training your model, you use its public attributes coefs_ and intercepts_.

<b>coefs_</b>is a list of weight matrices, where weight matrix at index i represents the weights between layer i and layer i+1.

<b>intercepts_</b> is a list of bias vectors, where the vector at index i represents the bias values added to layer i+1.

In [9]:
len(mlp.coefs_)

3

In [10]:
mlp.coefs_

[array([[ 9.60888334e-243,  2.63443793e-242,  1.63742258e-242, ...,
          2.62135788e-248,  2.85354244e-244,  1.29356226e-243],
        [-5.81074585e-249, -1.39882699e-244,  1.90901638e-242, ...,
         -4.77292375e-246, -3.21027308e-242,  8.81845281e-249],
        [-1.97546114e-244,  2.87589506e-248, -6.97017510e-245, ...,
         -4.48198433e-244,  1.41331372e-245, -1.45792721e-249],
        ...,
        [-1.34739769e-248, -4.98126981e-247,  9.57416815e-249, ...,
          7.24648480e-247,  7.44622543e-245, -5.21786464e-243],
        [-2.89146059e-250,  9.41086518e-242, -8.68046951e-247, ...,
          2.03263489e-244,  2.57196180e-242, -1.28909927e-249],
        [ 6.45799605e-247, -7.91338584e-250, -1.75720962e-247, ...,
          1.20935843e-249, -9.06216706e-248, -1.99529584e-245]]),
 array([[ 0.25052344, -0.36966599,  0.27683375,  1.4818212 ,  1.29767826,
         -1.03674985,  1.74836517, -0.60300324, -1.52107402, -0.16124084,
         -0.7970972 ,  0.82655135, -0.9959017

In [11]:
len(mlp.coefs_[0])

784

In [12]:
mlp.coefs_[0]

array([[ 9.60888334e-243,  2.63443793e-242,  1.63742258e-242, ...,
         2.62135788e-248,  2.85354244e-244,  1.29356226e-243],
       [-5.81074585e-249, -1.39882699e-244,  1.90901638e-242, ...,
        -4.77292375e-246, -3.21027308e-242,  8.81845281e-249],
       [-1.97546114e-244,  2.87589506e-248, -6.97017510e-245, ...,
        -4.48198433e-244,  1.41331372e-245, -1.45792721e-249],
       ...,
       [-1.34739769e-248, -4.98126981e-247,  9.57416815e-249, ...,
         7.24648480e-247,  7.44622543e-245, -5.21786464e-243],
       [-2.89146059e-250,  9.41086518e-242, -8.68046951e-247, ...,
         2.03263489e-244,  2.57196180e-242, -1.28909927e-249],
       [ 6.45799605e-247, -7.91338584e-250, -1.75720962e-247, ...,
         1.20935843e-249, -9.06216706e-248, -1.99529584e-245]])

## Make Predictions

### Based on the training dataset

In [11]:
startTime = datetime.now()
prediction = mlp.predict(X_train)
print('Prediction: {}', prediction)
print ('Total running time (H: M: S. ThS)', datetime.now()-startTime, 'seconds.')

Prediction: {} [0 6 7 ... 6 6 3]
Total running time (H: M: S. ThS) 0:00:00.751043 seconds.


### Based on the test dataset

Now we used the function predict which is regularly the one to be used to get the predictions on a new dataset. In our case, the new dataset is the X_test one.


In [12]:
# train your model using all data.
startTime = datetime.now()
mlp = MLPClassifier()
mlp.fit(X, y) 
print ('Total running time (H: M: S: ThS)', datetime.now()-startTime, 'seconds.')


Total running time (H: M: S: ThS) 0:03:35.998355 seconds.


In [15]:
predictions = mlp.predict(X_test)
print('Prediction: {}', predictions)

Prediction: {} [0 6 5 ... 7 4 8]


<hr>
By: Hector Alvaro Rojas &nbsp;&nbsp;|&nbsp;&nbsp; Data Science, Visualizations and Applied Statistics &nbsp;&nbsp;|&nbsp;&nbsp; April 14, 2018<br>
    Url: [http://www.arqmain.net]   &nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;   GitHub: [https://github.com/arqmain]
    <hr>