## EXERCISE - MNIST Neural Networks

In the Python code below, replace the question marks "?" with the proper code to perform a neural network prediction on the MNIST dataset predicting the digits 0-9. 

Find a sufficient number of nodes in the hidden layer, and use ReLU as the activation function. Make sure to balance the samples of each class so each are represented equally!

Use a learning rate of .1 and a max of 480 iterations for stochastic gradient descent. Set aside 1/3 of the data for testing, then evaluate performance with a confusion matrix.


In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix

df = pd.read_csv('https://bit.ly/3ilJc2C', compression='zip', delimiter=",")

X = df.values[:, :-1] / 1000.0 # this rescale helps the training 
Y = df.values[:, -1]


# Separate training and testing data
# Note that I use the 'stratify' parameter to ensure
# each class is proportionally represented in both sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
    test_size=1.0/3.0, random_state=10, stratify=Y)

# Fit a neural network classifier 
nn = MLPClassifier(solver='sgd',
                   hidden_layer_sizes=(100, ),
                   activation='relu',
                   max_iter=480,
                   learning_rate_init=.1)

nn.fit(X_train, Y_train)

# Evaluate the test dataset
print("Test set score: %f" % nn.score(X_test, Y_test))

cf = confusion_matrix(y_true=Y_test, y_pred=nn.predict(X_test))
print(cf)

Test set score: 0.976429
[[2266    0    5    0    1    7    9    1    8    4]
 [   0 2594   10    6    2    2    2    5    3    2]
 [   5    5 2277    9    4    2    3   11   11    3]
 [   1    2   24 2306    0   26    0    7   10    5]
 [   2    3    6    1 2218    0    7    5    4   29]
 [   5    1    4   12    1 2055    8    2    6   10]
 [  11    4    3    0    4   15 2250    0    5    0]
 [   3   10    8    3    7    2    0 2376    6   16]
 [   8   10    6   16    3   11    8    3 2193   17]
 [   6    1    1   12   24    6    2   13    5 2249]]


### SCROLL DOWN FOR ANSWER
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
v 

```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix

df = pd.read_csv('https://bit.ly/3ilJc2C', compression='zip', delimiter=",")

X = df.values[:, :-1] / 1000.0  # this rescale helps the training 
Y = df.values[:, -1]


# Separate training and testing data
# Note that I use the 'stratify' parameter to ensure
# each class is proportionally represented in both sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
    test_size=.33, random_state=10, stratify=Y)

# Fit a neural network classifier 
nn = MLPClassifier(solver='sgd',
                   hidden_layer_sizes=(100, ),
                   activation='relu',
                   max_iter=480,
                   learning_rate_init=.1)

nn.fit(X_train, Y_train)

# Evaluate the test dataset
print("Test set score: %f" % nn.score(X_test, Y_test))

cf = confusion_matrix(y_true=Y_test, y_pred=nn.predict(X_test))
print(cf)
```