## Shallow Neural Network  in Keras
Train a shallow neural network from the MNIST hand written digit images.  Keras is an API to use with Tensorflow. Each image is 28x28 pixels. 

#### Set the seed for reproducibility

In [1]:
import numpy as np
np.random.seed=42

#### Load dependencies

In [2]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD

ModuleNotFoundError: No module named 'keras'

The image is cut into rows of 28x28 pixels, so 28 rows. This array will be the input to the neural network. The output layer should be a 10 dimension array set to 1 for the matching digit.
Dense means the hidden layer of the neural network is connected to all the input layer and output layer. 
The first choice is to use 64 nodes in the hidden layer.
SGD is the stochastic gradian descent

#### Load data

6000 digits in the training set and 1000 in the test set. 

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train.shape

In [None]:
y_train.shape

#### Preprocessing the data
Transform the matrix to a uni dimension array. And transform 255 value to be a 1.

In [None]:
X_train = X_train.reshape(60000,784).astype('float32')
X_test = X_test.reshape(10000,784).astype('float32')
X_train/= 255
X_test /=255

Transform the y to one hot categorical variable

In [None]:
n_classes = 10
y_train = keras.utils.to_categorical(y_train, n_classes)
y_test = keras.utils.to_categorical(y_test, n_classes)

In [None]:
y_train[0]


#### Design a neural network
Use the sigmoid activation function. The input is 64 input, the output is 10

In [None]:
model = Sequential()
model.add(Dense(64, activation='sigmoid', input_shape=(784,)))
model.add(Dense(10, activation='softmax'))

In [None]:
model.summary()

50240 = 784 *64 + 64   (28x28 = 784)
#### Configure the model
Using the stochastic gradient descent with a learning rate 0.01, and focus on accuracy metric.

In [None]:
model.compile(loss='mean_squared_error', optimizer=SGD(lr=0.01), metrics=['accuracy'])

#### Train and Test the mode
As there are 10 outputs, the random guess is at an accuracy of 10%.
To get better results, define the number of iterations on the neural network by setting the epochs.

In [None]:
model.fit(X_train, y_train, batch_size=128, epochs=200, verbose=1, validation_data=(X_test, y_test))

85% of accuracy: loss: 0.0297 - acc: 0.8514 - val_loss: 0.0288 - val_acc: 0.8583