# MNIST 2.0. - five layers sigmoid

This notebook uses Keras to implement the second model of the tutorial ["Tensorflow and deep learning - without a PhD"](https://github.com/martin-gorner/tensorflow-mnist-tutorial). The original tensorflow implementation can be found [here](https://github.com/martin-gorner/tensorflow-mnist-tutorial/blob/master/mnist_2.0_five_layers_sigmoid.py).

The model consists of five fully connected layers using the sigmoid activation function for the hidden layers and a softmax activation function for the output layer:
```
neural network with 5 layers

· · · · · · · · · ·    (input data, flattened pixels)  X[batch, 784]
\x/x\x/x\x/x\x/x\x/ -- fully connected layer (sigmoid) W1[784, 200], B1[200]
 · · · · · · · · ·                                     Y1[batch, 200]
  \x/x\x/x\x/x\x/   -- fully connected layer (sigmoid) W2[200, 100], B2[100]
   · · · · · · ·                                       Y2[batch, 100]
    \x/x\x/x\x/     -- fully connected layer (sigmoid) W3[100, 60], B3[60]
     · · · · ·                                         Y3[batch, 60]
      \x/x\x/       -- fully connected layer (sigmoid) W4[60, 30], B4[30]
       · · ·                                           Y4[batch, 30]
        \x/         -- fully connected layer (softmax) W5[30, 10], B5[10]
         ·                                             Y5[batch, 10]
```

In [3]:
from keras.models import Sequential
from keras.layers import Dense, Activation

import numpy as np
np.random.seed(123)  # for reproducibility
from keras.utils import np_utils

from keras.optimizers import SGD

Using TensorFlow backend.


Load and preprocess data

In [4]:
from keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess input data
X_train = X_train.reshape(X_train.shape[0], 784)
X_test = X_test.reshape(X_test.shape[0], 784)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

# Preprocess class labels
Y_train = np_utils.to_categorical(y_train, 10)
Y_test = np_utils.to_categorical(y_test, 10)

Define model architecture

In [8]:
model = Sequential()
model.add(Dense(200, batch_input_shape=(None, 784), activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(60, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(10, activation='softmax'))

Compile the model

In [4]:
sgd = SGD(lr=0.003, decay=0.0)
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

Fit the model on training data

In [6]:
model.fit(X_train, Y_train, batch_size=1000, nb_epoch=2000, verbose=1)

Evaluate model on test data

In [None]:
score = model.evaluate(X_test, Y_test, verbose=0)