# MNIST 1.0. - Softmax
This notebook uses Keras to implement the first model of the tutorial ["Tensorflow and deep learning - without a PhD"](https://github.com/martin-gorner/tensorflow-mnist-tutorial). The original tensorflow implementation can be found [here](https://github.com/martin-gorner/tensorflow-mnist-tutorial/blob/master/mnist_1.0_softmax.py).

The model consists of a fully connected layer using the softmax activation function:
```
neural network with 1 layer of 10 softmax neurons

· · · · · · · · · ·  (input data, flattened pixels)      X [batch, 784]
\x/x\x/x\x/x\x/x\x/  -- fully connected layer (softmax)  W [784, 10], b[10]
  · · · · · · · ·                                        Y [batch, 10]
```
The model is:

$Y = softmax( X * W + b)$

- X: matrix for 100 grayscale images of 28x28 pixels, flattened (there are 100 images in a mini-batch)
- W: weight matrix with 784 lines and 10 columns
- b: bias vector with 10 dimensions
- +: add with broadcasting: adds the vector to each line of the matrix (numpy)
- softmax(matrix) applies softmax on each line
- softmax(line) applies an exp to each value then divides by the norm of the resulting line
- Y: output matrix with 100 lines and 10 columns



In [3]:
from keras.models import Sequential
from keras.layers import Dense, Activation

import numpy as np
np.random.seed(123)  # for reproducibility
from keras.utils import np_utils

from keras.optimizers import SGD

Using TensorFlow backend.


Load and preprocess data

In [6]:
from keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess input data
X_train = X_train.reshape(X_train.shape[0], 784)
X_test = X_test.reshape(X_test.shape[0], 784)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

# Preprocess class labels
Y_train = np_utils.to_categorical(y_train, 10)
Y_test = np_utils.to_categorical(y_test, 10)

Define model architecture

In [4]:
model = Sequential()
model.add(Dense(10, batch_input_shape=(None, 784)))
model.add(Activation('softmax'))

Compile the model

In [4]:
sgd = SGD(lr=0.005, decay=0.0)
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

Fit the model on training data

In [6]:
model.fit(X_train, Y_train, batch_size=1000, nb_epoch=2000, verbose=1)

Evaluate model on test data

In [None]:
score = model.evaluate(X_test, Y_test, verbose=0)