Example adapted from [this online post](https://nextjournal.com/gkoehler/digit-recognition-with-keras).

In [1]:
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')

In [2]:
X, y = mnist["data"], mnist["target"]
X.shape

(70000, 784)

In [3]:
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

In [4]:
import numpy as np

shuffle_index = np.random.permutation(60000)
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]

In [5]:
X_train.shape

(60000, 784)

In [6]:
X_test.shape

(10000, 784)

In [7]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

In [8]:
X_train /= 255
X_test /= 255

one-hot encoding the output using numpy-related utilities from keras

In [9]:
from keras.utils import np_utils
y_train = y_train.astype('float32')
y_test = y_test.astype('float32')

n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
y_train = np_utils.to_categorical(y_train, n_classes)
y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", y_train.shape)

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Shape before one-hot encoding:  (60000,)
Shape after one-hot encoding:  (60000, 10)


building a linear stack of densely connected layers with the sequential model from keras

![](nn_example.png)

In [10]:
from keras.models import Sequential, load_model
from keras.layers.core import Dense, Activation

model = Sequential()

model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))                            

model.add(Dense(512))
model.add(Activation('relu'))

model.add(Dense(10))
model.add(Activation('softmax'))

In [11]:
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

In [12]:
model.fit(X_train, y_train,
          batch_size=128, epochs=10,
          verbose=2,
          validation_data=(X_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
 - 16s - loss: 0.2200 - acc: 0.9349 - val_loss: 0.1154 - val_acc: 0.9634
Epoch 2/10
 - 15s - loss: 0.0806 - acc: 0.9754 - val_loss: 0.0787 - val_acc: 0.9742
Epoch 3/10
 - 13s - loss: 0.0518 - acc: 0.9833 - val_loss: 0.0726 - val_acc: 0.9791
Epoch 4/10
 - 13s - loss: 0.0345 - acc: 0.9893 - val_loss: 0.0700 - val_acc: 0.9804
Epoch 5/10
 - 15s - loss: 0.0284 - acc: 0.9907 - val_loss: 0.0736 - val_acc: 0.9792
Epoch 6/10
 - 15s - loss: 0.0220 - acc: 0.9931 - val_loss: 0.0767 - val_acc: 0.9800
Epoch 7/10
 - 14s - loss: 0.0192 - acc: 0.9935 - val_loss: 0.0907 - val_acc: 0.9756
Epoch 8/10
 - 15s - loss: 0.0157 - acc: 0.9948 - val_loss: 0.0796 - val_acc: 0.9801
Epoch 9/10
 - 14s - loss: 0.0171 - acc: 0.9944 - val_loss: 0.0821 - val_acc: 0.9804
Epoch 10/10
 - 13s - loss: 0.0115 - acc: 0.9960 - val_loss: 0.0805 - val_acc: 0.9822


<keras.callbacks.History at 0x1a37ad3908>

Compute model accuracy on the 10,000 testing examples 

In [13]:
loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)

print("Test Loss", loss_and_metrics[0])
print("Test Accuracy", loss_and_metrics[1])

Test Loss 0.08052582735941678
Test Accuracy 0.9822


save the model in HDF5 format (an open standard that is more efficient than Python pickle)

In [14]:
model.save("./keras_mnist_first.h5")

In [15]:
#mnist_model = load_model("./keras_mnist_first.h5")