# MNIST 3.1. - convolutional bigger dropout

This notebook is part of the series using Keras to follow the tutorial ["Tensorflow and deep learning - without a PhD"](https://github.com/martin-gorner/tensorflow-mnist-tutorial). The original tensorflow implementation can be found [here](https://github.com/martin-gorner/tensorflow-mnist-tutorial/blob/master/mnist_3.1_convolutional_bigger_dropout.py).

Neural network structure for this consists of three convolutional layers with their channel counts, and a fully connected layer (tha last layer has 10 softmax neurons):

```
· · · · · · · · · ·      (input data, 1-deep)                 X [batch, 28, 28, 1]
@ @ @ @ @ @ @ @ @ @   -- conv. layer 6x6x1=>6 stride 1        W1 [5, 5, 1, 6]        B1 [6]
∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶                                           Y1 [batch, 28, 28, 6]
  @ @ @ @ @ @ @ @     -- conv. layer 5x5x6=>12 stride 2       W2 [5, 5, 6, 12]        B2 [12]
  ∶∶∶∶∶∶∶∶∶∶∶∶∶∶∶                                             Y2 [batch, 14, 14, 12]
    @ @ @ @ @ @       -- conv. layer 4x4x12=>24 stride 2      W3 [4, 4, 12, 24]       B3 [24]
    ∶∶∶∶∶∶∶∶∶∶∶                                               Y3 [batch, 7, 7, 24] 
                                                               => reshaped to YY [batch, 7*7*24]
     \x/x\x\x/ ✞      -- fully connected layer (relu+dropout) W4 [7*7*24, 200]       B4 [200]
      · · · ·                                                 Y4 [batch, 200]
      \x/x\x/         -- fully connected layer (softmax)      W5 [200, 10]           B5 [10]
       · · · 
```

In [2]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Convolution2D, Reshape

import numpy as np
np.random.seed(123)  # for reproducibility
from keras.utils import np_utils

from keras.optimizers import Adam

Using TensorFlow backend.


Load and preprocess data

In [3]:
from keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess input data
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

# Preprocess class labels
Y_train = np_utils.to_categorical(y_train, 10)
Y_test = np_utils.to_categorical(y_test, 10)

Define model architecture

In [4]:
# three convolutional layers with their channel counts, and a
# fully connected layer (tha last layer has 10 softmax neurons)

K = 6  # first convolutional layer output depth
L = 12  # second convolutional layer output depth
M = 24  # third convolutional layer
N = 200  # fully connected layer

stride_1 = 1 # stride value for convolutional layer 1
stride_2 = 2 # stride value for convolutional layer 2
stride_3 = 2 # stride value for convolutional layer 3

pdrop = .25 # probability for dropout


model = Sequential()
model.add(Convolution2D(K, 5, 5, border_mode='same', input_shape=(28, 28, 1), activation='relu', 
                        subsample = (stride_1,stride_1)))
model.add(Convolution2D(L, 5, 5, border_mode='same', activation='relu', subsample = (stride_2,stride_2)))
model.add(Convolution2D(M, 4, 4, border_mode='same', activation='relu', subsample = (stride_3,stride_3)))
model.add(Reshape((7*7*M,)))
model.add(Dense(N, activation='relu'))
model.add(Dropout(pdrop))
model.add(Dense(10, activation='softmax'))

Compile the model

In [None]:
optimizer = Adam(lr=0.003, decay=0.002)
model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

Fit the model on training data

In [None]:
model.fit(X_train, Y_train, batch_size=1000, nb_epoch=2000, verbose=1)

Epoch 1/2000

Evaluate model on test data

In [None]:
score = model.evaluate(X_test, Y_test, verbose=0)