### Convolutional Neural Network Example Workflow (Keras)

This is an example of applying cnn model building based on `keras`. The data is from [kaggle MNIST Digit Recognizer Competition](https://www.kaggle.com/c/digit-recognizer)

#### 1. Preparation

Read data, preprocessing data (scale and normalize data, reshape data), set model parameters.

In [1]:
import numpy as np
import pandas as pd

In [2]:
# read data
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')

In [3]:
# check data
train.head(3)

Unnamed: 0,label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [4]:
test.head(3)

Unnamed: 0,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [5]:
train.shape, test.shape

((42000, 785), (28000, 784))

In [6]:
# split into training and validation set
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(train.iloc[:, 1:].values, train.label.values, test_size=.05, random_state=218)
X_train.shape, X_val.shape

((39900, 784), (2100, 784))

In [7]:
X_test = test.values
X_test.shape

(28000, 784)

In [8]:
import keras
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Flatten, Input
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.optimizers import Adam
from keras import backend as K

Using TensorFlow backend.


In [9]:
# setting for CNN 
batch_size, num_classes, epochs = 128, 10, 20
img_rows, img_cols = 28, 28
input_shape = (img_rows, img_cols, 1) # not channel_first

In [10]:
# reshape data into squared size
X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_val = X_val.reshape(X_val.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)

In [11]:
# pre-processing: normalization
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255
X_val /= 255
X_test /= 255

In [12]:
# pre-processing: convert target to one-hot vector
y_train = to_categorical(y_train, num_classes)
y_val = to_categorical(y_val, num_classes)
y_train.shape, y_val.shape

((39900, 10), (2100, 10))

#### 2. Build model

Use the most common way to build a CNN using `keras`: `Sequential` API

In [13]:
def cnn_model():
    '''
    Simple CNN model for MNIST digit classification task
    conv2d -> conv2d -> maxpooling -> dropout -> flatten -> dense (fully connected) -> dropout -> dense (softmax)
    use Sequential(), there are a lot of networks that cannot use Sequential()
    '''
    model = Sequential()
    model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
    model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(.25))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(num_classes, activation='softmax'))
    
    return model

In [14]:
model = cnn_model()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs,verbose=1, validation_data=(X_val, y_val))

Train on 39900 samples, validate on 2100 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x4aa9e780>

In [15]:
score = model.evaluate(X_val, y_val, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.0318565379559
Test accuracy: 0.988571428571


In [16]:
# probabilities
pred = model.predict(X_test)
pred_label = pred.argmax(axis=-1)

In [17]:
# submission
submission = pd.DataFrame({'ImageId': np.arange(1, test.shape[0] + 1), 
                           'Label': pred_label})
submission.to_csv('submission_simple_cnn.csv', index=False)

In [18]:
def lenet():
    '''
    LeNet
    conv2d -> maxpooling -> conv2d -> maxpooling -> dense -> dropout -> dense -> dropout -> dense (softmax)
    '''
    model = Sequential()
    model.add(Conv2D(filters=12, kernel_size=(5, 5), activation='relu', input_shape=input_shape))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(filters=25, kernel_size=(5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Flatten())
    model.add(Dense(180, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(100, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(num_classes, activation='softmax'))
    
    return model

In [19]:
lenet_model = lenet()
lenet_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
lenet_model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs + 50,verbose=1, validation_data=(X_val, y_val))

Train on 39900 samples, validate on 2100 samples
Epoch 1/70
Epoch 2/70
Epoch 3/70
Epoch 4/70
Epoch 5/70
Epoch 6/70
Epoch 7/70
Epoch 8/70
Epoch 9/70
Epoch 10/70
Epoch 11/70
Epoch 12/70
Epoch 13/70
Epoch 14/70
Epoch 15/70
Epoch 16/70
Epoch 17/70
Epoch 18/70
Epoch 19/70
Epoch 20/70
Epoch 21/70
Epoch 22/70
Epoch 23/70
Epoch 24/70
Epoch 25/70
Epoch 26/70
Epoch 27/70
Epoch 28/70
Epoch 29/70
Epoch 30/70
Epoch 31/70
Epoch 32/70
Epoch 33/70
Epoch 34/70
Epoch 35/70
Epoch 36/70
Epoch 37/70
Epoch 38/70
Epoch 39/70
Epoch 40/70
Epoch 41/70
Epoch 42/70
Epoch 43/70
Epoch 44/70
Epoch 45/70
Epoch 46/70
Epoch 47/70
Epoch 48/70
Epoch 49/70
Epoch 50/70
Epoch 51/70
Epoch 52/70
Epoch 53/70
Epoch 54/70
Epoch 55/70
Epoch 56/70
Epoch 57/70
Epoch 58/70
Epoch 59/70


Epoch 60/70
Epoch 61/70
Epoch 62/70
Epoch 63/70
Epoch 64/70
Epoch 65/70
Epoch 66/70
Epoch 67/70
Epoch 68/70
Epoch 69/70
Epoch 70/70


<keras.callbacks.History at 0x4c69f7f0>

In [20]:
pred = lenet_model.predict(X_test)
pred_label = pred.argmax(axis=-1)

In [21]:
submission['Label'] = pred_label
submission.to_csv('submission_lenet_seq.csv', index=False)

Use `Functional` API

In [22]:
def lenet2(input_shape):
    '''
    Another way to defind a CNN, use Model()
    '''
    X_input = Input(input_shape)
    X = Conv2D(64, (5, 5), activation='relu')(X_input)
    X = MaxPooling2D((2, 2))(X)
    X = Conv2D(128, (5, 5), activation='relu')(X)
    X = MaxPooling2D((2, 2))(X)
    X = Flatten()(X)
    X = Dense(256, activation='relu')(X)
    X = Dropout(.5)(X)
    X = Dense(100, activation='relu')(X)
    X = Dropout(.5)(X)
    X = Dense(10, activation='softmax')(X)
    
    model = Model(inputs=[X_input], outputs=X)
    return model

In [23]:
lenet_model2 = lenet()
lenet_model2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
lenet_model2.fit(X_train, y_train, batch_size=batch_size, epochs=epochs + 30,verbose=1, validation_data=(X_val, y_val))

Train on 39900 samples, validate on 2100 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x4d5de978>

In [24]:
pred = lenet_model2.predict(X_test)
pred_label = pred.argmax(axis=-1)

In [25]:
submission['Label'] = pred_label
submission.to_csv('submission_lenet_model.csv', index=False)