# Data science challenge: 

I completed this work with candidate 1037061. Implementation details are described below. 

## Data preprocessing:

Unlike with our data preprocessing for our neural networks above, we add a dimension to our data rather than flattening it. Instead of having the shape (784, 1), each of our training examples will now have the shape (28, 28, 1), where the first two values represent the width and height of the image in pixels, and the third value represents the number of color channels in our image. This value is usually 3 for RGB images, which have a red, green, and blue color channel; it is 1 here because we have greyscale images and therefore only a black/white color channel.

In [1]:
import keras
import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
from sklearn.preprocessing import OneHotEncoder
from keras import backend as b

Using TensorFlow backend.


In [3]:
#load mnist dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Reshape data
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype('float32')

# normalize the pixel values to the range 0 and 1 by dividing each value by the maximum of 255
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
# number of classes
num_classes = y_test.shape[1]
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

(60000, 1, 28, 28) (10000, 1, 28, 28) (60000, 10) (10000, 10)


## Model:
When building our CNN, we generally follow the architecture described above. For feature learning, we include a convolutional + ReLU (Conv2D) layer with 32 filters, each of size $5x5$, and a max pooling layer with a $2x2$ filter. As with our Keras neural network above, we also add a dropout layer to further reduce overfitting. Our classification layers include a Flatten layer and then two dense layers with the same hyperparameters as in our earlier network. Similarly, we compile our model with categorical cross-entropy loss and the Adam optimizer. 

In [4]:
#inspired by: https://books.google.co.uk/books?id=bXJiDwAAQBAJ&pg=PA98&lpg=PA98&dq=model.add(Conv2D(32,+(5,+5),+input_shape%3D(1,+28,+28),+activation%3D%27relu%27))+model.add(MaxPooling2D(pool_size%3D(2,+2)))+model.add(Dropout(0.2))+model.add(Flatten())+model.add(Dense(128,+activation%3D%27relu%27))+model.add(Dense(num_classes,+activation%3D%27softmax%27))+%23+Compile+model+model.compile(loss%3D%27categorical_crossentropy%27,+optimizer%3D%27adam%27,+metrics%3D%5B%27accuracy%27%5D)+%23+Fit+the+model+model.fit(X_train,+y_train,+validation_split%3D0.2,+epochs%3D10,+batch_size%3D200,+verbose%3D1)&source=bl&ots=9SMORye4bV&sig=G4aPjzm6NCwoq0zJQOH5JcAB3ZM&hl=en&sa=X&ved=2ahUKEwjmpaqp-urfAhVDalAKHdcCAYUQ6AEwEHoECAsQAQ#v=onepage&q=model.add(Conv2D(32%2C%20(5%2C%205)%2C%20input_shape%3D(1%2C%2028%2C%2028)%2C%20activation%3D'relu'))%20model.add(MaxPooling2D(pool_size%3D(2%2C%202)))%20model.add(Dropout(0.2))%20model.add(Flatten())%20model.add(Dense(128%2C%20activation%3D'relu'))%20model.add(Dense(num_classes%2C%20activation%3D'softmax'))%20%23%20Compile%20model%20model.compile(loss%3D'categorical_crossentropy'%2C%20optimizer%3D'adam'%2C%20metrics%3D%5B'accuracy'%5D)%20%23%20Fit%20the%20model%20model.fit(X_train%2C%20y_train%2C%20validation_split%3D0.2%2C%20epochs%3D10%2C%20batch_size%3D200%2C%20verbose%3D1)&f=false
b.set_image_dim_ordering('th')

# create model
model = Sequential()

#add layers
model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
    
model.add(Flatten())
model.add(Dense(128, activation='relu'))
    
model.add(Dense(num_classes, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, validation_split=0.2, epochs=10, batch_size=200, verbose=1)

# Save model
model.model.save('model_convolution.h5')

# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("CNN Error: %.2f%%" % (100-scores[1]*100))

Train on 48000 samples, validate on 12000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10




CNN Error: 1.08%


## Model performance: 
Our new model achieves an accuracy of 98.93 percent, which is slightly higher than our previous model's performance of 98.19 percent. 

In [21]:
# Get accuracy or other parameters from saved model
from keras.models import load_model

model = load_model('model_convolution.h5')

# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("CNN Performance: %s" % (scores[1]))

CNN Performance: 0.9893
