MNIST is a dataset of handwritten digits between 0 and 9. The images are in train.csv with each row containing an image. There are 784 columns with each column indicating a pixel. As a result each image is of size 28x28 (widthxheight). The first column of train.csv are the image labels.

We start by importing the required libraries into our notebook.

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten, BatchNormalization
from keras.layers import Dense, Dropout

from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
import keras

We load the training data ie. train.csv into a pandas dataframe.

In [None]:
train_data = pd.read_csv("../input/train.csv")

We check the first few rows of our pandas dataframe containing the training data.

In [None]:
train_data.head()

The shape shows the number of rows and number of columns that is 42000 rows and 785 columns.

In [None]:
train_data.shape

I load the values of the pandas dataframe into a variable. 
images contain only the images and leaving out the first column which are the labels.
labels contain only the labels hence we select only the first column and all the rows.

In [None]:
images = train_data.iloc[:,1:].values
labels = train_data.iloc[:,0].values

We try to plot an image so as to get a visual representation of the data.

In [None]:
def show_image(number):
    image = images[number]
    plt.axis('off')
    plt.imshow(image.reshape(28,28))
    plt.title(labels[number])

In the below images, the values are mentioned on top of the images.

In [None]:
show_image(24)

In [None]:
show_image(890)

In [None]:
show_image(32190)

We now split the training and validation data. We choose a 8:2 ratio for training and validation.

In [None]:
X_train, X_val, Y_train, Y_val = train_test_split(images, labels, test_size=0.2, random_state=0)
print("Length of X_train:", len(X_train))
print("Length of Y_train:", len(Y_train))
print("Length of X_val:", len(X_val))
print("Length of Y_train:", len(Y_val))

Convolution2D accepts a 4D input. As the image is of size 28x28, we add dimensions.

In [None]:
X_train = X_train.reshape(-1,28,28,1)
X_val = X_val.reshape(-1,28,28,1)

Keras has the functionality to one-hot encode the labels using to_categorical.

In [None]:
Y_train_one_hot = np_utils.to_categorical(Y_train, 10)
Y_validation_one_hot = np_utils.to_categorical(Y_val, 10)

I make our model here. Since the images will be fed in sequentially, we use a Sequential classifier. 
I add the first Convolution2D layer which gives 16 feature maps using a filter of size 3x3. 
A MaxPooling2D layer is used next with the pool size of 2x2.

Similarly another Convolution2D and MaxPooling2D layers are added.
Next a dense layer or a hidden layer is added with 64 neurons.
We use the next dense layer with 10 neurons at the output as we expect 10 probabilistic values.
I rescale the training and validation pixel values by 1./255 in order to obtain normalization.
Then we fit our model. I use the standard adam optimizer.

In [None]:
classifier = Sequential()

classifier.add(Convolution2D(16,3,3, input_shape=(28,28,1), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))
classifier.add(Dropout(0.2))

classifier.add(Convolution2D(32,3,3, input_shape=(28,28,1), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))
classifier.add(Dropout(0.2))

classifier.add(Flatten())

classifier.add(BatchNormalization())
classifier.add(Dense(output_dim = 64, activation='relu'))
classifier.add(Dense(output_dim = 10, activation='softmax'))

classifier.summary()

train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=False)
train_set = train_datagen.flow(X_train, Y_train_one_hot, batch_size=32)

validation_datagen = ImageDataGenerator(rescale=1./255)
validation_set = validation_datagen.flow(X_val, Y_validation_one_hot, batch_size=32)


classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

classifier.fit_generator(train_set,
                    steps_per_epoch=33600,epochs=5,
                    validation_data=(validation_set), validation_steps=8400, shuffle=True)

Now I load the test file.

In [None]:
X_test = pd.read_csv("../input/test.csv")
X_test.head()

In [None]:
X_test = X_test.iloc[:,:].values

In [None]:
X_test = (X_test)*1./255

In [None]:
X_test = X_test.reshape(-1,28,28,1)

I predict the test images using predict.

In [None]:
predictions = classifier.predict(X_test)

In [None]:
predictions = np.argmax(predictions,axis = 1)

In [None]:
ImageId = np.arange(1,28001)

In [None]:
test_images = X_test
test_labels = predictions

In [None]:
def show_test_image(number):
    test_image = test_images[number]
    plt.axis('off')
    plt.imshow(test_image.reshape(28,28))
    plt.title("Predicted:{}".format(test_labels[number]))

In [None]:
show_test_image(54)

In [None]:
show_test_image(3439)

In [None]:
show_test_image(15439)

In [None]:
Label = predictions

In [None]:
submission = pd.DataFrame()

In [None]:
submission['ImageId'] = ImageId
submission['Label'] = Label

In [None]:
submission.to_csv("MNIST2.csv",index=False)