# Recognizing Digits with CNNs

This kernel consists of a simple approach to recognize numerical digits in the famous MNIST dataset using convolutional neural networks. Thanks must be extended to Yassine Ghouzam, who's kernel [Introduction to CNN Keras - 0.997 (top 6%)
](https://www.kaggle.com/yassineghouzam/introduction-to-cnn-keras-0-997-top-6) served as a very helpful guide on how to approach this task.

Let's begin by loading and taking a peek into our dataset.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, Dropout, MaxPooling2D
from keras.losses import categorical_crossentropy
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping

In [2]:
train = pd.read_csv("../input/train.csv")
test = pd.read_csv("../input/test.csv")

In [3]:
train.head()

In [4]:
train.describe()

Let's also check to see if there is a class imbalance in our training set.

In [5]:
sns.countplot(train["label"])

And let's also confirm whether or not there are any missing values in our training or test sets.

In [6]:
train.isnull().any().describe()

In [7]:
test.isnull().any().describe()

Due to the great results deep learning has had on this dataset, let's build a CNN to model this data. We know from the dataset description that these are 28 x 28 images, so we will use this information to preprocess our data for training. We will also normalize our data by dividing it all by 255.

In [8]:
train = train.sample(frac=1, random_state=0)
X_train = train.drop(labels=["label"], axis=1)
X_train = X_train / 255
Y_train = train["label"]
num_classes = len(Y_train.unique())
img_rows, img_cols = 28, 28
X_train = X_train.values.reshape(-1, img_rows, img_cols, 1)

Let's visualize a data instance just to get an idea of what it looks like.

In [9]:
plt.imshow(X_train[0][:, :, 0], cmap="Greys")

Let's now build a CNN using the keras example framework for this dataset as a template which can be found [here](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py).

In [10]:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation="relu", input_shape=(img_rows, img_cols, 1)))
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation="softmax"))

Let's now compile said model.

In [11]:
model.compile(loss=categorical_crossentropy, optimizer="adam", metrics=["accuracy"])

Let's now fit the model for 50 epochs, with a validation set consisting of 10% of the data instances. Since the validation_split argument when fitting a model will take the last 10% of samples from our training set to use for training, let's ensure that there is no class imbalance in this small validation set.

In [12]:
sns.countplot(Y_train.tail(int(0.1*len(Y_train))))

Let's now make the labels categorical and train the model, employing early stopping if the model's validation accuracy stops increasing after 10 epochs.

In [13]:
Y_train = to_categorical(Y_train, num_classes)

In [14]:
model.fit(X_train, 
          Y_train, 
          epochs=50, 
          callbacks=[EarlyStopping(monitor="val_acc", patience=10)], 
          validation_split=0.1)

Let's now use our model to make predictions on the test data and submit those predictions.

In [None]:
test = test / 255
X_test = test.values.reshape(-1, img_rows, img_cols, 1)
predictions = model.predict(X_test)

In [None]:
submission = pd.DataFrame(np.argmax(predictions, axis=1)).reset_index()
submission.columns = ["ImageId", "Label"]
submission["ImageId"] += 1
submission.to_csv("submission.csv", index=False)