### MNIST Digit Recognizer using CNN ###

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

**Importing Libraries**

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Conv2D,Flatten,MaxPooling2D,Dropout,BatchNormalization
import matplotlib.pyplot as plt 
import seaborn as sns

**Exploring The Dataset**
Data consists of two csv files i.e train and test

In [None]:
filepath_train = "/kaggle/input/digit-recognizer/train.csv"
filepath_test = "/kaggle/input/digit-recognizer/test.csv"

In [None]:
df = pd.read_csv(filepath_train)
df.head()

**Every row of data corresponds to an example , which is having 784 pixel values(columns from 0 t0 783)which is a 28*28 array(or image of digit )**

In [None]:
df.shape

**Training Data has 42000 examples of images of digits**

In [None]:
df.isnull().sum().sort_values(ascending = False) 

**No Null Values here**

In [None]:
df_test = pd.read_csv(filepath_test)
df_test.shape

**test Data has 28000 images of size 784(28*28)**

**Converting the dataframe into array and seperating train data and labels**

In [None]:
X_train,y_train = np.asarray(df.drop('label',axis= 1)), np.asarray(df.loc[:,'label'])

In [None]:
print(X_train.shape, y_train.shape)

**42000 images (28* 28) and 42000 labels : 1 corresponding to each**

In [None]:
X_test = np.asarray(df_test)

In [None]:
print(X_test.shape)

**Test data is now array of 28000 examples and 784 pixels(28* 28) per image**

**Lets check the distribution of data**

In [None]:
plt.figure(figsize = (20,15))

plt.xticks(size=15)
sns.countplot(y_train,linewidth = 3,edgecolor=sns.color_palette())
plt.title('Distribution of labels in the train dataset', fontdict={'color' : 'Black' , 'fontsize' : 30})

plt.show()

**There are nearly equal examples of each label(i.e digits from 0 to 9) , hence data is balanced.**

**As we have discussed so far, image pixel values are in the form of columns, we have to reshape the values to visualise them as an image. Also apart from visualisation we also have to feed the CNN in the form of images(or reshaped 2D matrices)** 

In [None]:
print("Before Reshaping : ")
print("Shape of X_train :" ,X_train.shape)
print("Shape of y_train :" ,y_train.shape)
print("Shape of X_test :" ,X_test.shape)

In [None]:
X_train = X_train.reshape(len(X_train), 28,28,1)
X_test = X_test.reshape(len(X_test), 28,28,1)

y_train = tf.keras.utils.to_categorical(y_train)

In [None]:
print("After Reshaping : ")
print("Shape of X_train :" ,X_train.shape)
print("Shape of y_train :" ,y_train.shape)
print("Shape of X_test :" ,X_test.shape)

**Data has been reshaped and it can be seen that train data is of 42000 examples/images of size 28* 28 and labels are one hot encoded and hence shape changed form (42000,1) to (42000,10).**

**Visualising the images**

In [None]:
L = 5
W = 5
fig, axes = plt.subplots(L, W, figsize = (15,15))
axes = axes.ravel()

for i in range(0, L * W):  
    axes[i].imshow(X_train[i],cmap='gray')
    axes[i].set_title("Digit = "+str(i))
    axes[i].axis('off')
plt.subplots_adjust(wspace=0.5)

**Perfect!!! , Lets build a simple CNN model.**

In [None]:
def get_model(input_shape):
    model = Sequential()
    model.add(Conv2D(filters = 16,kernel_size = (3,3),activation = 'relu',input_shape = input_shape))
    model.add(BatchNormalization())

    model.add(Conv2D(filters = 32,kernel_size = (3,3),activation = 'relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2,2)))
    # model.add(Dropout(0.2))

    model.add(Conv2D(filters = 128,kernel_size = (3,3),activation = 'relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2,2)))
    # model.add(Dropout(0.2))

    model.add(Conv2D(filters = 256,kernel_size = (3,3),activation = 'relu'))
    model.add(MaxPooling2D((2,2)))
    
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(10,activation='softmax')) 
    
    return model

In [None]:
model = get_model(input_shape = (28,28,1))
model.summary()

In [None]:
model.compile(optimizer='adam',metrics = ['accuracy'],loss = 'categorical_crossentropy')
history = model.fit( X_train, y_train, batch_size = 300  , epochs = 30)

In [None]:
#Visualizing the training performance
plt.figure(figsize=(6, 4))

plt.subplot(1, 2, 1)
plt.title('Loss')
plt.plot(history.history['loss'], label='Loss')
plt.legend()
plt.grid()

plt.subplot(1, 2, 2)
plt.title('Accuracy')
plt.plot(history.history['accuracy'], label='accuracy')
plt.legend()
plt.grid()


**Model is converging satisfactorily , let's make predictions on test set**

In [None]:
predictions = model.predict(X_test)

In [None]:
results = np.argmax(predictions, axis= 1)

In [None]:
results = pd.Series(results, name="Label")
results.head()

In [None]:
submission = pd.concat([pd.Series(range(1,28001),name = "ImageId"),results],axis = 1)
submission.to_csv("submission.csv", index=False)

**Submit this CSV to the competition and lets see where we stand.**

**Please like and comment if you find this useful, Thanks!!!**