# Importing Libraries

## Upvote if you liked the work

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Loading the dataset
train = pd.read_csv('../input/digit-recognizer/train.csv')
test = pd.read_csv('../input/digit-recognizer/test.csv')

### Getting a general idea about the dataset

In [None]:
train.shape

In [None]:
train.info()

In [None]:
train.head()

In [None]:
# Defining the target and features for the dataset
y_train = train['label']
x_train = train.drop('label',axis=1)

In [None]:
x_train.head()

In [None]:
y_train.head()

# Modifiying the dataset for CNN

Train and test images (28px x 28px) has been stock into pandas.Dataframe as 1D vectors of 784 values. We reshape all data to 28x28x1 3D matrices.
<br>
Keras requires an extra dimension in the end which correspond to channels. MNIST images are gray scaled so it use only one channel. For RGB images, there is 3 channels, we would have reshaped 784px vectors to 28x28x3 3D matrices.

In [None]:
# Scaling the values from 0 to 255   to   0 to 1
x_train = x_train/255.0
test = test/255.0

In [None]:
x_train

In [None]:
type(x_train)

In [None]:
x_train = x_train.values.reshape(-1,28,28,1)
test = test.values.reshape(-1,28,28,1)

In [None]:
# Plotting the image
plt.imshow(x_train[4],cmap='Greys')

# Defining the CNN Model

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPool2D

In [None]:
model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu', input_shape = (28,28,1)))
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(10, activation = "softmax"))

The first layer is the convolutional layer. I set 32 filters for the two layers with a kernel size of (5 x 5). <br>


The CNN can isolate features that are useful everywhere from these transformed images (feature maps).<br>

The next layer after these two is the pooling layer.The pooling layers are used in CNN for consolidating the features learned by the convolutional layer feature map. It basically helps in the reduction of overfitting by the time of training of the model by compressing or generalizing the features in the feature map. <br>

Combining convolutional and pooling layers, CNN are able to combine local features and learn more global features of the image.<br>

Next is the dropout layer.Dropout is a regularization method that approximates training a large number of neural networks with different architectures in parallel.
<br>

During training, some number of layer outputs are randomly ignored or “dropped out.” This has the effect of making the layer look-like and be treated-like a layer with a different number of nodes and connectivity to the prior layer. In effect, each update to a layer during training is performed with a different “view” of the configured layer.

In a neural network, the activation function is responsible for transforming the summed weighted input from the node into the activation of the node or output for that input.

The rectified linear activation function or ReLU for short is a piecewise linear function that will output the input directly if it is positive, otherwise, it will output zero. It has become the default activation function for many types of neural networks because a model that uses it is easier to train and often achieves better performance.

Flattening is converting the data into a 1-dimensional array for inputting it to the next layer. We flatten the output of the convolutional layers to create a single long feature vector. And it is connected to the final classification model, which is called a fully-connected layer.


In [None]:
model.compile(optimizer='adam',
             loss = 'sparse_categorical_crossentropy',
             metrics=['accuracy'])
model.fit(x_train,y_train,epochs=10)

In [None]:
model.predict(test[0].reshape(1,28,28,1)).argmax()

In [None]:
plt.imshow(test[0],cmap='Greys')

In [None]:
test[0].shape

# Submitting the predictions

In [None]:
#prediction = []
#for i in range(len(test)):
    #prediction.append(model.predict(test[i].reshape(1,28,28,1)).argmax())
#submissions=pd.DataFrame({"ImageId": list(range(1,len(prediction)+1)),"Label": prediction})
#submissions.to_csv("submission.csv", index=False, header=True)