**This is my first kernel and I want all of your support.  Please upvote or comment your views if you find it useful. Your views will surely motivate me to write and work on more kernels. Thanks so much.**

**What is Deep Learning?** 

Deep Learing referes to neural networks with multiple hidden layers that can learn increasingly abstract representation of input data. Neural network with many hidden layers is known as deep neural network. Deep neural networks with many hidden layers can sequentially learn more complex features from the raw image.
* first hidden layers might learn local features from the input image.
*  then each subsequent hidden layers learns more complex features.
* and the last layer can classify image into categories. 

These types of deep neural networks are called as **Convolutional Neural Networks (CNN)**

**Where Deep Learning is used? **

Deep learning is mostly used in image recognition. Apart from that there are many applications of deep learning such as:
* Speech recognition.
* Natural language processing. 
* Self-driving vehicle. 
* Computer vision and so on.

This kernel will help you dive deep into convolutional neural network (CNN) with the python coding using the Digit Recognizer MNIST datasets.
CNN can be build using the following steps:
* Data Preparation
* Data Modeling using CNN
* Prediction and Submission

Lets begin the journey!


In [None]:
# importing required libraries. 
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.utils.np_utils import to_categorical

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

**1. Data Preparation**
* **Load Data**

In [None]:
# Load Train and Test data
train = pd.read_csv("../input/digit-recognizer/train.csv")
test = pd.read_csv("../input/digit-recognizer/test.csv")

In [None]:
#checking the shape of train and test dataset
train.shape

In [None]:
test.shape

In [None]:
train_label = train.label.values

# Dropping 'abel' column from train dataset
train = train.drop("label", axis = 1).values

train.shape

* **Check for missing and null values **

In [None]:
# train dataset null value check
np.isnan(train).sum()

In [None]:
# test dataset null value check
test.isnull().any().describe()

* **Normalization**

Normalization accelerates training speed. As the range of the values of the features gets narrowed down to a particular range because of normalization, its easy to perform computations over a smaller range of values. 

In [None]:
train = train/255.0
test= test/255.0

* **Visualization of train dataset images**

In [None]:
fig1, ax1 = plt.subplots(1,15, figsize=(15,10))
for i in range(15):
    # reshaping the images into 28*28 shape
    ax1[i].imshow(train[i].reshape((28,28)))
    ax1[i].axis('off')
    ax1[i].set_title(train_label[i]) 

* **Reshaping**

Reshaping images into 3 dimensions with 28*28*1. Here **'1'** is the number of channel used. As MNIST images are gray scaled, so it only use one channel. 

In [None]:
train_image =np.array(train).reshape(-1,28,28,1)
test_image =np.array(test).reshape(-1,28,28,1)

In [None]:
train_image.shape

In [None]:
test_image.shape

* **Label Encoding**

In [None]:
# first checkin the shape of train_label
train_label.shape

In [None]:
# Encoding labels to one hot encoder
train_label = to_categorical(train_label)

In [None]:
# again checking the shape of train_label
train_label.shape

As there are 10 categories from 0 to 9, after one hot encoding, train_label is categorized into 10 columns representing 10 categories. 

**2. Data Modeling using CNN**

The following are the steps in building Convolutional Neural Network (CNN):

**1. Convolution**

**2. Pooling**

**3. Flattening**

**4. Full Connection**

<img src="https://imgur.com/nME3ghU.jpg">

* **Convolution**

There are **3** steps in this process of Convolution:

1. **Input Image**: The input image is the image being detected. 

2. **Feature detector** : It is also called as Kernel or Filter. It is a matrix generally of 3*3 Size.

3. **Feature map** : The matrix representation of the input image is **multiplied** element-wise with the feature detector to produce a feature map. Feature map aims to reduce the size of the image and make processing faster and easier. 

Some of the features of the image are lost in this step. However, the main features of the image that are important in image detection are retained. These features are the ones that are unique to identifying that specific object. The way we prevent loss of image information is by having **many feature maps**. Each feature map detects the location of certain features in the image.

* **Apply ReLU (Rectifier Linear Unit)** : 
In this step we apply the rectifier function to **increase non-linearity** in the CNN. Images are made of different objects that are not linear to each other. Without applying this function the image classification will be treated as a linear problem while it is actually a non-linear one.

In [None]:
# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), padding = 'Same', activation="relu", input_shape=(28, 28, 1)))

* **Pooling**

Pooling enables the CNN to detect features in various images irrespective of the difference in lighting in the pictures and different angles of the images. 

There are different types of pooling, for example, **max pooling** and **min pooling**. 

Max pooling works by placing a matrix of 2x2 on the feature map and picking the largest value in that box. The 2x2 matrix is moved from left to right through the entire feature map picking the largest value in each pass. These values then form a new matrix called a **Pooled Feature Map**. Max pooling works to preserve the main features while also reducing the size of the image. This helps reduce overfitting, which would occur if the CNN is given too much information, especially if that information is not relevant in classifying the image.

In [None]:
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))

In [None]:
#adding another convulationary layer. Since we are adding another convolution layer, we are not required 
#- to pass input shape parameter
classifier.add(Conv2D(32, (3, 3), activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

* **Flattening**

Flattening involves transforming the entire pooled feature map matrix into a single column which is then fed to the neural network for processing.

In [None]:
# Step 3 - Flattening
classifier.add(Flatten())

* **Full Connection**

This step is made up of the **input layer**, **the fully connected layer,** and **the output layer**. The fully connected layer is similar to the hidden layer in ANNs but in this case it’s fully connected. The output layer is where we get the predicted classes or category

In [None]:
#Step 4 - Full connection. 
#Here output_dim is the number of neurons per layer
classifier.add(Dense(output_dim = 256, activation = 'relu'))

In [None]:
#output layer
# here we are using 10 output_dim (neurons) because there are 10 classes
classifier.add(Dense(output_dim = 10, activation = 'softmax'))

In [None]:
# Compilint the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

In [None]:
epochs=30
batch_size=90

# fitting the CNN model 
classifier.fit(train_image, train_label, batch_size=batch_size, epochs=epochs)

**3. Prediction and Submission**

In [None]:
#Prediction
results = classifier.predict(test_image)

In [None]:
# Submission
pred = []
numTest = results.shape[0]
for i in range(numTest):
    pred.append(np.argmax(results[i])) 
predictions = np.array(pred) 

sample_submission = pd.read_csv('../input/digit-recognizer/sample_submission.csv')

result=pd.DataFrame({'ImageId':sample_submission.ImageId, 'Label':predictions})
result.to_csv("submission.csv",index=False)
print(result)