# **Introduction**
In this kernel I will go through detailed explanation of CNN Model. This notebook will work as a perfect tutorial for beginers who are working first time with the CNN. I will start with preparing the data so that we can use it for our model. Then coming to the model part, I will explain each layer of the model so that you know the insights of what each layer is actually doing. Then we will talk about optimizers, callback techniques and data augmentation technique in detail. Finally we will evalaute our model and check how our model is working.<br>
This notebook will give you 0.9956 score. Play with some hyperparamters and you might get a better score. I will also update this notebook if I get a better score.

<h2><font color = "red">Please do an upvote if you find the kernel useful.</font></h2>

## What is CNN?
Convolutional Neural Network(CNN) are powerful image processing, artificial intelligence (AI) that use deep learning to perform both generative and descriptive tasks, often using machine vison that includes image and video recognition, along with recommender systems and natural language processing (NLP). A CNN uses a system much like a multilayer perceptron that has been designed for reduced processing requirements. The layers of a CNN consist of an input layer, an output layer and a hidden layer that includes multiple convolutional layers, pooling layers, fully connected layers and normalization layers. The removal of limitations and increase in efficiency for image processing results in a system that is far more effective, simpler to trains limited for image processing and natural language processing.

<a id='0'></a>
<font size="+2" color="blue"><b>My other kernels</b></font><br>
<div class="row">
    <div class="col-sm-4">
    <div class = "card">
      <div class = "card-body" style = "width: 20rem; ">
        <h5 class = "card-title" style = "font-size: 1.2em;"align="center">Natural Language Processing</h5>
          <img src="https://media-exp1.licdn.com/dms/image/C561BAQGEbzpXZ34-gQ/company-background_10000/0?e=2159024400&v=beta&t=o3vOn3Ye-qpqlDH64A1of1_aRAQ8TunahPQ4ZWuISRI" class = "card_img-top" style = "padding: 2% 0;width:20rem;height:12rem;"  alt="...">
        <p class="card-text" style = "font-size: 1.0em;text-align: center "><b>NLP model: The easiest way !</b></p>
        <a href = "https://www.kaggle.com/utcarshagrawal/nlp-model-the-easiest-way" class = "btn btn-info btn-lg active"  role = "button" style = "color: white; margin: 0 15% 0 25%" data-toggle = "popover" title = "Click">Click here</a>
      </div>
    </div>
  </div>
    <div class="col-sm-4">
      <div class="card">
        <div class="card-body" style="width: 20rem;">
          <h5 class = "card-title"  style = "font-size: 1.2em; " align="center" > Tutorial on Spark ML </h5>
            <img src = "https://miro.medium.com/max/650/1*mzhf9OccFn7DeVHI9dPiLQ.jpeg" class = "card_img-top" style = " padding: 2% 0;width:20rem;height:12rem;"  alt="...">
          <p class="card-text" style = "font-size: 1.0em;text-align: center "><b>Titanic:Spark ML Magic + EDA & Feature Engineering</b></p>
          <a href = "https://www.kaggle.com/utcarshagrawal/titanic-spark-ml-magic-eda-feature-engineering" class = "btn btn-info btn-lg active"  role = "button" style = "color: white; margin: 0 15% 0 25%" data-toggle = "popover" title = "Click">Click here</a>
        </div>
      </div>    
    </div>
      <div class="col-sm-4">
        <div class="card">
          <div class="card-body" style="width: 20rem;">
            <h5 class = "card-title"  style = "font-size: 1.2em; " align="center" > More on Spark ML </h5>
              <img src = "https://cdn.hswstatic.com/gif/water-life-crop.jpg" class = "card_img-top" style = " padding: 2% 0;width:20rem;height:12rem;border-radius:20%"  alt="...">
            <p class="card-text" style = "font-size: 1.0em;text-align: center "><b>Water Quality Prediction using Spark ML</b></p>
            <a href = "https://www.kaggle.com/utcarshagrawal/water-quality-prediction-using-sparkml/notebook" class = "btn btn-info btn-lg active"  role = "button" style = "color: white; margin: 0 15% 0 25%" data-toggle = "popover" title = "Click">Click here</a>
          </div>
        </div>    
      </div>
    </div>    
    

# **Table of Contents**
<ul style="list-style-type: square;">
    <li><a href='#1'>Importing Libraries</a></li>
    <li><a href='#2'>Reading the data</a></li>
    <li><a href='#3'>Data Preparation</a></li>
    <ul>
        <li><a href='#6'>Visualization</a></li>
        <li><a href='#7'>Normalization</a></li>
        <li><a href='#8'>Reshaping</a></li>
        <li><a href='#9'>Encoding</a></li>
        <li><a href='#10'>Splitting the data</a></li>
    </ul>
    <li><a href='#4'>CNN Model</a></li>
    <ul>
        <li><a href='#11'>Explanation of layers</a></li>
        <li><a href='#12'>Optimizer</a></li>
        <li><a href='#13'>Callback Technique</a></li>
        <li><a href='#14'>Data Augmentation</a></li>
        <li><a href='#15'>Training our model</a></li>
        <li><a href='#16'>Evaluating our model</a></li>
    </ul>
    <li><a href='#5'>Submission</a></li>
</ul>

<a id = '1'></a>
# **Importing Libraries**

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import itertools

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, BatchNormalization
from keras.optimizers import RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
from keras.utils import plot_model
from IPython.display import Image
from keras.utils.np_utils import to_categorical

<a id='2'></a>
# **Reading the data**

In [None]:
df1 = pd.read_csv('../input/digit-recognizer/train.csv')
df2 = pd.read_csv('../input/digit-recognizer/test.csv')

In [None]:
df1.head()

<a id='3'></a>
# **Data Preparation**

In [None]:
Y_train = df1["label"]
X_train = df1.drop(labels = ["label"],axis = 1).values 

<a id='6'></a>
### Plotting first six training images

In [None]:
fig = plt.figure(figsize=(20,20))
for i in range(6):
    ax = fig.add_subplot(1, 6, i+1, xticks=[], yticks=[])
    ax.imshow(X_train[i].reshape(28,28), cmap='gray')
    ax.set_title(str(Y_train[i]))

### Let us visualize how are images stored in matrix form

In [None]:
def visualize_input(img, ax):
    ax.imshow(img, cmap='gray')
    width, height = img.shape
    thresh = img.max()/2.5
    for x in range(width):
        for y in range(height):
            ax.annotate(str(round(img[x][y],2)), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if img[x][y]<thresh else 'black')

fig = plt.figure(figsize = (12,12)) 
ax = fig.add_subplot(111)
visualize_input(X_train[9].reshape(28,28), ax)

### Let us see how many each digits are present

In [None]:
g = sns.countplot(Y_train)

<a id='7'></a>
## Normalization
To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255.

One common preprocessing step in machine learning is to normalize our dataset, meaning divide each example by the standard deviation of the whole numpy array. But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255 (the maximum value of a pixel channel).

In [None]:
X_train = X_train/255.0
X_test = df2/255.0

<a id='8'></a>
## Reshaping our data

In [None]:
X_train = X_train.reshape(-1,28,28,1)
X_test = X_test.values.reshape(-1,28,28,1)

<a id='9'></a>
## Encoding the Label column
Labels are 10 digits numbers from 0 to 9. We need to encode these lables to one hot vectors using the to_categorical function from the keras.<br> (eg:- 5 -> [0,0,0,0,0,1,0,0,0,0]).

In [None]:
Y_train = to_categorical(Y_train, num_classes = 10)

<a id='10'></a>
Let's split our data into training and validation set and we are going to use the train_test_split function of sklearn library for this step.

In [None]:
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=7)

<a id='4'></a>
# **CNN Model**

<a id='11'></a>
#### Now we come to the most important part of our notebook.

Firstly, we will define our model to be a Sequential model.
Then we add our first layer i.e. a convolutional(Conv2D) layer. So what does a convolutional layer perform?
- Takes an input volume 
- Applies a filter at every position of the input
- Outputs another volume (usually of different size)

## **Like this:-** 

![Convolution_schematic.gif](attachment:Convolution_schematic.gif)
<caption><center> <u> <font color='purple'> </u><font color='purple'>Convolution operation<br> with a filter of 3x3 and a stride of 1 (stride = amount you move the window each time you slide) </center></caption>

<br>In a computer vision application like this competition, each value in the matrix on the left corresponds to a single pixel value, and we convolve a 3x3 filter with the image by multiplying its values element-wise with the original matrix, then summing them up and adding a bias.

Next we add a BatchNormalization layer. Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch. This has the effect of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks.
<br>  
Then again we add a convolutional layer and a batch normalization layer. After that we add a pooling layer. Here we are adding a MaxPool2D layer.The pooling (POOL) layer reduces the height and width of the input. It helps reduce computation, as well as helps make feature detectors more invariant to its position in the input.
<br>(Max-pooling layer: slides an (f,f) window over the input and stores the max value of the window in the output.)
![max_pool1.png](attachment:max_pool1.png)
<br> Then, we add a Dropout Layer. A Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting.
<br>   
We repeat the whole process one more time. Then we add a Flatten layer. A flatten layer collapses the spatial dimensions of the input into the channel dimension.
<br>    
Lastly, just add two dense layers followed by a batch normalization layer and one dropout layer. In the last layer(Dense(10,activation="softmax")) the net outputs distribution of probability of each class.

In [None]:
model = Sequential()

model.add(Conv2D(filters=32, kernel_size=(5,5), padding='Same', activation='relu', input_shape=(28,28,1)))
model.add(BatchNormalization())
model.add(Conv2D(filters=32, kernel_size=(5,5), padding='Same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))

model.add(Conv2D(filters=64, kernel_size=(3,3), padding='Same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters=64, kernel_size=(3,3), padding='Same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

Let us see the summary of our model. Here we are going to plot the model summary using plot_model.

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=True)
Image("model.png")

In [None]:
epochs = 30
batch_size = 64

<a id='12'></a>
## Define the optimizer
Optimizers are algorithms or methods used to change the attributes of your neural network such as weights and learning rate in order to reduce the losses. Optimization algorithms or strategies are responsible for reducing the losses and to provide the most accurate results possible.We have many options for optimizers and I tried many options and RMSprop gave me the best result.
<br>
Then we will compile our model with RMSprop optimizer and categorical_crossentropy as loss function.

In [None]:
optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)

In [None]:
model.compile(optimizer = optimizer , loss = "categorical_crossentropy", metrics=["accuracy"])

<a id='13'></a>
## Callback Technique
You define and use a callback when you want to automate some tasks after every training/epoch that help you have controls over the training process. This includes stopping training when you reach a certain accuracy/loss score, saving your model as a checkpoint after each successful epoch, adjusting the learning rates over time, and more.
Here I am using ReduceLRonPlateau technique which reduces learning rate when a metric has stopped improving.

In [None]:
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy', 
                                            patience=2, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=0.00001)

<a id='14'></a>
## Data Augmentation
The performance of deep learning neural networks often improves with the amount of data available.
Many people have downloaded mnist data to increase the model. But a more appropriate way to increase the data is to apply data augmentation technique.

Data augmentation is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples.

Image data augmentation is perhaps the most well-known type of data augmentation and involves creating transformed versions of images in the training dataset that belong to the same class as the original image.

Transforms include a range of operations from the field of image manipulation, such as shifts, flips, zooms, and much more.

I am applying data augmentation technique by using ImageDataGenerator function from keras and what I have done is:-

* Randomly rotate some training images by 10 degrees
* Randomly zoom by 10% 
* Randomly shear by 10%
* Randomly shift images horizontally by 10% of the width
* Randomly shift images vertically by 10% of the height 

In [None]:
image_gen=ImageDataGenerator(rotation_range=10,width_shift_range=0.1,height_shift_range=0.1,shear_range=0.1,zoom_range=0.1,horizontal_flip=False,vertical_flip=False,fill_mode='nearest')

In [None]:
train_image_gen=image_gen.fit(X_train)

<a id='15'></a>
### So we have completed all the pre processes. Now its time to train our model.

In [None]:
model.fit_generator(image_gen.flow(X_train, Y_train, batch_size=batch_size), epochs=epochs, validation_data = (X_val, Y_val), callbacks = [learning_rate_reduction])

<a id='16'></a>
## Evaluating our model

In [None]:
metrics=pd.DataFrame(model.history.history)
metrics

In [None]:
metrics[['loss' , 'val_loss']].plot()
plt.show()

In [None]:
metrics[['accuracy' , 'val_accuracy']].plot()
plt.show()

### Let's see some of our predictions

In [None]:
np.random.seed(16)
random_selection=np.random.randint(0,4201,size=1)
random_sample=X_val[random_selection]
print('Prediction:')
print(model.predict_classes(random_sample.reshape(1,28,28,1))[0])
plt.imshow(random_sample.reshape(28,28),cmap='binary')
plt.show

In [None]:
np.random.seed(9)
random_selection=np.random.randint(0,4201,size=1)
random_sample=X_val[random_selection]
print('Prediction:')
print(model.predict_classes(random_sample.reshape(1,28,28,1))[0])
plt.imshow(random_sample.reshape(28,28),cmap='binary')
plt.show

In [None]:
np.random.seed(27)
random_selection=np.random.randint(0,4201,size=1)
random_sample=X_val[random_selection]
print('Prediction:')
print(model.predict_classes(random_sample.reshape(1,28,28,1))[0])
plt.imshow(random_sample.reshape(28,28),cmap='binary')
plt.show

## Confusion Matrix

In [None]:
def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, cm[i, j],
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

Y_pred = model.predict(X_val)
Y_pred_classes = np.argmax(Y_pred,axis = 1) 
Y_true = np.argmax(Y_val,axis = 1) 
confusion_mtx = confusion_matrix(Y_true, Y_pred_classes) 
plot_confusion_matrix(confusion_mtx, classes = range(10)) 

<a id='5'></a>
# Submission

In [None]:
results = model.predict(X_test)

In [None]:
results = np.argmax(results,axis = 1)
results = pd.Series(results,name="Label")

In [None]:
submission = pd.concat([pd.Series(range(1,28001),name = "ImageId"),results],axis = 1)
submission.to_csv("submission.csv",index=False)

### So this is it. I hope I have tried to explain each and every thing. But if you still have any doubt
![1_yIPIuNIn6ar7MvQnNqlWlQ.jpeg](attachment:1_yIPIuNIn6ar7MvQnNqlWlQ.jpeg)
### Just comment below I will definitely try to solve your problem.
<b>Also, if you want to know about Spark ML and some of its different techniques or about NLP you can refer my other kernels :-</b> [My other kernels](#0) 

<h3><font color = "red">Thanks a lot for having a look at this notebook. I would like to get an appreciation from you with an upvote. Please upvote if you liked the kernel.</font></h3>