# Build your first Convolutional Neural Network using Keras
### This notebook will be your first step towards CNNs and Deep Learning.
* **1. Introduction**
* **2. Data**
    * 2.1 Importing Data
    * 2.2 Exploring Data
    * 2.3 Preprocessing
    * 2.4 Splitting
* **3. CNN**
    * 3.1 Defining the model
    * 3.2 Compiling the model with the right Optimizer, loss and metric.
    * 3.3 Training the model
* **4. Evaluation**
    * 4.1 Evaluating the model
    * 4.2 Visualizing the model's performance
* **5. Prediction and submission**

# Introduction
CNNs, short for Convolutionary Neural Networks, are a class of neural networks mostly used for dealing with image or video data.<br />
Convolutional networks were inspired by biological processes in that the connectivity pattern between neurons resembles the organization of the animal visual cortex.<br />
CNNs apply filters to images in order to extract different features, small (edges, curvatures..etc) and big (whole shapes, patterns..etc)<br />

# Data
### Importing Data
First thing to do is import all the libraries you're going to need.

In [None]:
import tensorflow as tf 
import pandas as pd 
import matplotlib.pyplot as plt 
import numpy as np 
%matplotlib inline

In [None]:
#Import the input files
train = pd.read_csv('../input/digit-recognizer/train.csv') 
evaluation = pd.read_csv('../input/digit-recognizer/test.csv')
sample = pd.read_csv('../input/digit-recognizer/sample_submission.csv')

print(f'train shape = {train.shape}', f'test shape = {evaluation.shape}', sep='\n')

### Exploring Data
After printing out the shapes of 'train' and 'evaluation', you can see that the train set contains 1 additional column, use .head() to have a look at the first 5 rows of the data.

In [None]:
train.head()

As shown, the first row "label" is the class of each instance, which should be our output. <br />
Now, let's see if our data contains any flaws <br />
Check if the dataset contains any null values.

In [None]:
print(train.isnull().any().sum())
print(evaluation.isnull().any().sum())

### Preprocessing
Take the 'labels' column out and use .describe() to have some insight on the data.

In [None]:
targets = train['label']
train = train.drop('label',axis = 1)

In [None]:
train.describe()

As we know, the images come in a grayscale format where all the values are between (0-255), a good thing you should do is standarize the data, which makes it easier for the model to converge. <br />
Standarization transforms the data in a way that scales all the values between (0-1), you can do this easily by dividing all values by 255 since our values come in a (0-255) range

In [None]:
train /= 255
evaluation /= 255

Now, let's describe our dataset again and notice if the standard deviation (std) is any different

In [None]:
train.describe()

The standard deviation is much lower now, good job! <br />
Let's check out one random sample using the .imshow() function.

In [None]:
index = np.random.randint(0,42000)
test_image = train.values[index].reshape(28,28)
plt.imshow(test_image, cmap = 'bone')
plt.title(targets.values[index])
plt.show()

In [None]:
train = train.values.reshape(-1,28,28,1)
evaluation = evaluation.values.reshape(-1,28,28,1)
targets = targets.values.reshape(-1,1)

### Splitting
Split the data into train and test segments, a 0.1-0.2 test-train ratio is good for most cases.

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train, targets, stratify = targets, test_size = 0.1, random_state = 42)

# CNN
### Defining the model
Our model will be 2 layers of (2x Conv2D, 1x Maxpooling, 1x BatchNorm, 1x Dropout), 2 Dense layers and 1 output layer. <br />
You can mess around with the number of layers and parameters of each layer and see how it affects your evaluation score.

In [None]:
from keras.models import Sequential
from keras.layers import Conv2D, BatchNormalization, MaxPooling2D, Dense, Dropout, Flatten
from keras.optimizers import Adam

In [None]:
model = Sequential()

model.add(Conv2D(32, input_shape = (28,28,1), kernel_size = (3,3), activation = 'relu'))
model.add(Conv2D(32, kernel_size = (3,3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.1))

model.add(Conv2D(64, kernel_size = (3,3), activation = 'relu'))
model.add(Conv2D(64, kernel_size = (3,3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.1))


model.add(Flatten())
model.add(Dense(256, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dropout(0.4))
model.add(Dense(128, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Dense(10, activation = 'softmax'))



### Compiling the model with the right Optimizer, loss and metric.
You can try and use other optimizers such as SGD, Adam works very well.

In [None]:
optimizer = Adam(lr=0.001)
model.compile(optimizer = optimizer,
             loss = 'sparse_categorical_crossentropy',
             metrics = ['accuracy'])

model.summary()

### Training the model.


In [None]:
EPOCHS = 15
BATCH_SIZE = 256

In [None]:
history = model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs = EPOCHS, batch_size = BATCH_SIZE)

# Evaluation
### Evaluating the model

In [None]:
model.evaluate(X_test, y_test)

### Visualizing the model's performance

In [None]:
plt.figure(figsize=(9,6))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend(['train','validation'])
plt.show()

# Prediction and submission

In [None]:
evaluation = evaluation.reshape(28000,28,28,1)
results = model.predict_classes(evaluation)

results = pd.Series(results, name="Label")
submission = pd.concat([pd.Series(range(1,28001), name = "ImageId"), results], axis = 1)

submission.to_csv("submission.csv", index=False)

### Thank you for reading!