### Convolutional Neural Networks (CNN)

due to how computationally efficient and effective they are, they are arguably the most popular deep learning algorithm.

these networks are useful for images as it can analyze different features of the image.

CNNs also require lower quantity of parameters when compared to an artificial neural network.

Convolutional networks are different than artificial networks since it consists of a convolutional layer and a pooling layer before it gets to the fully connected layers. 

The convolutional layers process the image by something known as the kernel filters, which are generally small and spatial dimensionality. 

### ReLU Activation function:

![Markdown Logo](https://www.researchgate.net/profile/Hossam_H_Sultan/publication/333411007/figure/fig7/AS:766785846525952@1559827400204/ReLU-activation-function.png)

Deletes non-linearity by making all negative values into zero.

### Pooling Layers

reduces computational costs by reducing the parameters of the image. helps reduce overfitting by providing an abstracted form of the original feature map.

The maximum value being taken into account in each image corresponds to a region in the image most prevalent to the feature

### Fully Connected Layers

Fully Connected layers is responsible for taking these features of inputs processing them to attain a final probability as to which class the image belongs to. In other words, FC layers are in charge of classification. Whereas the first part is responsible for feature extraction.

FC layers works the same way as multi layer perceptron where the node in the preceding layer is fully connected to the next layer and has its own weights.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.utils.np_utils import to_categorical
import random

Using TensorFlow backend.


In [None]:
np.random.seed(0)

In [None]:
(X_train, y_train), (X_test, y_test)= mnist.load_data()
 
print(X_train.shape)
print(X_test.shape)
assert(X_train.shape[0] == y_train.shape[0]), "The number of images is not equal to the number of labels."
assert(X_train.shape[1:] == (28,28)), "The dimensions of the images are not 28 x 28."
assert(X_test.shape[0] == y_test.shape[0]), "The number of images is not equal to the number of labels."
assert(X_test.shape[1:] == (28,28)), "The dimensions of the images are not 28 x 28."

In [None]:
num_of_samples=[]
 
cols = 5
num_classes = 10
 
fig, axs = plt.subplots(nrows=num_classes, ncols=cols, figsize=(5,10))
fig.tight_layout()
 
for i in range(cols):
    for j in range(num_classes):
      x_selected = X_train[y_train == j]
      axs[j][i].imshow(x_selected[random.randint(0,(len(x_selected) - 1)), :, :], cmap=plt.get_cmap('gray'))
      axs[j][i].axis("off")
      if i == 2:
        axs[j][i].set_title(str(j))
        num_of_samples.append(len(x_selected))

In [None]:
print(num_of_samples)
plt.figure(figsize=(12, 4))
plt.bar(range(0, num_classes), num_of_samples)
plt.title("Distribution of the train dataset")
plt.xlabel("Class number")
plt.ylabel("Number of images")
plt.show()

### Image preprocessing:

Notice the difference between CNN and artificial neural network is how we reshape the image. 

Previously, the image would be reshaped as a one dimensional array which are just the total number of pixels in the image. Now, we're leaving it as a 28 by 28 image but als  adding a depth of 1.

In [None]:
X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)

In [None]:
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

In [None]:
X_train = X_train/255
X_test = X_test/255