## Convolutional Neural Networks - Implementation
Your goal in this project is to classify handwritten digits using Convolutional Neural Networks. Here are a few examples of images from the input data:

![](https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png)

As output labels, we have a set of integer values ranging from 0 to 1. We are going to follow those steps in this notebook:

1. **Prepare the input data**
2. **Creating a simple fully connected model**
3. **Extending to CNNs**
4. **Visualizing Predictions**


### 1. Preparing the input data

In [None]:
import warnings
warnings.filterwarnings("ignore")

from keras.datasets import mnist

# The data, split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [None]:
print("Shape of the training set: {}".format(X_train.shape))
print("Shape of the test set: {}".format(X_test.shape))

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

# Index to be visualized
IDX=2
plt.imshow(X_train[IDX], cmap='gray')
plt.title("Label: {}".format(y_train[IDX]))
plt.show()

Now, let's normalize the data using standardization:

In [None]:
train_mean = X_train.mean(); train_mean

In [None]:
train_std = X_train.std();train_std

In [None]:
X_train = (X_train - train_mean)/train_std
X_test = (X_test - train_mean)/train_std

In [None]:
print(f'Training Mean {X_train.mean():.3f}')
print(f'Training Std {X_train.std():.3f}')
print(f'Test Mean {X_test.mean():.3f}')
print(f'Test Std {X_test.std():.3f}')

Normalisation is required so that all the inputs are at a comparable range.
Say there are two inputs to your ann, x1 and x2. x1 varies from to 0 to 0.5 and x2 varies from 0 to 1000. A change of x1 of 0.5  is 100 percent change where as a change of x2 by 0.5 is only a change of 0.05%. Hence normalization helps. 

### 2. Training using only Fully Connected layers first

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Flatten

model = Sequential([
    
])

In [None]:

model.compile()


In [None]:
model.fit()


In [None]:
model.evaluate()

### 3. Extending to CNNs

Now, let's replicate the same architecture from the visualization provided on the previous video:
![Screen Shot 2019-05-14 at 12 58 15](https://user-images.githubusercontent.com/5733246/57713463-e8627400-7648-11e9-8c64-3745519dbb20.png)

The analog architecture that is going to be used is:
- Conv. Layer with 6 filters
- Maxpooling
- Conv. Layer with 16 filters
- Maxpooling
- Fully connected layer with 120 units
- Fully connected layer with 100 units
- Output layer with 10 units

#### 3.1 One Conv-Layer

In [None]:
X_train.shape

In [None]:
X_train = X_train.reshape(60000,28,28,1)

In [None]:
X_test = X_test.reshape(10000,28,28,1)

In [None]:
from keras.layers import Conv2D, MaxPool2D

model = Sequential([
    Conv2D(6, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)),
    MaxPool2D(),
    Conv2D(16, kernel_size=(3,3), activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(300, activation='relu'),
    Dense(10, activation='softmax')
])


In [None]:

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


In [None]:
model.fit(X_train, y_train)

In [None]:
model.evaluate(X_test, y_test)

The previous error occours because we need to add a new dimension to the input data for the batch slicing. We can confirm this in the [documentation of Conv2D](https://keras.io/layers/convolutional/):
> 4D tensor with shape: (batch, channels, rows, cols) if data_format is "channels_first" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is "channels_last".

The data format can be verified this way:

In [None]:
from keras import backend as K
K.image_data_format()

By default, the data format on tensorflow is 'channel_last', hence, we have to reshape the input data to '(batch, rows, cols, channels)'. The current shape is:

In [None]:
print("Shape of the training set: {}".format(X_train.shape))
print("Shape of the test set: {}".format(X_test.shape))

It is missing the channel dimension. Since this image is black and white, the number of channels is just one. The input data can be reshaped in the folowing way:

In [None]:
X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)

Now let's train again:

In [None]:
from keras.layers import Conv2D

model = Sequential([
    Conv2D(6, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=2)
score = model.evaluate(X_test, y_test)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

#### 3.2 One Conv-Layer + Maxpooling

Now, let's add maxpooling:

In [None]:
from keras.layers import MaxPooling2D

model = Sequential([
    Conv2D(6, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=2)
score = model.evaluate(X_test, y_test)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

#### 3.3 Two Conv-Layer + Maxpooling

And now, let's finish the architecture:

In [None]:
model = Sequential([
    Conv2D(6, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(16, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(120, activation='relu'),
    Dense(100, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=2)
score = model.evaluate(X_test, y_test)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

### 4. Visualizing predictions
Finally, let's visualize some predictions:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline


# Index to be visualized
for idx in range(5):
    plt.imshow(X_test[idx].reshape(28,28), cmap='gray')
    out = model.predict(X_test[idx].reshape(1,28,28,1))
    plt.title("True: {}, Pred: {}".format(y_test[idx], np.argmax(out)))
    plt.show()