# Convolutional Neural Networks

In this notebook, we will try to get a general overview of CNNs and what can be done with them.
We will use the MNIST dataset.
At the end of the notebook as an extra side, you can also try to implement something similar by loading the CIFAR-10 dataset.

Please note that this notebook is not an advanced implementation of CNNs. It is just for you to learn ho to implement from scratch a simple CNN, without using any pre-trained network.

## MNIST Dataset

The MNIST dataset is a large database of handwritten digits. It contains 60,000 training images and 10,000 testing images

### Data Preparation

** Import the packages that you may need.**

In [2]:
from keras import layers
from keras import models
from keras.datasets import mnist
from keras.utils import to_categorical

Using TensorFlow backend.


** Load the MNIST dataset.**

In [5]:
(train_images,train_labels),(test_images, test_labels)= mnist.load_data()

** Perform some data pre-processing on both input and labels. Hint: reshape the input with dimension (28,28,1)**

In [7]:

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)


** Print the shape of the data and some sample to visualize them.**

In [8]:
print('Shape training data:')
print(train_images.shape)
print(train_labels.shape)
print('\nShape test data:')
print(test_images.shape)
print(test_labels.shape)

Shape training data:
(60000, 28, 28, 1)
(60000, 10)

Shape test data:
(10000, 28, 28, 1)
(10000, 10)


## Vanilla CNN

This is the most basic CNN: you will have to build a convolutional neural network that is composed by 2 Convolutional layers and 2 Fully Connected layers. Use proper activation functions.

** Set the number of batches and epochs.**

In [19]:
size_batches = 256
epochs = 3

** Build the Vanilla CNN model. **

In [20]:
model_cnn = models.Sequential()
model_cnn.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model_cnn.add(layers.Conv2D(64, (3, 3), activation='relu'))
model_cnn.add(layers.Flatten())
model_cnn.add(layers.Dense(64, activation='relu'))
model_cnn.add(layers.Dense(10, activation='softmax'))

** Get a summary of the model. **

In [21]:
model_cnn.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_3 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
flatten_2 (Flatten)          (None, 36864)             0         
_________________________________________________________________
dense_3 (Dense)              (None, 64)                2359360   
_________________________________________________________________
dense_4 (Dense)              (None, 10)                650       
Total params: 2,378,826
Trainable params: 2,378,826
Non-trainable params: 0
_________________________________________________________________


** Configure the model with an optimizer and a loss. **

In [22]:
model_cnn.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])

** Train the model. **

In [24]:
model_cnn.fit(train_images, train_labels, epochs, size_batches)

Epoch 1/256
 3447/60000 [>.............................] - ETA: 24:10 - loss: 0.3893 - accuracy: 0.8822

KeyboardInterrupt: 

### CNN with Max Pooling and Dropout

Let's implement the same CNN as above but plus Max Pooling and Dropout.

**Build the new network with max pooling and dropout. You should think a little bit where Max Pooling and Dropout should be inserted. **

** Get a summary of the model. **

** Configure the network. **

** Train the network. **

** Evaluate the model on the test data. **

** Print the test accuracy. **

** Evaluate the model on the test data. **

** Print the test accuracy. **

## Extra: More complex CNN with CIFAR-10

As an extra part, you can also load the CIFAR-10 dataset, perform a similar data pre-processing as the MNIST dataset and implement a proper CNN. In this case, the dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. Therefore you will need a network that is a little bit deeper, with 4 convolution layer. 

This part is not guided as the previous one, it's up to you to start from scratch and try out the implementation. However the procduere is pretty similar.