# Chapter 6: Convolutional Neural Networks

In [1]:
'''
W1 assignment 

First of all, we start importing the Keras module and checking its version
''' 
import keras
keras.__version__

'2.6.0'

Basic elements of a convolutional neuronal network

In [2]:
'''
W1 assignment

Now it's time to define our model.

After importing all the submodules we will need (sequential, Dense and sgd)
we define our model.

It consists of:
  * an input layer of 10 neurons. Each neuron receives 784 inputs corresponding
  to every sample and has a 'sigmoid' activation function at their output.
  * an output layer of 10 neurons. Since it is a classification problem using
  'softmax' as its activation function is convenient.

Finally, we print the summary of our model to have a better understanding of its
architecture.

The number of parameters of our model is composed of the following:
  * Input layer: 780 features * 10 neurons + 1 bias * 10 = 7850
  * Output layer: 10 outputs from input layer * 10 neurons + 1 bias * 10 = 110

This results in 7850 + 110 = 7960 parameters.
'''
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 24, 24, 32)        832       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 32)        0         
Total params: 832
Trainable params: 832
Non-trainable params: 0
_________________________________________________________________


Basic CNN model

In [3]:
'''
W1 assignment

Now it's time to define our model.

After importing all the submodules we will need (sequential, Dense and sgd)
we define our model.

It consists of:
  * an input layer of 10 neurons. Each neuron receives 784 inputs corresponding
  to every sample and has a 'sigmoid' activation function at their output.
  * an output layer of 10 neurons. Since it is a classification problem using
  'softmax' as its activation function is convenient.

Finally, we print the summary of our model to have a better understanding of its
architecture.

The number of parameters of our model is composed of the following:
  * Input layer: 780 features * 10 neurons + 1 bias * 10 = 7850
  * Output layer: 10 outputs from input layer * 10 neurons + 1 bias * 10 = 110

This results in 7850 + 110 = 7960 parameters.
'''
model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 24, 24, 32)        832       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 8, 8, 64)          51264     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64)          0         
Total params: 52,096
Trainable params: 52,096
Non-trainable params: 0
_________________________________________________________________


In [4]:

'''
W1 assignment

Now it's time to define our model.

After importing all the submodules we will need (sequential, Dense and sgd)
we define our model.

It consists of:
  * an input layer of 10 neurons. Each neuron receives 784 inputs corresponding
  to every sample and has a 'sigmoid' activation function at their output.
  * an output layer of 10 neurons. Since it is a classification problem using
  'softmax' as its activation function is convenient.

Finally, we print the summary of our model to have a better understanding of its
architecture.

The number of parameters of our model is composed of the following:
  * Input layer: 780 features * 10 neurons + 1 bias * 10 = 7850
  * Output layer: 10 outputs from input layer * 10 neurons + 1 bias * 10 = 110

This results in 7850 + 110 = 7960 parameters.
'''
model.add(layers.Flatten())
model.add(layers.Dense(10, activation='softmax'))

In [5]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 24, 24, 32)        832       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 8, 8, 64)          51264     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                10250     
Total params: 62,346
Trainable params: 62,346
Non-trainable params: 0
__________________________________________________

In [7]:
'''
W1 assignment

Now it's time to define our model.

After importing all the submodules we will need (sequential, Dense and sgd)
we define our model.

It consists of:
  * an input layer of 10 neurons. Each neuron receives 784 inputs corresponding
  to every sample and has a 'sigmoid' activation function at their output.
  * an output layer of 10 neurons. Since it is a classification problem using
  'softmax' as its activation function is convenient.

Finally, we print the summary of our model to have a better understanding of its
architecture.

The number of parameters of our model is composed of the following:
  * Input layer: 780 features * 10 neurons + 1 bias * 10 = 7850
  * Output layer: 10 outputs from input layer * 10 neurons + 1 bias * 10 = 110

This results in 7850 + 110 = 7960 parameters.
'''
from keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

print (train_images.shape)
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
(60000, 28, 28)


In [8]:
'''
W1 assignment

Now that we've defined our model it's time to compile and fit it to our dataset.
We have also to define our hyperparameters such as:
  * batch size
  * number of epochs
  * loss function
  * optimizing function

In our case we compile our model with the following hyper-parameters:
  * loss function: categorical_crossentropy. This is the function used in Keras 
  for multiclassification problems.
  * optimizer: sgd. Stochastic Gradient Descent is the most common and 
  famous optimizer.
  * metrics: accuracy. We want to know how well our model performs in terms of 
  accuracy when classifying our samples, in contrast with regression problems,
  where we want to know the amount of error obtained.

After compiling, we proceed to fit our model by indicating both the features and
labels sets and also the number of epochs (10) and the batch size (50).

Once fitted, we evaluate our output model using the features and labels sets.

Finally, we obtain a test accuracy of 0.88 (88%), which is pretty good.
'''
batch_size = 100
epochs = 5

model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

model.fit(train_images, train_labels,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1
          )

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fd44115f1d0>

Model evaluation

In [None]:
'''
W1 assignment

Now that we've defined our model it's time to compile and fit it to our dataset.
We have also to define our hyperparameters such as:
  * batch size
  * number of epochs
  * loss function
  * optimizing function

In our case we compile our model with the following hyper-parameters:
  * loss function: categorical_crossentropy. This is the function used in Keras 
  for multiclassification problems.
  * optimizer: sgd. Stochastic Gradient Descent is the most common and 
  famous optimizer.
  * metrics: accuracy. We want to know how well our model performs in terms of 
  accuracy when classifying our samples, in contrast with regression problems,
  where we want to know the amount of error obtained.

After compiling, we proceed to fit our model by indicating both the features and
labels sets and also the number of epochs (10) and the batch size (50).

Once fitted, we evaluate our output model using the features and labels sets.

Finally, we obtain a test accuracy of 0.88 (88%), which is pretty good.
'''
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test loss:', test_loss)
print('Test accuracy:', test_acc)

Test loss: 0.11137229395359755
Test accuracy: 0.9673
