<a href="https://colab.research.google.com/github/zelal-Eizaldeen/deeplearning_course/blob/main/4_4Tensorflow_Programming_Example_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

- In this programming example, we will demonstrate how to do **image classification using a convolutional neural network implemented using Tensorflow.**

In this Google Colab notebook, we will do **image classification with a convolutional neural network**. We start with importing some of the TensorFlow modules here. We will train it for **32 epochs and use a batch size of 32**.

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv2D
import numpy as np
import logging
tf.get_logger().setLevel(logging.ERROR)

EPOCHS = 32
BATCH_SIZE = 32

We **load the dataset with keras datasets.cifar10**.

In [2]:
# Load dataset.
cifar_dataset = keras.datasets.cifar10
(train_images, train_labels), (test_images,
    test_labels) = cifar_dataset.load_data()


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 0us/step


As before we want to **standardize this dataset** so we **compute the mean and standard deviation**, **so we can then standardize both the training images and the test images.**

In [3]:
# Standardize dataset.
mean = np.mean(train_images)
stddev = np.std(train_images)
train_images = (train_images - mean) / stddev
test_images = (test_images - mean) / stddev
print('mean: ', mean)
print('stddev: ', stddev)

mean:  120.70756512369792
stddev:  64.1500758911213


We also want to **encode our training labels as one hot encoding**, and we use do that using the two categorical function. And we say that we want it with 10 different classes, **so 10 outputs later for our network**.  

In [4]:
# Change labels to one-hot.
train_labels = to_categorical(train_labels,
                              num_classes=10)
test_labels = to_categorical(test_labels,
                             num_classes=10)

# Create the CNN

Let's now look at how **we can define our network**. So it's a **sequential** network as before where we stacked a number of layers on top of each other. It consists of **two convolutional layers** and **one fully connected layer**.  **We can see a one difference between this and the digit classification network is that we don't have a flattened layer at the very beginning of the network**.

We needed to have the flattened layer because the **fully connected layer, which was the first layer assumed a 1D array as inputs**. **But the convolutional layer assumes that we have a 3D array, which is the image input**. So it's **two dimensions plus the number of colored channels that makes it three.**

So if we look at this **convolutional layer**. The **first convolutional layer**,
- we say that **we want 64 output channels.** We want **a kernel size of five by five and a stride of two by two**. We use **relu activation and we use padding equals same**.
- With padding "same":  If we had **a stride of one by one, then the output dimension would be the same** as the input dimension.
-  But given that we then have **a stride of two**, it means that the **output dimension will be exactly half of the input dimension**. So with an **input shape of 32 by 32 by three, the output will now be 16 by 16, and then by the number of channels,** **which is 64**.


And we use **he normal initialization for the weights**, and we set the **biases** to zero.

In [5]:
# Model with two convolutional and one fully connected layer.
model = Sequential()
model.add(Conv2D(64, (5, 5), strides=(2,2),
                 activation='relu', padding='same',
                 input_shape=(32, 32, 3),
                 kernel_initializer='he_normal',
                 bias_initializer='zeros'))


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


 **Second convolutional layer**, we also use **64 channels**. We have a smaller **kernel size, three by three**, *stride, two by two*. **Again, relu same, he normal and zeros.**

In [6]:
model.add(Conv2D(64, (3, 3), strides=(2,2),
                 activation='relu', padding='same',
                 kernel_initializer='he_normal',
                 bias_initializer='zeros'))

Now we go **from a convolutional layer to a fully connected layer**. And there we have a **mismatch in dimensions because we have a 3D structure from the convolutional layer and a assuming a 1D dimension for the fully connected layer, so there we need to insert this flattened layer**.

In [7]:
model.add(Flatten())

And then this is **our output layer**, which has **10 neurons**. **Activation is soft max**. We use **glorot initialization of the weights and zeros for the bias**.

In [8]:
model.add(Dense(10, activation='softmax',
                kernel_initializer='glorot_uniform',
                bias_initializer='zeros'))

And then as we use soft max, we want to use **the categorical cross entropy as a loss function**. And we use the **Adam optimizer**. We also want to print out the **accuracy**, and then we'll print out **the summary of the model**.

In [10]:
model.compile(loss='categorical_crossentropy',
              optimizer='adam', metrics =['accuracy'])
model.summary()

Then the training of the network. We call the **fit function with train images and train labels, validation_data is the test images and test labels**. And then we tell how many **epochs** the **batch** size and so on.

In [11]:
history = model.fit(
    train_images, train_labels, validation_data =
    (test_images, test_labels), epochs=EPOCHS,
    batch_size=BATCH_SIZE, verbose=2, shuffle=True)

Epoch 1/32
1563/1563 - 50s - 32ms/step - accuracy: 0.5050 - loss: 1.4025 - val_accuracy: 0.5889 - val_loss: 1.1703
Epoch 2/32
1563/1563 - 45s - 29ms/step - accuracy: 0.6323 - loss: 1.0587 - val_accuracy: 0.6315 - val_loss: 1.0599
Epoch 3/32
1563/1563 - 48s - 30ms/step - accuracy: 0.6791 - loss: 0.9205 - val_accuracy: 0.6364 - val_loss: 1.0527
Epoch 4/32
1563/1563 - 77s - 49ms/step - accuracy: 0.7139 - loss: 0.8240 - val_accuracy: 0.6491 - val_loss: 1.0293
Epoch 5/32
1563/1563 - 84s - 54ms/step - accuracy: 0.7406 - loss: 0.7447 - val_accuracy: 0.6313 - val_loss: 1.1068
Epoch 6/32
1563/1563 - 44s - 28ms/step - accuracy: 0.7652 - loss: 0.6753 - val_accuracy: 0.6532 - val_loss: 1.0832
Epoch 7/32
1563/1563 - 81s - 52ms/step - accuracy: 0.7850 - loss: 0.6154 - val_accuracy: 0.6402 - val_loss: 1.1679
Epoch 8/32
1563/1563 - 46s - 29ms/step - accuracy: 0.8013 - loss: 0.5634 - val_accuracy: 0.6450 - val_loss: 1.2029
Epoch 9/32
1563/1563 - 85s - 54ms/step - accuracy: 0.8185 - loss: 0.5131 - val_a

We see here now **after 32 epochs** that the accuracy on the training data is pretty good, it's 95%.

However, if we look at the **validation accuracy**, so that's **the accuracy on the test dataset, we only have a 61% accuracy**. So that's a clear **indication of overfitting**, where we see that **it learns the training dataset, but does less well on the test dataset**.

# Modified CNN Version

We're trying to **classify one out of 10 different categories**, and in 61% of the cases we get it right with a simple network but I think we can do better. So let's move on to a network where we have made it a little bit more complex . So in this notebook, we will now have a little bit of a more complex network.

The initial code is the same as before but now that the network definition is different. We have a one **first convolutional layer here, followed by a dropout layer for regularization, another convolutional layer, and then another dropout**, **another convolution, dropout, convolutions**. We **have four convolutional layers here**. And then after that we **have a max pooling layer**. We have **a dropout again**, and then **we flatten it** and then we do **three fully connected layers**. So we have **four convolutional layers and three fully connected layers**. And we **use relu neurons for all the convolutional and the fully connected layers, except for the last one where we do the soft max activation**.

In [15]:
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import MaxPooling2D



In [16]:
# Model with 4 convolutional and 2 fully-connected layers
# using dropout and max-pooling.
model = Sequential()
model.add(Conv2D(64, (4, 4), activation='relu', padding='same',
                 input_shape=(32, 32, 3)))
model.add(Dropout(0.2))
model.add(Conv2D(64, (2, 2), activation='relu', padding='same',
                 strides=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))


And then the rest of the things are the same as well. We print out the summary of the network and we see here the** number of trainable parameters** for each of the layers.

In [17]:
# Compile and train the model.
model.compile(loss='categorical_crossentropy',
              optimizer='adam', metrics =['accuracy'])
model.summary()

And we can see here now that we have trained it for 32 epochs, that the **training accuracy is at 75%* and the **validation accuracy is at 75.6%**, so almost 76%.

So by having this more,**this deeper network with more layers, we, and adding dropout regularization, not only did we manage to get the over fitting under control, but we also got the validation accuracy** to improve significantly.

So now in 76% of the cases, we managed to **predict/or classify the correct image out of this dataset**.

In [18]:
history = model.fit(
    train_images, train_labels, validation_data =
    (test_images, test_labels), epochs=EPOCHS,
    batch_size=BATCH_SIZE, verbose=2, shuffle=True)

Epoch 1/32
1563/1563 - 221s - 141ms/step - accuracy: 0.3843 - loss: 1.6769 - val_accuracy: 0.5155 - val_loss: 1.3512
Epoch 2/32
1563/1563 - 268s - 171ms/step - accuracy: 0.5288 - loss: 1.3172 - val_accuracy: 0.6170 - val_loss: 1.1050
Epoch 3/32
1563/1563 - 273s - 175ms/step - accuracy: 0.5861 - loss: 1.1698 - val_accuracy: 0.6579 - val_loss: 0.9968
Epoch 4/32
1563/1563 - 222s - 142ms/step - accuracy: 0.6208 - loss: 1.0812 - val_accuracy: 0.6689 - val_loss: 0.9621
Epoch 5/32
1563/1563 - 225s - 144ms/step - accuracy: 0.6402 - loss: 1.0267 - val_accuracy: 0.6868 - val_loss: 0.9160
Epoch 6/32
1563/1563 - 223s - 143ms/step - accuracy: 0.6561 - loss: 0.9836 - val_accuracy: 0.6936 - val_loss: 0.8937
Epoch 7/32
1563/1563 - 220s - 141ms/step - accuracy: 0.6670 - loss: 0.9492 - val_accuracy: 0.6978 - val_loss: 0.8731
Epoch 8/32
1563/1563 - 267s - 171ms/step - accuracy: 0.6760 - loss: 0.9262 - val_accuracy: 0.6948 - val_loss: 0.8845
Epoch 9/32
1563/1563 - 217s - 139ms/step - accuracy: 0.6846 - lo