<a href="https://colab.research.google.com/github/vivri1216/10.02-Wallpapers/blob/master/CNN_Implementation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Examples
In this section we will take a look at some simple implementations of common CNN.

#### Building a CNN form scratch

In [2]:
import tensorflow as tf

In [3]:
import numpy as np

In [4]:
# extra code – loads the mnist dataset, add the channels axis to the inputs,
#              scales the values to the 0-1 range, and splits the dataset
mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = mnist
X_train_full = np.expand_dims(X_train_full, axis=-1).astype(np.float32) / 255
X_test = np.expand_dims(X_test.astype(np.float32), axis=-1) / 255
X_train, X_valid = X_train_full[:-5000], X_train_full[-5000:]
y_train, y_valid = y_train_full[:-5000], y_train_full[-5000:]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
[1m29515/29515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
[1m26421880/26421880[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
[1m5148/5148[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
[1m4422102/4422102[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


Here we have a basic implementation of a CNN with dropout layers to improve network regularization.

In [5]:
from functools import partial

tf.random.set_seed(42)  # extra code – ensures reproducibility
DefaultConv2D = partial(tf.keras.layers.Conv2D, kernel_size=3, padding="same",
                        activation="relu", kernel_initializer="he_normal")
model = tf.keras.Sequential([
    DefaultConv2D(filters=64, kernel_size=7, input_shape=[28, 28, 1]),
    tf.keras.layers.MaxPool2D(),
    DefaultConv2D(filters=128),
    DefaultConv2D(filters=128),
    tf.keras.layers.MaxPool2D(),
    DefaultConv2D(filters=256),
    DefaultConv2D(filters=256),
    tf.keras.layers.MaxPool2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(units=128, activation="relu",
                          kernel_initializer="he_normal"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(units=64, activation="relu",
                          kernel_initializer="he_normal"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(units=10, activation="softmax")
])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [11]:
model.summary()

In [12]:
# compiles, fits, evaluates, and uses the model to make predictions
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam",
              metrics=["accuracy"])
history = model.fit(X_train, y_train, epochs=10,
                    validation_data=(X_valid, y_valid))
score = model.evaluate(X_test, y_test)
X_new = X_test[:10]  # pretend we have new images
y_pred = model.predict(X_new)

# avg time neede for training:  2h 20m
# final validation accuracy: 91%

Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m775s[0m 447ms/step - accuracy: 0.6223 - loss: 1.0761 - val_accuracy: 0.8756 - val_loss: 0.3639
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m768s[0m 447ms/step - accuracy: 0.8561 - loss: 0.4359 - val_accuracy: 0.8936 - val_loss: 0.2994
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m801s[0m 446ms/step - accuracy: 0.8804 - loss: 0.3557 - val_accuracy: 0.8976 - val_loss: 0.2847
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m757s[0m 440ms/step - accuracy: 0.8950 - loss: 0.3187 - val_accuracy: 0.8998 - val_loss: 0.2730
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m806s[0m 443ms/step - accuracy: 0.9023 - loss: 0.2846 - val_accuracy: 0.9028 - val_loss: 0.2761
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m816s[0m 451ms/step - accuracy: 0.9084 - loss: 0.2721 - val_accuracy: 0.9064 - val_loss:

Now we create by hand a ResNet-34, that is a residual network with 34 layers, counting convolutional layers and the fully connected layers. This network is a stack of residual units. Every RU is made by two convolutional layers with a varying number of 3x3 filter with a stride equal to 1. The output of the convolutional layers is added to the output of the skip layer, the one that is equal to the input of the RU.

First of all we have to create a custom class using the low-level keras API. What we do here is creating from scratch, inheriting all the methods from the Keras Layer class, a new layer. This will be our residual unit. For doing so we just have to define two things:

1. How to initialize the layer, defining the `__init__` method
2. How the forward step of this layer works, by writing a custom `call` method.

In [13]:
DefaultConv2D = partial(tf.keras.layers.Conv2D, kernel_size=3, strides=1,
                        padding="same", kernel_initializer="he_normal",
                        use_bias=False)

class ResidualUnit(tf.keras.layers.Layer): # this new class will inherit all the methods of the parent class, e.g. the Layer class
    def __init__(self, filters, strides=1, activation="relu", **kwargs):
        super().__init__(**kwargs)
        self.activation = tf.keras.activations.get(activation)
        self.main_layers = [
            DefaultConv2D(filters, strides=strides),
            tf.keras.layers.BatchNormalization(),
            self.activation,
            DefaultConv2D(filters),
            tf.keras.layers.BatchNormalization()
        ]
        self.skip_layers = []
        if strides > 1: # in case of down-sampling from a residual to the next (effect of the stride being greater than 1)
            self.skip_layers = [
                DefaultConv2D(filters, kernel_size=1, strides=strides),
                tf.keras.layers.BatchNormalization()
            ]

    def call(self, inputs): # defining the forward pass of the network we are defining
        Z = inputs
        for layer in self.main_layers:
            Z = layer(Z)
        skip_Z = inputs
        for layer in self.skip_layers:
            skip_Z = layer(skip_Z)
        return self.activation(Z + skip_Z)

Now we can use our new layer to build the network, following the structure of the ResNet54.

In [14]:
from tensorflow import keras
# building a ResNet
# since we want to build a 34 ResNet we need to stack 16 residual units, adding on top a fully connected layer with 1000 neurons
# and an output layer congruent to the problem (in this case a softmax layer with 10 neurons)
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(64,7,strides=2, input_shape = [224,224,3], padding = "same", use_bias = False))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation("relu"))
model.add(keras.layers.MaxPool2D(pool_size = 3, strides = 2, padding = "same"))
prev_filters = 64
for filters in [64] * 3 + [128] * 4 + [256] * 6 + [512] * 3: # tricky cycle for being sure to build the layers with the proper number of filters
    strides = 1 if filters == prev_filters else 2
    model.add(ResidualUnit(filters, strides=strides))
    prev_filters = filters
model.add(keras.layers.GlobalAvgPool2D())
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(10, activation = "softmax"))


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [15]:
# 21 million parameters
model.summary()

#### Using pretrained models from Keras

Keras offers pretrained netowrks readily available with a single line of code. For example, we can load a ResNet50 network trained on the ImageNet database with the following code:

In [16]:
model = keras.applications.resnet50.ResNet50(weights = "imagenet")

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5
[1m102967424/102967424[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


For using this model we need to address a few key problems:

1. the input has to have the right shape
2. the input has to be preprocessed in the right way

For the first problem, all we have to do is checking the online documentation of Keras and then resizing the input to the right shape:

In [None]:
image_resized = tf.image.resize(image, [224,224])

For the second problem there exists a method provided by the model, that takes care of it:

In [None]:
# these functions assume that the pixel values range from 0 to 255
inputs = keras.applications.resnet50.preprocess_input(image_resized*255)

Then for using the model we just have to:

In [None]:
Y_proba = model.predict(inputs)
# the output will be a matrix with one row for each given input and column for each possible class. Every entry yields the
# probability computed by the model to belong to a specific class.
top_k = keras.applications.resnet50.decode_predictions(Y_proba, top = 3)
for image_index in range(len(images)):
  print("Image #{}".format(image_index))
  for class_id, name, y_proba in top_k[image_index]:
    print(" {} - {:12s} {:.2f}%".format(class_id, name, y_proba * 100))
  print()


The `decode_predictions()` function for each image returns an array containing the top k predictions, where each prediction is presented as an array containing the class identifier, its name and the corresponding confidence score.

To take a look at all the already available pretrained models in keras, visit the following link:
https://keras.io/api/applications/

#### Transfer Learning with Keras

In case few training data are available, one can use the pretrained keras models for tranfer learning. The idea is to take a pretrained model, train its top layers and eventually un-freezing all its layers to complete the training or the task at hand.  

In [20]:
import tensorflow_datasets as tfds

# importing the dataset
dataset, info = tfds.load("tf_flowers", as_supervised = True, with_info = True)
dataset_size = info.splits["train"].num_examples # 3670
class_names = info.features["label"].names # ['dandelion',...]
n_classes = info.features["label"].num_classes # 5

# splitting the dataset in training, validation and test set
test_set, valid_set, train_set = tfds.load(
    "tf_flowers",
    split=["train[:10%]", "train[10%:25%]", "train[25%:]"],
    as_supervised = True
    )

# preprocesing the images, the CNN we will import expects 224x224 images
def preprocess(image, label):
  resized_image = tf.image.resize(image, [224,224])
  final_image = keras.applications.xception.preprocess_input(resized_image)
  return final_image, label

# applying the function to the three datasets we created, shuffling them and adding Batching and prefetching
batch_size = 32
train_set = train_set.shuffle(1000)
train_set = train_set.map(preprocess).batch(batch_size).prefetch(1)
valid_set = valid_set.map(preprocess).batch(batch_size).prefetch(1)
test_set = test_set.map(preprocess).batch(batch_size).prefetch(1)

# importing an Xception pretrained model on ImageNet, excluding the top of the network (the global average pooling layer and the dense output layer)
# note that we assume to use the Keras functional API
base_model = keras.applications.xception.Xception(weights = "imagenet", include_top = False)
avg = keras.layers.GlobalAveragePooling2D()(base_model.output)
output = keras.layers.Dense(n_classes, activation = "softmax")(avg)
model = keras.Model(inputs = base_model.input, outputs = output)

# freeze the weights of the pretrained layer for the first training epochs
for layer in base_model.layers:
  layer.trainable = False

# compiling the model
optimizer = keras.optimizers.SGD(learning_rate= 0.2, momentum = 0.9)
model.compile(loss = "sparse_categorical_crossentropy", optimizer = optimizer, metrics = ["accuracy"])

# training the model
history = model.fit(train_set, epochs = 5, validation_data = valid_set)

Epoch 1/5




[1m86/86[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m850s[0m 10s/step - accuracy: 0.6992 - loss: 1.7484 - val_accuracy: 0.7731 - val_loss: 2.3354
Epoch 2/5
[1m86/86[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m768s[0m 9s/step - accuracy: 0.8522 - loss: 1.3104 - val_accuracy: 0.8185 - val_loss: 1.5946
Epoch 3/5
[1m86/86[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m800s[0m 9s/step - accuracy: 0.9191 - loss: 0.6064 - val_accuracy: 0.8457 - val_loss: 1.4804
Epoch 4/5
[1m86/86[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m803s[0m 9s/step - accuracy: 0.9475 - loss: 0.3429 - val_accuracy: 0.8348 - val_loss: 1.3990
Epoch 5/5
[1m86/86[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m803s[0m 9s/step - accuracy: 0.9595 - loss: 0.1944 - val_accuracy: 0.8348 - val_loss: 1.4965


After the first epochs of training, when the model reaches an accuracy of about 75%-80% the top layers will be pretty well trained, so we are ready to unfreeze the base model layers to continue training.

In [None]:
for layer in base_model.layers:
  layer.trainable = True

# compiling the model
optimizer = keras.optimizers.SGD(lr = 0.2, momentum = 0.9, decay = 0.01)
model.compile(loss = "sparse_categorical_crossentropy", optimizer = optimizer, metrics = ["accuracy"])

# training the model
history = model.fit(train_set, epochs = 5, validation_data = valid_set)

# final accuracy: 95%

#### Little brush-up of data augmentation

Data augmentation is a common practive in machine learning, referring to the process of modifing the original images in various ways obtaning new ones. For doing so, keras provides built in image modification tools. We will take as example images the previous ones, used to train the Xception network.

In [None]:
# displays the first 9 images in the validation set
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 10))
index = 0
for image, label in valid_set.take(9):
    index += 1
    plt.subplot(3, 3, index)
    plt.imshow(image)
    plt.title(f"Class: {class_names[label]}")
    plt.axis("off")

plt.show()

In [None]:
# preprocissing the images
batch_size = 32
preprocess = tf.keras.Sequential([
    tf.keras.layers.Resizing(height=224, width=224, crop_to_aspect_ratio=True), # image cropping
    tf.keras.layers.Lambda(tf.keras.applications.xception.preprocess_input) # xception network tailored preprocessing
])
train_set = train_set.map(lambda X, y: (preprocess(X), y)) # applying the preprocess network to the images
train_set = train_set.shuffle(1000, seed=42).batch(batch_size).prefetch(1)
valid_set = valid_set.map(lambda X, y: (preprocess(X), y)).batch(batch_size)
test_set = test_set.map(lambda X, y: (preprocess(X), y)).batch(batch_size)


In [None]:
# let's take a look at the preprocessed images
plt.figure(figsize=(12, 12))
for X_batch, y_batch in valid_set.take(1):
    for index in range(9):
        plt.subplot(3, 3, index + 1)
        plt.imshow((X_batch[index] + 1) / 2)  # rescale to 0–1 for imshow()
        plt.title(f"Class: {class_names[y_batch[index]]}")
        plt.axis("off")

plt.show()

Now we define a data augmentation network, by using already implemented data augmentation layers (take a look here to see all the available ones: https://keras.io/api/layers/preprocessing_layers/image_augmentation/)

In [None]:
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip(mode="horizontal", seed=42), # flip the images randomly
    tf.keras.layers.RandomRotation(factor=0.05, seed=42), # rotate the images randomly
    tf.keras.layers.RandomContrast(factor=0.2, seed=42) # apply a random contrast to the image
])

In [None]:
# let's take a look at the augmented images
plt.figure(figsize=(12, 12))
for X_batch, y_batch in valid_set.take(1):
    for index in range(9):
        plt.subplot(3, 3, index + 1)
        plt.imshow((X_batch[index] + 1) / 2)  # rescale to 0–1 for imshow()
        plt.title(f"Class: {class_names[y_batch[index]]}")
        plt.axis("off")

plt.show()