# Basics of image processing with python

## Lesson 6

In this lesson, we will have a look into the fundmentals of deep learning. Please consider this cheat sheet just as a very basic introduction without any claim for completeness. The aim of this lecture is just to give an overview on the principles, not a step-by-step guideline to a fully functional Deep Learning pipeline.

We will be using tensorflow in the examples due to its simplicity in setting up the environment. However, if you want to dive deeper into deep learning, you should use pytorch instead.

To start with, please create a new environment with the yml-file included in this lesson's folder. You will have to import the following packages:

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt

We first define some constants, defining the image dimensions and training batch size:

In [None]:
IMG_HEIGHT = 256
IMG_WIDTH = 256
BATCH_SIZE = 16

We then specify the path to our data. We can either specify a path to the training data and have the training data be split into training and validation data, or we can keep the validation data in a separate folder. Additionally, it is always good practice to have some independent test data to very the model performance. Roughly, about 60-80% of your data should be used as training data, 10-20% for validation and 10-20% for testing. The exact ratio may vary dependeing on your application and availability of data. Here, we assume that we split the training data.

In [None]:
train_dir = 'path/to/training/data'
# optionally, you can also specify the path to the validation data if it is not taken from the training_dir
# val_dir = 'path/to/validation/data'
test_dir = 'path/to/test/data'

Next, we load the datasets with `tf.keras.utils.image_dataset_from_directory` (this method may be deprectaed in the future. However, it is very simple to use, that's why we include it here). It accepts the following arguments:
* `directory` : Directory where the image data is saved
* `validation_split` : (Optional), gives the fraction of data from the input dataset that shall be used as validation data, float from 0 to 1. If not specified, the dataset is not split.
* `subset` : if `validation_split` is used, you can specify here if you want to output the `training` or `validation` split or `both`. If `both` is used, you will receive two outputs, see below.
* `seed` : optional seed for random shuffling
* `image_size` : all images processed must have the same size. Therefore, this mandatory parameter specifies the width and height of images.
* `batch_size` : specifies the number of images that should be processed before the weights are being updated. Small batch sizes converge more quickly, but may be less accurate

More parameters can be specified. Please check the tensorflow documentation if you are interested. The class labels are automatically added if the folder specified as training- and test-dir contain a furterh splitting into the individual classes.

In [None]:
train_ds, val_ds = tf.keras.utils.image_dataset_from_directory(
    directory=train_dir,
    validation_split=0.2,
    subset="both",
    seed=123,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE
)

test_ds = tf.keras.utils.image_dataset_from_directory(
    directory=test_dir,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE
)

Now, the input images have to be rescaled to a value range from 0 to 1. Assuming the input data is 8-bit encoded, you can use `tf.keras.layers.Rescaling` and pass "1./255" as argument, which describes the transformation of the range of 0-255 to 0-1.

In [None]:
normalization_layer = tf.keras.layers.Rescaling(1./255)

Next, this transformation is applied to the individual datasets by mapping the normalization onto each input x, while the label y remains unchanged:

In [None]:
normalized_train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
normalized_val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
normalized_test_ds = test_ds.map(lambda x, y: (normalization_layer(x), y))

To enhance the variation within the dataset, it is good practice to apply data augmentation like flipping, rotating or zooming of the images. This is applied randomly within a specified range. This step is not strictly necessary, but may help to achieve a more robust model.

In [None]:
# data_augmentation = tf.keras.Sequential([
#     tf.keras.layers.RandomFlip('horizontal'),
#     tf.keras.layers.RandomRotation(0.2),
#     tf.keras.layers.RandomZoom(0.2),
# ])

Next, we will specify the actual model. Using `models.Sequential`, we can define a list of layers in sequential order. If we would like to apply data augmentation, we can also add it here. The most commonly used layer types are:
* `layers.InputLayer` : input layer to the network. The `shape` argument gives the shape of the input image
* `layers.Conv2D` : 2D-convolutional layer is the "heart" of the CNN. In our example, we pass the following arguments:
    * `filters` : the number of filters to be learned, also corresponding to the dimension of the output space
    * `kernel_size` : the dimensions of the kernel, e.g. (3x3)
    * `activation` : the type of activation function used. If not given, no activation function will be used.

* `layers.MaxPooling2D` : applies max pooling to the result, here we only use the `pool_size` argument, that specifies the pooled pixel neighborhood.
* `layers.Dropout` : dropout layer to randomly ignore nodes. We only use the `rate` argument here, that specifies the dropout likelyhood.
* `layers.Flatten` : "flattens" an input multidimensional input to a 1D-output. Does not require any arguments.
* `layers.Dense` : Dense, fully connected layer. Here, we only use the `units` argument specifying the number of nodes and the `activation` arument specifying the sctivation function.

More arguments are possible, please check the tensorflow documentation if you are interested.

The architecture here is just a simple example architecture, it s not claimed that this solves a particular application!


In [None]:
model = models.Sequential([
    layers.InputLayer(shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    #data_augmentation,
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Dropout(0.1),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Dropout(0.1),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Dropout(0.1),
    layers.Conv2D(256, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Dropout(0.1),
    layers.Flatten(),
    layers.Dense(16, activation='relu'),
    layers.Dropout(0.1),
    layers.Dense(1, activation='sigmoid')
])

Next, we need to compile the model. Here, we also specify the type of optimizer to be used, the loss function and the metrics. To learn about the differnet optimizers, loss function and metrics, please check the documentation and mor detailled sources, this is not in the scope of the course.

In [None]:
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Now, we can eventually train the model using `model.fit`. We assign this to the variable `history`, stating that we can later on go back to the individual training steps and assess the intermediate results. We pass the following arguments:
* `x` : input data for the training. In our case, that's the `normalized_train_dataset` we have created earlier, i.e. a tuple with (inputs, labels)
* `validation_data` : validation data for the training. In our case, that's the `normalized_val_dataset` we have created earlier, i.e. a tuple with (inputs, labels)
* `epochs`: the number of epochs we want to train our model for

In [None]:
EPOCHS = 40
history = model.fit(
    x=normalized_train_ds,
    validation_data=normalized_val_ds,
    epochs=EPOCHS
)

Now, you will see the trining running and the classification accuracy (hopefully) increasing step by step. Since the weights are randomly initialized, the results will differ in each run. It is thus good practice to compare the results of several runs to judge the model stability. It can also happen that the model gets "stuck" in a local optimum and does not learn well. This is an effect that can be observed when multiple results are compared.

We can now evaluate the model quality on the test data, i.e. data that the model has not "seen" before and that has not yet been used to assess the model quality by using `model.evaluate`, passing the dataset we want to evaluate as argument.

In [None]:
test_loss, test_acc = model.evaluate(normalized_test_ds)
print(f'Testing Accuracy: {test_acc * 100:.2f}%')

Finally, it is helpful to track the training progress of the model. For that, we can access the relevant parameters from the history and plot them. In that way, it is also possible to easily identify overfitting.

In [None]:
training_accuracy = history.history['accuracy']
validation_accuracy = history.history['val_accuracy']
training_loss = history.history['loss']
validation_loss = history.history['val_loss']

epochs_range = range(EPOCHS)

plt.figure()
plt.plot(epochs_range, training_accuracy, label='Training Accuracy')
plt.plot(epochs_range, validation_accuracy, label='Validation Accuracy')
plt.legend()
plt.title('Training and Validation Accuracy')

plt.figure()
plt.plot(epochs_range, training_loss, label='Training Loss')
plt.plot(epochs_range, validation_loss, label='Validation Loss')
plt.legend()
plt.title('Training and Validation Loss')

If you are happy with your model and would like to save it for later use, you can easily save it using `model.save`, passing the desired model name and filetype as argument. This example uses the `.keras` file format, other possibilities can be found in the documentation. To re-load the model, you can use `tf.keras.models.load_model` and pass the path to your model as argument. The model can then be used as described above. To display the model architecture, you can use `model.summary`.

In [None]:
model.save('myAwesomeCNN.keras')

myAwesomeCNN = tf.keras.load_model('myAwesomeCNN.keras')

myAwesomeCNN.summary()