# Overfitting Experimentation

In model_selection.ipynb we selected convolutional neural networks as our model class and identified overfitting as a major issue with this dataset. The purpose of this notebook is to evaluate different was to reduce overfitting. 

In this notebook we document different experiments we have performed to evaluate measures for reducing overfitting. Overall, we performed the following experiments: 
- Experiment 1: testing different network configuration measures, such as dropout and regularization on a simple network
- Experiment 2: testing the effects of increased data variety via image generators
- Experiment 3: testing increased network complexity. This test builds on experiment 2 and tests whether the increased data variety requires an in crease in network complexity.
- Experiment 4: retesting Experiment 1 with a more complex network. We decided this experiment is necessary based on the inconclusive results of Experiment 1.

Each experiment is handled in it's own Section. Afterwards, we give a short conclusion of the experiments reported here. 

But first ... the usual notebook busywork of imports ...

In [1]:
# correct working directory. But only once. 
if not "working_directory_corrected" in vars():
    %cd ..
    working_directory_corrected = True

import numpy as np
import matplotlib.pyplot as plt 
import tensorflow as tf

from sklearn.model_selection import train_test_split
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential  # Model type to be used
from keras.layers import (
    Activation, # Types of layers to be used in our model
    Conv2D,
    Conv3D,
    Dense,  # Types of layers to be used in our model
    Flatten,
    GlobalAveragePooling2D,
    MaxPooling2D,
    MaxPooling3D,
    ZeroPadding2D,
) 

from models.cnn_builder import CNNBuilder
from evaluation.multi_run_evaluation import MultiRunEvaluation
from evaluation.evaluation_metrics import EvaluationMetrics
from data.dataset import Dataset

dataset = Dataset()

c:\Users\frank\Documents\Teaching\LU\Spring2023\Machine Learning² Unit\Machine Learning Example Project


And let's also load the data we will use in all of our experiments. 

In addition to loading our dataset we will use image generators for the remainder of the experiment. For now, we will generate dummy image generators that don't include any changes for the early experiments before we explore them in more detail in Experiment 2.

In [2]:
X, y = dataset.get_prepared_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
y_train_prob = np.array([y_train[i] for i in range(len(y_train))])
train_datagen = ImageDataGenerator(
    horizontal_flip=False,
    vertical_flip=False,
    fill_mode="nearest"
)
train_generator = train_datagen.flow(x=X_train, y=y_train_prob)

# Experiment 1: Network Configuration

In this experiment we measure the influence of network configuration settings on generalization. For this purpose we test the following configurations:
- Dropout Layers
- Batch Normalization
- Regularization
- weight constraints

## Experiment Protocol:
We will learn a separate convolutional network for each of the four tested configurations and one without any configuration and compare the average hamming scores on the test set. Each variation will be run ten times in order to decrease the influence of statistical variance. The learned network is the one identified in model_selection.ipynb

The experimentation will be done by executing the below code cell with different settings and recording the results in the cell below. The code uses two utility classes:
- We will use the class *MultiRunEvaluaton* which can run an experiment multiple times, recording the evaluation metrics and calculating minimum, maximum, mean and standard deviation. 
- We will use the class *CNNBuilder* to to assemble our model in a less verbous and less error prone way. This class supports our four tested configurations.


Based on the collected data we will discuss which of the settings could improve our results.


In [3]:
cnn_builder = CNNBuilder(
    convolutional_layers=[16],
    fully_connected_layers=[100],
    in_shape=(X_train.shape[1], X_train.shape[2], X_train.shape[3]),    # type: ignore
    out_shape=y_train.shape[1]  # type: ignore
)
cnn_builder.apply_regularization = True
cnn_builder.apply_dropout = True
cnn_builder.apply_batch_normalization = True
cnn_builder.weight_constraints = True

evaluator = MultiRunEvaluation(model_creation=cnn_builder.build_model)
evaluator.evaluate(
    nr_runs=10,
    epochs=100,
    early_stopping_patience=5,
    train_generator= train_generator,
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test
)
evaluator.print_metrics()


Running experiment 1 of, 10


2023-01-03 15:02:39.518155: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 6s - loss: 2.1905 - val_loss: 2.3641 - 6s/epoch - 263ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1849 - val_loss: 2.3904 - 5s/epoch - 251ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1300 - val_loss: 2.2889 - 5s/epoch - 230ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0662 - val_loss: 2.2136 - 5s/epoch - 250ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9981 - val_loss: 2.1762 - 5s/epoch - 247ms/step
Epoch 6/100
21/21 - 5s - loss: 1.9165 - val_loss: 2.0583 - 5s/epoch - 240ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8500 - val_loss: 1.9977 - 5s/epoch - 233ms/step
Epoch 8/100
21/21 - 5s - loss: 1.8002 - val_loss: 1.9161 - 5s/epoch - 235ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7687 - val_loss: 1.8344 - 5s/epoch - 239ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7476 - val_loss: 1.7632 - 5s/epoch - 240ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7404 - val_loss: 1.7525 - 5s/epoch - 245ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7328 - val_loss: 1.7423 - 5s/epoch - 248ms/step
Epoch 13/100
21/21 - 5s - loss: 1.728

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.1446 - val_loss: 2.2035 - 5s/epoch - 261ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1265 - val_loss: 2.1588 - 5s/epoch - 237ms/step
Epoch 3/100
21/21 - 5s - loss: 2.0728 - val_loss: 2.0726 - 5s/epoch - 251ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0070 - val_loss: 2.0225 - 5s/epoch - 249ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9303 - val_loss: 1.9618 - 5s/epoch - 248ms/step
Epoch 6/100
21/21 - 6s - loss: 1.8582 - val_loss: 1.8577 - 6s/epoch - 263ms/step
Epoch 7/100
21/21 - 6s - loss: 1.8019 - val_loss: 1.8010 - 6s/epoch - 264ms/step
Epoch 8/100
21/21 - 6s - loss: 1.7632 - val_loss: 1.7619 - 6s/epoch - 296ms/step
Epoch 9/100
21/21 - 6s - loss: 1.7407 - val_loss: 1.7464 - 6s/epoch - 274ms/step
Epoch 10/100
21/21 - 6s - loss: 1.7342 - val_loss: 1.7377 - 6s/epoch - 274ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7277 - val_loss: 1.7391 - 5s/epoch - 257ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7261 - val_loss: 1.7345 - 5s/epoch - 262ms/step
Epoch 13/100
21/21 - 6s - loss: 1.725

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.1746 - val_loss: 2.4624 - 5s/epoch - 259ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1617 - val_loss: 2.4732 - 5s/epoch - 234ms/step
Epoch 3/100
21/21 - 6s - loss: 2.1231 - val_loss: 2.4426 - 6s/epoch - 265ms/step
Epoch 4/100
21/21 - 6s - loss: 2.0529 - val_loss: 2.4136 - 6s/epoch - 274ms/step
Epoch 5/100
21/21 - 6s - loss: 1.9896 - val_loss: 2.3406 - 6s/epoch - 272ms/step
Epoch 6/100
21/21 - 5s - loss: 1.9090 - val_loss: 2.2383 - 5s/epoch - 257ms/step
Epoch 7/100
21/21 - 6s - loss: 1.8429 - val_loss: 2.1082 - 6s/epoch - 288ms/step
Epoch 8/100
21/21 - 5s - loss: 1.7972 - val_loss: 2.0145 - 5s/epoch - 233ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7608 - val_loss: 1.9066 - 5s/epoch - 240ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7450 - val_loss: 1.8497 - 5s/epoch - 246ms/step
Epoch 11/100
21/21 - 6s - loss: 1.7368 - val_loss: 1.8137 - 6s/epoch - 263ms/step
Epoch 12/100
21/21 - 6s - loss: 1.7328 - val_loss: 1.7904 - 6s/epoch - 264ms/step
Epoch 13/100
21/21 - 5s - loss: 1.729

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.2219 - val_loss: 2.3427 - 5s/epoch - 255ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1581 - val_loss: 2.3229 - 5s/epoch - 248ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1258 - val_loss: 2.2860 - 5s/epoch - 242ms/step
Epoch 4/100
21/21 - 6s - loss: 2.0772 - val_loss: 2.2622 - 6s/epoch - 267ms/step
Epoch 5/100
21/21 - 6s - loss: 2.0183 - val_loss: 2.2016 - 6s/epoch - 272ms/step
Epoch 6/100
21/21 - 7s - loss: 1.9463 - val_loss: 2.1194 - 7s/epoch - 312ms/step
Epoch 7/100
21/21 - 6s - loss: 1.8759 - val_loss: 2.0187 - 6s/epoch - 307ms/step
Epoch 8/100
21/21 - 6s - loss: 1.8210 - val_loss: 1.9371 - 6s/epoch - 274ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7810 - val_loss: 1.8704 - 5s/epoch - 250ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7543 - val_loss: 1.8168 - 5s/epoch - 235ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7423 - val_loss: 1.7833 - 5s/epoch - 259ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7383 - val_loss: 1.7637 - 5s/epoch - 257ms/step
Epoch 13/100
21/21 - 5s - loss: 1.732

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.2158 - val_loss: 2.2591 - 5s/epoch - 252ms/step
Epoch 2/100
21/21 - 6s - loss: 2.1773 - val_loss: 2.3073 - 6s/epoch - 289ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1093 - val_loss: 2.3236 - 5s/epoch - 245ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0546 - val_loss: 2.2658 - 5s/epoch - 249ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9892 - val_loss: 2.2391 - 5s/epoch - 252ms/step
Epoch 6/100
21/21 - 5s - loss: 1.9123 - val_loss: 2.1417 - 5s/epoch - 243ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8441 - val_loss: 2.0465 - 5s/epoch - 253ms/step
Epoch 8/100
21/21 - 6s - loss: 1.7900 - val_loss: 1.9779 - 6s/epoch - 276ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7616 - val_loss: 1.8875 - 5s/epoch - 260ms/step
Epoch 10/100
21/21 - 6s - loss: 1.7463 - val_loss: 1.8351 - 6s/epoch - 276ms/step
Epoch 11/100
21/21 - 6s - loss: 1.7351 - val_loss: 1.8019 - 6s/epoch - 275ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7286 - val_loss: 1.7731 - 5s/epoch - 254ms/step
Epoch 13/100
21/21 - 5s - loss: 1.729

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.1745 - val_loss: 2.4444 - 5s/epoch - 237ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1585 - val_loss: 2.3407 - 5s/epoch - 232ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1140 - val_loss: 2.2730 - 5s/epoch - 234ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0468 - val_loss: 2.2408 - 5s/epoch - 231ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9612 - val_loss: 2.1373 - 5s/epoch - 233ms/step
Epoch 6/100
21/21 - 5s - loss: 1.8914 - val_loss: 2.0339 - 5s/epoch - 233ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8215 - val_loss: 1.9245 - 5s/epoch - 232ms/step
Epoch 8/100
21/21 - 5s - loss: 1.7777 - val_loss: 1.8537 - 5s/epoch - 232ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7503 - val_loss: 1.8853 - 5s/epoch - 235ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7369 - val_loss: 1.8517 - 5s/epoch - 234ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7333 - val_loss: 1.8135 - 5s/epoch - 234ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7296 - val_loss: 1.7925 - 5s/epoch - 237ms/step
Epoch 13/100
21/21 - 5s - loss: 1.727

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 8s - loss: 2.1573 - val_loss: 2.3442 - 8s/epoch - 373ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1526 - val_loss: 2.3149 - 5s/epoch - 249ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1048 - val_loss: 2.2965 - 5s/epoch - 241ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0435 - val_loss: 2.2060 - 5s/epoch - 242ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9710 - val_loss: 2.1961 - 5s/epoch - 252ms/step
Epoch 6/100
21/21 - 5s - loss: 1.8980 - val_loss: 2.0854 - 5s/epoch - 240ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8298 - val_loss: 2.0311 - 5s/epoch - 243ms/step
Epoch 8/100
21/21 - 5s - loss: 1.7889 - val_loss: 1.8901 - 5s/epoch - 247ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7611 - val_loss: 1.8402 - 5s/epoch - 247ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7395 - val_loss: 1.8066 - 5s/epoch - 244ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7384 - val_loss: 1.7817 - 5s/epoch - 253ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7320 - val_loss: 1.7680 - 5s/epoch - 232ms/step
Epoch 13/100
21/21 - 5s - loss: 1.726

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 2.1461 - val_loss: 2.2141 - 7s/epoch - 332ms/step
Epoch 2/100
21/21 - 7s - loss: 2.1515 - val_loss: 2.1527 - 7s/epoch - 324ms/step
Epoch 3/100
21/21 - 8s - loss: 2.0926 - val_loss: 2.1155 - 8s/epoch - 374ms/step
Epoch 4/100
21/21 - 6s - loss: 2.0337 - val_loss: 2.0749 - 6s/epoch - 289ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9710 - val_loss: 2.0697 - 5s/epoch - 249ms/step
Epoch 6/100
21/21 - 5s - loss: 1.8948 - val_loss: 1.9918 - 5s/epoch - 239ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8311 - val_loss: 1.9141 - 5s/epoch - 245ms/step
Epoch 8/100
21/21 - 5s - loss: 1.7871 - val_loss: 1.8495 - 5s/epoch - 254ms/step
Epoch 9/100
21/21 - 6s - loss: 1.7587 - val_loss: 1.8034 - 6s/epoch - 278ms/step
Epoch 10/100
21/21 - 7s - loss: 1.7444 - val_loss: 1.8246 - 7s/epoch - 335ms/step
Epoch 11/100
21/21 - 7s - loss: 1.7347 - val_loss: 1.7964 - 7s/epoch - 323ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7297 - val_loss: 1.7752 - 5s/epoch - 241ms/step
Epoch 13/100
21/21 - 5s - loss: 1.730

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.1918 - val_loss: 2.2254 - 5s/epoch - 253ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1660 - val_loss: 2.2279 - 5s/epoch - 238ms/step
Epoch 3/100
21/21 - 5s - loss: 2.1187 - val_loss: 2.2260 - 5s/epoch - 250ms/step
Epoch 4/100
21/21 - 5s - loss: 2.0615 - val_loss: 2.1850 - 5s/epoch - 249ms/step
Epoch 5/100
21/21 - 5s - loss: 2.0042 - val_loss: 2.1297 - 5s/epoch - 251ms/step
Epoch 6/100
21/21 - 5s - loss: 1.9301 - val_loss: 2.0910 - 5s/epoch - 246ms/step
Epoch 7/100
21/21 - 5s - loss: 1.8534 - val_loss: 2.0075 - 5s/epoch - 256ms/step
Epoch 8/100
21/21 - 6s - loss: 1.8007 - val_loss: 1.9246 - 6s/epoch - 264ms/step
Epoch 9/100
21/21 - 5s - loss: 1.7700 - val_loss: 1.8682 - 5s/epoch - 243ms/step
Epoch 10/100
21/21 - 5s - loss: 1.7536 - val_loss: 1.8290 - 5s/epoch - 245ms/step
Epoch 11/100
21/21 - 5s - loss: 1.7421 - val_loss: 1.9013 - 5s/epoch - 244ms/step
Epoch 12/100
21/21 - 5s - loss: 1.7339 - val_loss: 1.8543 - 5s/epoch - 260ms/step
Epoch 13/100
21/21 - 5s - loss: 1.729

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 2.1501 - val_loss: 2.5161 - 5s/epoch - 237ms/step
Epoch 2/100
21/21 - 5s - loss: 2.1422 - val_loss: 2.4427 - 5s/epoch - 233ms/step
Epoch 3/100
21/21 - 5s - loss: 2.0801 - val_loss: 2.3630 - 5s/epoch - 253ms/step
Epoch 4/100
21/21 - 6s - loss: 2.0200 - val_loss: 2.2726 - 6s/epoch - 286ms/step
Epoch 5/100
21/21 - 5s - loss: 1.9488 - val_loss: 2.1499 - 5s/epoch - 255ms/step
Epoch 6/100
21/21 - 6s - loss: 1.8706 - val_loss: 2.0375 - 6s/epoch - 279ms/step
Epoch 7/100
21/21 - 7s - loss: 1.8085 - val_loss: 1.9166 - 7s/epoch - 332ms/step
Epoch 8/100
21/21 - 6s - loss: 1.7714 - val_loss: 1.8427 - 6s/epoch - 277ms/step
Epoch 9/100
21/21 - 7s - loss: 1.7482 - val_loss: 1.8127 - 7s/epoch - 318ms/step
Epoch 10/100
21/21 - 7s - loss: 1.7377 - val_loss: 1.7817 - 7s/epoch - 320ms/step
Epoch 11/100
21/21 - 7s - loss: 1.7326 - val_loss: 1.7649 - 7s/epoch - 322ms/step
Epoch 12/100
21/21 - 6s - loss: 1.7274 - val_loss: 1.7621 - 6s/epoch - 280ms/step
Epoch 13/100
21/21 - 6s - loss: 1.727

## Results

Running our Experiment yielded the following results for the hamming score on the test set.

| Configuration          | Min  | Max  | Mean  | St. Dev |
| ---------------------- | ---- | ---- | ----- | ------- |
| No Configuration       | 0.17 | 0.21 | 0.195 | 0.012   |
| Regularization         | 0.16 | 0.21 | 0.184 | 0.017   |
| Dropout                | 0.17 | 0.21 | 0.19  | 0.011   |
| Batch Normalization    | 0.02 | 0.16 | 0.109 | 0.040   |
| Weight Constraints     | 0.11 | 0.20 | 0.163 | 0.029   |

## Discussion
None of the configurations have improved the hamming score of the test set. While regularization and dropout are close - and thus could be a result of random variation, batch normalization led to a significantly worse performance. 

This does not necessarily mean, that these configurations are not useful. It may mean that their effect is overshadowed by the general issue of having a very small data set. During test runs we have seen dropout increase the hammilton score slightly, so it doesn't seem to strictly make it worse. 

## Conclusions

While Batch Normalization clearly performs badly, the other configurations are less clear cut. For now this experiment is inconclusive.  we will revisit if after increasing the data set complexity with data generators, hoping that this will increase the complexity enough for these regularizations to make a consistent difference. 



# Experiment 2: Image Generators

In this experiment we test the effects of introducing variabilty into the training set by using image generators.

Specifically, we try to:
- Mirror images horizonzally
- Rotate the image up to 30 degrees in either direction
- Shift the image by 10% of the image size in horizontal or vertical direction
- change brightness by 20%
- zooming in or out by 20%

### Experiment Setup:
We will learn ten a separate convolutional network for each of the five tested configurations and one without any configuration and compare the average hamming scores on the test set. The learned network is the one identified in model_selection.ipynb

The experiment will be carried out by running the below cell with different configurations and recording the results in the cell below. 

We will compare the achieved average hamming scores to determine which model performed well. If multiple configurations improve matters we will also test configurtions of them. The additional tested configurations will be determined based on the initial results.

In [4]:
train_datagen = ImageDataGenerator(
    #rotation_range=30,
    #zoom_range = 0.20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    # brightness_range= [0.8, 1.2],
    #horizontal_flip=True,
    fill_mode="nearest"
)
train_generator = train_datagen.flow(x=X_train, y=y_train_prob)

cnn_builder = CNNBuilder(
    convolutional_layers=[16],
    fully_connected_layers=[100],
    in_shape=(X_train.shape[1], X_train.shape[2], X_train.shape[3]),  # type: ignore
    out_shape=y_train.shape[1]  # type: ignore
)

evaluator = MultiRunEvaluation(model_creation=cnn_builder.build_model)
evaluator.evaluate(
    nr_runs=10,
    epochs=100,
    early_stopping_patience=5,
    train_generator= train_generator,
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test
)
evaluator.print_metrics()


Running experiment 1 of, 10
Epoch 1/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3134 - val_loss: 0.2849 - 5s/epoch - 231ms/step
Epoch 2/100
21/21 - 5s - loss: 0.2810 - val_loss: 0.2813 - 5s/epoch - 246ms/step
Epoch 3/100
21/21 - 5s - loss: 0.2790 - val_loss: 0.2813 - 5s/epoch - 236ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2767 - val_loss: 0.2793 - 4s/epoch - 209ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2750 - val_loss: 0.2754 - 4s/epoch - 210ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2725 - val_loss: 0.2743 - 4s/epoch - 207ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2672 - val_loss: 0.2736 - 5s/epoch - 215ms/step
Epoch 8/100
21/21 - 5s - loss: 0.2656 - val_loss: 0.2723 - 5s/epoch - 258ms/step
Epoch 9/100
21/21 - 5s - loss: 0.2626 - val_loss: 0.2678 - 5s/epoch - 220ms/step
Epoch 10/100
21/21 - 5s - loss: 0.2599 - val_loss: 0.2692 - 5s/epoch - 228ms/step
Epoch 11/100
21/21 - 5s - loss: 0.2580 - val_loss: 0.2669 - 5s/epoch - 240ms/step
Epoch 12/100
21/21 - 5s - loss: 0.2536 - val_loss: 0.2691 - 5s/epoch - 226ms/step
Epoch 13/100
21/21 - 5s - loss: 0.250

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3207 - val_loss: 0.2844 - 5s/epoch - 258ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2803 - val_loss: 0.2822 - 4s/epoch - 209ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2775 - val_loss: 0.2791 - 4s/epoch - 208ms/step
Epoch 4/100
21/21 - 5s - loss: 0.2746 - val_loss: 0.2783 - 5s/epoch - 241ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2713 - val_loss: 0.2742 - 4s/epoch - 213ms/step
Epoch 6/100
21/21 - 5s - loss: 0.2695 - val_loss: 0.2733 - 5s/epoch - 223ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2665 - val_loss: 0.2714 - 5s/epoch - 228ms/step
Epoch 8/100
21/21 - 5s - loss: 0.2634 - val_loss: 0.2711 - 5s/epoch - 222ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2611 - val_loss: 0.2668 - 4s/epoch - 209ms/step
Epoch 10/100
21/21 - 4s - loss: 0.2587 - val_loss: 0.2649 - 4s/epoch - 210ms/step
Epoch 11/100
21/21 - 4s - loss: 0.2566 - val_loss: 0.2658 - 4s/epoch - 209ms/step
Epoch 12/100
21/21 - 4s - loss: 0.2546 - val_loss: 0.2653 - 4s/epoch - 210ms/step
Epoch 13/100
21/21 - 5s - loss: 0.250

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3135 - val_loss: 0.2865 - 5s/epoch - 236ms/step
Epoch 2/100
21/21 - 5s - loss: 0.2783 - val_loss: 0.2815 - 5s/epoch - 227ms/step
Epoch 3/100
21/21 - 5s - loss: 0.2741 - val_loss: 0.2813 - 5s/epoch - 259ms/step
Epoch 4/100
21/21 - 5s - loss: 0.2749 - val_loss: 0.2780 - 5s/epoch - 237ms/step
Epoch 5/100
21/21 - 5s - loss: 0.2699 - val_loss: 0.2774 - 5s/epoch - 259ms/step
Epoch 6/100
21/21 - 5s - loss: 0.2670 - val_loss: 0.2759 - 5s/epoch - 227ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2652 - val_loss: 0.2746 - 5s/epoch - 216ms/step
Epoch 8/100
21/21 - 4s - loss: 0.2635 - val_loss: 0.2735 - 4s/epoch - 214ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2616 - val_loss: 0.2751 - 4s/epoch - 213ms/step
Epoch 10/100
21/21 - 5s - loss: 0.2607 - val_loss: 0.2756 - 5s/epoch - 220ms/step
Epoch 11/100
21/21 - 5s - loss: 0.2602 - val_loss: 0.2723 - 5s/epoch - 252ms/step
Epoch 12/100
21/21 - 5s - loss: 0.2552 - val_loss: 0.2701 - 5s/epoch - 259ms/step
Epoch 13/100
21/21 - 5s - loss: 0.255

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3131 - val_loss: 0.2845 - 5s/epoch - 218ms/step
Epoch 2/100
21/21 - 5s - loss: 0.2759 - val_loss: 0.2803 - 5s/epoch - 229ms/step
Epoch 3/100
21/21 - 5s - loss: 0.2721 - val_loss: 0.2788 - 5s/epoch - 223ms/step
Epoch 4/100
21/21 - 5s - loss: 0.2693 - val_loss: 0.2771 - 5s/epoch - 244ms/step
Epoch 5/100
21/21 - 5s - loss: 0.2652 - val_loss: 0.2758 - 5s/epoch - 236ms/step
Epoch 6/100
21/21 - 5s - loss: 0.2620 - val_loss: 0.2772 - 5s/epoch - 239ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2599 - val_loss: 0.2702 - 5s/epoch - 249ms/step
Epoch 8/100
21/21 - 5s - loss: 0.2574 - val_loss: 0.2734 - 5s/epoch - 215ms/step
Epoch 9/100
21/21 - 5s - loss: 0.2548 - val_loss: 0.2682 - 5s/epoch - 230ms/step
Epoch 10/100
21/21 - 5s - loss: 0.2544 - val_loss: 0.2687 - 5s/epoch - 247ms/step
Epoch 11/100
21/21 - 5s - loss: 0.2505 - val_loss: 0.2695 - 5s/epoch - 249ms/step
Epoch 12/100
21/21 - 5s - loss: 0.2479 - val_loss: 0.2677 - 5s/epoch - 224ms/step
Epoch 13/100
21/21 - 4s - loss: 0.247

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3147 - val_loss: 0.2858 - 5s/epoch - 237ms/step
Epoch 2/100
21/21 - 5s - loss: 0.2807 - val_loss: 0.2841 - 5s/epoch - 240ms/step
Epoch 3/100
21/21 - 5s - loss: 0.2785 - val_loss: 0.2822 - 5s/epoch - 227ms/step
Epoch 4/100
21/21 - 5s - loss: 0.2774 - val_loss: 0.2807 - 5s/epoch - 222ms/step
Epoch 5/100
21/21 - 5s - loss: 0.2731 - val_loss: 0.2754 - 5s/epoch - 235ms/step
Epoch 6/100
21/21 - 6s - loss: 0.2694 - val_loss: 0.2732 - 6s/epoch - 306ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2653 - val_loss: 0.2715 - 5s/epoch - 220ms/step
Epoch 8/100
21/21 - 5s - loss: 0.2631 - val_loss: 0.2696 - 5s/epoch - 222ms/step
Epoch 9/100
21/21 - 5s - loss: 0.2608 - val_loss: 0.2688 - 5s/epoch - 216ms/step
Epoch 10/100
21/21 - 5s - loss: 0.2598 - val_loss: 0.2660 - 5s/epoch - 227ms/step
Epoch 11/100
21/21 - 5s - loss: 0.2567 - val_loss: 0.2646 - 5s/epoch - 240ms/step
Epoch 12/100
21/21 - 5s - loss: 0.2548 - val_loss: 0.2642 - 5s/epoch - 220ms/step
Epoch 13/100
21/21 - 5s - loss: 0.251

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 4s - loss: 0.3233 - val_loss: 0.2827 - 4s/epoch - 208ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2765 - val_loss: 0.2789 - 4s/epoch - 203ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2723 - val_loss: 0.2757 - 4s/epoch - 203ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2681 - val_loss: 0.2739 - 4s/epoch - 202ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2665 - val_loss: 0.2737 - 4s/epoch - 204ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2649 - val_loss: 0.2743 - 4s/epoch - 200ms/step
Epoch 7/100
21/21 - 4s - loss: 0.2623 - val_loss: 0.2735 - 4s/epoch - 202ms/step
Epoch 8/100
21/21 - 4s - loss: 0.2586 - val_loss: 0.2681 - 4s/epoch - 204ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2566 - val_loss: 0.2689 - 4s/epoch - 201ms/step
Epoch 10/100
21/21 - 4s - loss: 0.2544 - val_loss: 0.2667 - 4s/epoch - 203ms/step
Epoch 11/100
21/21 - 4s - loss: 0.2504 - val_loss: 0.2645 - 4s/epoch - 203ms/step
Epoch 12/100
21/21 - 4s - loss: 0.2541 - val_loss: 0.2667 - 4s/epoch - 202ms/step
Epoch 13/100
21/21 - 4s - loss: 0.250

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 4s - loss: 0.3199 - val_loss: 0.2857 - 4s/epoch - 211ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2781 - val_loss: 0.2829 - 4s/epoch - 204ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2731 - val_loss: 0.2784 - 4s/epoch - 204ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2696 - val_loss: 0.2753 - 4s/epoch - 205ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2667 - val_loss: 0.2723 - 4s/epoch - 203ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2627 - val_loss: 0.2724 - 4s/epoch - 203ms/step
Epoch 7/100
21/21 - 4s - loss: 0.2592 - val_loss: 0.2688 - 4s/epoch - 205ms/step
Epoch 8/100
21/21 - 4s - loss: 0.2552 - val_loss: 0.2677 - 4s/epoch - 206ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2545 - val_loss: 0.2688 - 4s/epoch - 202ms/step
Epoch 10/100
21/21 - 4s - loss: 0.2510 - val_loss: 0.2672 - 4s/epoch - 206ms/step
Epoch 11/100
21/21 - 4s - loss: 0.2518 - val_loss: 0.2682 - 4s/epoch - 204ms/step
Epoch 12/100
21/21 - 4s - loss: 0.2480 - val_loss: 0.2650 - 4s/epoch - 203ms/step
Epoch 13/100
21/21 - 4s - loss: 0.246

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 4s - loss: 0.3359 - val_loss: 0.2892 - 4s/epoch - 208ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2820 - val_loss: 0.2851 - 4s/epoch - 208ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2767 - val_loss: 0.2813 - 4s/epoch - 206ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2738 - val_loss: 0.2758 - 4s/epoch - 203ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2701 - val_loss: 0.2737 - 4s/epoch - 206ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2661 - val_loss: 0.2702 - 4s/epoch - 204ms/step
Epoch 7/100
21/21 - 4s - loss: 0.2627 - val_loss: 0.2665 - 4s/epoch - 203ms/step
Epoch 8/100
21/21 - 4s - loss: 0.2582 - val_loss: 0.2696 - 4s/epoch - 211ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2563 - val_loss: 0.2635 - 4s/epoch - 206ms/step
Epoch 10/100
21/21 - 4s - loss: 0.2528 - val_loss: 0.2613 - 4s/epoch - 205ms/step
Epoch 11/100
21/21 - 4s - loss: 0.2489 - val_loss: 0.2625 - 4s/epoch - 203ms/step
Epoch 12/100
21/21 - 4s - loss: 0.2467 - val_loss: 0.2637 - 4s/epoch - 203ms/step
Epoch 13/100
21/21 - 4s - loss: 0.245

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 4s - loss: 0.3228 - val_loss: 0.2892 - 4s/epoch - 205ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2767 - val_loss: 0.2804 - 4s/epoch - 205ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2730 - val_loss: 0.2785 - 4s/epoch - 210ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2687 - val_loss: 0.2739 - 4s/epoch - 204ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2657 - val_loss: 0.2751 - 4s/epoch - 202ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2630 - val_loss: 0.2727 - 4s/epoch - 203ms/step
Epoch 7/100
21/21 - 4s - loss: 0.2588 - val_loss: 0.2734 - 4s/epoch - 204ms/step
Epoch 8/100
21/21 - 4s - loss: 0.2573 - val_loss: 0.2731 - 4s/epoch - 203ms/step
Epoch 9/100
21/21 - 4s - loss: 0.2536 - val_loss: 0.2720 - 4s/epoch - 202ms/step
Epoch 10/100
21/21 - 4s - loss: 0.2524 - val_loss: 0.2684 - 4s/epoch - 202ms/step
Epoch 11/100
21/21 - 4s - loss: 0.2500 - val_loss: 0.2667 - 4s/epoch - 205ms/step
Epoch 12/100
21/21 - 4s - loss: 0.2483 - val_loss: 0.2669 - 4s/epoch - 204ms/step
Epoch 13/100
21/21 - 4s - loss: 0.246

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 5s - loss: 0.3137 - val_loss: 0.2854 - 5s/epoch - 216ms/step
Epoch 2/100
21/21 - 4s - loss: 0.2786 - val_loss: 0.2814 - 4s/epoch - 210ms/step
Epoch 3/100
21/21 - 4s - loss: 0.2778 - val_loss: 0.2793 - 4s/epoch - 206ms/step
Epoch 4/100
21/21 - 4s - loss: 0.2737 - val_loss: 0.2761 - 4s/epoch - 205ms/step
Epoch 5/100
21/21 - 4s - loss: 0.2693 - val_loss: 0.2725 - 4s/epoch - 207ms/step
Epoch 6/100
21/21 - 4s - loss: 0.2662 - val_loss: 0.2714 - 4s/epoch - 208ms/step
Epoch 7/100
21/21 - 5s - loss: 0.2646 - val_loss: 0.2703 - 5s/epoch - 241ms/step
Epoch 8/100
21/21 - 5s - loss: 0.2594 - val_loss: 0.2663 - 5s/epoch - 225ms/step
Epoch 9/100
21/21 - 5s - loss: 0.2565 - val_loss: 0.2658 - 5s/epoch - 225ms/step
Epoch 10/100
21/21 - 5s - loss: 0.2550 - val_loss: 0.2665 - 5s/epoch - 229ms/step
Epoch 11/100
21/21 - 5s - loss: 0.2507 - val_loss: 0.2668 - 5s/epoch - 233ms/step
Epoch 12/100
21/21 - 5s - loss: 0.2489 - val_loss: 0.2647 - 5s/epoch - 231ms/step
Epoch 13/100
21/21 - 5s - loss: 0.248

## Results

Running our Experiment yielded the following scores for the test hamming score:

| Configuration          | Min  | Max  | Mean  | St. Dev |
| ---------------------- | ---- | ---- | ----- | ------- |
| No Configuration       | 0.17 | 0.21 | 0.195 | 0.012   |
| Mirroring              | 0.17 | 0.22 | 0.191 | 0.015   |
| Rotation               | 0.19 | 0.24 | 0.214 | 0.014   |
| Shift                  | 0.21 | 0.25 | 0.231 | 0.015   |
| Brightness             | 0.08 | 0.13 | 0.111 | 0.019   |
| Zoom                   | 0.17 | 0.22 | 0.198 | 0.018   |

## Discussion

Our results indicate that both rotation and shift increase the hamming score and changing brightness dramatically decrased the score. Mirroring and zoom did not result in significant score changes.

Based on these results we identified the following follow-up questions:
* How do the results change if we activate all configurations except brightness?
* How do the results change if we activate only rotation and shift?
* Is the score increase from shift consistent? 

To answer these question the following additional configurations were recorded:
- All except brightness: An image data generator with all configurations except the brightness adjustment.
- Rotation and Shift: Combining rotation and shift.
- Shift (rerun): a rerun of the shift experiment to get insight into the consistency of the result quality.

The recorded data can be found in the table below.

| Configuration          | Min  | Max  | Mean  | St. Dev |
| ---------------------- | ---- | ---- | ----- | ------- |
| No Configuration       | 0.17 | 0.21 | 0.195 | 0.012   |
| All except brightness  | 0.19 | 0.25 | 0.214 | 0.017   |
| Rotation + Shift       | 0.18 | 0.24 | 0.218 | 0.022   |
| Shift (rerun)          | 0.18 | 0.24 | 0.222 | 0.018 |

From these results we can see that activating all configurations except brightness yields the same quality as activating only brightness or only shift. The same holds for only activating rotation and shift. The effects of these preprocessing techniques do not seem to confound each other - or at least this is not visible in the relatively simple model we applied.

Rerunning sift also yielded a slighty lower result than before. 

## Conclusion

For the future we decided to use data generators with rotation and shift. Using mirroring and zoom could also have been an option. However, since both perform similar, we decided to go with the one that leads to shorter calculation times.

As a side-note: our rerun of the shift preprocessing also shows that we should be careful even in interpreting the averaged results over ten runs. We will take this into account when doing bigger experiments by using more runs.


# Experiment 3: Increasing complexity
In this experiment we test whether the increased complexity in data introduced by the data generator also requires an increase in network complexity.  

## Experiment Proticol
For our experiment we use the configuration of the image data generator that performed best in Experiment 2 and test different network complexity increases. We test:
- Adding convolutional layers
- increasing the number of convolutional patterns
- increasing the number of fully connected neurons
- adding more fully connected neurons. 

Each configuration will be tested ten times. We will compare them based on the average hamming score on the test set.

The experiment is conducted by configuring the below cell and executing it. The results are noted in the markup cell below.

Due to technical difficulties, we had to restart the notebook environment for this test. This means, data recorded here is done with a different training / test split than the other experiments. The experiment itself has been executed entirely with the same training test split.

In [5]:
train_datagen = ImageDataGenerator(
    rotation_range=30,
    width_shift_range=0.1,
    height_shift_range=0.1,
    fill_mode="nearest"
)
train_generator = train_datagen.flow(x=X_train, y=y_train_prob)

cnn_builder = CNNBuilder(
    convolutional_layers=[64, 32, 16],
    fully_connected_layers=[100],
    in_shape=(X_train.shape[1], X_train.shape[2], X_train.shape[3]),  # type: ignore
    out_shape=y_train.shape[1]  # type: ignore
)

evaluator = MultiRunEvaluation(model_creation=cnn_builder.build_model)
evaluator.evaluate(
    nr_runs=10,
    epochs=100,
    early_stopping_patience=5,
    train_generator= train_generator,
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test
)
evaluator.print_metrics()


Running experiment 1 of, 10
Epoch 1/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 13s - loss: 0.3390 - val_loss: 0.2854 - 13s/epoch - 630ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2817 - val_loss: 0.2854 - 13s/epoch - 601ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2813 - val_loss: 0.2841 - 13s/epoch - 624ms/step
Epoch 4/100
21/21 - 13s - loss: 0.2803 - val_loss: 0.2849 - 13s/epoch - 605ms/step
Epoch 5/100
21/21 - 13s - loss: 0.2810 - val_loss: 0.2845 - 13s/epoch - 605ms/step
Epoch 6/100
21/21 - 13s - loss: 0.2808 - val_loss: 0.2851 - 13s/epoch - 598ms/step
Epoch 7/100
21/21 - 13s - loss: 0.2806 - val_loss: 0.2826 - 13s/epoch - 616ms/step
Epoch 8/100
21/21 - 13s - loss: 0.2800 - val_loss: 0.2826 - 13s/epoch - 619ms/step
Epoch 9/100
21/21 - 13s - loss: 0.2809 - val_loss: 0.2821 - 13s/epoch - 607ms/step
Epoch 10/100
21/21 - 13s - loss: 0.2796 - val_loss: 0.2841 - 13s/epoch - 613ms/step
Epoch 11/100
21/21 - 13s - loss: 0.2788 - val_loss: 0.2821 - 13s/epoch - 599ms/step
Epoch 12/100
21/21 - 13s - loss: 0.2786 - val_loss: 0.2850 - 13s/epoch - 604ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 12s - loss: 0.3535 - val_loss: 0.2915 - 12s/epoch - 582ms/step
Epoch 2/100
21/21 - 12s - loss: 0.2829 - val_loss: 0.2894 - 12s/epoch - 592ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2807 - val_loss: 0.2844 - 13s/epoch - 615ms/step
Epoch 4/100
21/21 - 14s - loss: 0.2791 - val_loss: 0.2819 - 14s/epoch - 653ms/step
Epoch 5/100
21/21 - 12s - loss: 0.2775 - val_loss: 0.2827 - 12s/epoch - 579ms/step
Epoch 6/100
21/21 - 14s - loss: 0.2764 - val_loss: 0.2800 - 14s/epoch - 657ms/step
Epoch 7/100
21/21 - 13s - loss: 0.2753 - val_loss: 0.2776 - 13s/epoch - 628ms/step
Epoch 8/100
21/21 - 13s - loss: 0.2735 - val_loss: 0.2747 - 13s/epoch - 599ms/step
Epoch 9/100
21/21 - 13s - loss: 0.2718 - val_loss: 0.2737 - 13s/epoch - 624ms/step
Epoch 10/100
21/21 - 13s - loss: 0.2703 - val_loss: 0.2751 - 13s/epoch - 637ms/step
Epoch 11/100
21/21 - 13s - loss: 0.2681 - val_loss: 0.2715 - 13s/epoch - 626ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2658 - val_loss: 0.2720 - 12s/epoch - 589ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 12s - loss: 0.3380 - val_loss: 0.2890 - 12s/epoch - 594ms/step
Epoch 2/100
21/21 - 12s - loss: 0.2813 - val_loss: 0.2856 - 12s/epoch - 590ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2802 - val_loss: 0.2846 - 13s/epoch - 596ms/step
Epoch 4/100
21/21 - 12s - loss: 0.2798 - val_loss: 0.2842 - 12s/epoch - 589ms/step
Epoch 5/100
21/21 - 12s - loss: 0.2803 - val_loss: 0.2837 - 12s/epoch - 583ms/step
Epoch 6/100
21/21 - 12s - loss: 0.2788 - val_loss: 0.2832 - 12s/epoch - 590ms/step
Epoch 7/100
21/21 - 13s - loss: 0.2778 - val_loss: 0.2819 - 13s/epoch - 595ms/step
Epoch 8/100
21/21 - 12s - loss: 0.2780 - val_loss: 0.2833 - 12s/epoch - 590ms/step
Epoch 9/100
21/21 - 12s - loss: 0.2796 - val_loss: 0.2823 - 12s/epoch - 595ms/step
Epoch 10/100
21/21 - 12s - loss: 0.2777 - val_loss: 0.2817 - 12s/epoch - 581ms/step
Epoch 11/100
21/21 - 12s - loss: 0.2774 - val_loss: 0.2806 - 12s/epoch - 592ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2760 - val_loss: 0.2777 - 12s/epoch - 588ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 14s - loss: 0.3307 - val_loss: 0.2879 - 14s/epoch - 663ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2814 - val_loss: 0.2835 - 13s/epoch - 637ms/step
Epoch 3/100
21/21 - 14s - loss: 0.2806 - val_loss: 0.2822 - 14s/epoch - 682ms/step
Epoch 4/100
21/21 - 13s - loss: 0.2795 - val_loss: 0.2841 - 13s/epoch - 616ms/step
Epoch 5/100
21/21 - 13s - loss: 0.2803 - val_loss: 0.2830 - 13s/epoch - 616ms/step
Epoch 6/100
21/21 - 12s - loss: 0.2790 - val_loss: 0.2810 - 12s/epoch - 569ms/step
Epoch 7/100
21/21 - 12s - loss: 0.2778 - val_loss: 0.2811 - 12s/epoch - 563ms/step
Epoch 8/100
21/21 - 12s - loss: 0.2782 - val_loss: 0.2820 - 12s/epoch - 564ms/step
Epoch 9/100
21/21 - 12s - loss: 0.2770 - val_loss: 0.2819 - 12s/epoch - 563ms/step
Epoch 10/100
21/21 - 12s - loss: 0.2767 - val_loss: 0.2768 - 12s/epoch - 549ms/step
Epoch 11/100
21/21 - 12s - loss: 0.2756 - val_loss: 0.2800 - 12s/epoch - 563ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2766 - val_loss: 0.2775 - 12s/epoch - 582ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 14s - loss: 0.3281 - val_loss: 0.2894 - 14s/epoch - 666ms/step
Epoch 2/100
21/21 - 14s - loss: 0.2825 - val_loss: 0.2848 - 14s/epoch - 644ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2810 - val_loss: 0.2848 - 13s/epoch - 621ms/step
Epoch 4/100
21/21 - 12s - loss: 0.2810 - val_loss: 0.2870 - 12s/epoch - 560ms/step
Epoch 5/100
21/21 - 12s - loss: 0.2823 - val_loss: 0.2835 - 12s/epoch - 556ms/step
Epoch 6/100
21/21 - 11s - loss: 0.2805 - val_loss: 0.2842 - 11s/epoch - 546ms/step
Epoch 7/100
21/21 - 11s - loss: 0.2800 - val_loss: 0.2800 - 11s/epoch - 532ms/step
Epoch 8/100
21/21 - 11s - loss: 0.2796 - val_loss: 0.2819 - 11s/epoch - 544ms/step
Epoch 9/100
21/21 - 11s - loss: 0.2796 - val_loss: 0.2814 - 11s/epoch - 541ms/step
Epoch 10/100
21/21 - 11s - loss: 0.2790 - val_loss: 0.2795 - 11s/epoch - 545ms/step
Epoch 11/100
21/21 - 11s - loss: 0.2768 - val_loss: 0.2795 - 11s/epoch - 546ms/step
Epoch 12/100
21/21 - 11s - loss: 0.2772 - val_loss: 0.2807 - 11s/epoch - 537ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 13s - loss: 0.3444 - val_loss: 0.2844 - 13s/epoch - 632ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2828 - val_loss: 0.2852 - 13s/epoch - 616ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2801 - val_loss: 0.2832 - 13s/epoch - 607ms/step
Epoch 4/100
21/21 - 13s - loss: 0.2795 - val_loss: 0.2814 - 13s/epoch - 616ms/step
Epoch 5/100
21/21 - 13s - loss: 0.2795 - val_loss: 0.2828 - 13s/epoch - 609ms/step
Epoch 6/100
21/21 - 13s - loss: 0.2791 - val_loss: 0.2839 - 13s/epoch - 611ms/step
Epoch 7/100
21/21 - 13s - loss: 0.2791 - val_loss: 0.2810 - 13s/epoch - 611ms/step
Epoch 8/100
21/21 - 13s - loss: 0.2778 - val_loss: 0.2824 - 13s/epoch - 606ms/step
Epoch 9/100
21/21 - 13s - loss: 0.2776 - val_loss: 0.2805 - 13s/epoch - 616ms/step
Epoch 10/100
21/21 - 13s - loss: 0.2766 - val_loss: 0.2804 - 13s/epoch - 612ms/step
Epoch 11/100
21/21 - 13s - loss: 0.2763 - val_loss: 0.2797 - 13s/epoch - 616ms/step
Epoch 12/100
21/21 - 13s - loss: 0.2764 - val_loss: 0.2812 - 13s/epoch - 611ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 13s - loss: 0.3409 - val_loss: 0.2844 - 13s/epoch - 621ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2814 - val_loss: 0.2848 - 13s/epoch - 618ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2811 - val_loss: 0.2831 - 13s/epoch - 622ms/step
Epoch 4/100
21/21 - 13s - loss: 0.2799 - val_loss: 0.2854 - 13s/epoch - 624ms/step
Epoch 5/100
21/21 - 13s - loss: 0.2788 - val_loss: 0.2834 - 13s/epoch - 617ms/step
Epoch 6/100
21/21 - 13s - loss: 0.2799 - val_loss: 0.2837 - 13s/epoch - 616ms/step
Epoch 7/100
21/21 - 13s - loss: 0.2784 - val_loss: 0.2820 - 13s/epoch - 618ms/step
Epoch 8/100
21/21 - 13s - loss: 0.2792 - val_loss: 0.2854 - 13s/epoch - 602ms/step
Epoch 9/100
21/21 - 12s - loss: 0.2782 - val_loss: 0.2805 - 12s/epoch - 571ms/step
Epoch 10/100
21/21 - 12s - loss: 0.2774 - val_loss: 0.2797 - 12s/epoch - 560ms/step
Epoch 11/100
21/21 - 12s - loss: 0.2777 - val_loss: 0.2793 - 12s/epoch - 567ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2761 - val_loss: 0.2795 - 12s/epoch - 563ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 13s - loss: 0.3354 - val_loss: 0.2854 - 13s/epoch - 618ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2820 - val_loss: 0.2834 - 13s/epoch - 621ms/step
Epoch 3/100
21/21 - 13s - loss: 0.2794 - val_loss: 0.2824 - 13s/epoch - 618ms/step
Epoch 4/100
21/21 - 14s - loss: 0.2798 - val_loss: 0.2819 - 14s/epoch - 665ms/step
Epoch 5/100
21/21 - 14s - loss: 0.2797 - val_loss: 0.2825 - 14s/epoch - 688ms/step
Epoch 6/100
21/21 - 13s - loss: 0.2797 - val_loss: 0.2826 - 13s/epoch - 642ms/step
Epoch 7/100
21/21 - 15s - loss: 0.2787 - val_loss: 0.2833 - 15s/epoch - 694ms/step
Epoch 8/100
21/21 - 13s - loss: 0.2775 - val_loss: 0.2818 - 13s/epoch - 628ms/step
Epoch 9/100
21/21 - 13s - loss: 0.2794 - val_loss: 0.2809 - 13s/epoch - 619ms/step
Epoch 10/100
21/21 - 13s - loss: 0.2768 - val_loss: 0.2803 - 13s/epoch - 625ms/step
Epoch 11/100
21/21 - 14s - loss: 0.2779 - val_loss: 0.2796 - 14s/epoch - 646ms/step
Epoch 12/100
21/21 - 13s - loss: 0.2779 - val_loss: 0.2800 - 13s/epoch - 630ms/step
Epoch 13/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 13s - loss: 0.3376 - val_loss: 0.2890 - 13s/epoch - 598ms/step
Epoch 2/100
21/21 - 13s - loss: 0.2811 - val_loss: 0.2840 - 13s/epoch - 597ms/step
Epoch 3/100
21/21 - 12s - loss: 0.2798 - val_loss: 0.2848 - 12s/epoch - 561ms/step
Epoch 4/100
21/21 - 12s - loss: 0.2791 - val_loss: 0.2828 - 12s/epoch - 553ms/step
Epoch 5/100
21/21 - 12s - loss: 0.2789 - val_loss: 0.2818 - 12s/epoch - 572ms/step
Epoch 6/100
21/21 - 12s - loss: 0.2789 - val_loss: 0.2847 - 12s/epoch - 556ms/step
Epoch 7/100
21/21 - 12s - loss: 0.2782 - val_loss: 0.2787 - 12s/epoch - 557ms/step
Epoch 8/100
21/21 - 12s - loss: 0.2771 - val_loss: 0.2825 - 12s/epoch - 562ms/step
Epoch 9/100
21/21 - 12s - loss: 0.2771 - val_loss: 0.2810 - 12s/epoch - 559ms/step
Epoch 10/100
21/21 - 12s - loss: 0.2763 - val_loss: 0.2788 - 12s/epoch - 559ms/step
Epoch 11/100
21/21 - 12s - loss: 0.2772 - val_loss: 0.2810 - 12s/epoch - 553ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2769 - val_loss: 0.2795 - 12s/epoch - 555ms/step

Running expe

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 12s - loss: 0.3260 - val_loss: 0.2901 - 12s/epoch - 576ms/step
Epoch 2/100
21/21 - 12s - loss: 0.2836 - val_loss: 0.2833 - 12s/epoch - 576ms/step
Epoch 3/100
21/21 - 12s - loss: 0.2803 - val_loss: 0.2854 - 12s/epoch - 574ms/step
Epoch 4/100
21/21 - 12s - loss: 0.2795 - val_loss: 0.2826 - 12s/epoch - 580ms/step
Epoch 5/100
21/21 - 12s - loss: 0.2776 - val_loss: 0.2809 - 12s/epoch - 590ms/step
Epoch 6/100
21/21 - 12s - loss: 0.2771 - val_loss: 0.2820 - 12s/epoch - 577ms/step
Epoch 7/100
21/21 - 12s - loss: 0.2779 - val_loss: 0.2822 - 12s/epoch - 575ms/step
Epoch 8/100
21/21 - 12s - loss: 0.2775 - val_loss: 0.2795 - 12s/epoch - 579ms/step
Epoch 9/100
21/21 - 13s - loss: 0.2757 - val_loss: 0.2779 - 13s/epoch - 601ms/step
Epoch 10/100
21/21 - 12s - loss: 0.2750 - val_loss: 0.2794 - 12s/epoch - 592ms/step
Epoch 11/100
21/21 - 12s - loss: 0.2760 - val_loss: 0.2818 - 12s/epoch - 594ms/step
Epoch 12/100
21/21 - 12s - loss: 0.2741 - val_loss: 0.2778 - 12s/epoch - 583ms/step
Epoch 13/100


## Results


Running our Experiment yielded the following hamming scores on the test set. The table should be read as follows:
- Convolutional Layers: is a list of convolutional patterns per layer. A value of [32,16] means there are two convolutional layers with 32 and 16 patterns respectively. 
- Fully Connected layers: is a list of neurons  per layer. A value of [100, 200] means there are two fully connected layers with 100 and 200 neurons respectively. 


| Test | Convolutional Layers | Fully Connected Layers | Min  | Max  | Mean  | St. Dev |
| ---- | -------------------- | ---------------------- | ---- | ----- | ---- | ------- |
| 1    | [16]                 | [100]                  | 0.20 | 0.23 | 0.217 | 0.012   |
| 2    | [32]                 | [100]                  | 0.16 | 0.22 | 0.201 | 0.021   |
| 3    | [32,16]              | [100]                  | 0.20 | 0.23 | 0.215 | 0.013   |
| 4    | [16]                 | [1000]                 | 0.15 | 0.23 | 0.204 | 0.023   |
| 5    | [16]                 | [100,100]              | 0.15 | 0.24 | 0.213 | 0.025   |
| 6    | [32,16]              | [100,100]              | 0.15 | 0.24 | 0.19  | 0.031   |
| 7    | [32,16]              | [20,20]                | 0.09 | 0.22 | 0.175 | 0.038   |
| 8    | [64,32,16]           | [100,100]              | 0.11 | 0.15 | 0.125 | 0.014   |
| 9    | [64,32,16]           | [100]                  | 0.10 | 0.23 | 0.173 | 0.03    |


## Discussion

The initial configuration is still the best configuration albeit with several close neighbors. This shows that improving the complexity did not lead to improved results.

In some case we even observed significantly worse results. Specifically, it seems like increasing the number of convolutional layers to three leads to a significant decrease in results. 


## Conclusions:
For future experiments we will stay in the bounds tested in this experiment as increasing the complexity seems to not help results. We will also exclude configurations with more than two convolutional layers. 



# Experiment 4: Network configuration in complex networks

In this experiment we rerun Experiment 1 for a more complex network. The goal is to test whether the results are dependent on the complexity of the network.


## Experiment Protocol: 

For experimentation we used configuration 6 from the previous experiment. While this configuration yielded worse scores than others, it is a combination of the two promising configurations 3 and 5. We combined the two to achieve a more complex but promising network to test.

Otherwise, the experiment follows the same procedure as Experiment 1. The results are recorded below the Python cell. 


In [6]:
train_datagen = ImageDataGenerator(
    rotation_range=30,
    width_shift_range=0.1,
    height_shift_range=0.1,
    fill_mode="nearest"
)
train_generator = train_datagen.flow(x=X_train, y=y_train_prob)

cnn_builder = CNNBuilder(
    convolutional_layers=[32,16],
    fully_connected_layers=[100,100],
    in_shape=(X_train.shape[1], X_train.shape[2], X_train.shape[3]),  # type: ignore
    out_shape=y_train.shape[1]  # type: ignore
)
cnn_builder.apply_regularization = False
cnn_builder.apply_dropout = False
cnn_builder.apply_batch_normalization = False
cnn_builder.weight_constraints = False

evaluator = MultiRunEvaluation(model_creation=cnn_builder.build_model)
evaluator.evaluate( nr_runs=10, 
                    epochs=100, 
                    early_stopping_patience=5, 
                    train_generator= train_generator, 
                    X_train=X_train, 
                    y_train=y_train, 
                    X_test=X_test,
                    y_test=y_test)
evaluator.print_metrics()


Running experiment 1 of, 10
Epoch 1/100


  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3422 - val_loss: 0.2840 - 7s/epoch - 342ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2825 - val_loss: 0.2860 - 7s/epoch - 340ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2806 - val_loss: 0.2841 - 7s/epoch - 347ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2807 - val_loss: 0.2827 - 7s/epoch - 332ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2785 - val_loss: 0.2833 - 7s/epoch - 328ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2765 - val_loss: 0.2798 - 7s/epoch - 334ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2748 - val_loss: 0.2811 - 7s/epoch - 329ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2717 - val_loss: 0.2756 - 7s/epoch - 339ms/step
Epoch 9/100
21/21 - 7s - loss: 0.2704 - val_loss: 0.2732 - 7s/epoch - 346ms/step
Epoch 10/100
21/21 - 8s - loss: 0.2670 - val_loss: 0.2721 - 8s/epoch - 367ms/step
Epoch 11/100
21/21 - 8s - loss: 0.2637 - val_loss: 0.2688 - 8s/epoch - 367ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2654 - val_loss: 0.2735 - 7s/epoch - 352ms/step
Epoch 13/100
21/21 - 7s - loss: 0.262

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 8s - loss: 0.3344 - val_loss: 0.2860 - 8s/epoch - 361ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2827 - val_loss: 0.2870 - 7s/epoch - 352ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2817 - val_loss: 0.2861 - 7s/epoch - 341ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2797 - val_loss: 0.2839 - 7s/epoch - 337ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2783 - val_loss: 0.2806 - 7s/epoch - 352ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2759 - val_loss: 0.2786 - 7s/epoch - 346ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2734 - val_loss: 0.2757 - 7s/epoch - 349ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2731 - val_loss: 0.2812 - 7s/epoch - 335ms/step
Epoch 9/100
21/21 - 8s - loss: 0.2715 - val_loss: 0.2750 - 8s/epoch - 368ms/step
Epoch 10/100
21/21 - 8s - loss: 0.2672 - val_loss: 0.2734 - 8s/epoch - 362ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2659 - val_loss: 0.2702 - 7s/epoch - 351ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2651 - val_loss: 0.2672 - 7s/epoch - 334ms/step
Epoch 13/100
21/21 - 7s - loss: 0.265

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


Epoch 1/100
21/21 - 10s - loss: 0.3428 - val_loss: 0.2858 - 10s/epoch - 481ms/step
Epoch 2/100
21/21 - 9s - loss: 0.2816 - val_loss: 0.2822 - 9s/epoch - 419ms/step
Epoch 3/100
21/21 - 9s - loss: 0.2801 - val_loss: 0.2806 - 9s/epoch - 413ms/step
Epoch 4/100
21/21 - 8s - loss: 0.2789 - val_loss: 0.2806 - 8s/epoch - 390ms/step
Epoch 5/100
21/21 - 8s - loss: 0.2783 - val_loss: 0.2804 - 8s/epoch - 367ms/step
Epoch 6/100
21/21 - 8s - loss: 0.2757 - val_loss: 0.2783 - 8s/epoch - 371ms/step
Epoch 7/100
21/21 - 8s - loss: 0.2748 - val_loss: 0.2796 - 8s/epoch - 384ms/step
Epoch 8/100
21/21 - 8s - loss: 0.2745 - val_loss: 0.2771 - 8s/epoch - 366ms/step
Epoch 9/100
21/21 - 8s - loss: 0.2732 - val_loss: 0.2765 - 8s/epoch - 395ms/step
Epoch 10/100
21/21 - 8s - loss: 0.2728 - val_loss: 0.2786 - 8s/epoch - 370ms/step
Epoch 11/100
21/21 - 9s - loss: 0.2709 - val_loss: 0.2731 - 9s/epoch - 439ms/step
Epoch 12/100
21/21 - 8s - loss: 0.2695 - val_loss: 0.2760 - 8s/epoch - 383ms/step
Epoch 13/100
21/21 - 8s

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3340 - val_loss: 0.2871 - 7s/epoch - 344ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2830 - val_loss: 0.2862 - 7s/epoch - 346ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2812 - val_loss: 0.2852 - 7s/epoch - 342ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2800 - val_loss: 0.2824 - 7s/epoch - 342ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2791 - val_loss: 0.2881 - 7s/epoch - 340ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2791 - val_loss: 0.2819 - 7s/epoch - 347ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2770 - val_loss: 0.2814 - 7s/epoch - 345ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2771 - val_loss: 0.2789 - 7s/epoch - 342ms/step
Epoch 9/100
21/21 - 7s - loss: 0.2750 - val_loss: 0.2755 - 7s/epoch - 351ms/step
Epoch 10/100
21/21 - 7s - loss: 0.2701 - val_loss: 0.2773 - 7s/epoch - 346ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2688 - val_loss: 0.2765 - 7s/epoch - 348ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2675 - val_loss: 0.2678 - 7s/epoch - 346ms/step
Epoch 13/100
21/21 - 8s - loss: 0.265

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 8s - loss: 0.3297 - val_loss: 0.2896 - 8s/epoch - 358ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2812 - val_loss: 0.2841 - 7s/epoch - 351ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2800 - val_loss: 0.2827 - 7s/epoch - 348ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2780 - val_loss: 0.2827 - 7s/epoch - 349ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2764 - val_loss: 0.2809 - 7s/epoch - 349ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2746 - val_loss: 0.2832 - 7s/epoch - 353ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2728 - val_loss: 0.2799 - 7s/epoch - 357ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2708 - val_loss: 0.2782 - 7s/epoch - 345ms/step
Epoch 9/100
21/21 - 8s - loss: 0.2705 - val_loss: 0.2761 - 8s/epoch - 362ms/step
Epoch 10/100
21/21 - 7s - loss: 0.2689 - val_loss: 0.2754 - 7s/epoch - 352ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2677 - val_loss: 0.2741 - 7s/epoch - 355ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2676 - val_loss: 0.2721 - 7s/epoch - 349ms/step
Epoch 13/100
21/21 - 8s - loss: 0.265

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3475 - val_loss: 0.2867 - 7s/epoch - 330ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2821 - val_loss: 0.2854 - 7s/epoch - 330ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2823 - val_loss: 0.2846 - 7s/epoch - 332ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2796 - val_loss: 0.2827 - 7s/epoch - 329ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2798 - val_loss: 0.2813 - 7s/epoch - 329ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2785 - val_loss: 0.2806 - 7s/epoch - 332ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2758 - val_loss: 0.2782 - 7s/epoch - 328ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2743 - val_loss: 0.2747 - 7s/epoch - 327ms/step
Epoch 9/100
21/21 - 7s - loss: 0.2716 - val_loss: 0.2748 - 7s/epoch - 328ms/step
Epoch 10/100
21/21 - 7s - loss: 0.2681 - val_loss: 0.2749 - 7s/epoch - 331ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2671 - val_loss: 0.2677 - 7s/epoch - 327ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2638 - val_loss: 0.2709 - 7s/epoch - 329ms/step
Epoch 13/100
21/21 - 7s - loss: 0.262

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3320 - val_loss: 0.2859 - 7s/epoch - 330ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2806 - val_loss: 0.2865 - 7s/epoch - 324ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2791 - val_loss: 0.2878 - 7s/epoch - 326ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2797 - val_loss: 0.2839 - 7s/epoch - 324ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2780 - val_loss: 0.2826 - 7s/epoch - 326ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2766 - val_loss: 0.2815 - 7s/epoch - 327ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2766 - val_loss: 0.2775 - 7s/epoch - 324ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2727 - val_loss: 0.2758 - 7s/epoch - 328ms/step
Epoch 9/100
21/21 - 7s - loss: 0.2707 - val_loss: 0.2759 - 7s/epoch - 326ms/step
Epoch 10/100
21/21 - 7s - loss: 0.2693 - val_loss: 0.2765 - 7s/epoch - 328ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2668 - val_loss: 0.2744 - 7s/epoch - 328ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2663 - val_loss: 0.2727 - 7s/epoch - 330ms/step
Epoch 13/100
21/21 - 7s - loss: 0.265

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3366 - val_loss: 0.2846 - 7s/epoch - 357ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2809 - val_loss: 0.2899 - 7s/epoch - 348ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2806 - val_loss: 0.2850 - 7s/epoch - 341ms/step
Epoch 4/100
21/21 - 8s - loss: 0.2794 - val_loss: 0.2807 - 8s/epoch - 385ms/step
Epoch 5/100
21/21 - 9s - loss: 0.2758 - val_loss: 0.2807 - 9s/epoch - 426ms/step
Epoch 6/100
21/21 - 8s - loss: 0.2745 - val_loss: 0.2785 - 8s/epoch - 392ms/step
Epoch 7/100
21/21 - 8s - loss: 0.2714 - val_loss: 0.2759 - 8s/epoch - 397ms/step
Epoch 8/100
21/21 - 8s - loss: 0.2697 - val_loss: 0.2741 - 8s/epoch - 398ms/step
Epoch 9/100
21/21 - 8s - loss: 0.2676 - val_loss: 0.2719 - 8s/epoch - 396ms/step
Epoch 10/100
21/21 - 8s - loss: 0.2666 - val_loss: 0.2711 - 8s/epoch - 372ms/step
Epoch 11/100
21/21 - 9s - loss: 0.2649 - val_loss: 0.2705 - 9s/epoch - 437ms/step
Epoch 12/100
21/21 - 8s - loss: 0.2649 - val_loss: 0.2696 - 8s/epoch - 382ms/step
Epoch 13/100
21/21 - 8s - loss: 0.263

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


21/21 - 7s - loss: 0.3422 - val_loss: 0.2873 - 7s/epoch - 329ms/step
Epoch 2/100
21/21 - 7s - loss: 0.2823 - val_loss: 0.2844 - 7s/epoch - 322ms/step
Epoch 3/100
21/21 - 7s - loss: 0.2809 - val_loss: 0.2850 - 7s/epoch - 323ms/step
Epoch 4/100
21/21 - 7s - loss: 0.2808 - val_loss: 0.2838 - 7s/epoch - 321ms/step
Epoch 5/100
21/21 - 7s - loss: 0.2803 - val_loss: 0.2830 - 7s/epoch - 319ms/step
Epoch 6/100
21/21 - 7s - loss: 0.2808 - val_loss: 0.2829 - 7s/epoch - 323ms/step
Epoch 7/100
21/21 - 7s - loss: 0.2796 - val_loss: 0.2820 - 7s/epoch - 352ms/step
Epoch 8/100
21/21 - 7s - loss: 0.2792 - val_loss: 0.2829 - 7s/epoch - 339ms/step
Epoch 9/100
21/21 - 7s - loss: 0.2784 - val_loss: 0.2794 - 7s/epoch - 342ms/step
Epoch 10/100
21/21 - 7s - loss: 0.2771 - val_loss: 0.2782 - 7s/epoch - 348ms/step
Epoch 11/100
21/21 - 7s - loss: 0.2738 - val_loss: 0.2762 - 7s/epoch - 333ms/step
Epoch 12/100
21/21 - 7s - loss: 0.2730 - val_loss: 0.2743 - 7s/epoch - 355ms/step
Epoch 13/100
21/21 - 9s - loss: 0.268

  history = model.fit_generator(train_generator, epochs=epochs, validation_data=(X_test, y_test),callbacks = [early_stopping],verbose=verbose)


Epoch 1/100
21/21 - 8s - loss: 0.3374 - val_loss: 0.2861 - 8s/epoch - 396ms/step
Epoch 2/100
21/21 - 8s - loss: 0.2816 - val_loss: 0.2853 - 8s/epoch - 399ms/step
Epoch 3/100
21/21 - 9s - loss: 0.2799 - val_loss: 0.2834 - 9s/epoch - 415ms/step
Epoch 4/100
21/21 - 8s - loss: 0.2811 - val_loss: 0.2863 - 8s/epoch - 401ms/step
Epoch 5/100
21/21 - 9s - loss: 0.2796 - val_loss: 0.2834 - 9s/epoch - 439ms/step
Epoch 6/100
21/21 - 11s - loss: 0.2793 - val_loss: 0.2842 - 11s/epoch - 530ms/step
Epoch 7/100
21/21 - 10s - loss: 0.2784 - val_loss: 0.2869 - 10s/epoch - 453ms/step
Epoch 8/100
21/21 - 8s - loss: 0.2780 - val_loss: 0.2805 - 8s/epoch - 397ms/step
Epoch 9/100
21/21 - 9s - loss: 0.2755 - val_loss: 0.2788 - 9s/epoch - 406ms/step
Epoch 10/100
21/21 - 9s - loss: 0.2730 - val_loss: 0.2771 - 9s/epoch - 418ms/step
Epoch 11/100
21/21 - 9s - loss: 0.2709 - val_loss: 0.2755 - 9s/epoch - 424ms/step
Epoch 12/100
21/21 - 9s - loss: 0.2684 - val_loss: 0.2734 - 9s/epoch - 417ms/step
Epoch 13/100
21/21 - 

## Results

Running our Experiment yielded the following results:

| Configuration          | Min  | Max  | Mean  | St. Dev |
| ---------------------- | ---- | ---- | ----- | ------- |
| No Configuration (again!)      | 0.12 | 0.24 | 0.194 | 0.032   | 
| Regularization         | 0.13 | 0.19 | 0.175 | 0.018   |
| Dropout                | 0.1  | 0.23 | 0.177 | 0.047   |
| Batch Normalization    | 0.07 | 0.16 | 0.117 | 0.026   |
| Weight Constraints     | 0.16 | 0.21 | 0.184 | 0.016   |


## Interpretation.

The results achieved mirror those of Experiment 1. The configurations we tested seem to not make a big difference in changing the hamming score. Batch normalization in particular worsens the score significantly.

## Conclusion

In future experients, we will use regularization, dropout and weight constraints. Although we will do so more out of recognition of good practices as this experiment did not give cause to assume that they will improve matters drastically.

# Conclusions

The experiments performed in this notebook showed that we can improve the generalization of our models by using image generators to increase the variety of input images. The neural network configuration did not improve results but also did not worsen them significantly.

As an outcome of the experiments, we derive the following knowledge for further experiments:
- Experiment 1 and 4: We will use regularization, dropout and weight constraints for our models.
- Experiment 2: We will use image generators with rotation and shift for image data preprocessing.
- Experiment 3: We will focus on models in the low-complexity end of CNNs. This means  we will test ...
    - no more than 2 convolutional layers
    - no more than 1000 Neurons in fully connected layers
    - no more than 2 fully connected layers.

We will use these for future experiments.

In addition, we noted that even with 10 runs, the numbers can differ significantly when rerunning an experiment. For this reason, we will use a higher number of runs in the future. 