# FeedForward 🥸

In this notebook we explore the possible use of Multilayer Perceptron Neural Network in this classicification task. We used three different architecture strucutures: a baseline Network with two hidden layers and 1100 hidden neurons and then two variations obtained by doubleing the number of neurons or layers.

### Libriaries

Just import all the usefull libriaries

In [None]:
import utilities as ff


In [None]:
import tensorflow as tf

from keras.layers import *
from keras.models import *
from keras.losses import *
from keras.optimizers import *
from keras.utils import *


### Dataset

Set the parameters.

In [None]:
img_size = 100
color_mode = "grayscale"
epochs = 20


Divide the dataset between exemples used for hyperparameters tuning and risk estimation.

In [None]:
train, test = ff.ready_to_be_used_dataset(
    image_size=img_size,
    color_mode=color_mode,
)


Compute the number of batches that must be used for training and validating the model in the hyperparameters tuning.

In [None]:
true_train_size = len(train) * 4 // 5
true_train_size


Divede the training part between training and validation.

In [None]:
train, valid = train.take(true_train_size), train.skip(true_train_size)


Check the dimension of the data.

In [None]:
image_batch, labels_batch = next(iter(valid))
print(
    f"Size of a batch of images {image_batch.shape}",
    f"Size of a bath of labels {labels_batch.shape}",
)


### Baseline model

#### Model definition

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(1000, activation="sigmoid"),
    Dense(100, activation="sigmoid"),
    Dense(2, activation="softmax"),
]

model = Sequential(layer_list)
model.summary()


In [None]:
plot_model(
    model,
    show_shapes=True,
    show_dtype=False,
    show_layer_names=True,
    rankdir="TB",
    expand_nested=False,
    dpi=96,
    layer_range=None,
    show_layer_activations=True,
)


In [None]:
model.compile(
    optimizer=Adam(),
    loss=SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)


#### Hyperparameters tuning

In [None]:
history = model.fit(train, validation_data=valid, epochs=epochs)


In [None]:
ff.performance_plot(history)

**UNDERFITTING!!** 

Try with another activation function.

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(1000, activation="relu"),
    Dense(100, activation="relu"),
    Dense(2, activation="softmax"),
]
model = Sequential(layer_list)


In [None]:
model.compile(
    optimizer=Adam(),
    loss=SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)


In [None]:
history = model.fit(train, validation_data=valid, epochs=epochs)


In [None]:
ff.performance_plot(history)

It seems to overfit after the epoch number 11/12.

Let's modify the learning rate!

In [None]:
learning_rate = 1e-6

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(1000, activation="relu"),
    Dense(100, activation="relu"),
    Dense(2, activation="softmax"),
]
model = Sequential(layer_list)


In [None]:
model.compile(
    optimizer=Adam(learning_rate=learning_rate),
    loss=SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)


In [None]:
history = model.fit(train, validation_data=valid, epochs=epochs)


In [None]:
ff.performance_plot(history)

With a very lower learning rate the neural network reach the optimal performance faster and the learning curve is more smooth.

We can deduce that with the default learning rate there were a lot of oscillations, so with a learning rate = 0.000001 we obtain a better exploitation!

#### Risk estimation

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(100, activation="relu"),
    Dense(2, activation="softmax"),
]
def_model = Sequential(layer_list)


In [None]:
accuracies = ff.k_fold_cross_validation(
    "DEF",
    dataset=test,
    model_name=layer_list,
    epochs=epochs,
)


avg_loss = 1 - sum(accuracies) / len(accuracies)
print("Risk estimation (average zero one loss): ", avg_loss)


### Many neurons model

#### Model definition

Try to double the hidden neurons per layers.

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(2000, activation="relu"),
    Dense(200, activation="relu"),
    Dense(2, activation="softmax"),
]

model = tf.keras.Sequential(layer_list)
model.summary()


In [None]:
plot_model(
    model,
    show_shapes=True,
    show_dtype=False,
    show_layer_names=True,
    rankdir="TB",
    expand_nested=False,
    dpi=96,
    layer_range=None,
    show_layer_activations=True,
)


In [None]:
model.compile(
    optimizer=Adam(learning_rate=learning_rate),
    loss=SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)


#### Hyperparameters tuning

In [None]:
history = model.fit(train, validation_data=valid, epochs=epochs)


In [None]:
ff.performance_plot(history)

#### Risk estimation


### Many layers model

#### Model definition

In [None]:
layer_list = [
    Flatten(input_shape=(img_size, img_size, 1)),
    Dense(750, activation="relu"),
    Dense(250, activation="relu"),
    Dense(75, activation="relu"),
    Dense(25, activation="relu"),
    Dense(2, activation="softmax"),
]
model = Sequential(layer_list)
model.summary()


In [None]:
plot_model(
    model,
    show_shapes=True,
    show_dtype=False,
    show_layer_names=True,
    rankdir="TB",
    expand_nested=False,
    dpi=96,
    layer_range=None,
    show_layer_activations=True,
)


In [None]:
model.compile(
    optimizer=Adam(learning_rate=learning_rate),
    loss=SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)


#### Hyperparameter tuning

In [None]:
history = model.fit(train, validation_data=valid, epochs=epochs)


In [None]:
ff.performance_plot(history)

#### Risk estimation

In [None]:
accuracies = ff.k_fold_cross_validation(
    "MANY_L",
    test,
    layer_list,
    learning_rate=learning_rate,
    epochs=epochs,
)

avg_loss = 1 - sum(accuracies) / len(accuracies)
print("Risk estimation (average zero one loss): ", avg_loss)
