**Group 22**

Name  | Surname | Email  
---------|-------------------|---------
Julio|Vigueras|20220661@novaims.unl.pt 
Ariel|Pérez|20220662@novaims.unl.pt
Miguelanguel|Mayuare|20220665@novaims.unl.pt
Ayotunde|Aribo|20221012@novaims.unl.pt

# Model handcrafted "B"
---

The previous model "A" overfitted very quickly. Considering how simple the model is, increasing the complexity will worsen the overfit. The first approach is to add regularization. The technique for regularization used in this notebook is adding a dropout layer. A dropout layer will drop a defined quantity of weights to zero, the weights are randomly selected. 

*Srivastava, Nitish, et al. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting." Journal of Machine Learning Research, vol. 15, no. 1, 2014, pp. 1929-1958.*

In [None]:
# Make the imports
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import image_dataset_from_directory

import plotly.express as px
from plotly.subplots import make_subplots
import pandas as pd

import pathlib
import warnings

warnings.filterwarnings('ignore')

First, the previous model is imported in order to make the necessary modifications.

In [None]:
!wget https://www.dropbox.com/s/n3320qxwdn3rs19/moths.zip?dl=0 -O moths.zip
!unzip moths.zip

Mounted at /content/gdrive/


In [None]:
def base_dropout(blocks=4, input_shape=(224, 224, 3)):
    inputs = keras.Input(shape=input_shape)
    x = layers.Rescaling(1./255)(inputs)
    for i in range(blocks + 1, blocks + 5):
        x = layers.Conv2D(filters=2**i, kernel_size=3, activation="relu")(x)
        x = layers.MaxPooling2D(pool_size=2)(x)
        # This layer applies Dropout with a dropout rate of 30% to the input data 
        # before passing it to the next convolution layer.
        x = layers.Dropout(0.3)(x) 
    x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
    x = layers.Flatten()(x)
    # This layer applies Dropout with a dropout rate of 50% to the output data of the final convolution layer 
    # before passing it to the output dense layer.
    x = layers.Dropout(0.5)(x) 
    outputs = layers.Dense(30, activation="softmax")(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

In [None]:
model = base_dropout(3)
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 222, 222, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 111, 111, 16)     0         
 )                                                               
                                                                 
 dropout (Dropout)           (None, 111, 111, 16)      0         
                                                                 
 conv2d_1 (Conv2D)           (None, 109, 109, 32)      4640      
                                                             

2023-04-06 17:45:01.529437: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-04-06 17:45:01.611744: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-04-06 17:45:01.611905: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysf

In [None]:
# Compile model
model.compile(loss="sparse_categorical_crossentropy", 
              optimizer="adam", 
              metrics=["accuracy"])

In [None]:
dataset_path = pathlib.Path("moths")
input_shape = (224, 224, 3)

In [None]:
# Split datasets
train_dataset = image_dataset_from_directory(
    dataset_path / "train",
    image_size=input_shape[:2],
    batch_size=64)
validation_dataset = image_dataset_from_directory(
    dataset_path / "valid",
    image_size=input_shape[:2],
    batch_size=64)
test_dataset = image_dataset_from_directory(
    dataset_path / "test",
    image_size=input_shape[:2],
    batch_size=64)

Found 3558 files belonging to 30 classes.
Found 445 files belonging to 30 classes.
Found 408 files belonging to 30 classes.


In [None]:
# Callbacks
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="saved_models/model_handcrafted_B.keras",
        save_best_only=True,
        monitor="val_loss"),
    keras.callbacks.EarlyStopping(
        monitor="val_loss",
        patience=15)
]

In [None]:
# Train model
history = model.fit(
    train_dataset,
    epochs=100,
    batch_size=64,
    validation_data=validation_dataset,
    callbacks=callbacks)

Epoch 1/100


2023-04-06 17:45:53.250326: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [3558]
	 [[{{node Placeholder/_4}}]]
2023-04-06 17:45:53.250540: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [3558]
	 [[{{node Placeholder/_4}}]]
2023-04-06 17:45:53.715654: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
2023-04-06 17:45:54.729941: I tensorflow/compiler/xla



2023-04-06 17:46:02.859792: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype string and shape [445]
	 [[{{node Placeholder/_0}}]]
2023-04-06 17:46:02.860071: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype string and shape [445]
	 [[{{node Placeholder/_0}}]]


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100


In [None]:
# Visualization
hist_df = pd.DataFrame(history.history)
loss = px.scatter(hist_df['loss'])
val_loss = px.line(hist_df['val_loss'])
accuracy = px.scatter(hist_df['accuracy'])
val_accuracy = px.line(hist_df['val_accuracy'])

fig = make_subplots(cols=2, rows=1, subplot_titles=("Loss", "Accuracy"))
fig.add_trace(loss.data[0], col=1, row=1)
fig.add_trace(val_loss.data[0], col=1, row=1)
fig.add_trace(accuracy.data[0], col=2, row=1)
fig.add_trace(val_accuracy.data[0], col=2, row=1)

fig.show()

![Accuracy and loss](https://www.dropbox.com/s/4k2kkpzd5txugj9/modelb.png?raw=1)

Comparing the plots from the previous model, this is much better because it took longer to overfit.

Below are both models compared with the test set.

In [None]:
model_A = load_model("saved_models/model_handcrafted_A.keras")
model_B = load_model("saved_models/model_handcrafted_B.keras")

In [None]:
_, model_A_acc = model_A.evaluate(test_dataset)
_, model_B_acc = model_B.evaluate(test_dataset)

print(f"Model A (without regularization): {model_A_acc * 100:.2f}% of accuracy\n"
      f"Model B (with regularization): {model_B_acc * 100:.2f}% of accuracy")

1/7 [===>..........................] - ETA: 0s - loss: 0.9766 - accuracy: 0.7656

2023-04-06 17:48:04.890955: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [408]
	 [[{{node Placeholder/_4}}]]
2023-04-06 17:48:04.891394: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype string and shape [408]
	 [[{{node Placeholder/_0}}]]


Model A (without regularization): 69.85% of accuracy
Model B (with regularization): 64.95% of accuracy


---
### Model comparisson
Model A show a better accuracy, but model B takes longer to show overfitting.