### ***Convolutional Neural Network - Data Augmentation***

#### ***Initial operations***

Firstly, we have done some initial and setting operations, like connecting to our Google Drive folder and importing libraries and files useful for the project.

In [None]:
import pandas as pd
import numpy as np
import importlib
import itertools
import csv
import sys
import os

from iterstrat.ml_stratifiers import MultilabelStratifiedKFold

from tensorflow import keras
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tensorflow.keras.models import load_model
from keras.regularizers import l2

parent_folder = os.path.abspath('../../')
sys.path.append(parent_folder)

from src.utils import text_vectorization
# importlib.reload(text_vectorization)

from src.utils import embedding
# importlib.reload(embedding)

from src.utils import kfold_cv
# importlib.reload(kfold_cv)

In [None]:
network_name = 'conv-aug'
model_name = 'conv'

#### ***Training and test set***

In this first stage, we have:
- Read the **training and test set**;
- Calculated the **number of unique categories**, so the number of classes in the text classification;
- Converted the labels associated with the articles' to **one-hot encoding representation**, which is a deep learning best practice when we cope with multi-label text classification task.

In [None]:
train_set = pd.read_csv('../../data/augmented/train-set-cat1-augmented.csv')

In [None]:
test_set = pd.read_csv('../../data/processed/test-set-cat1-processed.csv')

# Number of different categories
number_of_categories = len(train_set['label'].unique())

# One-hot encoding of the labels
label_train = to_categorical(train_set['label'], num_classes = number_of_categories, dtype = 'int64')
label_test = to_categorical(test_set['label'], num_classes = number_of_categories, dtype = 'int64')

#### ***Text vectorization and embedding***

Firstly, the following **parameters** are defined:
- **Size of the vocabulary** to create;
- **Number of words** considered for each text (article);
- **Dimension of the embedding**;

In [None]:
vocabulary_size = 50000
words_per_sentence = 200
embedding_dim = 100

Then, we have carried out two embedding approaches:

- **Keras vectorization and GloVe embedding**

  - The *vectorization* (and so the creation of the *vocabulary*) is carried out using the **Keras built-in function**, with the final adaption of the text vectorizer on the training set;
  - For the *embedding matrix*, we have used a pre-trained solution, named **GloVe**, with 100 dimensions;
  - Finally, we have created the final **vectorized feature** for the training phase.

In [None]:
text_vectorizer_keras = text_vectorization.createTextVectorizer(vocabulary_size, words_per_sentence, train_set['text'])
vocabulary_keras = text_vectorizer_keras.get_vocabulary()

embedding_matrix_glove = embedding.buildEmbeddingMatrix(embedding_dim, vocabulary_keras)
embedding_layer_glove = embedding.createEmbeddingLayer(embedding_matrix_glove, None)

In [None]:
feature_train_glove = text_vectorization.textVectorization(train_set['text'], text_vectorizer_keras)

- **Word2Vec vectorization and embedding**

  - This strategy plans to create a text vectorizer, a vocabulary and an embedding using a *Word2Vec model* directly trained on our training set.

In [None]:
word2vec = text_vectorization.createTextVectorizerWord2Vec(train_set['text'], vocabulary_size, embedding_dim)
text_vectorizer_word2vec = word2vec['text_vectorizer']
vocabulary_word2vec = list(word2vec['vocabulary_embedding'].key_to_index)

embedding_matrix_word2vec = embedding.buildingEmbeddingMatrixWord2Vec(embedding_dim, vocabulary_word2vec, word2vec['vocabulary_embedding'])
embedding_layer_word2vec = embedding.createEmbeddingLayer(embedding_matrix_word2vec, None)

In [None]:
feature_train_word2vec = text_vectorization.textVectorizationWord2Vec(train_set['text'], text_vectorizer_word2vec, words_per_sentence)

#### ***Neural network architectures***

Here, we have defined a **set of Convolutional Neural Network architectures** (models), using different combinations of hyperparameters:

- *Embedding layer*: Glove or Word2Vec (as explained before);
- *Number of convolutional hidden layers*: 1, 2 or 3;
- *Number of filters in the convolutional layers*: 128 or 256;
- *Kernel size of the convolutional layers*: 3, 4 or 5;
- *Regularization*: L2 regularization in the convolutional layers;

The following parameters have kept the same value in each architectures:
- *Learning rate*: 0.5;
- *Batch size*: 512.

There will be a **list of neural network architecture**, which also a brief explanation:

##### **Neural network A - 3 Layers, Same Number of Layers, Decrease Kernel Size**
  - *Glove embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 4;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape = (words_per_sentence,), dtype = 'int64')

x = embedding_layer_glove(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 5, activation = 'relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 4, activation = 'relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation = 'relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_A = keras.Model(input_layer, output_layer, name = model_name)

conv_network_A_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Glove',
    'regularization': False,

    'number_of_layers': 3,

    'layer1_filters': 128,
    'layer1_kernel_size': 5,

    'layer2_filters': 128,
    'layer2_kernel_size': 4,

    'layer3_filters': 128,
    'layer3_kernel_size': 3

}

del input_layer, x, output_layer

##### **Neural network B - 3 Layers, Same Number of Layers, Same Kernel Size**
  - *Glove embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape = (words_per_sentence,), dtype = 'int64')

x = embedding_layer_glove(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_B = keras.Model(input_layer, output_layer, name = model_name)

conv_network_B_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Glove',
    'regularization': False,

    'number_of_layers': 3,

    'layer1_filters': 128,
    'layer1_kernel_size': 3,

    'layer2_filters': 128,
    'layer2_kernel_size': 3,

    'layer3_filters': 128,
    'layer3_kernel_size': 3

}

del input_layer, x, output_layer

##### **Neural network C - 3 Layers, Glove versus Word2Vec**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_C = keras.Model(input_layer, output_layer, name = model_name)

conv_network_C_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 3,

    'layer1_filters': 128,
    'layer1_kernel_size': 3,

    'layer2_filters': 128,
    'layer2_kernel_size': 3,

    'layer3_filters': 128,
    'layer3_kernel_size': 3

}

del input_layer, x, output_layer

##### **Neural network D - 3 Layers, Increase Number of Layers**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 32 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 32, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 64, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_D = keras.Model(input_layer, output_layer, name = model_name)

conv_network_D_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 3,

    'layer1_filters': 32,
    'layer1_kernel_size': 3,

    'layer2_filters': 64,
    'layer2_kernel_size': 3,

    'layer3_filters': 128,
    'layer3_kernel_size': 3

}

del input_layer, x, output_layer

##### **Neural network E - 3 Layers, Increase but Same Kernel Size**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 32 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 5;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 32, kernel_size = 5, activation='relu')(x)
x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 5, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_E = keras.Model(input_layer, output_layer, name = model_name)

conv_network_E_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 3,

    'layer1_filters': 32,
    'layer1_kernel_size': 5,

    'layer2_filters': 64,
    'layer2_kernel_size': 5,

    'layer3_filters': 128,
    'layer3_kernel_size': 5

}

del input_layer, x, output_layer

##### **Neural network F - 3 Layers, With Regularization**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 32 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *With regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 32, kernel_size = 3, activation='relu', kernel_regularizer = l2(0.01))(x)
x = keras.layers.Conv1D(filters = 64, kernel_size = 3, activation='relu', kernel_regularizer = l2(0.01))(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu', kernel_regularizer = l2(0.01))(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_F = keras.Model(input_layer, output_layer, name = model_name)

conv_network_F_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': True,

    'number_of_layers': 3,

    'layer1_filters': 32,
    'layer1_kernel_size': 3,

    'layer2_filters': 64,
    'layer2_kernel_size': 3,

    'layer3_filters': 128,
    'layer3_kernel_size': 3

}

del input_layer, x, output_layer

##### **Neural network G - 2 Layers, Same Number of Layers, Decrease Kernel Size**
  - *Glove embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 4;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_glove(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 4, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_G = keras.Model(input_layer, output_layer, name = model_name)

conv_network_G_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Glove',
    'regularization': False,

    'number_of_layers': 2,

    'layer1_filters': 128,
    'layer1_kernel_size': 4,

    'layer2_filters': 128,
    'layer2_kernel_size': 3,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network H - 2 Layers, Same Number of Layers, Same Kernel Size**
  - *Glove embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_glove(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)
x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_H = keras.Model(input_layer, output_layer, name = model_name)

conv_network_H_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Glove',
    'regularization': False,

    'number_of_layers': 2,

    'layer1_filters': 64,
    'layer1_kernel_size': 5,

    'layer2_filters': 64,
    'layer2_kernel_size': 5,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network I - 2 Layers, Glove versus Word2Vec**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)
x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_I = keras.Model(input_layer, output_layer, name = model_name)

conv_network_I_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 2,

    'layer1_filters': 64,
    'layer1_kernel_size': 5,

    'layer2_filters': 64,
    'layer2_kernel_size': 5,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network L - 2 Layers, Increase Number of Layers**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 3, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_L = keras.Model(input_layer, output_layer, name = model_name)

conv_network_L_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 2,

    'layer1_filters': 64,
    'layer1_kernel_size': 3,

    'layer2_filters': 128,
    'layer2_kernel_size': 3,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network M - 2 Layers, Increase but Same Kernel Size**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 5;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu')(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 5, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_M = keras.Model(input_layer, output_layer, name = model_name)

conv_network_M_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 2,

    'layer1_filters': 64,
    'layer1_kernel_size': 5,

    'layer2_filters': 128,
    'layer2_kernel_size': 5,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network N - 2 Layers, Increase Number of Layers**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 3;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *With regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 3, activation='relu', kernel_regularizer = l2(0.01))(x)
x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu', kernel_regularizer = l2(0.01))(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_N = keras.Model(input_layer, output_layer, name = model_name)

conv_network_N_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': True,

    'number_of_layers': 2,

    'layer1_filters': 64,
    'layer1_kernel_size': 3,

    'layer2_filters': 128,
    'layer2_kernel_size': 3,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network O - 1 Layer, Glove**
  - *Glove embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_glove(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_O = keras.Model(input_layer, output_layer, name = model_name)

conv_network_O_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Glove',
    'regularization': False,

    'number_of_layers': 1,

    'layer1_filters': 128,
    'layer1_kernel_size': 3,

    'layer2_filters': None,
    'layer2_kernel_size': None,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network P - 1 Layer, Word2Vec**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_P = keras.Model(input_layer, output_layer, name = model_name)

conv_network_P_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 1,

    'layer1_filters': 128,
    'layer1_kernel_size': 3,

    'layer2_filters': None,
    'layer2_kernel_size': None,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network Q - 1 Layer, Increase Kernel Size**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 128 and kernel size 5;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 128, kernel_size = 5, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_Q = keras.Model(input_layer, output_layer, name = model_name)

conv_network_Q_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 1,

    'layer1_filters': 128,
    'layer1_kernel_size': 5,

    'layer2_filters': None,
    'layer2_kernel_size': None,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network R - 1 Layer, Decrease Number of Filters**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 3;
  - *Without regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 3, activation='relu')(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_R = keras.Model(input_layer, output_layer, name = model_name)

conv_network_R_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': False,

    'number_of_layers': 1,

    'layer1_filters': 64,
    'layer1_kernel_size': 3,

    'layer2_filters': None,
    'layer2_kernel_size': None,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural network S - 1 Layer, With Regularization**
  - *Word2Vec embedding*;
  - *1 convolutional layer*: number of filters equal to 64 and kernel size 5;
  - *With regularization*.

In [None]:
input_layer = keras.Input(shape=(words_per_sentence,), dtype='int64')

x = embedding_layer_word2vec(input_layer)

x = keras.layers.Conv1D(filters = 64, kernel_size = 5, activation='relu', kernel_regularizer = l2(0.01))(x)

x = keras.layers.GlobalMaxPooling1D()(x)

x = keras.layers.Dropout(0.5)(x)

x = keras.layers.Dense(number_of_categories, activation = 'softmax')(x)
output_layer = x

conv_network_S = keras.Model(input_layer, output_layer, name = model_name)

conv_network_S_info = {

    'network': model_name,
    'data_aug': True,

    'embedding': 'Word2Vec',
    'regularization': True,

    'number_of_layers': 1,

    'layer1_filters': 64,
    'layer1_kernel_size': 5,

    'layer2_filters': None,
    'layer2_kernel_size': None,

    'layer3_filters': None,
    'layer3_kernel_size': None

}

del input_layer, x, output_layer

##### **Neural networks sets and other information**

In [None]:
network_models_set = [
    conv_network_A, conv_network_B, conv_network_C, conv_network_D,
    conv_network_E, conv_network_F, conv_network_G, conv_network_H,
    conv_network_I, conv_network_L, conv_network_M, conv_network_N,
    conv_network_O, conv_network_P, conv_network_Q, conv_network_R,
    conv_network_S
]

In [None]:
initial_weights = [
    conv_network_A.get_weights(), conv_network_B.get_weights(), conv_network_C.get_weights(), conv_network_D.get_weights(),
    conv_network_E.get_weights(), conv_network_F.get_weights(), conv_network_G.get_weights(), conv_network_H.get_weights(),
    conv_network_I.get_weights(), conv_network_L.get_weights(), conv_network_M.get_weights(), conv_network_N.get_weights(),
    conv_network_O.get_weights(), conv_network_P.get_weights(), conv_network_Q.get_weights(), conv_network_R.get_weights(),
    conv_network_S.get_weights()
]

In [None]:
network_info_set = [
    conv_network_A_info, conv_network_B_info, conv_network_C_info, conv_network_D_info,
    conv_network_E_info, conv_network_F_info, conv_network_G_info, conv_network_H_info,
    conv_network_I_info, conv_network_L_info, conv_network_M_info, conv_network_N_info,
    conv_network_O_info, conv_network_P_info, conv_network_Q_info, conv_network_R_info,
    conv_network_S_info
]

#### ***K-Fold Cross Validation***

In this part of the project, we have implemented the **K-Fold Cross Validation** as a strategy to find the **best hyperparameters** for the neural network and also to have a **performance estimation** of the model on new and unseen data. Our approach has followed these logic:

*   Firstly, we defined a **number of epochs** equal to *30*, which will be an upper bound in the actual number of epochs used to train the model, due to the fact that we have used an *early stopping monitoring rule*: if the performance does not improve for 3 straight epochs, the K-Fold cycle end and we keep the epoch number with the best performance as hypeparameter;

*   The **number of folds K** has been set to *3* and a multi-label stratified approach has been carried out;

*   As a text classification task (categorical label), the **loss function** has been the **categorical cross entropy**, which will result in a loss value. Our goal is to **minimize** this metric, in order to improve the performance of the model, so we have used it as our performance proxy. Also, we have taken into account the **accuracy**;

*   To evaluate a single moodel (combination of hyperparameters), we have computed the **average of the performance** of the K iteration;

*   *The best network architecture is the one which lead to the lowest loss*;

*   *We write a CSV file with all the different networks architecture and the related obtained performance in the K-Fold CV*.

In [None]:
epochs = 30
k_fold = 3

In [None]:
kfold_results = []

# For each neural network architecture
for index, element in enumerate(network_models_set):

    # Print information to manage the situation during the process
    print(f"Neural network {index}")

    # Performing the K-Fold Cross Validation
    if(network_info_set[index]['embedding'] == 'Glove'):

      kfold_result = kfold_cv.kfoldCrossValidation(k_fold, feature_train_glove, label_train, element, network_info_set[index], epochs)

    if(network_info_set[index]['embedding'] == 'Word2Vec'):

      kfold_result = kfold_cv.kfoldCrossValidation(k_fold, feature_train_word2vec, label_train, element, network_info_set[index], epochs)

    kfold_results.append(kfold_result)

In [None]:
# Write the K-Fold CV results to a CSV file
with open('../../results/data/kfold-' + network_name + '.csv', mode = 'w', newline = '') as file:

    writer = csv.DictWriter(file, fieldnames = list(kfold_results[0].keys()))
    writer.writeheader()

    for row_data in kfold_results:
        writer.writerow(row_data)

#### ***Convolutional Neural Network - Final Architecture***

Here, we have created the **neural network architecture** model with the best hyperparameters found in the K-Fold Cross Validation.

In [None]:
# Get the best hyperparameters combination
best_loss = 999
best_network = None
best_network_info = None

for index, element in enumerate(kfold_results):

    if(element['loss'] < best_loss):

        best_loss = element['loss']

        best_network = network_models_set[index]
        best_network.set_weights(initial_weights[index])

        best_network_info = network_info_set[index]

In [None]:
# Compiling the network
best_network.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

#### ***Training***

Training the neural network model with all the training data and save the H5 model file.

In [None]:
# Training (fit Neural Network)
if(best_network_info['embedding'] == 'Glove'):

  training_history = best_network.fit(

      x = feature_train_glove,
      y = label_train,
      batch_size = 512,
      epochs = int(best_network_info['best_number_epochs'])

  )

if(best_network_info['embedding'] == 'Word2Vec'):

  training_history = best_network.fit(

      x = feature_train_word2vec,
      y = label_train,
      batch_size = 512,
      epochs = int(best_network_info['best_number_epochs'])

  )

In [None]:
best_network.save('../../results/models/' + network_name +'.h5')

#### ***Testing***

Testing the neural network model with the test set.

*   The text in the test set has been vectorized using the Glove embedding created using the training set, keeping the consistency in the results;

*   We have evaluated the performance using the **categorical cross entropy loss**, **global accuracy** and **single class accuracy**;

In [None]:
best_network = load_model('../../results/models/' + network_name +'.h5')

In [None]:
if(best_network_info['embedding'] == 'Glove'):

  feature_test_glove = text_vectorization.textVectorization(test_set['text'], text_vectorizer_keras)
  score = best_network.evaluate(feature_test_glove, label_test, verbose = 0)

if(best_network_info['embedding'] == 'Word2Vec'):

  feature_test_word2vec = text_vectorization.textVectorizationWord2Vec(test_set['text'], text_vectorizer_word2vec, words_per_sentence)
  score = best_network.evaluate(feature_test_word2vec, label_test, verbose = 0)

In [None]:
# Performance metrics
test_loss = round(score[0], 3)
test_accuracy = round(score[1], 3)

In [None]:
# Write the testing performance on the global final results CSV
with open('../../results/data/results.csv', mode = 'a', newline = '') as file:

    writer = csv.writer(file)
    writer.writerow([
        best_network_info['network'],
        best_network_info['embedding'],
        best_network_info['data_aug'],
        best_network_info['regularization'],
        best_network_info['number_of_layers'],
        test_loss,
        test_accuracy
    ])