# Multilayer Perceptum (MLP) <a name="multilayer"></a>


We are going to use **Multilayer Perceptron (MLP)** because it is a flexible neural network architecture. MLPs are great for solving **classification problems**

For this model we will define the model architecture and the training strategy consisting in:
- **Number of layers**
- **Number of neurons of each layer**
- **Choice of the activation functions**
- **Optimizer** 
- **Learning hyperparameters** (e.g., learning rate, mini-batch size, number of epochs, etc.)
- **Regularization techniques to adopt** (e.g., early stopping, weight regularization, dropout)

The network works by processing data through **multiple layers**, with each layer learning to capture different features of the input data.

### Model architecture definition and Training Strategy <a name="archi-train"></a>
[[go back to the top]](#multilayer)

For the architecture of our MLP model we need, as mentioned above, the number of layers, neurons, and choose the activation functions such as relu, softmax and Tanh for example.

We used **dictionaries** to organize and store different options for **hyperparameters**. This allows us to easily experiment with different configurations and manage the settings efficiently.

To optimize our model, we decided to do a grid search to **update and select the best hyperparameter combination in the first iteration**, for one epoch only in order to be less complex. This means that in the beginning, we test several combinations of hyperparameters to find the one that performs best. By doing this, we can quickly narrow down the best model for our task, improving the **accuracy** of the predictions.

Additionally, we will use the **ADAM** optimizer, which is a popular choice for training neural networks due to its adaptive learning rate and efficient performance.
We also implemented **early stopping** to prevent overfitting by monitoring the model's performance and halting training when it stops improving.

In this way, the process of **testing and updating** in the first iteration helps us fine-tune the model efficiently, and **selecting the best combination** ensures we are using the most effective settings for our dataset.

The following table defines the possible combinations of hyperparameters we tested:

| <span style="color: #C70039;">**Hyperparameter**</span> | <span style="color: #C70039;">**Options**</span>        |
|-----------------------------------------------------|-------------------------------------------------------|
| <span style="color: #00bfae;">**Hidden Units**</span> | [128, 64, 32], [256, 128, 64], [256, 128, 64, 32]                |
| <span style="color: #00bfae;">**Activation Functions**</span> | reLU, sigmoid, tanh                             |
| <span style="color: #00bfae;">**Dropout Rate**</span> | 0.3, 0.5                                          |
| <span style="color: #00bfae;">**Batch Size**</span>   | 32                                               |
| <span style="color: #00bfae;">**Epochs**</span>       | 1                                                 |
| <span style="color: #00bfae;">**Regularizations**</span>       | None, L1 (Lasso), L2 (Ridge)                                            |
| <span style="color: #00bfae;">**Learning Rate**</span> | 0.001, 0.0001                                     |


<span style="color: #C70039;">**Note:**</span>
- <span style="color: #00bfae;">**Hidden Units**</span> consists in the number of layers and the number of each neurons of each layer, for example in this case [256, 128, 64], it defines 3 layers with 256, 128 and 64 neurons, respectively.


In [8]:
import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

# Classe MLP
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, dropout_rate, activations):
        super(MLP, self).__init__()
        self.hidden_layers = []
        for units, activation in zip(hidden_units, activations):
            self.hidden_layers.append(tf.keras.layers.Dense(units, activation=activation))
            self.hidden_layers.append(tf.keras.layers.Dropout(dropout_rate))
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  # Classificação multi-classe

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)
    
    
def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

# Gerar todas as combinações de hiperparâmetros
import random

import random

def generate_activation_combinations(hidden_units_list, activations_list=['sigmoid', 'relu', 'tanh'], num_combinations=4):
    """
    Gera combinações aleatórias de funções de ativação para uma lista de camadas ocultas.

    Args:
        hidden_units_list (list): Lista de listas representando o número de neurônios por camada.
        activations_list (list): Lista de funções de ativação disponíveis.
        num_combinations (int): Número de combinações a gerar por configuração de camadas.

    Returns:
        list: Lista com combinações aleatórias de ativações.
    """
    activation_combinations = []
    for hidden_units in hidden_units_list:
        layers = len(hidden_units)
        # Gerar combinações aleatórias de ativações para este número de camadas
        for _ in range(num_combinations):
            random_combination = [random.choice(activations_list) for _ in range(layers)]
            activation_combinations.append((hidden_units, random_combination))  # Retorna como tupla
    return activation_combinations




hidden_units_list = [[128, 64, 32], [256, 128, 64], [64, 32], [512, 256, 128], [256, 128, 64, 32]]
#activation_functions = ['relu', 'sigmoid', 'tanh']  # Funções de ativação possíveis

# Combinar hidden_units com ativations
activation_configs = generate_activation_combinations(hidden_units_list)


# Função para carregar os dados de um fold específico
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values  # Extrair os rótulos diretamente
    features = data.values  # Extrair as features como matriz numpy
    return features, labels

# Treinar e avaliar o modelo
def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(input_dim=X_train.shape[1],
                output_dim=10,
                hidden_units=config['hidden_units'],
                dropout_rate=config['dropout_rate'],
                activations=config['activations'])
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=config['batch_size'],
        epochs=config['epochs'],
        callbacks=[early_stopping],
        verbose=0
    )
    
    return max(history.history['val_accuracy']) 

## Cross-validation, apenas uma iteração (1 fold)
def cross_validate_model(config, files):
    # Apenas o primeiro fold para validação
    fold_number = 0
    X_val, y_val = load_fold_data(fold_number, files)
    X_train, y_train = [], []
    
    # Treino com os outros folds
    for i in range(len(files)):
        if i != fold_number:
            X_temp, y_temp = load_fold_data(i, files)
            X_train.append(X_temp)
            y_train.append(y_temp)
    
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)
    
    # Treinar e avaliar para este fold
    accuracy = train_evaluate_model(config, X_train, y_train, X_val, y_val)
    return accuracy  # Retorna a acurácia deste único fold


# Função para avaliação em paralelo
def evaluate_config_parallel(args):
    config, files = args
    accuracy = cross_validate_model(config, files)
    print(f"Configuration: {config} | Accuracy: {accuracy}")
    return config, accuracy

# Definições de hiperparâmetros
configurations = {
    "hidden_units": [config[0] for config in activation_configs],
    "activations": [config[1] for config in activation_configs],
    "dropout_rate": [0, 0.1, 0.2, 0.3, 0.4, 0.5],
    "batch_size": [32, 64],
    "epochs": [20, 50],
    "learning_rate": [0.001, 0.0001],
}



files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)] 

# Gerar todas as combinações de configurações
all_configs = generate_configs(configurations)

# Rodar tuning em paralelo
if __name__ == '__main__':
    num_workers = 8
    with Pool(num_workers) as pool:
        results = pool.map(evaluate_config_parallel, [(config, files) for config in all_configs])

    # Encontrar a melhor configuração
    best_config, best_accuracy = max(results, key=lambda x: x[1])
    print(f"Best configuration: {best_config}, Best accuracy: {best_accuracy}")

Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'relu', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6838487982749939
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['relu', 'sigmoid', 'relu'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6254295706748962
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['relu', 'tanh', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6517754793167114
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'tanh', 'relu'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.636884331703186
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['relu', 'tanh', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6689575910568237
Configuration: {'h

KeyboardInterrupt: 

In [None]:
import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, dropout_rate, activations, regularization_type=None, regularization_value=0.01):
        super(MLP, self).__init__()
        self.hidden_layers = []
        self.regularization_type = regularization_type
        self.regularization_value = regularization_value

        # Construção das camadas ocultas
        for units, activation in zip(hidden_units, activations):
            self.hidden_layers.append(
                tf.keras.layers.Dense(units, activation=activation)
            )
            self.hidden_layers.append(tf.keras.layers.Dropout(dropout_rate))
        
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  # Classificação multi-classe


    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)
    
    def compute_regularization_loss(self):
        regularization_loss = 0.0
        if self.regularization_type:
            for layer in self.hidden_layers:
                if isinstance(layer, tf.keras.layers.Dense):
                    weights = layer.kernel
                    if self.regularization_type == 'l1':
                        regularization_loss += tf.reduce_sum(tf.abs(weights)) * self.regularization_value
                    elif self.regularization_type == 'l2':
                        regularization_loss += tf.reduce_sum(tf.square(weights)) * self.regularization_value
        return regularization_loss

    
    
def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

# Função para carregar os dados de um fold específico
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values  # Extrair os rótulos diretamente
    features = data.values  # Extrair as features como matriz numpy
    return features, labels

# Treinar e avaliar o modelo
def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(
        input_dim=X_train.shape[1],
        output_dim=10,
        hidden_units=config['hidden_units'],
        dropout_rate=config['dropout_rate'],
        activations=config['activations'],
        regularization_type=config.get('regularization_type', None),
        regularization_value=config.get('regularization_value', 0.01)
    )
    
    # Função de perda com regularização
    def loss_with_regularization(y_true, y_pred):
        base_loss = tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
        regularization_loss = model.compute_regularization_loss()
        return base_loss + regularization_loss

    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
        loss=loss_with_regularization,
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=config['batch_size'],
        epochs=config['epochs'],
        callbacks=[early_stopping],
        verbose=0
    )
    
    return max(history.history['val_accuracy'])


## Cross-validation, apenas uma iteração (1 fold)
def cross_validate_model(config, files):
    # Apenas o primeiro fold para validação
    fold_number = 0
    X_val, y_val = load_fold_data(fold_number, files)
    X_train, y_train = [], []
    
    # Treino com os outros folds
    for i in range(len(files)):
        if i != fold_number:
            X_temp, y_temp = load_fold_data(i, files)
            X_train.append(X_temp)
            y_train.append(y_temp)
    
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)
    
    # Treinar e avaliar para este fold
    accuracy = train_evaluate_model(config, X_train, y_train, X_val, y_val)
    return accuracy  # Retorna a acurácia deste único fold


# Função para avaliação em paralelo
def evaluate_config_parallel(args):
    config, files = args
    accuracy = cross_validate_model(config, files)
    print(f"Configuration: {config} | Accuracy: {accuracy}")
    return config, accuracy

from itertools import product

def generate_activation_combinations(hidden_units_list, activations_list=['relu'], num_combinations=1):
    activation_combinations = []
    for hidden_units in hidden_units_list:
        layers = len(hidden_units)
        # Gerar combinações aleatórias de ativações para este número de camadas
        for _ in range(num_combinations):
            random_combination = [random.choice(activations_list) for _ in range(layers)]
            activation_combinations.append((hidden_units, random_combination)) 
    return activation_combinations


hidden_units_list = [[128, 64, 32], [256, 128, 64]]

# Gerar combinações de hidden_units e activations
activation_combinations = generate_activation_combinations(hidden_units_list)

# Separar hidden_units e activations em listas distintas
hidden_units = [combo[0] for combo in activation_combinations]
activations = [combo[1] for combo in activation_combinations]

# Definições de hiperparâmetros corrigidas
configurations = {
    "hidden_units": hidden_units,  # Somente os hidden_units
    "activations": activations,    # Somente as ativações correspondentes
    "dropout_rate": [0.2, 0.3, 0.4],
    "batch_size": [32],
    "epochs": [20,50],
    "learning_rate": [0.001, 0.0001],
    "regularization_type": [None, 'l1', 'l2'],
    "regularization_value": [0.01, 0.001],
}

files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)] 

# Gerar todas as combinações de configurações
all_configs = generate_configs(configurations)

# Rodar tuning em paralelo
if __name__ == '__main__':
    num_workers = 8
    with Pool(num_workers) as pool:
        results = pool.map(evaluate_config_parallel, [(config, files) for config in all_configs])

    # Encontrar a melhor configuração
    best_config, best_accuracy = max(results, key=lambda x: x[1])
    print(f"Best configuration: {best_config}, Best accuracy: {best_accuracy}")

Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'relu', 'relu'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.6139748096466064
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.19473081827163696Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'tanh', 'relu'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.3654066324234009

Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.01} | Ac

2024-11-28 10:11:48.307278: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [10] vs. [0]
	 [[{{function_node __inference_one_step_on_data_963970}}{{node adam/Mul_31}}]]


Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.18442153930664062
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'relu', 'relu'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.5910652875900269
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularizatio

2024-11-28 10:16:08.042515: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [10] vs. [0]
	 [[{{function_node __inference_one_step_on_data_1067458}}{{node adam/Mul_31}}]]


Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.1878579556941986
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'tanh', 'relu'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularization_value': 0.001} | Accuracy: 0.25544100999832153
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['relu', 'tanh', 'tanh', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_

2024-11-28 10:23:16.924016: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [0] vs. [256,128]
	 [[{{function_node __inference_one_step_on_data_1229186}}{{node adam/truediv_5}}]]


Configuration: {'hidden_units': [256, 128, 64], 'activations': ['relu', 'tanh', 'tanh', 'sigmoid'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularization_value': 0.01} | Accuracy: 0.5910652875900269
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'relu', 'sigmoid', 'tanh'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'sigmoid', 'sigmoid'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regula

2024-11-28 10:27:24.922712: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [256,128] vs. [0]
	 [[{{function_node __inference_one_step_on_data_1299019}}{{node adam/Mul_11}}]]


Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l2', 'regularization_value': 0.01} | Accuracy: 0.22107674181461334
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.23596793413162231
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.14432989060878754
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['relu', 'tanh', 'tanh', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1

2024-11-28 10:28:04.597691: F tensorflow/core/framework/tensor.cc:844] Check failed: dtype() == expected_dtype (1 vs. 9) int64 expected, got float


Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'relu', 'relu'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.6426116824150085
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['tanh', 'tanh', 'sigmoid', 'relu'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.001} | Accuracy: 0.1935853362083435
Configuration: {'hidden_units': [128, 64, 32], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularization_value': 0.01} | Accuracy: 0.23253150284290314
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'relu', 'relu'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.001} |

2024-11-28 10:33:05.969420: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Trying to assign to variable with tensor with wrong shape. Expected [] got [0]
	 [[{{function_node __inference_one_step_on_data_405724}}{{node AssignVariableOp}}]]


Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'relu', 'sigmoid', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.5475372076034546
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['sigmoid', 'tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.001} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.001} | Accuracy: 0.22680412232875824
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'relu', 'sigmoid', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularizatio

2024-11-28 10:45:03.277843: F tensorflow/core/framework/tensor.cc:844] Check failed: dtype() == expected_dtype (1 vs. 9) int64 expected, got float


Configuration: {'hidden_units': [256, 128, 64], 'activations': ['relu', 'tanh', 'tanh', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1', 'regularization_value': 0.001} | Accuracy: 0.49140894412994385
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.5990836024284363
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['relu', 'tanh', 'tanh', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l2', 'regularization_value': 0.01} | Accuracy: 0.4639175236225128
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, '

2024-11-28 11:04:02.639201: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [256,128] vs. [0]
	 [[{{function_node __inference_one_step_on_data_1823939}}{{node adam/truediv_5}}]]


Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l2', 'regularization_value': 0.001} | Accuracy: 0.19129438698291779
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_value': 0.001} | Accuracy: 0.24169529974460602
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.47766321897506714
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['relu', 'relu', 'tanh'], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regu

2024-11-28 11:13:12.692996: F tensorflow/core/framework/tensor.cc:844] Check failed: dtype() == expected_dtype (1 vs. 9) int64 expected, got float


Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.001} | Accuracy: 0.3413516581058502
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'relu', 'relu'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198


Python(49550) MallocStackLogging: can't turn off malloc stack logging because it was not enabled.


Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'tanh', 'sigmoid', 'relu'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l2', 'regularization_value': 0.001} | Accuracy: 0.4135166108608246
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.01} | Accuracy: 0.210767462849617
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': None, 'regularization_value': 0.001} | Accuracy: 0.25773194432258606
Configuration: {'hidden_units': [256, 128, 64], 'activations': ['tanh', 'sigmoid', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.0001, 'regularization_type': 'l1

2024-11-28 11:20:20.727209: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Incompatible shapes: [128,64] vs. [0]
	 [[{{function_node __inference_one_step_on_data_628592}}{{node adam/Mul_19}}]]


Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'relu', 'sigmoid', 'tanh'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.11454753577709198
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'sigmoid', 'sigmoid'], 'dropout_rate': 0.2, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l2', 'regularization_value': 0.001} | Accuracy: 0.4650630056858063
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['sigmoid', 'tanh', 'tanh'], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type': 'l1', 'regularization_value': 0.01} | Accuracy: 0.10996563732624054
Configuration: {'hidden_units': [256, 128, 64, 32], 'activations': ['tanh', 'relu', 'sigmoid', 'tanh'], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 1, 'learning_rate': 0.001, 'regularization_type

KeyboardInterrupt: 

Based on the best combination of hyperparameters identified: ........., we will now proceed to train the model while determining the optimal number of epochs and batch size. Previously, we limited the training to just 1 epoch to reduce complexity during the initial exploration.