# Multilayer Perceptum (MLP) <a name="multilayer"></a>


We are going to use **Multilayer Perceptron (MLP)** because it is a flexible neural network architecture. MLPs are great for solving **classification problems**

First, we will define the model architecture, this step consists in:
- **Number of layers**
- **Number of neurons of each layer**
- **Choice of the activation functions**


Then we'll perform the training strategy, this strategy includes:
- **Optimizer**
- **Learning hyperparameters** (e.g., learning rate, mini-batch size, number of epochs, etc.)
- **Regularization techniques to adopt** (e.g., early stopping, weight regularization, dropout, data augmentation, etc.)
- **Possibility of using transfer learning**

The network works by processing data through **multiple layers**, with each layer learning to capture different features of the input data.

### Model architecture definition <a name="architecture"></a>
[[go back to the top]](#multilayer)

For the architecture of our MLP model we need, as mentioned above, the number of layers, neurons, and choose the activation functions such as relu, softmax and Tanh for example.


The following table defines the our model architecture:

| <span style="color: #C70039;">**Architecture**</span> | <span style="color: #C70039;">**Options**</span>  |
|-----------------------------------------------------|-------------------------------------------------------|
| <span style="color: #00bfae;">**hidden_units**</span> | [[128, 64], [256, 128, 64], [64, 32]]               |
| <span style="color: #00bfae;">**Activation functions**</span> | ['relu', 'relu', 'tanh']                                  |   

- **hidden_units** consists in the number of layers and the number of each neurons of each layer, for example in this case [256, 128, 64], it defines 3 layers with 256, 128 and 64 neurons, respectively.


In [7]:
import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

# Classe para definir a arquitetura do modelo MLP
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units):
        super(MLP, self).__init__()
        self.hidden_layers = []
        for units in hidden_units:
            self.hidden_layers.append(tf.keras.layers.Dense(units, activation='relu'))
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  # Saída multi-classe

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)
    
    def build(self, input_shape):
        super(MLP, self).build(input_shape)
        # Informar ao TensorFlow as dimensões esperadas das entradas
        self.call(tf.keras.Input(shape=input_shape[1:]))


input_dim = 40  # Exemplo: número de features
output_dim = 10  # Exemplo: número de classes
hidden_units = [128, 64, 32]

model = MLP(input_dim=input_dim, output_dim=output_dim, hidden_units=hidden_units)

# Construir o modelo explicitamente
model.build((None, input_dim))
model.summary()

### Training Strategy <a name="training"></a>
[[go back to the top]](#multilayer)

We used **dictionaries** to organize and store different options for **hyperparameters**. This allows us to easily experiment with different configurations and manage the settings efficiently.

To optimize our model, we decided to **update and select the best hyperparameter combination in the first iteration**. This means that in the beginning, we test several combinations of hyperparameters to find the one that performs best. By doing this, we can quickly narrow down the best model for our task, improving the **accuracy** of the predictions.

Additionally, we will use the **ADAM** optimizer, which is a popular choice for training neural networks due to its adaptive learning rate and efficient performance.
We also implemented **early stopping** to prevent overfitting by monitoring the model's performance and halting training when it stops improving.

In this way, the process of **testing and updating** in the first iteration helps us fine-tune the model efficiently, and **selecting the best combination** ensures we are using the most effective settings for our dataset.

---


---

### Hyperparameter Configuration

The following table defines the possible combinations of hyperparameters we tested:

| <span style="color: #C70039;">**Hyperparameter**</span> | <span style="color: #C70039;">**Options**</span>        |
|-----------------------------------------------------|-------------------------------------------------------|
| <span style="color: #00bfae;">**hidden_units**</span> | [[128, 64], [256, 128, 64], [64, 32]]                 |
| <span style="color: #00bfae;">**dropout_rate**</span> | [0.3, 0.5]                                            |
| <span style="color: #00bfae;">**batch_size**</span>   | [32]                                                  |
| <span style="color: #00bfae;">**epochs**</span>       | [20]                                                  |
| <span style="color: #00bfae;">**learning_rate**</span> | [0.001, 0.0001]                                       |

In [None]:
import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

# Classe MLP
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, dropout_rate):
        super(MLP, self).__init__()
        self.hidden_layers = []
        for units in hidden_units:
            self.hidden_layers.append(tf.keras.layers.Dense(units, activation='relu'))
            self.hidden_layers.append(tf.keras.layers.Dropout(dropout_rate))
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  # Classificação multi-classe

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)

# Gerar todas as combinações de hiperparâmetros
def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

# Função para carregar os dados de um fold específico
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values  # Extrair os rótulos diretamente
    features = data.values  # Extrair as features como matriz numpy
    return features, labels

# Treinar e avaliar o modelo
def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(input_dim=X_train.shape[1],
                output_dim=10,
                hidden_units=config['hidden_units'],
                dropout_rate=config['dropout_rate'])
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=config['batch_size'],
        epochs=config['epochs'],
        callbacks=[early_stopping],
        verbose=0
    )
    
    return max(history.history['val_accuracy'])  # Melhor acurácia na validação

# Cross-validation, apenas a primeira iteração
def cross_validate_model(config, files, k=10):
    for i in range(k):
        X_val, y_val = load_fold_data(i, files)
        X_train, y_train = [], []
        for j in range(k):
            if j != i:
                X_temp, y_temp = load_fold_data(j, files)
                X_train.append(X_temp)
                y_train.append(y_temp)
        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)

        accuracy = train_evaluate_model(config, X_train, y_train, X_val, y_val)
        return accuracy  # Retorna após a primeira iteração

# Função para avaliação em paralelo
def evaluate_config_parallel(args):
    config, files = args
    accuracy = cross_validate_model(config, files, k=10)
    print(f"Configuration: {config} | Accuracy: {accuracy}")
    return config, accuracy

# Definições de hiperparâmetros
configurations = {
    "hidden_units": [[128, 64, 32], [256, 128, 64], [64, 32],[512,256,128], [256,128,64,32]],
    "dropout_rate": [0, 0.1, 0.2, 0.3, 0.4, 0.5],
    "batch_size": [32,64],
    "epochs": [20,50],
    "learning_rate": [0.001, 0.0001]
    
}

files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)] 

# Gerar todas as combinações de configurações
all_configs = generate_configs(configurations)

# Rodar tuning em paralelo
if __name__ == '__main__':
    num_workers = 8 #milas cores
    with Pool(num_workers) as pool:
        results = pool.map(evaluate_config_parallel, [(config, files) for config in all_configs])

    # Encontrar a melhor configuração
    best_config, best_accuracy = max(results, key=lambda x: x[1])
    print(f"Best configuration: {best_config}, Best accuracy: {best_accuracy}")

Configuration: {'hidden_units': [128, 64, 32], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6380297541618347
Configuration: {'hidden_units': [128, 64, 32], 'dropout_rate': 0.5, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6655212044715881
Configuration: {'hidden_units': [128, 64, 32], 'dropout_rate': 0.4, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6735395193099976
Configuration: {'hidden_units': [256, 128, 64], 'dropout_rate': 0, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.636884331703186
Configuration: {'hidden_units': [256, 128, 64], 'dropout_rate': 0.1, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6666666865348816
Configuration: {'hidden_units': [128, 64, 32], 'dropout_rate': 0.3, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6758304834365845
Configuration: {'hidden_units': [128, 64, 32], 'dropout_rate': 0.1, 'batc

2024-11-27 11:49:55.790946: W tensorflow/core/framework/op_kernel.cc:1840] OP_REQUIRES failed at strided_slice_op.cc:117 : INVALID_ARGUMENT: Expected begin, end, and strides to be 1D equal size tensors, but got shapes [0], [1], and [1] instead.
2024-11-27 11:49:55.791240: I tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: INVALID_ARGUMENT: Expected begin, end, and strides to be 1D equal size tensors, but got shapes [0], [1], and [1] instead.
	 [[{{function_node __inference_one_step_on_data_209295}}{{node strided_slice}}]]


Configuration: {'hidden_units': [256, 128, 64, 32], 'dropout_rate': 0.1, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.648339033126831
Configuration: {'hidden_units': [512, 256, 128], 'dropout_rate': 0.3, 'batch_size': 64, 'epochs': 50, 'learning_rate': 0.0001} | Accuracy: 0.6895761489868164
Configuration: {'hidden_units': [64, 32], 'dropout_rate': 0.5, 'batch_size': 64, 'epochs': 50, 'learning_rate': 0.0001} | Accuracy: 0.6620847582817078
Configuration: {'hidden_units': [512, 256, 128], 'dropout_rate': 0.5, 'batch_size': 64, 'epochs': 50, 'learning_rate': 0.0001} | Accuracy: 0.6918671131134033
Configuration: {'hidden_units': [256, 128, 64, 32], 'dropout_rate': 0.2, 'batch_size': 32, 'epochs': 20, 'learning_rate': 0.001} | Accuracy: 0.6597937941551208
Configuration: {'hidden_units': [512, 256, 128], 'dropout_rate': 0.2, 'batch_size': 64, 'epochs': 50, 'learning_rate': 0.0001} | Accuracy: 0.6941580772399902
Configuration: {'hidden_units': [512, 256, 128], 'dropou

InvalidArgumentError: Graph execution error:

Detected at node strided_slice defined at (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main

  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel_launcher.py", line 17, in <module>

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/traitlets/config/application.py", line 1043, in launch_instance

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/kernelapp.py", line 725, in start

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/tornado/platform/asyncio.py", line 215, in start

  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 596, in run_forever

  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1890, in _run_once

  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/events.py", line 80, in _run

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/kernelbase.py", line 513, in dispatch_queue

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/kernelbase.py", line 502, in process_one

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/kernelbase.py", line 409, in dispatch_shell

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/kernelbase.py", line 729, in execute_request

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/ipkernel.py", line 422, in do_execute

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/ipykernel/zmqshell.py", line 540, in run_cell

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/interactiveshell.py", line 2961, in run_cell

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/interactiveshell.py", line 3016, in _run_cell

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/interactiveshell.py", line 3221, in run_cell_async

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/interactiveshell.py", line 3400, in run_ast_nodes

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/IPython/core/interactiveshell.py", line 3460, in run_code

  File "/var/folders/qx/kgqzhwb50b7flqgr05m062t40000gn/T/ipykernel_10906/1556981529.py", line 106, in <module>

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/pool.py", line 212, in __init__

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/pool.py", line 303, in _repopulate_pool

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/pool.py", line 326, in _repopulate_pool_static

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/process.py", line 121, in start

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/context.py", line 277, in _Popen

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/popen_fork.py", line 19, in __init__

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/popen_fork.py", line 71, in _launch

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/process.py", line 315, in _bootstrap

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/process.py", line 108, in run

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/pool.py", line 125, in worker

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/multiprocess/pool.py", line 48, in mapstar

  File "/var/folders/qx/kgqzhwb50b7flqgr05m062t40000gn/T/ipykernel_10906/1556981529.py", line 84, in evaluate_config_parallel

  File "/var/folders/qx/kgqzhwb50b7flqgr05m062t40000gn/T/ipykernel_10906/1556981529.py", line 78, in cross_validate_model

  File "/var/folders/qx/kgqzhwb50b7flqgr05m062t40000gn/T/ipykernel_10906/1556981529.py", line 54, in train_evaluate_model

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/keras/src/backend/tensorflow/trainer.py", line 320, in fit

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/keras/src/backend/tensorflow/trainer.py", line 121, in one_step_on_iterator

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/keras/src/backend/tensorflow/trainer.py", line 108, in one_step_on_data

  File "/Users/franciscamihalache/Library/Python/3.9/lib/python/site-packages/keras/src/backend/tensorflow/trainer.py", line 62, in train_step

Expected begin, end, and strides to be 1D equal size tensors, but got shapes [0], [1], and [1] instead.
	 [[{{node strided_slice}}]] [Op:__inference_one_step_on_iterator_209348]

In [None]:
eu tenho esta funcao eu quero que o modelo rode apenas uma vez o cross validation para testar apenas os parâmetros da arquitetura da rede, ou seja numero de leayer, neurónios por layer, e funcao de ativação 
quero fazer um grid search para ver quais as melhores combinações iniciais para comecar a arquitetura da rede 
apenas quero que seja feita 1 iteracao do cross validation para verificar qual delas é a melhor combinação 



neste momento estou a fazer grid search com o dropout learning rate etc mas eu apenas quero estes parâmetros numero de leayer, neurónios por layer, e funcao de ativação  nessa tal funcao de cross validation 

os outros valores tem de ser os standard, para que depois haja hyperparameter tuning durante o treino no MLP


import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

# Classe MLP
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, dropout_rate):
        super(MLP, self).__init__()
        self.hidden_layers = []
        for units in hidden_units:
            self.hidden_layers.append(tf.keras.layers.Dense(units, activation='relu'))
            self.hidden_layers.append(tf.keras.layers.Dropout(dropout_rate))
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  # Classificação multi-classe

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)

# Gerar todas as combinações de hiperparâmetros
def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

# Função para carregar os dados de um fold específico
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values  # Extrair os rótulos diretamente
    features = data.values  # Extrair as features como matriz numpy
    return features, labels

# Treinar e avaliar o modelo
def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(input_dim=X_train.shape[1],
                output_dim=10,
                hidden_units=config['hidden_units'],
                dropout_rate=config['dropout_rate'])
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=config['batch_size'],
        epochs=config['epochs'],
        callbacks=[early_stopping],
        verbose=0
    )
    
    return max(history.history['val_accuracy'])  # Melhor acurácia na validação

# Cross-validation, apenas a primeira iteração
def cross_validate_model(config, files, k=10):
    for i in range(k):
        X_val, y_val = load_fold_data(i, files)
        X_train, y_train = [], []
        for j in range(k):
            if j != i:
                X_temp, y_temp = load_fold_data(j, files)
                X_train.append(X_temp)
                y_train.append(y_temp)
        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)

        accuracy = train_evaluate_model(config, X_train, y_train, X_val, y_val)
        return accuracy  # Retorna após a primeira iteração

# Função para avaliação em paralelo
def evaluate_config_parallel(args):
    config, files = args
    accuracy = cross_validate_model(config, files, k=10)
    print(f"Configuration: {config} | Accuracy: {accuracy}")
    return config, accuracy

# Definições de hiperparâmetros
configurations = {
    "hidden_units": [[128, 64, 32], [256, 128, 64], [64, 32],[512,256,128], [256,128,64,32]],
    "dropout_rate": [0, 0.1, 0.2, 0.3, 0.4, 0.5],
    "batch_size": [32,64],
    "epochs": [20,50],
    "learning_rate": [0.001, 0.0001]
    
}

files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)] 

# Gerar todas as combinações de configurações
all_configs = generate_configs(configurations)

# Rodar tuning em paralelo
if __name__ == '__main__':
    num_workers = 8
    with Pool(num_workers) as pool:
        results = pool.map(evaluate_config_parallel, [(config, files) for config in all_configs])

    # Encontrar a melhor configuração
    best_config, best_accuracy = max(results, key=lambda x: x[1])
    print(f"Best configuration: {best_config}, Best accuracy: {best_accuracy}")

In [6]:
import tensorflow as tf
import numpy as np
import pandas as pd
import itertools
from pathos.multiprocessing import Pool

# Classe MLP
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, activation):
        super(MLP, self).__init__()
        self.hidden_layers = [
            tf.keras.layers.Dense(units, activation=activation)
            for units in hidden_units
        ]
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)

# Gerar todas as combinações de hiperparâmetros
def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

# Função para carregar os dados de um fold específico
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values
    features = data.values
    return features, labels

# Treinar e avaliar o modelo
def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(
        input_dim=X_train.shape[1],
        output_dim=10,
        hidden_units=config['hidden_units'],
        activation=config['activation']
    )
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),  # Valor padrão
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=32,  # Valor padrão
        epochs=20,      # Valor padrão
        callbacks=[early_stopping],
        verbose=0
    )
    
    return max(history.history['val_accuracy'])  # Melhor acurácia na validação

# Cross-validation, apenas a primeira iteração
def cross_validate_model(config, files, k=10):
    X_val, y_val = load_fold_data(0, files)
    X_train, y_train = [], []
    for j in range(1, k):  # Exclui o fold de validação
        X_temp, y_temp = load_fold_data(j, files)
        X_train.append(X_temp)
        y_train.append(y_temp)
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)

    accuracy = train_evaluate_model(config, X_train, y_train, X_val, y_val)
    return accuracy

# Função para avaliação em paralelo
def evaluate_config_parallel(args):
    config, files = args
    accuracy = cross_validate_model(config, files, k=10)
    print(f"Configuration: {config} | Accuracy: {accuracy}")
    return config, accuracy

# Arquitetura inicial - Grid Search
architecture_configs = {
    "hidden_units": [[128, 64], [256, 128,64,], [64, 32], [512, 256, 128], [256, 128, 64, 32],[128,64,32]],
    "activation": ['relu', 'tanh']
}

files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)]

# Etapa 1: Encontrar a melhor arquitetura
if __name__ == '__main__':
    num_workers = 4
    all_architecture_configs = generate_configs(architecture_configs)
    
    with Pool(num_workers) as pool:
        architecture_results = pool.map(
            evaluate_config_parallel,
            [(config, files) for config in all_architecture_configs]
        )
    
    best_architecture, best_arch_accuracy = max(
        architecture_results, key=lambda x: x[1]
    )
    print(f"Best architecture: {best_architecture}, Best accuracy: {best_arch_accuracy}")
    
    # Etapa 2: Ajuste fino dos hiperparâmetros
    tuning_configs = {
        "batch_size": [16, 32, 64],
        "epochs": [20, 50, 100],
        "learning_rate": [0.001, 0.0001, 0.01],
        "dropout_rate": [0.1, 0.2, 0.3, 0.5]
    }
    
    all_tuning_configs = generate_configs(tuning_configs)
    
    def evaluate_tuning_config(args):
        config, files = args
        X_val, y_val = load_fold_data(0, files)
        X_train, y_train = [], []
        for j in range(1, 10):
            X_temp, y_temp = load_fold_data(j, files)
            X_train.append(X_temp)
            y_train.append(y_temp)
        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)

        config_model = MLP(
            input_dim=X_train.shape[1],
            output_dim=10,
            hidden_units=best_architecture['hidden_units'],  # Melhor arquitetura
            activation=best_architecture['activation']
        )
        config_model.compile(
            optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy']
        )
        history = config_model.fit(
            X_train, y_train,
            validation_data=(X_val, y_val),
            batch_size=config['batch_size'],
            epochs=config['epochs'],
            verbose=0
        )
        accuracy = max(history.history['val_accuracy'])
        return config, accuracy

    # Rodar tuning em paralelo
    with Pool(num_workers) as pool:
        tuning_results = pool.map(
            evaluate_tuning_config,
            [(config, files) for config in all_tuning_configs]
        )
    
    best_tuning_config, best_tuning_accuracy = max(
        tuning_results, key=lambda x: x[1]
    )
    print(f"Best tuning config: {best_tuning_config}, Best tuning accuracy: {best_tuning_accuracy}")


Configuration: {'hidden_units': [128, 64], 'activation': 'tanh'} | Accuracy: 0.6632302403450012
Configuration: {'hidden_units': [256, 128, 64], 'activation': 'tanh'} | Accuracy: 0.674685001373291
Configuration: {'hidden_units': [128, 64], 'activation': 'relu'} | Accuracy: 0.6723940372467041
Configuration: {'hidden_units': [256, 128, 64], 'activation': 'relu'} | Accuracy: 0.6334478855133057
Configuration: {'hidden_units': [64, 32], 'activation': 'relu'} | Accuracy: 0.6597937941551208
Configuration: {'hidden_units': [64, 32], 'activation': 'tanh'} | Accuracy: 0.6712485551834106
Configuration: {'hidden_units': [512, 256, 128], 'activation': 'relu'} | Accuracy: 0.636884331703186
Configuration: {'hidden_units': [256, 128, 64, 32], 'activation': 'tanh'} | Accuracy: 0.6597937941551208
Configuration: {'hidden_units': [256, 128, 64, 32], 'activation': 'relu'} | Accuracy: 0.6334478855133057
Configuration: {'hidden_units': [512, 256, 128], 'activation': 'tanh'} | Accuracy: 0.6529209613800049
Conf

In [15]:
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow.keras.callbacks import EarlyStopping

# Definir o modelo final com os melhores parâmetros encontrados
def final_train_model(best_architecture, best_tuning_config, files):
    # Carregar dados
    X_val, y_val = load_fold_data(0, files)  # Dados de validação
    X_train, y_train = [], []
    for j in range(1, 10):  # Usando os outros folds para treino
        X_temp, y_temp = load_fold_data(j, files)
        X_train.append(X_temp)
        y_train.append(y_temp)
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)

    # Construir o modelo com a arquitetura e os hiperparâmetros finais
    model = MLP(
        input_dim=X_train.shape[1],
        output_dim=10,  # Ajuste para número de classes (10)
        hidden_units=best_architecture['hidden_units'],
        activation=best_architecture['activation']
    )
    
    # Compilar o modelo
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=best_tuning_config['learning_rate']),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

    # Definir callbacks de regularização
    early_stopping = EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )

    # Treinar o modelo
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=best_tuning_config['batch_size'],
        epochs=best_tuning_config['epochs'],
        callbacks=[early_stopping],
        verbose=1
    )

    # Avaliar o modelo final
    final_accuracy = model.evaluate(X_val, y_val, verbose=0)[1]
    print(f"Final validation accuracy: {final_accuracy:.4f}")

    # Imprimir os melhores hiperparâmetros após o treinamento
    print("\nMelhores hiperparâmetros usados no treinamento:")
    print(f"Arquitetura: {best_architecture}")
    print(f"Configuração de tuning: {best_tuning_config}")

# Usando os melhores parâmetros encontrados
best_architecture = {'hidden_units': [128, 64, 32], 'activation': 'tanh'}
best_tuning_config = {
    'batch_size': 64,
    'epochs': 50,
    'learning_rate': 0.0001,
    'dropout_rate': 0.1
}

# Carregar os arquivos de dados (ajustar conforme o caminho correto)
files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1, 11)]

# Treinar o modelo com os melhores hiperparâmetros encontrados
final_train_model(best_architecture, best_tuning_config, files)


Epoch 1/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.1790 - loss: 2.2603 - val_accuracy: 0.3093 - val_loss: 2.0903
Epoch 2/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 709us/step - accuracy: 0.3680 - loss: 2.0648 - val_accuracy: 0.4444 - val_loss: 1.9112
Epoch 3/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 711us/step - accuracy: 0.4151 - loss: 1.9054 - val_accuracy: 0.4662 - val_loss: 1.7642
Epoch 4/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 716us/step - accuracy: 0.4288 - loss: 1.7873 - val_accuracy: 0.4731 - val_loss: 1.6522
Epoch 5/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 821us/step - accuracy: 0.4630 - loss: 1.6734 - val_accuracy: 0.5132 - val_loss: 1.5542
Epoch 6/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 724us/step - accuracy: 0.4908 - loss: 1.5979 - val_accuracy: 0.5326 - val_loss: 1.4865
Epoch 7/50
[1m123

In [14]:
import tensorflow as tf
import numpy as np

# Função para carregar os dados (ajuste conforme necessário para seu caso)
def load_fold_data(fold, files):
    # Carregar os dados de cada "fold" (assumindo CSVs ou outros formatos)
    data = pd.read_csv(files[fold])
    X = data.drop('Label', axis=1).values  # Ajuste conforme a estrutura dos seus dados
    y = data['Label'].values  # Ajuste conforme a estrutura dos seus dados
    return X, y

# Função de treinamento do MLP
def train_MLP(mlp_model, train_data, train_labels, test_data, test_labels, patience, batch_size, num_epochs):
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=patience)
    history = mlp_model.fit(train_data, train_labels,
                            epochs=num_epochs,
                            batch_size=batch_size,
                            callbacks=[early_stopping],
                            validation_data=(test_data, test_labels),
                            verbose=1)
    return history

# Função para treinar o modelo final com os melhores parâmetros
def final_train_model(best_architecture, best_tuning_config, files):
    # Carregar dados
    X_val, y_val = load_fold_data(0, files)  # Dados de validação
    X_train, y_train = [], []
    for j in range(1, 10):  # Usando os outros folds para treino
        X_temp, y_temp = load_fold_data(j, files)
        X_train.append(X_temp)
        y_train.append(y_temp)
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)

    # Construir o modelo com a arquitetura e os hiperparâmetros finais
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=(X_train.shape[1],)))
    
    # Adicionar camadas ocultas conforme a melhor arquitetura
    for units in best_architecture['hidden_units']:
        model.add(tf.keras.layers.Dense(units, activation=best_architecture['activation']))
        model.add(tf.keras.layers.Dropout(best_tuning_config['dropout_rate']))
    
    # Camada de saída
    model.add(tf.keras.layers.Dense(10, activation='softmax'))  # Ajuste conforme o número de classes
    
    # Compilar o modelo
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=best_tuning_config['learning_rate']),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

    # Treinar o modelo
    history = train_MLP(
        mlp_model=model,
        train_data=X_train,
        train_labels=y_train,
        test_data=X_val,
        test_labels=y_val,
        patience=5,  # Define a paciência para early stopping
        batch_size=best_tuning_config['batch_size'],
        num_epochs=best_tuning_config['epochs']
    )

    # Avaliar o modelo final
    final_accuracy = model.evaluate(X_val, y_val, verbose=0)[1]
    print(f"Final validation accuracy: {final_accuracy:.4f}")

# Usando os melhores parâmetros encontrados
best_architecture = {'hidden_units': [128, 64, 32], 'activation': 'tanh'}
best_tuning_config = {
    'batch_size': 64,
    'epochs': 50,
    'learning_rate': 0.0001,
    'dropout_rate': 0.1
}

# Carregar os arquivos de dados (ajustar conforme o caminho correto)
files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1, 11)]

# Treinar o modelo com os melhores hiperparâmetros encontrados
final_train_model(best_architecture, best_tuning_config, files)


Epoch 1/50




[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.0988 - loss: 2.3518 - val_accuracy: 0.3219 - val_loss: 2.1218
Epoch 2/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 852us/step - accuracy: 0.2189 - loss: 2.1499 - val_accuracy: 0.4570 - val_loss: 1.9662
Epoch 3/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 833us/step - accuracy: 0.3117 - loss: 2.0179 - val_accuracy: 0.4891 - val_loss: 1.8180
Epoch 4/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 818us/step - accuracy: 0.3625 - loss: 1.9027 - val_accuracy: 0.5155 - val_loss: 1.6871
Epoch 5/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 840us/step - accuracy: 0.4075 - loss: 1.7900 - val_accuracy: 0.5315 - val_loss: 1.5818
Epoch 6/50
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 836us/step - accuracy: 0.4492 - loss: 1.6801 - val_accuracy: 0.5487 - val_loss: 1.4986
Epoch 7/50
[1m123/123[0m [