### Deep Fool Algorithm 

We have achieved some results with the CNN and MLP models, but we would like to explore how these models perform when faced with adversarial examples. To do so, we aply DeepFool.

**DeepFool** is a widely used method for generating adversarial examples, designed to evaluate the robustness of machine learning models, particularly in classification tasks. 

By iteratively finding the minimal perturbation needed to alter a model's prediction, DeepFool provides insights into a **model's vulnerability to adversarial attacks**. This method is crucial for developing more **robust and reliable models**, as it helps identify potential weaknesses and informs strategies for improving their defense mechanisms

![Descrição da imagem](./images/deepfool.png)


1. **Initialization:** Starting with the original image \( $x_0$ \), and setting the iteration counter \($ i $\) to 0.

2. **Perturbation Calculation:** For each iteration:
   - The algorithm calculates the perturbation required to change the model's prediction, iteratively adjusting the image.
   - It computes the gradients and the necessary perturbation \( $r_i $\) for the class boundary.

3. **Repeat Until Misclassification:** The algorithm repeats this process until the image is misclassified by the model.

4. **Return Perturbation:** The final perturbation is the sum of all the adjustments \( $r_i$ \) made during the iterations.

The result is the minimal perturbation $ r $ that causes a misclassification.


In [1]:
# import of relevant libraries

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import models, layers, regularizers, optimizers
from tensorflow.python.client import device_lib
import pandas as pd
import numpy as np
import pickle
import os
import librosa
from copy import deepcopy

In [2]:
def get_gradient(model, x, target_class):
    """
    Calcula o gradiente da saída da classe-alvo com respeito à entrada.

    Args:
        model: O modelo neural treinado.
        x: Entrada para o modelo (tensor).
        target_class: Índice da classe-alvo.

    Returns:
        Gradiente calculado.
    """
    with tf.GradientTape() as tape:
        tape.watch(x)
        logits = model(x)
        target_logits = logits[:, target_class]
    return tape.gradient(target_logits, x)


In [3]:
import numpy as np

def deepfool(model, x0, eta=1e-2, max_iter=50, num_classes=10):
    """
    Implementa o algoritmo DeepFool para calcular a menor perturbação.

    Args:
        model: O modelo neural treinado.
        x0: Entrada inicial (tensor).
        eta: Parâmetro de overshoot.
        max_iter: Número máximo de iterações.
        num_classes: Número de classes no modelo.

    Returns:
        r_sum: Perturbação acumulada.
        loop_i: Número de iterações realizadas.
        label_xi: Nova previsão após a perturbação.
    """
    x = tf.convert_to_tensor(x0, dtype=tf.float32)
    r_sum = tf.zeros_like(x)
    label_xi = tf.argmax(model(x), axis=1).numpy()[0]

    for loop_i in range(max_iter):
        gradients = []
        logits = model(x)
        current_label = tf.argmax(logits, axis=1).numpy()[0]

        if current_label != label_xi:
            break

        for k in range(num_classes):
            grad = get_gradient(model, x, k)
            gradients.append(grad)

        gradients = tf.stack(gradients)
        logits = tf.squeeze(logits)

        smallest_perturbation = float('inf')
        for k in range(num_classes):
            if k == label_xi:
                continue
            w_k = gradients[k] - gradients[label_xi]
            f_k = logits[k] - logits[label_xi]
            perturbation = tf.abs(f_k) / tf.norm(w_k, ord=2)
            if perturbation < smallest_perturbation:
                smallest_perturbation = perturbation
                r_i = (perturbation + eta) * w_k / tf.norm(w_k, ord=2)

        r_sum += r_i
        x = x + r_i

    return r_sum, loop_i, current_label


In [4]:
def example_robustness(r, x):
    """
    Calcula a robustez de um exemplo.

    Args:
        r: Perturbação adversarial aplicada.
        x: Entrada original.

    Returns:
        Valor de robustez (ρ).
    """
    norm_r = tf.norm(r)
    norm_x = tf.norm(x)
    return norm_r / norm_x


In [5]:
def model_robustness(model, X_test, y_test):
    """
    Avalia a robustez média do modelo em relação ao conjunto de testes.

    Args:
        model: O modelo neural treinado.
        X_test: Conjunto de dados de teste.
        y_test: Labels do conjunto de teste.

    Returns:
        Média e desvio padrão da robustez.
    """
    rho_values = []
    for i in range(len(X_test)):
        x = tf.expand_dims(X_test[i], axis=0)
        r, _, _ = deepfool(model, x)
        rho = example_robustness(r, x)
        rho_values.append(rho.numpy())
    
    mean_rho = np.mean(rho_values)
    std_rho = np.std(rho_values)
    return mean_rho, std_rho


In [6]:
# Avaliar robustez do modelo
mean_rho, std_rho = model_robustness(model, X_test, y_test)
print(f"Medium robustness: {mean_rho:.4f}")
print(f"Standard deviation of robustness: {std_rho:.4f}")

NameError: name 'model' is not defined

In [None]:
class MLP(tf.keras.Model):
    def __init__(self, input_dim, output_dim, hidden_units, dropout_rate, activations, regularization_type=None, regularization_value=0.01):
        super(MLP, self).__init__()
        self.hidden_layers = []
        self.regularization_type = regularization_type
        self.regularization_value = regularization_value

        for units, activation in zip(hidden_units, activations):
            self.hidden_layers.append(
                tf.keras.layers.Dense(units, activation=activation)
            )
            self.hidden_layers.append(tf.keras.layers.Dropout(dropout_rate))
        
        self.output_layer = tf.keras.layers.Dense(output_dim, activation='softmax')  

    def call(self, inputs):
        x = inputs
        for layer in self.hidden_layers:
            x = layer(x)
        return self.output_layer(x)
    
    def compute_regularization_loss(self):
        regularization_loss = 0.0
        if self.regularization_type:
            for layer in self.hidden_layers:
                if isinstance(layer, tf.keras.layers.Dense):
                    weights = layer.kernel
                    if self.regularization_type == 'l1':
                        regularization_loss += tf.reduce_sum(tf.abs(weights)) * self.regularization_value
                    elif self.regularization_type == 'l2':
                        regularization_loss += tf.reduce_sum(tf.square(weights)) * self.regularization_value
        return regularization_loss

def load_fold_data(fold_index, files):
    # Adjust fold_index to be zero-based
    data = pd.read_csv(files[fold_index-1]).to_numpy()

    if np.isnan(data).any():
        print(f"Warning: Missing values detected in file {files[fold_index - 1]}.")
        data = data[~np.isnan(data).any(axis=1)]  # Remove rows with NaN values
    X = data[:, :-1]  # Features
    y = data[:, -1].astype(int)  # Labels
    if (y < 0).any() or (y >= 10).any():
        raise ValueError(f"Invalid label values detected in file {files[fold_index - 1]}. Labels: {np.unique(y)}")
    return X, y

files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1, 11)]

# Define the test fold
fold_test = 1
X_test, y_test = load_fold_data(fold_test, files)

# Define the training folds
X_train, y_train = [], []
for i in range(1, 11):  # Total of 10 folds
    if i != fold_test:
        X_temp, y_temp = load_fold_data(i, files)
        X_train.append(X_temp)
        y_train.append(y_temp)

# Concatenate the training data
X_train = np.concatenate(X_train, axis=0)
y_train = np.concatenate(y_train, axis=0)

# Hyperparameters
best_config = {
    'hidden_units': [256, 128, 64],
    'activations': ['relu', 'relu', 'relu'],
    'dropout_rate': 0.3,
    'batch_size': 64,
    'epochs': 20,
    'learning_rate': 0.0001,
    'regularization_type': None,
    'regularization_value': 0.01
}

# Initialize and train the model
model = MLP(
    input_dim=X_train.shape[1],
    output_dim=10,  # Classes from 0 to 9
    hidden_units=best_config['hidden_units'],
    dropout_rate=best_config['dropout_rate'],
    activations=best_config['activations']
)

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=best_config['learning_rate']),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    metrics=['accuracy']
)

# Example validation split
X_val, y_val = X_train[:len(X_train)//10], y_train[:len(y_train)//10]

history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    batch_size=best_config['batch_size'],
    epochs=best_config['epochs']
)


Epoch 1/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.1875 - loss: 2.2475 - val_accuracy: 0.2917 - val_loss: 2.1037
Epoch 2/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.3056 - loss: 2.0502 - val_accuracy: 0.3567 - val_loss: 1.9065
Epoch 3/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.4009 - loss: 1.8247 - val_accuracy: 0.4420 - val_loss: 1.7311
Epoch 4/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.4830 - loss: 1.6741 - val_accuracy: 0.4420 - val_loss: 1.6252
Epoch 5/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5298 - loss: 1.5154 - val_accuracy: 0.4943 - val_loss: 1.4976
Epoch 6/20
[1m123/123[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5504 - loss: 1.4208 - val_accuracy: 0.5121 - val_loss: 1.4090
Epoch 7/20
[1m123/123[0m 

In [40]:
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number - 1]) 
    if data.empty:
        print(f"Erro: O arquivo {files[fold_number - 1]} está vazio ou não foi carregado corretamente.")
    labels = data.pop('Label').values
    features = data.values
    return features, labels


def train_evaluate_model(config, X_train, y_train, X_val, y_val):
    model = MLP(
        input_dim=X_train.shape[1],
        output_dim=10,
        hidden_units=config['hidden_units'],
        dropout_rate=config['dropout_rate'],
        activations=config['activations'],
        regularization_type=config.get('regularization_type', None),
        regularization_value=config.get('regularization_value', 0.01)
    )
    
    def loss_with_regularization(y_true, y_pred):
        base_loss = tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
        regularization_loss = model.compute_regularization_loss()
        return base_loss + regularization_loss

    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=config['learning_rate']),
        loss=loss_with_regularization,
        metrics=['accuracy']
    )
    
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=config['batch_size'],
        epochs=config['epochs'],
        callbacks=[early_stopping],
        verbose=0
    )
    return history 

def deepfool_mlp(model, x0, y0, max_iter=50, epsilon=1e-6):
    x_adv = tf.identity(x0)  # Tensor inicial
    for i in range(max_iter):
        with tf.GradientTape() as tape:
            tape.watch(x_adv)
            logits = model(x_adv)
            pred_label = tf.argmax(logits, axis=-1).numpy()[0]

        if pred_label != y0:
            return x_adv.numpy()  # Retorna o adversarial se o modelo errar

        gradients = tape.gradient(logits, x_adv).numpy()[0]
        logits = logits.numpy()[0]
        current_label = pred_label

        perturbations = []
        for k in range(len(logits)):
            if k != y0:
                diff_grad = gradients[k] - gradients[y0]
                diff_logits = logits[k] - logits[y0]
                perturbation = abs(diff_logits) / (np.linalg.norm(diff_grad) + epsilon)
                perturbations.append((perturbation, k))

        perturbations.sort(key=lambda x: x[0])  # Ordena pelo menor deslocamento necessário
        r_min, _ = perturbations[0]
        x_adv += r_min * gradients

    return x_adv.numpy()  # Retorna a última iteração como fallback


def cross_validation_mlp_deepfool(datasets, model_builder, params):
    folds = list(range(1, 11))
    accuracy_values = []
    loss_values = []
    robustness_values = []

    for test_fold in folds:
        train_val_folds = [fold for fold in folds if fold != test_fold]
        train_folds = train_val_folds[:-1]
        val_fold = train_val_folds[-1]
        
        # Carregar dados
        X_train, y_train = [], []
        for fold in train_folds:
            X_temp, y_temp = load_fold_data(fold, files)
            X_train.append(X_temp)
            y_train.append(y_temp)
        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)

        X_val, y_val = load_fold_data(val_fold, files)
        X_test, y_test = load_fold_data(test_fold, files)

        # Treinar modelo
        model = model_builder(input_dim=X_train.shape[1], best_config=best_config)
        model.fit(
            X_train, y_train,
            validation_data=(X_val, y_val),
            batch_size=params['batch_size'],
            epochs=params['epochs'],
            verbose=0
        )

        # Avaliação de desempenho
        fold_loss, fold_accuracy = model.evaluate(X_test, y_test, verbose=0)
        accuracy_values.append(fold_accuracy)
        loss_values.append(fold_loss)

        # Robustez com DeepFool
        adversarial_success = 0
        for idx in range(len(X_test)):
            x0 = np.expand_dims(X_test[idx], axis=0)
            y0 = y_test[idx]
            x_adv = deepfool_mlp(model, x0, y0)
            adv_pred = tf.argmax(model(x_adv), axis=-1).numpy()[0]

            if adv_pred != y0:
                adversarial_success += 1

        robustness = 1 - (adversarial_success / len(X_test))
        robustness_values.append(robustness)
        print(f"Fold {test_fold}: Accuracy={fold_accuracy:.4f}, Loss={fold_loss:.4f}, Robustness={robustness:.4f}")

    return accuracy_values, loss_values, robustness_values


In [41]:
params = {
    'hidden_units': [256, 128, 64],
    'activations': ['relu', 'relu', 'relu'],
    'dropout_rate': 0.3,
    'batch_size': 64,
    'epochs': 20,
    'learning_rate': 0.0001,
    'regularization_type': None,
    'regularization_value': 0.01
}

In [42]:
def build_mlp_model(input_dim, best_config):
    """
    Build and compile an MLP model based on the provided configuration.

    Args:
        input_dim (int): Number of input features.
        best_config (dict): Dictionary containing hyperparameters.

    Returns:
        tf.keras.Model: Compiled MLP model.
    """
    model = MLP(
                input_dim=X_train.shape[1],
                output_dim=10,
                hidden_units=best_config['hidden_units'],
                dropout_rate=best_config['dropout_rate'],
                activations=best_config['activations'],
                regularization_type=best_config.get('regularization_type', None),
                regularization_value=best_config.get('regularization_value', 0.01)
            )

    # Compile the model
    model.compile(
                optimizer=tf.keras.optimizers.Adam(learning_rate=best_config['learning_rate']),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                metrics=['accuracy']
            )

    return model


In [43]:
# Load all folds into a list
files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1, 11)]
folds = [pd.read_csv(file) for file in files]

In [44]:
accuracy_values, loss_values, robustness_values = cross_validation_mlp_deepfool(
    datasets=files,            
    model_builder=build_mlp_model,  
    params=best_config        
)

print("Resultados da Validação Cruzada:")
for i in range(len(accuracy_values)):
    print(f"Fold {i + 1}:")
    print(f" - Acurácia: {accuracy_values[i]:.4f}")
    print(f" - Perda: {loss_values[i]:.4f}")
    print(f" - Robustez: {robustness_values[i]:.4f}")

Fold 1: Accuracy=0.6472, Loss=1.1020, Robustness=0.5613
Fold 2: Accuracy=0.5405, Loss=1.3149, Robustness=0.4944
Fold 3: Accuracy=0.6000, Loss=1.1624, Robustness=0.5319
Fold 4: Accuracy=0.5970, Loss=1.1801, Robustness=0.5131
Fold 5: Accuracy=0.6132, Loss=1.0799, Robustness=0.5192
Fold 6: Accuracy=0.5699, Loss=1.4149, Robustness=0.5140
Fold 7: Accuracy=0.5656, Loss=1.3666, Robustness=0.5179
Fold 8: Accuracy=0.6414, Loss=1.2023, Robustness=0.5434
Fold 9: Accuracy=0.5172, Loss=1.3547, Robustness=0.4301
Fold 10: Accuracy=0.5974, Loss=1.2022, Robustness=0.5030
Resultados da Validação Cruzada:
Fold 1:
 - Acurácia: 0.6472
 - Perda: 1.1020
 - Robustez: 0.5613
Fold 2:
 - Acurácia: 0.5405
 - Perda: 1.3149
 - Robustez: 0.4944
Fold 3:
 - Acurácia: 0.6000
 - Perda: 1.1624
 - Robustez: 0.5319
Fold 4:
 - Acurácia: 0.5970
 - Perda: 1.1801
 - Robustez: 0.5131
Fold 5:
 - Acurácia: 0.6132
 - Perda: 1.0799
 - Robustez: 0.5192
Fold 6:
 - Acurácia: 0.5699
 - Perda: 1.4149
 - Robustez: 0.5140
Fold 7:
 - Acurá

In [48]:
def cross_validation_mlp_deepfool(datasets, model_builder, params):
    folds = list(range(1, 11))
    accuracy_values = []
    loss_values = []
    robustness_values = []

    for test_fold in folds:
        train_val_folds = [fold for fold in folds if fold != test_fold]
        train_folds = train_val_folds[:-1]
        val_fold = train_val_folds[-1]
        
        # Load data
        X_train, y_train = [], []
        for fold in train_folds:
            X_temp, y_temp = load_fold_data(fold, files)
            X_train.append(X_temp)
            y_train.append(y_temp)
        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)

        X_val, y_val = load_fold_data(val_fold, files)
        X_test, y_test = load_fold_data(test_fold, files)

        # Train model
        model = model_builder(input_dim=X_train.shape[1], best_config=best_config)
        model.fit(
            X_train, y_train,
            validation_data=(X_val, y_val),
            batch_size=params['batch_size'],
            epochs=params['epochs'],
            verbose=0
        )

        # Evaluate performance
        fold_loss, fold_accuracy = model.evaluate(X_test, y_test, verbose=0)
        accuracy_values.append(fold_accuracy)
        loss_values.append(fold_loss)

        # Robustness with DeepFool
        adversarial_success = 0
        for idx in range(len(X_test)):
            x0 = np.expand_dims(X_test[idx], axis=0)
            y0 = y_test[idx]
            x_adv = deepfool_mlp(model, x0, y0)
            adv_pred = tf.argmax(model(x_adv), axis=-1).numpy()[0]

            if adv_pred != y0:
                adversarial_success += 1

        robustness = 1 - (adversarial_success / len(X_test))
        robustness_values.append(robustness)
        print(f"Fold {test_fold}: Accuracy={fold_accuracy:.4f}, Loss={fold_loss:.4f}, Robustness={robustness:.4f}")

    # Calculate averages
    avg_accuracy = np.mean(accuracy_values)
    avg_loss = np.mean(loss_values)
    avg_robustness = np.mean(robustness_values)

    # Display averages
    print("\nCross-Validation Summary:")
    print(f"Average Accuracy: {avg_accuracy:.4f}")
    print(f"Average Loss: {avg_loss:.4f}")
    print(f"Average Robustness: {avg_robustness:.4f}")

    return accuracy_values, loss_values, robustness_values

accuracy_values, loss_values, robustness_values = cross_validation_mlp_deepfool(
    datasets=files,            
    model_builder=build_mlp_model,  
    params=best_config        
)

Fold 1: Accuracy=0.6804, Loss=1.1354, Robustness=0.5945
Fold 2: Accuracy=0.5439, Loss=1.2987, Robustness=0.4899
Fold 3: Accuracy=0.6032, Loss=1.1992, Robustness=0.5178
Fold 4: Accuracy=0.5808, Loss=1.2251, Robustness=0.4980
Fold 5: Accuracy=0.6154, Loss=1.0490, Robustness=0.5331
Fold 6: Accuracy=0.5176, Loss=1.5814, Robustness=0.4520
Fold 7: Accuracy=0.5525, Loss=1.3745, Robustness=0.5060
Fold 8: Accuracy=0.6154, Loss=1.2348, Robustness=0.5186
Fold 9: Accuracy=0.5417, Loss=1.3398, Robustness=0.4608
Fold 10: Accuracy=0.6045, Loss=1.1937, Robustness=0.5293

Cross-Validation Summary:
Average Accuracy: 0.5855
Average Loss: 1.2632
Average Robustness: 0.5100
