### **Introduction**

Deep learning models have achieved significant success across various domains, including image recognition, natural language processing, and audio analysis. Despite their impressive accuracy, these models are susceptible to **adversarial examples**—small, often imperceptible perturbations that can cause misclassifications. Understanding and mitigating these vulnerabilities are critical for improving model robustness.

The **DeepFool algorithm**, proposed by Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard, systematically computes the smallest perturbation required to alter a model's prediction. This notebook employs DeepFool to evaluate the robustness of two architectures trained on the **UrbanSound8K dataset**: a **Convolutional Neural Network (CNN)** and a **Recurrent Neural Network (RNN)**.

---

### **DeepFool Algorithm**

The DeepFool algorithm iteratively linearizes a classifier's decision boundaries to compute the smallest perturbation $ r $ required to change the predicted label of an input $x$.

1. **Initialization**:
   - Input: Classifier $f$, example $x$, overshoot parameter $ \eta \ $, and maximum iterations $T $.
   - Predicted label:
     $$
     \hat{k}(x) = \arg\max_k f_k(x),
     $$
     where $f_k(x)$ is the classifier's probability for class $k $.

2. **Iteration**:
   - Compute the gradients of the classifier output with respect to $x$ for all classes.
   - Determine the minimal perturbation $r$ required to cross the decision boundary:
     $$
     r = \frac{\left| f_k(x) - f_{\hat{k}(x)}(x) \right|}{\left\| \nabla f_k(x) - \nabla f_{\hat{k}(x)}(x) \right\|_2} \cdot \frac{\nabla f_k(x) - \nabla f_{\hat{k}(x)}(x)}{\left\| \nabla f_k(x) - \nabla f_{\hat{k}(x)}(x) \right\|_2}.
     $$
   - Update $ x $ with the perturbation $ r $:
     $$
     x \gets x + r.
     $$

3. **Stopping Condition**:
   - The algorithm stops when $ \hat{k}(x + r) \neq \hat{k}(x) $ or the maximum number of iterations $ T $ is reached.

4. **Output**:
   - Return the minimal perturbation $ r $, the number of iterations, and the new predicted label.

---

### **Robustness Metric**

The robustness of a classifier $ f $ is defined as the expected relative norm of the minimal perturbation $ r $ with respect to the input $ x $:
$$
\rho_{\text{adv}}(f) = \mathbb{E}_{x \sim \mathcal{X}_{\text{test}}} \left[ \frac{\|r\|_2}{\|x\|_2} \right].
$$

In practice, this expectation is approximated by the mean over all test examples:
$$
\rho_{\text{adv}}(f) \approx \frac{1}{|\mathcal{X}_{\text{test}}|} \sum_{x \in \mathcal{X}_{\text{test}}} \frac{\|r\|_2}{\|x\|_2}.
$$

---

### **Notebook Objectives**

1. **Apply DeepFool to CNN and RNN Models**:
   - Evaluate minimal adversarial perturbations for both architectures trained on the UrbanSound8K dataset.

2. **Measure Robustness**:
   - Compute the robustness metric $rho_{\text{adv}}$ for both models across 10 cross-validation folds.

3. **Compare Results**:
   - Analyze differences in robustness between CNN and RNN architectures to identify which model demonstrates greater resistance to adversarial perturbations.

This notebook provides insights into the adversarial vulnerabilities of CNN and RNN models, offering guidance for developing more robust deep learning systems.

## Imports and Setup

In [1]:
# Import necessary libraries
import numpy as np
import tensorflow as tf
from tensorflow.keras import optimizers
import keras
import pickle
import pandas as pd
import os
from copy import deepcopy
from sklearn.preprocessing import StandardScaler
from matplotlib import pyplot as plt

# Constants
SAMPLE_RATE = 22050
HOP_LENGTH = round(SAMPLE_RATE * 0.0125)
TIME_SIZE = 4 * SAMPLE_RATE // HOP_LENGTH + 1
TARGET_WIDTH = 320  # Width for 2D features
NUM_CLASSES = 10  # Number of classes in the dataset


## Feature Preparation

This cell loads pre-extracted features from the UrbanSound8K dataset, organizes them into a DataFrame, and prepares both 2D (e.g., mel spectrograms, chroma features) and 1D (e.g., spectral rolloff, zero-crossing rate) features for model input.

In [2]:
# Load pre-extracted features
with open('urbansound8k_features.pkl', 'rb') as f:
    data = pickle.load(f)

# Create a DataFrame for easy manipulation
features_df = pd.DataFrame(data)

# Prepare 2D and 1D features
X_mel = np.array(features_df['mel_spec'].tolist())
X_mfcc = np.array(features_df['mfccs'].tolist())
X_chroma = np.array(features_df['chroma'].tolist())
X_contrast = np.array(features_df['spectral_contrast'].tolist())
X_rolloff = np.array(features_df['spectral_rolloff'].tolist())
X_zcr = np.array(features_df['zero_crossing_rate'].tolist())
y = np.array(features_df['label'].tolist())

# Reshape 1D features
X_rolloff = X_rolloff.reshape(X_rolloff.shape[0], -1)
X_zcr = X_zcr.reshape(X_zcr.shape[0], -1)

## Supporting Functions

**Description:**

This code implements the **DeepFool algorithm** for computing minimal adversarial perturbations on a multi-input deep learning model. It includes the following functions:

1. **`get_gradient`**:  
   Computes the gradient of the \(k\)-th output of the model with respect to its multi-input data. The function is specifically designed to handle models with multiple input features by treating inputs as dictionaries of tensors. This is essential for determining the direction and magnitude of perturbations required to fool the model.

2. **`deepfool`**:  
   Implements the DeepFool algorithm to iteratively find the smallest perturbation \(r\) that changes the model's predicted label.  
   - **Supports multi-input models** by processing inputs as dictionaries, where each key corresponds to a specific feature type (e.g., `input_1`, `input_2`).
   - **Inputs**:
     - `model`: The multi-input neural network.
     - `x0`: The original input as a dictionary of tensors.
     - `eta`: Overshoot parameter to adjust perturbation size.
     - `max_iter`: Maximum number of iterations allowed.
     - `num_classes`: Number of classes in the classification task.
   - **Outputs**:
     - `r_sum`: The cumulative perturbation applied to the input.
     - `loop_i`: Number of iterations performed.
     - `label_xi`: The new predicted label after perturbation.

3. **`example_robustness`**:  
   Calculates the robustness value, defined as the ratio of the norm of the perturbation to the norm of the input:
   $$
   \rho = \frac{\|r(x)\|_2}{\|x\|_2}.
   $$
   This function is compatible with multi-input models, summing norms across all input components.

4. **`model_robustness`**:  
   Computes the mean and standard deviation of robustness values across multiple examples, providing an overall measure of the model's adversarial robustness.  

These functions are designed to evaluate the robustness of multi-input classifiers against adversarial attacks by identifying and quantifying the minimal perturbations required to change predictions. The multi-input support makes this implementation versatile for models that process diverse types of inputs, such as images, text, or combined features. This framework is particularly useful for analyzing and comparing the robustness of different architectures, such as CNNs, RNNs, or hybrid models.

In [3]:
def get_gradient(model, x, k):
    #Computes the gradient of the k-th element in the model output with respect to the input x.
    with tf.GradientTape() as tape:
        inputs = {key: tf.cast(value, dtype=tf.float32) for key, value in x.items()}
        for value in inputs.values():
            tape.watch(value)
        results = model(inputs)
        # Ensure results is 1D or extract the correct batch element
        results_k = results[0, k] if len(results.shape) > 1 else results[k]

    gradients = tape.gradient(results_k, inputs)
    return {key: grad.numpy() for key, grad in gradients.items()}, results



def deepfool(model, x0, eta=0.01, max_iter=20, num_classes=10):
    #Implements the DeepFool algorithm for a multi-input model.
    # Obtain the initial estimated label
    f_x0 = model(x0).numpy().flatten()
    label_x0 = np.argmax(f_x0)

    loop_i = 0
    xi = deepcopy(x0)
    label_xi = label_x0

    # Main loop
    while label_xi == label_x0 and loop_i < max_iter:
        w_k_list = []
        f_k_list = []
        grad_f_label_x0_on_xi, f_xi = get_gradient(model, xi, label_x0)

        for k in range(num_classes):
            if k == label_x0:
                continue
            grad_f_k_on_xi, _ = get_gradient(model, xi, k)
            w_k = {key: grad_f_k_on_xi[key] - grad_f_label_x0_on_xi[key] for key in xi.keys()}
            f_k = f_xi[0, k] - f_xi[0, label_x0]
            w_k_norm = np.sqrt(sum(np.linalg.norm(w_k_input.flatten())**2 for w_k_input in w_k.values()))
            fk_wk = np.abs(f_k) / (w_k_norm + 1e-8)
            w_k_list.append((fk_wk, w_k, f_k))

        # Find minimal perturbation
        fk_wk_min, w_l, f_l = min(w_k_list, key=lambda t: t[0])

        # Compute perturbation
        ri_const = np.abs(f_l) / (sum(np.linalg.norm(w_l_input.flatten())**2 for w_l_input in w_l.values()) + 1e-8)
        ri = {key: ri_const * w_l_input for key, w_l_input in w_l.items()}

        # Update xi
        xi = {key: xi[key] + ri[key] for key in xi.keys()}

        # Update label
        f_xi = model(xi).numpy().flatten()
        label_xi = np.argmax(f_xi)
        loop_i += 1

    # Compute total perturbation
    r_sum = {key: (1 + eta) * (xi[key] - x0[key]) for key in x0.keys()}

    return r_sum, loop_i, label_xi


def example_robustness(x, r, epsilon=1e-8):
    #Calculates the robustness value ||r(x)|| / ||x|| for multi-input data.
    r_norm = np.sqrt(sum(np.linalg.norm(r_input.flatten())**2 for r_input in r.values()))
    x_norm = np.sqrt(sum(np.linalg.norm(x_input.flatten())**2 for x_input in x.values()))
    return r_norm / (x_norm + epsilon)



def model_robustness(example_robustness_list):
    #Calculates the mean and standard deviation of robustness values for the model.
    mean = np.mean(np.array(example_robustness_list))
    std = np.std(np.array(example_robustness_list))
    return mean, std

Utility functions for saving and loading data with Pickle:

In [4]:
def save_pkl(data, path):
    with open(path, "wb") as saved_data:
        pickle.dump(data, saved_data)
    saved_data.close()

def load_pkl(path):
    to_return = None
    with open(path, "rb") as loaded_data:
        to_return = pickle.load(loaded_data)
    loaded_data.close()
    return to_return

## Data Preprocessing and Feature Scaling Functions

 This code handles feature preparation for training and evaluation. It includes:
- Adjusting feature dimensions to a consistent size using padding or truncation for both 2D (e.g., mel-spectrograms) and 1D features (e.g., spectral rolloff).
- Scaling features for normalization, ensuring consistent input distributions.
- Splitting data into train, validation, and test sets based on cross-validation folds, organizing the features and labels for each.


In [5]:
# Function to pad or truncate 2D features
def pad_or_truncate_2d(X):
    num_samples, height, width = X.shape
    if width > TARGET_WIDTH:
        X = X[:, :, :TARGET_WIDTH]  # Truncate
    elif width < TARGET_WIDTH:
        pad_width = TARGET_WIDTH - width
        X = np.pad(X, ((0, 0), (0, 0), (0, pad_width)), mode='constant')  # Pad
    return X

# Function to pad or truncate 1D features
def pad_or_truncate_1d(X):
    num_samples, width = X.shape
    if width > TARGET_WIDTH:
        X = X[:, :TARGET_WIDTH]  # Truncate
    elif width < TARGET_WIDTH:
        pad_width = TARGET_WIDTH - width
        X = np.pad(X, ((0, 0), (0, pad_width)), mode='constant')  # Pad
    return X

# Function to scale features
def scale_features(X, scaler):
    if X.ndim == 3:
        num_features = X.shape[1]
        X_flat = X.reshape(X.shape[0], -1)
        X_scaled_flat = scaler.fit_transform(X_flat)
        X_scaled = X_scaled_flat.reshape(X.shape[0], num_features, X.shape[2])
    elif X.ndim == 2:
        X_scaled = scaler.fit_transform(X)
    else:
        raise ValueError(f"Input X must be 2D or 3D array, but got array with shape {X.shape}")
    return X_scaled


In [6]:
def ready_data(fold, features_df, X_mel, X_mfcc, X_chroma, X_contrast,
               X_rolloff, X_zcr, y, TARGET_WIDTH=320):
    # Use the specified fold as the test set
    test_idx = (features_df['fold'] == fold).values

    # Use the next fold as the validation set (cycling back to 1 after 10)
    validation_fold = (fold % 10) + 1
    val_idx = (features_df['fold'] == validation_fold).values

    # Use the remaining folds as the training set
    train_idx = ~(test_idx | val_idx)


    # Adjust features for train, validation, and test sets
    X_train_mel = pad_or_truncate_2d(X_mel[train_idx])
    X_val_mel = pad_or_truncate_2d(X_mel[val_idx])
    X_test_mel = pad_or_truncate_2d(X_mel[test_idx])

    X_train_mfcc = pad_or_truncate_2d(X_mfcc[train_idx])
    X_val_mfcc = pad_or_truncate_2d(X_mfcc[val_idx])
    X_test_mfcc = pad_or_truncate_2d(X_mfcc[test_idx])

    X_train_chroma = pad_or_truncate_2d(X_chroma[train_idx])
    X_val_chroma = pad_or_truncate_2d(X_chroma[val_idx])
    X_test_chroma = pad_or_truncate_2d(X_chroma[test_idx])

    X_train_contrast = pad_or_truncate_2d(X_contrast[train_idx])
    X_val_contrast = pad_or_truncate_2d(X_contrast[val_idx])
    X_test_contrast = pad_or_truncate_2d(X_contrast[test_idx])

    X_train_rolloff = pad_or_truncate_1d(X_rolloff[train_idx])
    X_val_rolloff = pad_or_truncate_1d(X_rolloff[val_idx])
    X_test_rolloff = pad_or_truncate_1d(X_rolloff[test_idx])

    X_train_zcr = pad_or_truncate_1d(X_zcr[train_idx])
    X_val_zcr = pad_or_truncate_1d(X_zcr[val_idx])
    X_test_zcr = pad_or_truncate_1d(X_zcr[test_idx])

    # Extract labels
    y_train = y[train_idx]
    y_val = y[val_idx]
    y_test = y[test_idx]

    return (X_train_mel, X_train_mfcc, X_train_chroma, X_train_contrast, X_train_rolloff, X_train_zcr, y_train,
            X_val_mel, X_val_mfcc, X_val_chroma, X_val_contrast, X_val_rolloff, X_val_zcr, y_val,
            X_test_mel, X_test_mfcc, X_test_chroma, X_test_contrast, X_test_rolloff, X_test_zcr, y_test)

## Robustness for All Folds

Here we calculate the robustness of the CNN model for each fold that exists in the dataset, we use the same method we did for the CNN training within the 10 folds, and take the same precautions of input sizes and regulating 2D and 1D features.

In [8]:
# Loop over the folds
for fold in range(1, 11):
    print(f"Processing fold {fold}...")

    file_path = f"robustness/robustness_cnn{fold}.pkl"

    # Skip fold if the robustness file already exists
    if os.path.exists(file_path):
        continue

    # Prepare data for this fold using the ready_data function
    (X_train_mel, X_train_mfcc, X_train_chroma, X_train_contrast, X_train_rolloff, X_train_zcr, y_train,
     X_val_mel, X_val_mfcc, X_val_chroma, X_val_contrast, X_val_rolloff, X_val_zcr, y_val,
     X_test_mel, X_test_mfcc, X_test_chroma, X_test_contrast, X_test_rolloff, X_test_zcr, y_test) = ready_data(
        fold, features_df, X_mel, X_mfcc, X_chroma, X_contrast, X_rolloff, X_zcr, y
    )

    # Scale features individually per fold
    scaler_mel = StandardScaler()
    scaler_mfcc = StandardScaler()
    scaler_chroma = StandardScaler()
    scaler_contrast = StandardScaler()
    scaler_rolloff = StandardScaler()
    scaler_zcr = StandardScaler()

    # Scale and reshape features
    def scale_and_reshape_test(X_train, X_test, scaler):
        X_train_scaled = scale_features(X_train, scaler)
        X_test_scaled = scaler.transform(X_test.reshape(X_test.shape[0], -1)).reshape(X_test.shape)
        return X_train_scaled[..., np.newaxis], X_test_scaled[..., np.newaxis]

    X_train_mel_scaled, X_test_mel_scaled = scale_and_reshape_test(X_train_mel, X_test_mel, scaler_mel)
    X_train_mfcc_scaled, X_test_mfcc_scaled = scale_and_reshape_test(X_train_mfcc, X_test_mfcc, scaler_mfcc)
    X_train_chroma_scaled, X_test_chroma_scaled = scale_and_reshape_test(X_train_chroma, X_test_chroma, scaler_chroma)
    X_train_contrast_scaled, X_test_contrast_scaled = scale_and_reshape_test(X_train_contrast, X_test_contrast, scaler_contrast)

    # For 1D features (Rolloff and ZCR)
    X_train_rolloff_scaled = scale_features(X_train_rolloff, scaler_rolloff)
    X_test_rolloff_scaled = scaler_rolloff.transform(X_test_rolloff)
    X_train_zcr_scaled = scale_features(X_train_zcr, scaler_zcr)
    X_test_zcr_scaled = scaler_zcr.transform(X_test_zcr)

    # Combine 1D features after adding channel dimension
    def combine_1d_features(rolloff, zcr):
        rolloff = rolloff[..., np.newaxis]
        zcr = zcr[..., np.newaxis]
        return np.concatenate([rolloff, zcr], axis=-1)

    X_train_1d = combine_1d_features(X_train_rolloff_scaled, X_train_zcr_scaled)
    X_test_1d = combine_1d_features(X_test_rolloff_scaled, X_test_zcr_scaled)

    # Load the model for the fold
    model_path = f"assets/kfold_metrics/model_fold{fold}.keras"
    fold_model_cnn = keras.models.load_model(model_path, compile=False)
    fold_model_cnn.compile(
        optimizer=optimizers.Adam(learning_rate=0.001),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )

    # Initialize the list to hold robustness values
    robustness_values_cnn_fold = []
    num_test_examples = X_test_mel_scaled.shape[0]

        # Run DeepFool for each example
    for i in range(num_test_examples):
        print(f"Processing example {i+1}/{num_test_examples} in fold {fold}")

        # Prepare the input for the model
        example_input = {
            "mel_input": np.expand_dims(X_test_mel_scaled[i], axis=0),
            "mfcc_input": np.expand_dims(X_test_mfcc_scaled[i], axis=0),
            "chroma_input": np.expand_dims(X_test_chroma_scaled[i], axis=0),
            "contrast_input": np.expand_dims(X_test_contrast_scaled[i], axis=0),
            "rolloff_input": np.expand_dims(X_test_1d[i, :, 0:1], axis=0),
            "zcr_input": np.expand_dims(X_test_1d[i, :, 1:2], axis=0)
        }

        # Run DeepFool
        perturbation, iters, fool_label = deepfool(fold_model_cnn, example_input)

        # Compute the robustness value
        robustness_value = example_robustness(example_input, perturbation)

        robustness_values_cnn_fold.append(robustness_value)

        # Save robustness results
    os.makedirs("robustness", exist_ok=True)
    with open(file_path, "wb") as f_out:
        pickle.dump(robustness_values_cnn_fold, f_out)
    print(f"Saved robustness results for fold {fold}.")

Processing fold 1...
Processing fold 2...
Processing fold 3...
Processing fold 4...
Processing fold 5...
Processing fold 6...


KeyboardInterrupt: 

## Calculate results for each fold

In [9]:
# Function to load robustness values from a pickle file
def load_robustness_values(file_path):
    with open(file_path, 'rb') as f:
        return pickle.load(f)

# Check each fold's models' results
for fold in range(1, 11):  # Iterate through folds 1 to 10
    file_path = f"robustness/robustness_cnn{fold}.pkl"  # Path to the robustness file for each fold

    # Ensure the file exists before loading
    if os.path.exists(file_path):
        robustness_values_cnn_fold = load_robustness_values(file_path)
        mean_robustness_cnn, std_robustness_cnn = model_robustness(robustness_values_cnn_fold)
        print(f"Fold {fold} - The CNN model has a robustness of {mean_robustness_cnn:.7f} +/- {std_robustness_cnn:.7f}.")
    else:
        print(f"Fold {fold} - Robustness file not found.")

Fold 1 - The CNN model has a robustness of 1.3798980 +/- 4.1650213.
Fold 2 - The CNN model has a robustness of 0.8795149 +/- 3.7217876.
Fold 3 - The CNN model has a robustness of 1.6633893 +/- 5.6833664.
Fold 4 - The CNN model has a robustness of 1.0007375 +/- 3.7238488.
Fold 5 - The CNN model has a robustness of 1.3241602 +/- 4.3191468.
Fold 6 - Robustness file not found.
Fold 7 - Robustness file not found.
Fold 8 - Robustness file not found.
Fold 9 - Robustness file not found.
Fold 10 - Robustness file not found.


The robustness values across folds show high standard deviations compared to their means, indicating the presence of outliers. This suggests that while the model demonstrates moderate robustness on average, certain data points deviate significantly, highlighting cases where the model is either highly robust or extremely vulnerable to adversarial perturbations. These outliers contribute to the variability observed in the robustness measurements, we will now adress these outliers.

In [10]:
import numpy as np
from collections import Counter

# Calculate mean and standard deviation of robustness values
mean_robustness = np.mean(robustness_values_cnn_fold)
std_robustness = np.std(robustness_values_cnn_fold)

# Threshold for outliers
threshold_high = mean_robustness + 2 * std_robustness
threshold_low = mean_robustness - 2 * std_robustness

# Identify outliers
outliers = [(i, value) for i, value in enumerate(robustness_values_cnn_fold) if value > threshold_high or value < threshold_low]
outlier_indices = [i for i, _ in outliers]

print(f"Number of outliers: {len(outliers)}")
print("Outlier indices and values:", outliers)

# Map robustness values to their respective classes
outlier_classes = [y_test[idx] for idx in outlier_indices]  # Use y_test instead of test_labels

# Count occurrences of each class in outliers
outlier_class_counts = Counter(outlier_classes)

print("Outlier Class Distribution:")
for class_id, count in outlier_class_counts.items():
    print(f"Class {class_id}: {count} occurrences")

# Exclude outliers from robustness values
filtered_robustness_values = [value for i, value in enumerate(robustness_values_cnn_fold) if i not in outlier_indices]

# Recalculate mean and standard deviation without outliers
filtered_mean_robustness = np.mean(filtered_robustness_values)
filtered_std_robustness = np.std(filtered_robustness_values)

print(f"Filtered Mean Robustness: {filtered_mean_robustness}")
print(f"Filtered Standard Deviation Robustness: {filtered_std_robustness}")


Number of outliers: 44
Outlier indices and values: [(134, np.float64(12.784296952876474)), (136, np.float64(24.144619940274744)), (137, np.float64(11.57103600590289)), (138, np.float64(18.514620831131403)), (139, np.float64(17.47469266491729)), (140, np.float64(15.067606740929234)), (141, np.float64(17.216584507119425)), (143, np.float64(13.984420965162009)), (147, np.float64(20.728384694158294)), (149, np.float64(15.314793961093748)), (150, np.float64(15.908570787150246)), (151, np.float64(26.652310187457466)), (152, np.float64(23.353275061638115)), (155, np.float64(15.294607511530602)), (201, np.float64(14.363248989424012)), (262, np.float64(13.21630838101174)), (268, np.float64(14.94779430493747)), (437, np.float64(11.047122847140068)), (544, np.float64(14.907048086684412)), (581, np.float64(14.145543287390014)), (592, np.float64(26.016408985380888)), (593, np.float64(29.477554521607978)), (596, np.float64(28.696355460876095)), (599, np.float64(18.126364999018605)), (600, np.float64

IndexError: index 823 is out of bounds for axis 0 with size 823

# RNN Model

## Data Preparation for the RNN

In [None]:
# Load preprocessed features (e.g., Mel Spectrograms)
with open('rnn_features.pkl', 'rb') as f:
    data = pickle.load(f)

# Extract feature arrays and labels
features_df = pd.DataFrame(data)
X_mel = np.array(features_df['mel_spec'].tolist())  # Use 'mel_spec' key for Mel Spectrograms
X_class = np.array(features_df['classID'].tolist())
folds = np.array(features_df['fold'])

# Reshape Mel spectrograms for RNN input (samples, time_steps, freq_bins)
X_mel = X_mel.reshape(X_mel.shape[0], X_mel.shape[1], X_mel.shape[2])  # Remove the channel dimension
print(f"Shape of Mel spectrogram input: {X_mel.shape}")

## Iterate over 10 folds

In [None]:
# Loop over 10 folds
for fold in range(1, 11):
    print(f"Processing fold {fold}...")

    # File to save robustness results
    file_path = f"robustness/rnn_robustness_fold{fold}.pkl"

    # Skip fold if results already exist
    if os.path.exists(file_path):
        print(f"Skipping fold {fold}, results already exist.")
        continue

    # Train-test-validation split
    test_idx = (folds == fold)
    train_idx = ~test_idx

    X_train, X_test = X_mel[train_idx], X_mel[test_idx]
    y_train, y_test = X_class[train_idx], X_class[test_idx]

    # Load the RNN model for the current fold
    model_path = f"rnn_models/rnn_model_fold{fold}.keras"
    rnn_model = keras.models.load_model(model_path, compile=False)
    rnn_model.compile(
        optimizer=Adam(learning_rate=0.001),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )

    # Initialize list for robustness values
    robustness_values_rnn_fold = []
    num_test_examples = X_test.shape[0]

    # Run DeepFool on each test example
    for i in range(num_test_examples):
        print(f"Processing example {i + 1}/{num_test_examples} in fold {fold}...")

        # Prepare input (ensure the shape matches RNN's expected input)
        example_input = np.expand_dims(X_test[i], axis=0)  # Shape: (1, time_steps, freq_bins)

        # Apply DeepFool
        perturbation, iters, fool_label = deepfool(rnn_model, example_input)

        # Compute robustness value
        robustness_value = example_robustness(example_input, perturbation)

        # Store robustness value
        robustness_values_rnn_fold.append(robustness_value)

    # Save robustness results
    os.makedirs("robustness", exist_ok=True)
    with open(file_path, "wb") as f_out:
        pickle.dump(robustness_values_rnn_fold, f_out)
    print(f"Saved robustness results for fold {fold}.")

## Analyze results

In [None]:
# Check each fold's models' results
for fold in range(1, 11):  # Iterate through folds 1 to 10
    file_path = f"robustness/robustness_rnn{fold}.pkl"  # Path to the robustness file for each fold

    # Ensure the file exists before loading
    if os.path.exists(file_path):
        robustness_values_rnn_fold = load_robustness_values(file_path)
        mean_robustness_rnn, std_robustness_rnn = model_robustness(robustness_values_rnn_fold)
        print(f"Fold {fold} - The RNN model has a robustness of {mean_robustness_rnn:.7f} +/- {std_robustness_rnn:.7f}.")
    else:
        print(f"Fold {fold} - Robustness file not found.")

In [None]:
# Calculate mean and standard deviation of robustness values
mean_robustness = np.mean(robustness_values_rnn_fold)
std_robustness = np.std(robustness_values_rnn_fold)

# Define thresholds for outliers (±2 standard deviations)
threshold_high = mean_robustness + 2 * std_robustness
threshold_low = mean_robustness - 2 * std_robustness

# Identify outliers (indices and values)
outliers = [(i, value) for i, value in enumerate(robustness_values_rnn_fold)
            if value > threshold_high or value < threshold_low]

# Print summary of outliers
print(f"Number of outliers: {len(outliers)}")
print("Outlier indices and values:", outliers)

# Ensure y_test contains class IDs, not one-hot encoding
if len(y_test.shape) > 1 and y_test.shape[1] > 1:
    y_test = np.argmax(y_test, axis=1)  # Convert one-hot to class IDs

# Map robustness values to their respective classes
outlier_classes = [y_test[idx] for idx, _ in outliers]

# Count occurrences of each class in outliers
from collections import Counter
outlier_class_counts = Counter(outlier_classes)

# Print outlier class distribution
print("Outlier Class Distribution:")
for class_id, count in outlier_class_counts.items():
    print(f"Class {class_id}: {count} occurrences")


# Conclusion

# References



   - [Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P., Polytechnique, E. and De Lausanne, F. (2016). DeepFool: a simple and accurate method to fool deep neural networks.](https://openaccess.thecvf.com/content_cvpr_2016/papers/Moosavi-Dezfooli_DeepFool_A_Simple_CVPR_2016_paper.pdf)

   - [Morgan, A. (2022). A Review of DeepFool: a simple and accurate method to fool deep neural networks. [online] Machine Intelligence and Deep Learning](https://medium.com/machine-intelligence-and-deep-learning-lab/a-review-of-deepfool-a-simple-and-accurate-method-to-fool-deep-neural-networks-b016fba9e48e#:~:text=DeepFool%20finds%20the%20minimal%20perturbations)

