## DeepFool Algorithm

For a given classifier and example, the algorithm is set to compute the minimal perturbation that is sufficient to change the estimated label. For the remainder of the section, the following notation will be used:

- $f$: a given classifier that outputs a vector with the probability distribution for the classification associated with its probability index.

- $x$: a given example.

- $X$: the domain of the examples.

- $T$: the domain of test examples available.

- $k$: a possible classification of the considered problem. Thus, the probability of the classification of an example $x$ to be $k$ is $f_k(x)$.

- $\hat{k}(x)$: the estimated classification of a given example. It is noted that $\hat{k}(x) = argmax_k( f_k(x) )$.

- $\hat{r}(x)$: the minimal perturbation for which $\hat{k}(x) \ne \hat{k}(x+\hat{r}(x))$.

The DeepFool algorithm only outputs the value of the minimal perturbation $\hat{r}(x)$ of a given example $x$.

Finally, the proposed formal definition for the robustness to adversarial examples of a given classifier is the expected value over the domain of examples for the norm of the minimal perturbation for an example divided by the norm of that same example. For practical purposes, the aforementioned expected value is approximated to the mean value for all examples in the available test domain of the classifier:

$\rho_{adv}(f) = 𝔼_X \frac{||\hat{r}(x)||_2}{||x||_2} ≈ \frac{1}{|T|} ∑_{x\in T} \frac{||\hat{r}(x)||_2}{||x||_2}$

Function to calculate the gradients of all the classes at the same time (with respect to the input)

In [43]:
import pickle
import pandas as pd
import numpy as np
import os
import librosa
import tensorflow as tf
import numpy as np
from copy import deepcopy
import keras
from tensorflow.keras import models, layers, regularizers, optimizers
from tensorflow.python.client import device_lib


In [45]:
def save_pkl(data, path):
    try:
        with open(path, "wb") as saved_data:
            pickle.dump(data, saved_data)
        saved_data.close()
    except:
        print('Fail to save data')

def load_pkl(path):
    try:
        with open(path, "rb") as loaded_data:
            to_return = pickle.load(loaded_data)
        loaded_data.close()
        return to_return
    except:
        print('Fail to load data')
        return None

In [46]:
def get_gradient(model, x, k):

    # compute the k-th value of the model output using x as the input under the watch of tensorflow's GradientTape
    with tf.GradientTape(persistent=True) as tape:
        inputs = [tf.cast(input_value, dtype=tf.float64) for input_value in x]
        for input_value in inputs:
            tape.watch(input_value)
        results = model(inputs)
        results_k = results[0,k]

    # obtain the gradient of the k-th element in the model output w.r.t. the input
    gradients = tape.gradient(results_k, inputs)
    del tape
    return [grad.numpy() for grad in gradients], results


Implementation of the deepfool algorithm

In [47]:
# function that implements the DeepFool algorithm
# model: the model to be used in the algorithm
# x0: the initial input without any perturbation
# eta: an overshoot value to be multiplied with each iteration's perturbation
# max_iter: maximum nummber of iterations  the algorithm is allowed to execute
def deepfool(model, x0, eta=0.01, max_iter=20):

    # obtain the initial estimated label
    f_x0 = model(x0).numpy().flatten()
    label_x0 = f_x0.argsort()[::-1][0]

    loop_i = 0
    xi = deepcopy(x0)
    label_xi = label_x0
    r = []

    # main loop
    while label_xi == label_x0 and loop_i < max_iter:
        w_l = [np.zeros(x_input.shape) for x_input in x0]
        f_l = 0
        fk_wk_min = np.inf
        grad_f_label_x0_on_xi, f_xi = get_gradient(model, xi, label_x0)

        for k in range(10): # k = 0, ..., 9 (possible classes in the problem considered for this project)
            if (k == label_x0):
                continue
            grad_f_k_on_xi, f_xi = get_gradient(model, xi, k)
            w_k = [g_f_k - g_f_label for g_f_k, g_f_label in zip(grad_f_k_on_xi, grad_f_label_x0_on_xi)]
            w_k_norm = np.sqrt(np.sum(np.fromiter([np.linalg.norm(w_k_input)**2 for w_k_input in w_k], dtype=np.float32)))
            f_k = f_xi[0,k] - f_xi[0,label_x0]
            fk_wk = np.linalg.norm(f_k) / (w_k_norm + 1e-3)
            if fk_wk < fk_wk_min:
                w_l, f_l = w_k, f_k
        
        w_l_squared_norm = np.sum(np.fromiter([np.linalg.norm(w_l_input)**2 for w_l_input in w_l], dtype=np.float32))
        f_l_norm = np.linalg.norm(f_l)
        ri_const = f_l_norm / (w_l_squared_norm + 1e-3)
        ri = [ri_const * w_l_input for w_l_input in w_l]
        r.append(ri)
        xi_new = [xi_item + (1+eta)*ri_item for xi_item, ri_item in zip(xi, ri)]
        xi = xi_new
        label_xi = model(xi).numpy().flatten().argsort()[::-1][0]
        loop_i += 1

    # main loop finished
    r_sum = [np.zeros(x_input.shape) for x_input in x0]
    for i in range(len(x0)):
        for r_i in r:
            r_sum[i] += r_i[i][0]

    # return the value of r(x), number of iterations performed, and the new label obtained by adding the perturbation to the input
    return r_sum, loop_i, label_xi


In [48]:
def example_robustness(x, r):
    """
    Calculate robustness measure for an individual example.
    """
    r_norm = np.sqrt(np.sum([np.linalg.norm(r_input)**2 for r_input in r]))
    x_norm = np.sqrt(np.sum([np.linalg.norm(x_input)**2 for x_input in x]))
    return r_norm / x_norm


def model_robustness(example_robustness_list):
    """
    Calculate mean and standard deviation of robustness for the model.
    """
    mean = np.mean(example_robustness_list)
    std = np.std(example_robustness_list)
    return mean, std

Running DeepFool for the RNN-LSTM

In [49]:
df = pd.read_csv("../UrbanSound8K/metadata/UrbanSound8K.csv")

In [34]:
def padding(path, duration = 4, sr = 44100):
    files = librosa.util.find_files(path)
    data = []

    for index, file_path in enumerate(files):
        try:
            audio, sr = librosa.load(file_path, sr=sr, mono=True)

            if len(audio) < duration*sr: # quando a duracao do audio for inferior aos 4 seg vamos adicionar padding
                audio = np.concatenate([audio,np.zeros(shape = (duration*sr - len(audio), ))])

            elif len(audio) > duration*sr: # quando a duracao do audio for superior aos 4 segs vamos reduzir a duracao
                audio = audio[:duration*sr]
        
            file_name = os.path.basename(file_path)
            data.append([file_name, audio])
        
        except Exception:
            print(f"Error in processig file {file_path}: {Exception}")

    return data

In [35]:
def feature_extraction(dataframe, audios, hop_length = 512, n_fft = 256):
    log_spectograms = []
    labels = []

    for index in range(len(audios)):
        try:
            file_name =audios[index][0]
            if file_name:
                row = dataframe.loc[dataframe["slice_file_name"] == file_name]

                if not row.empty:
                    label = row.iloc[0,6]
                    spectogram = np.abs(librosa.core.stft(
                        y = np.array(audios[index][1]),
                        hop_length = hop_length,
                        n_fft = n_fft
                    ))
                    log_spectogram = librosa.amplitude_to_db(spectogram)
                    log_spectograms.append(log_spectogram)
                    labels.append(label)
        except Exception:
            print(f"Error in processig file {audios[index][0]}: {Exception}")

    log_spectograms = np.array(log_spectograms)
    labels = np.array(labels)
    return log_spectograms, labels

In [36]:
fold_paths = ["../UrbanSound8K/audio/fold1",
              "../UrbanSound8K/audio/fold2",
              "../UrbanSound8K/audio/fold3",
              "../UrbanSound8K/audio/fold4",
              "../UrbanSound8K/audio/fold5",
              "../UrbanSound8K/audio/fold6",
              "../UrbanSound8K/audio/fold7",
              "../UrbanSound8K/audio/fold8",
              "../UrbanSound8K/audio/fold9",
              "../UrbanSound8K/audio/fold10"]

In [37]:
import gc
features = np.empty((10,), dtype=object)
label = np.empty((10,), dtype=object)

for i, fold in enumerate(fold_paths):
    print(f"Processing Fold Number {i+1}")
    audio_data = padding(fold)
    log_spectograms, labels = feature_extraction(df, audio_data)

    # normalizar os dados para estarem entre valores [0,1]
    log_spectograms_normalized = (log_spectograms - np.min(log_spectograms)) / (np.max(log_spectograms) - np.min(log_spectograms))

    # one-hot encoding dos labels
    encoded_labels = np.zeros((len(labels), 10))
    encoded_labels[np.arange(len(labels)), labels] = 1
    
    features[i] = log_spectograms_normalized
    label[i] = encoded_labels
    print("Features Shape: ",features[i].shape)
    print("Labels Shape: ",label[i].shape,"\n")
    
    del log_spectograms
    del log_spectograms_normalized
    del labels
    del encoded_labels
    gc.collect() # libertar memoria

save_pkl(features,"data/features_rnn.pkl")
save_pkl(label,"data/labels_rnn.pkl")

Processing Fold Number 1
Features Shape:  (873, 129, 345)
Labels Shape:  (873, 10) 

Processing Fold Number 2
Features Shape:  (888, 129, 345)
Labels Shape:  (888, 10) 

Processing Fold Number 3
Features Shape:  (925, 129, 345)
Labels Shape:  (925, 10) 

Processing Fold Number 4
Features Shape:  (990, 129, 345)
Labels Shape:  (990, 10) 

Processing Fold Number 5
Features Shape:  (936, 129, 345)
Labels Shape:  (936, 10) 

Processing Fold Number 6
Features Shape:  (823, 129, 345)
Labels Shape:  (823, 10) 

Processing Fold Number 7
Features Shape:  (838, 129, 345)
Labels Shape:  (838, 10) 

Processing Fold Number 8
Features Shape:  (806, 129, 345)
Labels Shape:  (806, 10) 

Processing Fold Number 9
Features Shape:  (816, 129, 345)
Labels Shape:  (816, 10) 

Processing Fold Number 10
Features Shape:  (837, 129, 345)
Labels Shape:  (837, 10) 



In [38]:
df_data = load_pkl("data/features_rnn.pkl")

In [39]:
df_data

array([array([[[0.44941983, 0.41713092, 0.44372114, ..., 0.58563778,
                0.60813849, 0.52276851],
               [0.43885333, 0.43250569, 0.48991636, ..., 0.68623683,
                0.65796393, 0.74013259],
               [0.4039603 , 0.44462505, 0.45634013, ..., 0.65961627,
                0.67031856, 0.74947155],
               ...,
               [0.36152731, 0.36152731, 0.36152731, ..., 0.36152731,
                0.36152731, 0.36152731],
               [0.36152731, 0.36152731, 0.36152731, ..., 0.36152731,
                0.36152731, 0.36152731],
               [0.36152731, 0.36152731, 0.36152731, ..., 0.36152731,
                0.36152731, 0.36152731]],

              [[0.52118725, 0.49201721, 0.40953051, ..., 0.5151961 ,
                0.57059886, 0.57214958],
               [0.51119143, 0.45011978, 0.437294  , ..., 0.48364943,
                0.53344393, 0.55138207],
               [0.47889804, 0.39415313, 0.44182374, ..., 0.35703329,
                0.35541366, 0

In [None]:

for f in range(1, 10+1):

    # this list will hold the values ||r(x)|| / ||x|| initially mentioned on the notebook 
    robustness_values_rnn_fold = []

    # utilize only the data of the f-th fold
    X_fold = df_data[f]

    # the data to be inputed into the RNN model (spectogram)
    # explicit cast to assure type integrity of the data before being passed to the model
    

    # load the model from the performance evaluation cross validation run that used the f-th fold as the test fold
    # the compile keyword argument was set to false such that the model can be explicitly re-compiled using the compile settings defined on LSTM.ipynb
    fold_model_rnn = keras.models.load_model(f"kfold_metrics_LSTM/model_fold{f}.keras", compile=False)
    fold_model_rnn.compile(
        optimizer=optimizers.Adam(learning_rate=0.001),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )

    # begin the run of each example in the f-th fold and the corresponding model on the DeepFool algorithm
    for i in range(len(X_fold)):

        # prepare the input for the implemented DeepFool function
        example_input = [np.expand_dims(X_fold[i], axis = 0)]

        # run DeepFool
        # eta keyword argument was set to 10^6 due to the extremely small order of magnitude of each iteration's perturbation (between 10^-8 and 10^-5)
        perturbation, iters, fool_label = deepfool(fold_model_rnn, example_input, eta=1e6)

        # save the ||r(x)|| / ||x|| value into the list defined in the beginning of the "fold" for loop
        robustness_values_rnn_fold.append(example_robustness(example_input, perturbation))

    # to save the results of each fold
    save_pkl(robustness_values_rnn_fold, f"robustness/robustness_values_rnn_fold{f}.pkl")
# check each folds' models results
for f in range(1, 10+1):
    robustness_values_rnn_fold = load_pkl(f"robustness/robustness_values_rnn_fold{f}.pkl")
    mean_robustness_rnn, std_robustness_rnn = model_robustness(robustness_values_rnn_fold)
    print(f"Fold {f} - The RNN model has a robustness of {mean_robustness_rnn: .7f} +/- {std_robustness_rnn: .7f}.")



## References

- Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P., Polytechnique, E. and De Lausanne, F. (2016). DeepFool: a simple and accurate method to fool deep neural networks: https://openaccess.thecvf.com/content_cvpr_2016/papers/Moosavi-Dezfooli_DeepFool_A_Simple_CVPR_2016_paper.pdf.

- TensorFlow. (2023). Introduction to gradients and automatic differentiation | TensorFlow Core. [online] Available at: https://www.tensorflow.org/guide/autodiff.