# Adversarial Robustness Audit

In this notebook, we will run our adversarial robustness evaluation using the `audit_tool/robustness.py` module.  We’ll load the data and trained models, configure our attack parameters, run the attacks, visualize the results, analyze the features most frequently perturbed, and finally generate a report of the findings.


In [None]:
# Credit Risk Audit Tool - Adversarial Robustness Phase
# Dataset: Prosper Loan Data

import sys
import os
# Add the project root to the Python path so we can import audit_tool
sys.path.append(os.path.abspath('..'))

import pandas as pd
import numpy as np
import joblib

# Import our robustness functions
from audit_tool.robustness import robustness as rb

## 1. Load Data and Models

We first load our pre-processed test set and the three serialized models (Logistic Regression, Random Forest, XGBoost) that we will audit.


# Paths to processed data
X_test = pd.read_csv('../data/processed/prosperloan/X_test.csv').values
y_test = pd.read_csv('../data/processed/prosperloan/y_test.csv')['target'].values

# Keep feature names for vulnerability analysis
feature_names = pd.read_csv('../data/processed/prosperloan/X_test.csv').columns.tolist()

# Load trained models
models = {
    "logreg": joblib.load('../models/prosperloan/logistic_regression.pkl'),
    "rf":     joblib.load('../models/prosperloan/random_forest.pkl'),
    "xgb":    joblib.load('../models/prosperloan/xgboost.pkl')
}

print("Loaded test data and models:")
print(" - X_test shape:", X_test.shape)
print(" - y_test shape:", y_test.shape)
print(" - Models:", list(models.keys()))


## 2. Configure Attacks and Epsilon Sweep

Define which attack methods to run and the list of ε budgets for white-box attacks.  
We’ll also set where to save our outputs.


In [None]:
# Supported attacks in robustness.py
attacks  = ["fgsm", "pgd", "hopskipjump", "boundary"]

# Epsilon values for FGSM and PGD
epsilons = [0.01, 0.05, 0.1, 0.2]

# Output directory for CSVs, plots, and report
output_dir = "../outputs/reports/robustness"

## 3. Run Robustness Evaluation

Generate adversarial examples, compute success rates and average distortions,  
and collect the raw adversarial data for feature-vulnerability analysis.


In [None]:
df_results, adv_examples = rb.run_robustness(
    models=models,
    X=X_test,
    y=y_test,
    attacks=attacks,
    epsilons=epsilons,
    output_dir=output_dir
)

# Display the first rows of the results DataFrame
df_results.head()


## 4. Plot Robustness Curves

Visualize how attack success rate changes with ε for each model and attack.  
The plots will also be saved into `outputs/reports/robustness/`.


In [None]:
rb.plot_robustness(df_results, output_dir=output_dir)

## 5. Analyze Feature Vulnerabilities

Identify the top 5 input features most frequently modified in successful adversarial examples  
for each (model, attack, ε) scenario.


In [None]:
vuln = rb.analyze_feature_vulnerability(
    adv_examples=adv_examples,
    feature_names=feature_names,
    top_k=5
)

# Display the top-5 features for one example scenario
first_key = next(iter(vuln))
print(f"Top 5 features for {first_key}:")
for feat, count in vuln[first_key]:
    print(f" - {feat}: {count} modifications")


## 6. Generate Markdown Report

Produce a comprehensive Markdown report (`robustness_report.md`)  
that includes the summary table, vulnerability rankings, and recommendations.


In [None]:
report_path = os.path.join(output_dir, "informe_robustez.md")
rb.generate_markdown_report(
    df=df_results,
    vuln=vuln,
    output_path=report_path
)
print(f"Report written to: {report_path}")
