# Deploying Multiple MIA Attacks in Synth-MIA

This notebook demonstrates how to deploy multiple Membership Inference Attack (MIA) methods and compare their performance across different metrics.

## Overview

When auditing the privacy of synthetic data, it's often valuable to test multiple attack strategies to get a comprehensive view of potential vulnerabilities. Different attacks may perform better under different conditions or reveal different types of privacy leakage.

This notebook shows how to:
- Load and prepare data for multiple attacks
- Initialize various MIA attackers with different strategies
- Run all attacks systematically
- Compare and analyze results across different methods

## Import Libraries and Load Data

First, let's import the necessary libraries and load our example dataset.

In [1]:
import pandas as pd
from synth_mia.attackers import *

# Load the datasets from the housing example
mem = pd.read_csv('../example_data/housing/mem.csv').values
non_mem = pd.read_csv('../example_data/housing/non_mem.csv').values
synth = pd.read_csv('../example_data/housing/synth.csv').values
ref = pd.read_csv('../example_data/housing/ref.csv').values

print(f"Data loaded successfully:")
print(f"  Members: {mem.shape}")
print(f"  Non-members: {non_mem.shape}")
print(f"  Synthetic: {synth.shape}")
print(f"  Reference: {ref.shape}")

Data loaded successfully:
  Members: (200, 9)
  Non-members: (200, 9)
  Synthetic: (200, 9)
  Reference: (200, 9)


## Initialize Multiple Attackers

Now we'll create instances of various attack methods available in Synth-MIA. Each attacker implements a different strategy for inferring membership:

You can adjust hyperparameters for each attacker as needed.

In [None]:
# Initialize instances of various attackers with different strategies
att1 = GenLRA(k_nearest=5)  
att2 = DCR()                
att3 = DPI()                
att4 = LOGAN()              
att5 = DCRDiff()            
att6 = DOMIAS()             
att7 = MC()                 
att8 = DensityEstimate(method="kde")  
att9 = LocalNeighborhood()  
att10 = Classifier()        

# Create a list of all attacker instances for easy iteration
attackers = [att1, att2, att3, att4, att5, att6, att7, att8, att9, att10]

print(f"Initialized {len(attackers)} different attack methods:")
for i, attacker in enumerate(attackers, 1):
    print(f"  {i}. {attacker.name}")

Initialized 10 different attack methods:
  1. Gen-LRA
  2. DCR
  3. DPI
  4. LOGAN
  5. DCR-Diff
  6. DOMIAS
  7. MC
  8. Density Estimator
  9. Local Neighborhood
  10. Classifier


## Execute All Attacks

Now we'll systematically run each attack method and collect the results. Each attacker will:
1. Compute attack scores for the test data
2. Evaluate the attack using ROC metrics
3. Store results for comparison

In [9]:
# Dictionary to store evaluation results for each attacker
results = {}

for i, attacker in enumerate(attackers, 1):    
    # Execute the attack
    true_labels, scores = attacker.attack(mem, non_mem, synth, ref)
    
    # Evaluate the attack using ROC metrics
    eval_results = attacker.eval(true_labels, scores, metrics=['roc'])
    
    # Store the evaluation results
    results[attacker.name] = eval_results
    
results_df = pd.DataFrame(results).T


Processing Test dataset: 100%|██████████| 400/400 [00:00<00:00, 1559.89it/s]


## Compare Attack Results

Let's display the results in a comprehensive table for easy comparison. The metrics include:

- **AUC-ROC**: Area Under the ROC Curve (higher is better for attackers)
- **TPR at FPR=0**: True Positive Rate when False Positive Rate is 0
- **TPR at FPR=0.001**: TPR at 0.1% false positive rate
- **TPR at FPR=0.01**: TPR at 1% false positive rate  
- **TPR at FPR=0.1**: TPR at 10% false positive rate

Higher values generally indicate more successful attacks (greater privacy leakage).

In [11]:
results_df.round(3)

Unnamed: 0,auc_roc,tpr_at_fpr_0,tpr_at_fpr_0.001,tpr_at_fpr_0.01,tpr_at_fpr_0.1
Gen-LRA,0.648,0.22,0.22,0.22,0.31
DCR,0.527,0.025,0.025,0.035,0.145
DPI,0.531,0.01,0.017,0.05,0.124
LOGAN,0.536,0.005,0.005,0.015,0.115
DCR-Diff,0.562,0.0,0.0,0.015,0.215
DOMIAS,0.684,0.02,0.02,0.055,0.27
MC,0.511,0.0,0.006,0.05,0.117
Density Estimator,0.643,0.02,0.02,0.035,0.165
Local Neighborhood,0.5,0.0,0.001,0.01,0.1
Classifier,0.646,0.03,0.03,0.1,0.36
