# Day 36: Membership Inference Attack

In this lab, we will simulate a **Membership Inference Attack (MIA)**.
We will show that a model that overfits to its training data is vulnerable: an attacker can determine if a specific sample was in the training set just by looking at the model's high confidence scores.

In [None]:
import sys
import os
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.privacy.membership_inference import MIAttacker

## 1. Train Target Model

We train a Random Forest classifier. To make it vulnerable, we will intentionally overfit (no max depth).

In [None]:
# Generate data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, random_state=42)

# Split into Train (Members) and Test (Non-Members)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)

# Train model (Overfitting)
model = RandomForestClassifier(n_estimators=100, max_depth=None, random_state=42)
model.fit(X_train, y_train)

print(f"Train Accuracy: {model.score(X_train, y_train):.4f} (High confidence on members)")
print(f"Test Accuracy: {model.score(X_test, y_test):.4f} (Lower confidence on non-members)")

## 2. Launch Attack

The attacker guesses: "If confidence > 0.9, it's a Member."

In [None]:
attacker = MIAttacker()
threshold = 0.9

# Attack predictions
member_preds = attacker.attack_threshold_based(model, X_train, y_train, threshold)
non_member_preds = attacker.attack_threshold_based(model, X_test, y_test, threshold)

# Evaluate
metrics = attacker.evaluate_attack(member_preds, non_member_preds)
for k, v in metrics.items():
    print(f"{k}: {v:.4f}")

## 3. Visualize Vulnerability

We plot the confidence distribution for Members vs Non-Members.

In [None]:
prob_member = model.predict_proba(X_train)
conf_member = [prob_member[i, y_train[i]] for i in range(len(y_train))]

prob_non_member = model.predict_proba(X_test)
conf_non_member = [prob_non_member[i, y_test[i]] for i in range(len(y_test))]

plt.figure(figsize=(10, 6))
plt.hist(conf_member, bins=20, alpha=0.5, label='Members (Train)', color='blue')
plt.hist(conf_non_member, bins=20, alpha=0.5, label='Non-Members (Test)', color='red')
plt.title("Confidence Distribution (Vulnerability to MIA)")
plt.xlabel("Model Confidence on True Class")
plt.ylabel("Count")
plt.legend()
plt.show()