<a href="https://colab.research.google.com/github/appliedcode/mthree-c422/blob/mthree-c422-dipti/Exercises/day-14/Threat_and_Security/Threat_and_Security_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


## **Problem Statement** – AI Security, Threat Simulation, and Privacy-Preserving AI in Credit Card Fraud Detection

You are working in the **security analytics division** of a global financial services company.
Your team’s goal is to build, evaluate, and audit a **credit card fraud detection model** that not only detects fraudulent transactions but also complies with **AI security and privacy governance best practices**.

Your responsibilities include:

1. **Dataset Handling \& Preprocessing**
    - Load and analyze the **Credit Card Fraud dataset** containing anonymized transaction features.
    - Prepare the data for binary classification (`fraud` vs `non-fraud`).
2. **Threat Simulation**
    - Simulate a **data poisoning attack** where a small percentage of fraud labels are intentionally flipped, testing the system’s resilience to adversarial manipulation.
3. **Baseline Model Training**
    - Train a machine learning classifier to detect fraud and evaluate its performance on the poisoned dataset.
4. **Privacy Preservation**
    - Apply a **privacy-preserving transformation** (e.g., partial feature masking or removal of certain sensitive transaction indicators) and retrain the model.
    - Compare performance trade-offs between the original and privacy-preserved versions.
5. **Auditing \& Transparency**
    - Use **SHAP** to explain the model’s decisions and uncover key features influencing fraud classification.
    - Create an **audit log** recording performance metrics, poisoning parameters, and privacy measures taken.
6. **Governance Report**
    - Generate a report summarizing the threat simulation, model results, privacy steps, and interpretability findings, suitable for an internal security and compliance review.

**Business Context:**
Financial fraud models are prime targets for **adversarial attacks** and can inadvertently leak sensitive transaction patterns. Regulatory frameworks like **GDPR**, **PCI DSS**, and emerging **AI Act** mandate proper bias control, privacy safeguards, and detailed audit trails for AI systems used in financial services.

***

### **Dataset Collection Code**

```python
import pandas as pd

# Load the Credit Card Fraud Detection dataset from Kaggle public source
# If running in Colab, you must upload 'creditcard.csv' or fetch from Kaggle using API
# For example:
# from google.colab import files
# uploaded = files.upload()

df = pd.read_csv('creditcard.csv')

# Features (X) and Target (y)
X = df.drop(columns=['Class'])
y = df['Class']  # 1 = Fraud, 0 = Non-Fraud

print("Dataset shape:", X.shape)
print("Fraud distribution:\n", y.value_counts())

# Preview the first few rows
df.head()
```


***


In [None]:
# -*- coding: utf-8 -*-
"""Credit Card Fraud Detection - AI Security, Privacy, and Auditing"""

# Install necessary packages
!pip install shap scikit-learn pandas matplotlib -q

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix
import shap
import datetime

# -------------------------
# 1. LOAD DATASET
# -------------------------
# NOTE: Ensure 'creditcard.csv' is uploaded to Colab or available in working directory
# Dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud

df = pd.read_csv('creditcard.csv')

# Drop rows with missing values in the 'Class' column
df.dropna(subset=['Class'], inplace=True)


X = df.drop(columns=['Class'])
y = df['Class']  # Target: 1=Fraud, 0=Non-Fraud


print("Dataset shape:", X.shape)
print("Fraud class distribution:\n", y.value_counts())

# -------------------------
# 2. SIMULATE DATA POISONING ATTACK
# -------------------------
def poison_labels(y, fraction=0.02, target_label=0):
    """
    Flip labels for a fraction of fraud cases to normal (or vice versa) to simulate poisoning.
    """
    y_poisoned = y.copy()
    n_poison = int(fraction * len(y))
    # Ensure indices are within bounds and unique
    all_indices = y[y == 1 - target_label].index
    if len(all_indices) < n_poison:
        print(f"Warning: Not enough samples of target label {1-target_label} to poison {fraction*100}%")
        n_poison = len(all_indices)
    indices_to_poison = np.random.choice(all_indices, n_poison, replace=False)
    y_poisoned.loc[indices_to_poison] = target_label
    return y_poisoned


print("\nSimulating label poisoning (2% of fraud labels flipped to class 0)...")
# Identify fraud indices before poisoning
fraud_indices = y[y == 1].index
y_poisoned = y.copy()
n_poison_fraud = int(0.02 * len(fraud_indices))
indices_to_poison = np.random.choice(fraud_indices, n_poison_fraud, replace=False)
y_poisoned.loc[indices_to_poison] = 0

# -------------------------
# 3. TRAIN-TEST SPLIT
# -------------------------
X_train, X_test, y_train, y_test = train_test_split(
    X, y_poisoned, test_size=0.3, random_state=42, stratify=y_poisoned
)

# -------------------------
# 4. BASELINE MODEL TRAINING
# -------------------------
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

print("\nBaseline Model - Classification Report (With Poisoned Data):")
print(classification_report(y_test, y_pred, digits=4))

# -------------------------
# 5. PRIVACY PRESERVATION - FEATURE MASKING
# -------------------------
# Example: Drop "Amount" and "Time" columns as a simple anonymization step
X_privacy = X.drop(columns=['Amount', 'Time'])
X_train_priv, X_test_priv, y_train_priv, y_test_priv = train_test_split(
    X_privacy, y_poisoned, test_size=0.3, random_state=42, stratify=y_poisoned
)

clf_priv = RandomForestClassifier(n_estimators=100, random_state=42)
clf_priv.fit(X_train_priv, y_train_priv)
y_pred_priv = clf_priv.predict(X_test_priv)

print("\nPrivacy-Preserved Model - Classification Report:")
print(classification_report(y_test_priv, y_pred_priv, digits=4))

# -------------------------
# 6. AUDIT LOGGING
# -------------------------
audit_log = {
    "timestamp": datetime.datetime.now().isoformat(),
    "poisoning_fraction": 0.02,
    "baseline_metrics": classification_report(y_test, y_pred, output_dict=True),
    "privacy_preserved_metrics": classification_report(y_test_priv, y_pred_priv, output_dict=True),
    "confusion_matrix_baseline": confusion_matrix(y_test, y_pred).tolist(),
    "confusion_matrix_privacy": confusion_matrix(y_test_priv, y_pred_priv).tolist()
}

print("\n--- Security Audit Log ---")
print(audit_log)

# -------------------------
# 7. MODEL EXPLAINABILITY (SHAP)
# -------------------------
print("\nGenerating SHAP explanation for baseline model...")
explainer = shap.TreeExplainer(clf)
shap_values = explainer.shap_values(X_test)

# Print debugging information about shap_values
print(f"Type of shap_values: {type(shap_values)}")
if isinstance(shap_values, list):
    print(f"Length of shap_values list: {len(shap_values)}")
    for i, arr in enumerate(shap_values):
        print(f"Shape of shap_values[{i}]: {arr.shape}")
        if isinstance(arr, np.ndarray):
            print(f"First 5 elements of shap_values[{i}]: {arr.flatten()[:5]}")
else:
    print(f"Shape of shap_values: {shap_values.shape}")
    if isinstance(shap_values, np.ndarray):
        print(f"First 5 elements of shap_values: {shap_values.flatten()[:5]}")


# Summary plot for class 1 (fraud)
# Ensure shap_values[1] has the correct shape (samples, features)
if isinstance(shap_values, list) and len(shap_values) > 1 and isinstance(shap_values[1], np.ndarray) and shap_values[1].shape == X_test.shape:
     shap.summary_plot(shap_values[1], X_test, plot_type="bar")
elif isinstance(shap_values, np.ndarray) and shap_values.shape == X_test.shape:
     # This case is less likely for binary classification TreeExplainer output
     print("Attempting to plot shap_values directly as its shape matches X_test.")
     shap.summary_plot(shap_values, X_test, plot_type="bar")
else:
    print("\nCould not generate SHAP summary plot due to unexpected shap_values structure or shape.")
    print(f"Expected shape for plotting: {X_test.shape}")


# -------------------------
# 8. COMPLIANCE / GOVERNANCE REPORT
# -------------------------
compliance_report = f"""
CREDIT CARD FRAUD DETECTION - AI SECURITY & PRIVACY AUDIT
--------------------------------------------------------
Timestamp: {audit_log['timestamp']}

Threat Simulation:
- Simulated {audit_log['poisoning_fraction']*100:.1f}% label poisoning to test fraud model robustness.

Model Performance:
- Baseline Accuracy: {audit_log['baseline_metrics']['accuracy']:.4f}
- Privacy-Preserved Accuracy: {audit_log['privacy_preserved_metrics']['accuracy']:.4f}

Security Governance:
- Evaluated model under data poisoning threat scenario.
- Applied simple feature masking ('Amount', 'Time') to enhance privacy.

Transparency:
- SHAP feature importance analysis used to interpret influential features for fraud classification.

Audit Readiness:
- Metrics, confusion matrices, and parameters logged for review.
- Supports compliance with GDPR, PCI DSS, and AI governance best practices.
"""

print(compliance_report)

Dataset shape: (265359, 30)
Fraud class distribution:
 Class
0.0    264879
1.0       480
Name: count, dtype: int64

Simulating label poisoning (2% of fraud labels flipped to class 0)...

Baseline Model - Classification Report (With Poisoned Data):
              precision    recall  f1-score   support

         0.0     0.9995    0.9999    0.9997     79467
         1.0     0.9286    0.7376    0.8221       141

    accuracy                         0.9994     79608
   macro avg     0.9641    0.8687    0.9109     79608
weighted avg     0.9994    0.9994    0.9994     79608

