# 05 — Explainability (SHAP) + Governance Notes

We use SHAP for **global** interpretability:
- show which latent PCA components drive risk scores
- avoid assigning human semantics to PCA features
- document monitoring & governance considerations


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import shap

df = pd.read_csv('../data/creditcard.csv')
cols = list(df.columns)
label_col = 'Class' if 'Class' in cols else 'class'
df[label_col] = pd.to_numeric(df[label_col], errors='coerce').fillna(0).astype(int)

X = df.drop(columns=[label_col])
y = df[label_col].astype(int)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

neg = (y_train == 0).sum()
pos = (y_train == 1).sum()
scale_pos_weight = neg / max(pos, 1)

model = XGBClassifier(
    n_estimators=400,
    learning_rate=0.05,
    max_depth=4,
    subsample=0.8,
    colsample_bytree=0.8,
    objective='binary:logistic',
    eval_metric='aucpr',
    scale_pos_weight=scale_pos_weight,
    tree_method='hist',
    random_state=42
)
model.fit(X_train, y_train)

# Sample for SHAP (include all fraud cases + a manageable non-fraud sample)
fraud_idx = y_test[y_test==1].index
nonfraud_idx = y_test[y_test==0].sample(n=min(5000, int((y_test==0).sum())), random_state=42).index
sample_idx = fraud_idx.union(nonfraud_idx)

X_sample = X_test.loc[sample_idx]
print('SHAP sample shape:', X_sample.shape)

In [None]:
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_sample)

shap.summary_plot(shap_values, X_sample, show=False)
plt.title('SHAP Summary (Global Feature Impact)')
plt.show()

In [None]:
mean_abs = np.abs(shap_values).mean(axis=0)
top_idx = np.argsort(mean_abs)[::-1][:10]
top = pd.DataFrame({'feature': X_sample.columns[top_idx], 'mean_abs_shap': mean_abs[top_idx]})
top

## How to interpret this correctly

- Features are PCA components (`v1`–`v28`) → **do not** assign human semantics.
- SHAP indicates which latent behavioral components influence risk scores.

Portfolio-safe phrasing:
> The model concentrates risk on a small subset of latent behavioral components, suggesting fraud is driven by anomalous combinations of transaction behaviors rather than amount alone.


## Governance & Monitoring

**Monitor**
- fraud rate and label drift
- score distribution shifts
- precision at fixed review capacity
- false positive rate / customer friction

**Retrain**
- rolling window retraining (weekly/monthly)
- trigger retraining when score distribution drifts

**Human-in-the-loop**
- flagged transactions feed investigation workflow
- policy depends on compliance and ops capacity
