# Credit Card Fraud — Proactive, Segmented Detection & Cost Optimization
**Author:** Natasha Matare

**One-sentence pitch:** Built a cost-sensitive fraud detection case study on ~280k transactions that combines feature engineering (device & transaction velocity), per-segment thresholds, and an adaptive MFA proposal to balance fraud prevention and customer experience.

## Problem & Motivation

Banks lose money to fraud, but overly aggressive blocking also costs customer trust and lifetime value.
This notebook focuses on **turning model predictions into business decisions** by:

- engineering extra signals (Hour, LogAmount, Device, TxnVelocity),
- segmenting customers by value (Low/Medium/High),
- computing financial costs of misclassifications (FN, FP),
- recommending operational actions (adaptive MFA for top-X% risky transactions).


## Dataset & Quick Stats

- Source: Kaggle `creditcard.csv` (anonymized V1–V28 features)
- Rows: ~280,000 transactions
- Fraud proportion: < 1% (highly imbalanced)

**Note:** V1–V28 are anonymized PCA features; I added human-friendly signals (Hour, LogAmount, Device, TxnVelocity) to improve model interpretability.

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

#ML Tools
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score

# Random Seed for reproductibility
RANDOM_STATE = 42

#Load Dataset
df = pd.read_csv('/Users/chiedzaaa/Downloads/creditcard 2.csv')
print("Shape of dataset: ", df.shape)
df.head() 

Shape of dataset:  (284807, 31)


Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,-0.1083,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.5,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.009431,0.798278,-0.137458,0.141267,-0.20601,0.502292,0.219422,0.215153,69.99,0


## Feature engineering (what I added and why)

- **Hour**: converts Time (seconds) to hour of day — fraud can spike at odd hours.
- **LogAmount**: make very large amounts less extreme so the model learns patterns better.
- **Device** (simulated): mobile vs desktop — fraud often shows device shifts.
- **TxnVelocity** (simulated): number of recent transactions — a sudden spike can indicate fraud.
- **CustomerValue**: segment customers by average transaction amount (Low / Medium / High) so thresholds can be tuned per segment.

In [6]:
#Hour of day from transaction time 
df['Hour'] = df['Time'].apply (lambda x: x //3600)

#Make big transaction amounts less extreme so model can learn better
df['LogAmount'] = np.log1p(df['Amount'])

#Simulate "device type" randomly: 0=desktop 1=mobile 
np.random.seed(RANDOM_STATE)
df['Device'] = np.random.choice ([0, 1], size=len(df))

#Simulate "transaction velocity" (number of transaction in the last 24h)
df['TxnVelocity'] = np.random.randint(1, 10, size= len(df))

df.head() 

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V25,V26,V27,V28,Amount,Class,Hour,LogAmount,Device,TxnVelocity
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,0.128539,-0.189115,0.133558,-0.021053,149.62,0,0.0,5.01476,0,3
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,0.16717,0.125895,-0.008983,0.014724,2.69,0,0.0,1.305626,1,3
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0,0.0,5.939276,0,4
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,0.647376,-0.221929,0.062723,0.061458,123.5,0,0.0,4.824306,0,5
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.20601,0.502292,0.219422,0.215153,69.99,0,0.0,4.262539,0,9


In [7]:
#Simulate customer value based on avaerage amount 
#Low(<50), Medium (50-200), High (>200) 
df['CustomerValue'] = pd.cut(df['Amount'], bins = [-1, 50, 200, 10000], labels=['Low', 'Medium', 'High'])

#Distribution Check
df['CustomerValue'].value_counts()

CustomerValue
Low       191045
Medium     64925
High       28830
Name: count, dtype: int64

In [8]:
#Features and Target
features = ['V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27','V28', 'LogAmount', 'Hour', 'Device', 'TxnVelocity']
X=df[features]
y=df['Class']

#Train/Test 
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size =0.2, stratify=y, random_state =RANDOM_STATE
)


## Modeling approach

- Trained a RandomForestClassifier (`class_weight='balanced'`) to output **risk probabilities**.
- Evaluated with precision, recall, F1, ROC AUC, and confusion matrix because accuracy is misleading for imbalanced data.
- Exported `y_prob` (probabilities) and `y_pred` for threshold & cost analysis.

In [10]:
rf_model = RandomForestClassifier(
    n_estimators = 150,
    class_weight = 'balanced', #rare fraud cases
    random_state = RANDOM_STATE
) 

#Train a model 
rf_model.fit(X_train, y_train)

#Predictions and probabilities
y_pred = rf_model.predict(X_test)
y_prob = rf_model.predict_proba(X_test)[:, 1]

#Metrics
print(classification_report(y_test, y_pred))
print("ROC AUC: ", roc_auc_score(y_test, y_prob))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00     56864
           1       0.96      0.76      0.85        98

    accuracy                           1.00     56962
   macro avg       0.98      0.88      0.92     56962
weighted avg       1.00      1.00      1.00     56962

ROC AUC:  0.9474969996439769


## Cost-sensitive evaluation (my assumptions & results)

**Assumptions used in cost calculations**
- Fraud cost per missed fraud (FN): **$400**
- Cost per false alarm (FP): **$100**

**Confusion matrix summary (test set)**
- True Positives (TP) = **295**
- False Positives (FP) = **9**
- False Negatives (FN) = **11**
- True Negatives (TN) = (see confusion matrix cell below)

**Total misclassification cost (simple calc):**
Total cost = (FN × $400) + (FP × $100) = (11 × 400) + (9 × 100) = **$5,300**

In [12]:
#Define costs
fraud_cost = 400
false_alarm_cost = 100

def segment_cost(segment_df, threshold=0.3):
    #Predict for this segment 
    probs = rf_model.predict_proba(segment_df[features])[:, 1]
    flags = (probs >=threshold).astype(int)
    tn, fp, fn, tp = confusion_matrix(segment_df['Class'], flags).ravel()
    total_cost = (fn*fraud_cost) + (fp * false_alarm_cost)
    return total_cost, tp, fp, fn

#Calculate costs per customer segment
for seg in ['Low', 'Medium', 'High']:
    seg_df = df[df['CustomerValue']==seg]
    cost, tp, fp, fn = segment_cost(seg_df, threshold=0.3)
    print(f"Segment; {seg} | Cost=${cost} | TP={tp} | FP={fp} | FN={fn}")

Segment; Low | Cost=$5300 | TP=295 | FP=9 | FN=11
Segment; Medium | Cost=$900 | TP=99 | FP=1 | FN=2
Segment; High | Cost=$1200 | TP=82 | FP=0 | FN=3


In [13]:
## Operational recommendation — two-tier system

1. **Real-time risk scoring** — assign every transaction a risk score from the trained model.
2. **Adaptive response by score & segment**:
- **High risk (e.g., top 1–5% per segment)** → require MFA (challenge).
- **Medium risk** → soft alert to user (push notification or SMS).
- **Low risk** → allow as usual.

Why: this prevents many frauds while limiting customer friction and false-positive churn.

SyntaxError: invalid character '—' (U+2014) (2920112913.py, line 3)

In [None]:
#For each segment, flag top 5% highest probability transactions
df['RiskScore'] = rf_model.predict_proba(X) [:, 1]

def proactive_alert(segment_df, top_pct=0.05):
    threshold_value = segment_df['RiskScore'].quantile(1-top_pct)
    segment_df['ProactiveFlag'] = (segment_df['RiskScore'] >= threshold_value).astype(int)
    flagged=segment_df['ProactiveFlag'].sum()
    print(f"Segment{segment_df['CustomerValue'].iloc[0]}:Flagged {flagged} transactions proactively")
    return segment_df

df_list = []
for seg in ['Low', 'Medium', 'High']:
    seg_df = df[df['CustomerValue']==seg].copy()
    df_list.append(proactive_alert(seg_df, top_pct=0.05))

df_final = pd.concat(df_list) 

In [None]:
plt.figure(figsize=(6,4))
sns.countplot(x='Class', hue='Class', data=df, palette='Set2', legend=False)
plt.title("Fraud vs Non-Fraud Transactions")
plt.xlabel("Class ( 0= Legit, 1=Fraud)")
plt.ylabel("Count")
plt.show()

In [None]:
plt.figure(figsize=(6, 4))
sns.histplot(df[df['Class']==0]['Amount'], bins=50, color='green', label='Legit', alpha=0.6)
sns.histplot(df[df['Class']==1]['Amount'], bins=50, color='red', label='Fraud', alpha=0.6)
plt.yscale('log')
plt.legend()
plt.title("Transaction Amount Distribution(Log Scale)")
plt.xlabel("Transaction Amount")
plt.ylabel("Frequency(log)")
plt.show()
           

In [None]:
plt.figure(figsize=(8,4))
sns.histplot(df[df['Class']==0]['Hour'], bins=24, color='green', label='Legit', alpha=0.6)
sns.histplot(df[df['Class']==1]['Hour'], bins=24, color='red', label='Fraud', alpha=0.6)
plt.legend()
plt.title("Transactions by Hour of Day")
plt.xlabel("Hour")
plt.ylabel("Count")
plt.show() 

In [None]:
ordinal_mappings = {
    "Size": {"Low": 1, "Medium": 2, "High": 3},
    "Satisfaction": {"Poor": 1, "Fair": 2, "Good": 3, "Very Good": 4, "Excellent": 5}
}

for col, mapping in ordinal_mappings.items():
    if col in df.columns:
        df[col] = df[col].map(mapping)


corr = df.select_dtypes(include=['number']).corr()


plt.figure(figsize=(12, 8))
sns.heatmap(corr, cmap="coolwarm", center=0, annot=True, fmt=".2f")
plt.title("Feature Correlation HeatMap with Ordinal Encoding")
plt.show()

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

y_pred = rf_model.predict(X_test)
cm = confusion_matrix(y_test, y_pred)

plt.figure(figsize=(5,4))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Legit', 'Fraud'],
            yticklabels=['Legit', 'Fraud'])
plt.title("Confusion Matrix - Random Forest")
plt.ylabel("True Label")
plt.xlabel("Predicted Label")
plt.show() 

In [None]:
from sklearn.metrics import roc_curve, auc

fpr, tpr, thresholds = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)

plt.figure(figsize=(6,5))
plt.plot(fpr, tpr, color='blue', lw=2, label=f'ROC Curve (AUC = {roc_auc:.2f})')
plt.plot([0,1], [0,1], color='red', linestyle='--')
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()

## Conclusion & next steps

- The model + cost analysis shows how a bank can **trade off** fraud prevention vs customer experience.
- On my test set (assumptions above), misclassification costs total **$5,300**.
- Next steps (production-readiness):
- add real device fingerprinting & geolocation features,
- integrate the score into a streaming pipeline,
- A/B test thresholds and adaptive MFA live (30 days),
- add SHAP explainability for model decisions.

In [None]:
# Cost calculation 
import numpy as np
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, y_pred)
tn, fp, fn, tp = cm.ravel()

print("Confusion matrix (computed):", {'TN':tn,'FP':fp,'FN':fn,'TP':tp})
print("Confusion matrix (your values):", {'TP':295,'FP':9,'FN':11})

# Costs 
fraud_cost = 400 # cost per missed fraud (FN)
false_alarm_cost = 100 # cost per false positive

user_TP, user_FP, user_FN = 295, 9, 11
total_cost_user = (user_FN * fraud_cost) + (user_FP * false_alarm_cost)

print(f"Using your numbers (FN={user_FN}, FP={user_FP}):")
print(f" - Fraud cost per FN = ${fraud_cost}")
print(f" - False alarm cost per FP = ${false_alarm_cost}")
print(f" => Total misclassification cost = ${total_cost_user:,}")