<a href="https://colab.research.google.com/github/ispromadhka/Credit-Card-Fraud-Detection/blob/main/Credit_Card_Fraud_Detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# üí≥ Credit Card Fraud Detection with Advanced ML

---

### üìö Table of Contents

1. [Introduction & Business Context](#introduction)

2. [Data Loading & Initial Exploration](#data-loading)

3. [Exploratory Data Analysis](#eda)

4. [Feature Engineering](#feature-engineering)

5. [Model Development](#model-development)

6. [Model Evaluation & Interpretability](#evaluation)

7. [Business Impact Analysis](#business-impact)


---


<a id='introduction'></a>
## 1. Introduction & Business Context üìä

###  Problem Statement

Credit card fraud is a significant challenge in the financial industry, causing billions in losses annually. This project develops a sophisticated fraud detection system using:

- **CatBoost** (Yandex's gradient boosting library)

- **SHAP** for model interpretability

- **Optuna** for hyperparameter optimization



###  Success Metrics

- **Primary**: PR-AUC (critical for imbalanced data)

- **Secondary**: ROC-AUC, F1-Score, Business ROI


In [None]:
!pip install catboost optuna imblearn shap kagglehub

In [None]:
import sys
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import patches
import matplotlib.patches as mpatches
import seaborn as sns
from datetime import datetime

from catboost import CatBoostClassifier
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.preprocessing import RobustScaler
from sklearn.metrics import (
    roc_auc_score, average_precision_score, f1_score,
    confusion_matrix, classification_report, roc_curve,
    precision_recall_curve, precision_score, recall_score
)

import optuna
from imblearn.over_sampling import SMOTE
import shap

import kagglehub
warnings.filterwarnings('ignore')

plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.labelsize'] = 12
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.rcParams['legend.fontsize'] = 10
plt.rcParams['figure.titlesize'] = 16


<a id='data-loading'></a>
## 2. Data Loading & Initial Overview üìÅ

We'll use the famous Credit Card Fraud Detection dataset from Kaggle, containing:

- **284,807** transactions

- **492** fraudulent transactions (0.17% - extreme imbalance!)

- **30** features (V1-V28 are PCA-transformed for privacy, plus Time and Amount)


In [None]:
path = kagglehub.dataset_download("mlg-ulb/creditcardfraud")
df = pd.read_csv(f"{path}/creditcard.csv")

print(f"\n Dataset Shape: {df.shape[0]:,} transactions √ó {df.shape[1]} features")

In [None]:
print("\n Dataset Info:")
display(df.head())
display(df.info())
print("\nüìà Statistical Summary:")
display(df.describe().round(2))

<a id='eda'></a>
## 3. Exploratory Data Analysis üîç

### Key Questions to Answer:

1. How severe is the class imbalance?

2. What are the distributions of Amount and Time?

3. Are there any obvious patterns in fraudulent transactions?

4. Do we have missing values or outliers to handle?


Missing Values Check

In [None]:
missing = df.isnull().sum()
if missing.any():
    display(missing[missing > 0])
else:
    print("‚úÖ No missing values found - Data quality is excellent!")

Class Distribution Analysis

In [None]:
fraud_count = df['Class'].sum()
normal_count = len(df) - fraud_count
fraud_ratio = df['Class'].mean()

print(f"üîµ Normal Transactions: {normal_count:,} ({(1-fraud_ratio)*100:.2f}%)")
print(f"üî¥ Fraud Transactions: {fraud_count:,} ({fraud_ratio*100:.2f}%)")
print(f"‚öñÔ∏è Imbalance Ratio: 1:{int(normal_count/fraud_count)}")

fig, axes = plt.subplots(1, 3, figsize=(18, 6))

ax = axes[0]
class_data = pd.DataFrame({
    'Class': ['Normal', 'Fraud'],
    'Count': [normal_count, fraud_count],
    'Percentage': [(1-fraud_ratio)*100, fraud_ratio*100]
})

bars = ax.bar(class_data['Class'], class_data['Count'],
               color=['#2E86C1', '#E74C3C'], alpha=0.8, edgecolor='black')

for bar, count, pct in zip(bars, class_data['Count'], class_data['Percentage']):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
            f'{count:,}\n({pct:.2f}%)',
            ha='center', va='bottom', fontweight='bold')

ax.set_ylabel('Number of Transactions', fontsize=12)
ax.set_title('Class Distribution', fontsize=14, fontweight='bold')
ax.set_ylim(0, normal_count * 1.1)
ax.grid(True, alpha=0.3)

ax = axes[1]
colors = ['#2E86C1', '#E74C3C']
explode = (0, 0.1)

wedges, texts, autotexts = ax.pie(
    [normal_count, fraud_count],
    labels=['Normal', 'Fraud'],
    colors=colors,
    explode=explode,
    autopct=lambda pct: f'{pct:.2f}%\n({int(pct/100 * len(df)):,})',
    startangle=90,
    shadow=True
)

for autotext in autotexts:
    autotext.set_color('white')
    autotext.set_fontweight('bold')

ax.set_title('Transaction Type Proportion', fontsize=14, fontweight='bold')

ax = axes[2]
ax.bar(['Normal', 'Fraud'], [normal_count, fraud_count],
       color=['#2E86C1', '#E74C3C'], alpha=0.8, edgecolor='black')
ax.set_yscale('log')
ax.set_ylabel('Number of Transactions (log scale)', fontsize=12)
ax.set_title('Class Distribution (Log Scale)', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, which='both')

for i, (label, value) in enumerate(zip(['Normal', 'Fraud'], [normal_count, fraud_count])):
    ax.text(i, value, f'{value:,}', ha='center', va='bottom', fontweight='bold')

plt.suptitle('üéØ Class Imbalance Analysis', fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

In [None]:
fraud_count = df['Class'].sum()
normal_count = len(df) - fraud_count
fraud_ratio = df['Class'].mean()

fig, axes = plt.subplots(1, 2, figsize=(14, 6))

ax = axes[0]
class_data = pd.DataFrame({
      'Class': ['Normal', 'Fraud'],
      'Count': [normal_count, fraud_count],
      'Percentage': [(1-fraud_ratio)*100, fraud_ratio*100]
  })

bars = ax.bar(class_data['Class'], class_data['Count'],
                 color=['#2E86C1', '#E74C3C'], alpha=0.8, edgecolor='black')

for bar, count, pct in zip(bars, class_data['Count'],
class_data['Percentage']):
      height = bar.get_height()
      ax.text(bar.get_x() + bar.get_width()/2., height,
              f'{count:,}\n({pct:.2f}%)',
              ha='center', va='bottom', fontweight='bold')

ax.set_ylabel('Number of Transactions', fontsize=12)
ax.set_title('Class Distribution', fontsize=14, fontweight='bold')
ax.set_ylim(0, normal_count * 1.1)
ax.grid(True, alpha=0.3)


ax = axes[1]
ax.bar(['Normal', 'Fraud'], [normal_count, fraud_count],
         color=['#2E86C1', '#E74C3C'], alpha=0.8, edgecolor='black')
ax.set_yscale('log')
ax.set_ylabel('Number of Transactions (log scale)', fontsize=12)
ax.set_title('Class Distribution (Log Scale)', fontsize=14,fontweight='bold')
ax.grid(True, alpha=0.3, which='both')

for i, (label, value) in enumerate(zip(['Normal', 'Fraud'], [normal_count,
   fraud_count])):
    ax.text(i, value, f'{value:,}', ha='center', va='bottom',
fontweight='bold')

plt.suptitle(f'Class Imbalance Analysis\n Imbalance Ratio: 1:{int(normal_count/fraud_count)}', fontsize=16,fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

Transaction Amount Analysis

In [None]:
stats_data = []

for class_label in [0, 1]:
    class_name = 'Normal' if class_label == 0 else 'Fraud'
    class_data = df[df['Class'] == class_label]['Amount']

    stats_data.append({
        'Type': class_name,
        'Mean ($)': f"{class_data.mean():.2f}",
        'Median ($)': f"{class_data.median():.2f}",
        'Std Dev ($)': f"{class_data.std():.2f}",
        'Min ($)': f"{class_data.min():.2f}",
        'Max ($)': f"{class_data.max():.2f}",
        'Count': f"{len(class_data):,}"
    })

stats_df = pd.DataFrame(stats_data)
display(stats_df)

fig, axes = plt.subplots(1, 2, figsize=(16, 7))

ax = axes[0]
data_to_plot = [df[df['Class'] == 0]['Amount'].values,
                df[df['Class'] == 1]['Amount'].values]
bp = ax.boxplot(data_to_plot, labels=['Normal', 'Fraud'],
                patch_artist=True, showmeans=True)
colors = ['#2E86C1', '#E74C3C']
for patch, color in zip(bp['boxes'], colors):
    patch.set_facecolor(color)
    patch.set_alpha(0.7)
ax.set_ylabel('Transaction Amount ($)')
ax.set_title('Amount Distribution Comparison', fontweight='bold')
ax.set_yscale('log')
ax.grid(True, alpha=0.3)

ax = axes[1]
sample_indices = np.random.choice(df[df['Class'] == 0].index, 5000,
                                  replace=False)
ax.scatter(df.loc[sample_indices, 'Time'] / 3600,
           df.loc[sample_indices, 'Amount'],
           alpha=0.3, s=10, label='Normal', color='#2E86C1')
ax.scatter(df[df['Class'] == 1]['Time'] / 3600,
           df[df['Class'] == 1]['Amount'],
           alpha=0.8, s=20, label='Fraud', color='#E74C3C', marker='x')
ax.set_xlabel('Time (hours)')
ax.set_ylabel('Amount ($)')
ax.set_title('Transaction Pattern: Amount vs Time', fontweight='bold')
ax.set_yscale('log')
ax.legend()
ax.grid(True, alpha=0.3)

plt.suptitle('üí∞ Transaction Patterns Analysis', fontsize=16,
             fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

PCA Features Analysis (V1-V28):

In [None]:
v_cols = [col for col in df.columns if col.startswith('V')]

correlations = []
for col in v_cols:
    corr = df[col].corr(df['Class'])
    correlations.append((col, corr))

correlations.sort(key=lambda x: abs(x[1]), reverse=True)

print("\nüéØ Top 10 Features Most Correlated with Fraud:")
for i, (col, corr) in enumerate(correlations[:10], 1):
    direction = "up" if corr > 0 else "down"
    print(f"  {i}. {col}: {corr:+.4f} {direction}")

fig, ax = plt.subplots(figsize=(14, 8))

top_features = [col for col, _ in correlations[:15]] + ['Amount', 'Class']
corr_matrix = df[top_features].corr()

mask = np.triu(np.ones_like(corr_matrix, dtype=bool))
sns.heatmap(corr_matrix, mask=mask, annot=True, fmt='.2f',
            cmap='coolwarm', center=0, square=True, linewidths=1,
            cbar_kws={"shrink": 0.8})

plt.title('Correlation Matrix: Top Features vs Fraud',
          fontsize=12, fontweight='bold', pad=20)
plt.tight_layout()
plt.show()


<a id='feature-engineering'></a>
## 4. Feature Engineering üîß

### Strategy:

1. **Time-based features**: Extract hour and day patterns

2. **Amount transformations**: Log transform, polynomial features

3. **Statistical aggregations**: Mean, std, min, max of V features

4. **Anomaly scores**: Z-scores for key features


In [None]:
def create_advanced_features(df):
    print("üîß Feature Engineering Pipeline")

    df_feat = df.copy()
    original_features = len(df.columns)

    print("\n‚è∞ Creating time-based features...")
    df_feat['Hour'] = (df_feat['Time'] / 3600) % 24
    df_feat['Day'] = df_feat['Time'] / (3600 * 24)
    df_feat['Hour_sin'] = np.sin(2 * np.pi * df_feat['Hour'] / 24)
    df_feat['Hour_cos'] = np.cos(2 * np.pi * df_feat['Hour'] / 24)
    df_feat['Is_Weekend'] = (df_feat['Day'] % 7 >= 5).astype(int)
    df_feat['Is_Night'] = ((df_feat['Hour'] >= 22) | (df_feat['Hour'] <= 6)).astype(int)

    print("üí∞ Creating amount-based features...")
    df_feat['Amount_log'] = np.log1p(df_feat['Amount'])
    df_feat['Amount_squared'] = df_feat['Amount'] ** 2
    df_feat['Amount_sqrt'] = np.sqrt(df_feat['Amount'])

    df_feat['Amount_bin'] = pd.qcut(df_feat['Amount'], q=10, labels=False, duplicates='drop')

    print("üìä Creating statistical aggregations...")
    v_cols = [col for col in df.columns if col.startswith('V')]

    df_feat['V_mean'] = df_feat[v_cols].mean(axis=1)
    df_feat['V_std'] = df_feat[v_cols].std(axis=1)
    df_feat['V_max'] = df_feat[v_cols].max(axis=1)
    df_feat['V_min'] = df_feat[v_cols].min(axis=1)
    df_feat['V_range'] = df_feat['V_max'] - df_feat['V_min']
    df_feat['V_skew'] = df_feat[v_cols].skew(axis=1)
    df_feat['V_kurtosis'] = df_feat[v_cols].kurtosis(axis=1)

    print("üéØ Creating anomaly scores...")
    important_features = ['V1', 'V2', 'V3', 'V4', 'V11', 'V12', 'V14', 'V17']

    for col in important_features:
        mean = df_feat[col].mean()
        std = df_feat[col].std()
        df_feat[f'{col}_zscore'] = (df_feat[col] - mean) / std
        df_feat[f'{col}_is_outlier'] = (np.abs(df_feat[f'{col}_zscore']) > 3).astype(int)

    print("üîÑ Creating interaction features...")
    df_feat['Amount_x_V1'] = df_feat['Amount'] * df_feat['V1']
    df_feat['Amount_x_V2'] = df_feat['Amount'] * df_feat['V2']
    df_feat['Hour_x_Amount'] = df_feat['Hour'] * df_feat['Amount']

    outlier_cols = [col for col in df_feat.columns if col.endswith('_is_outlier')]
    df_feat['Total_outliers'] = df_feat[outlier_cols].sum(axis=1)

    new_features = len(df_feat.columns) - original_features

    print(f"\n‚úÖ Feature Engineering Complete!")
    print(f"   - Original features: {original_features}")
    print(f"   - New features created: {new_features}")
    print(f"   - Total features: {len(df_feat.columns)}")

    return df_feat

df_engineered = create_advanced_features(df)

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

ax = axes[0, 0]
for class_label, class_name, color in [(0, 'Normal', '#2E86C1'),
                                        (1, 'Fraud', '#E74C3C')]:
    hour_dist = df_engineered[df_engineered['Class']==class_label]['Hour']
    ax.hist(hour_dist, bins=24, alpha=0.7, label=class_name,
            color=color, density=True)

ax.set_xlabel('Hour of Day')
ax.set_ylabel('Density')
ax.set_title('Transaction Distribution by Hour', fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)

ax = axes[0, 1]
for class_label, class_name, color in [(0, 'Normal', '#2E86C1'),
                                        (1, 'Fraud', '#E74C3C')]:
    amount_log = df_engineered[df_engineered['Class']==class_label]['Amount_log']
    ax.hist(amount_log, bins=50, alpha=0.7, label=class_name,
            color=color, density=True)

ax.set_xlabel('Log(Amount + 1)')
ax.set_ylabel('Density')
ax.set_title('Log-Transformed Amount Distribution', fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)

ax = axes[1, 0]
stats_features = ['V_mean', 'V_std', 'V_range']
positions = np.arange(len(stats_features))
width = 0.35

normal_means = [df_engineered[df_engineered['Class']==0][feat].mean()
                for feat in stats_features]
fraud_means = [df_engineered[df_engineered['Class']==1][feat].mean()
               for feat in stats_features]

ax.bar(positions - width/2, normal_means, width, label='Normal',
       color='#2E86C1', alpha=0.8)
ax.bar(positions + width/2, fraud_means, width, label='Fraud',
       color='#E74C3C', alpha=0.8)

ax.set_xlabel('Statistical Features')
ax.set_ylabel('Mean Value')
ax.set_title('V-Features Statistics by Class', fontweight='bold')
ax.set_xticks(positions)
ax.set_xticklabels(stats_features)
ax.legend()
ax.grid(True, alpha=0.3)

ax = axes[1, 1]
for class_label, class_name, color in [(0, 'Normal', '#2E86C1'),
                                        (1, 'Fraud', '#E74C3C')]:
    outlier_counts = df_engineered[df_engineered['Class']==class_label]['Total_outliers']
    ax.hist(outlier_counts, bins=range(0, 10), alpha=0.7, label=class_name,
            color=color, density=True)

ax.set_xlabel('Number of Outlier Features')
ax.set_ylabel('Density')
ax.set_title('Outlier Count Distribution by Class', fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)

plt.suptitle('Engineered Features Analysis', fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()


<a id='model-development'></a>
## 5. Model Development ü§ñ

### Approach:

1. **Train-Validation-Test Split** (60-20-20)

2. **Imbalance Handling**: Compare no sampling, SMOTE, and undersampling

3. **Model**: CatBoost with class weights

4. **Optimization**: Optuna for hyperparameter tuning


Preparing Data for Modeling

In [None]:
X = df_engineered.drop(['Class', 'Time'], axis=1)
y = df_engineered['Class']

print(f"\nüìä Feature Matrix Shape: {X.shape}")
print(f"üéØ Target Distribution: {y.value_counts().to_dict()}")

X_temp, X_test, y_temp, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

X_train, X_val, y_train, y_val = train_test_split(
    X_temp, y_temp, test_size=0.25, random_state=42, stratify=y_temp
)

print(f"\n‚úÖ Data Split Complete:")
print(f"    Training Set: {X_train.shape[0]:,} samples ({y_train.mean()*100:.2f}% fraud)")
print(f"    Validation Set: {X_val.shape[0]:,} samples ({y_val.mean()*100:.2f}% fraud)")
print(f"    Test Set: {X_test.shape[0]:,} samples ({y_test.mean()*100:.2f}% fraud)")


In [None]:
baseline_model = CatBoostClassifier(
    iterations=200,
    learning_rate=0.01,
    depth=8,
    l2_leaf_reg=3,
    loss_function='Logloss',
    eval_metric='PRAUC',
    random_seed=42,
    bootstrap_type='Bayesian',
    class_weights={0: 1, 1: 250},
    use_best_model=True
)

baseline_model.fit(
    X_train, y_train,
    eval_set=(X_val, y_val),
    early_stopping_rounds=50,
    verbose=False
)

y_val_pred = baseline_model.predict(X_val)
y_val_proba = baseline_model.predict_proba(X_val)[:, 1]

baseline_metrics = {
    'ROC-AUC': roc_auc_score(y_val, y_val_proba),
    'PR-AUC': average_precision_score(y_val, y_val_proba),
    'F1-Score': f1_score(y_val, y_val_pred),
    'Precision': precision_score(y_val, y_val_pred),
    'Recall': recall_score(y_val, y_val_pred)
}

print("\nüìä Baseline Model Performance:")
for metric, value in baseline_metrics.items():
    print(f"    {metric}: {value:.4f}")

cm = confusion_matrix(y_val, y_val_pred)
tn, fp, fn, tp = cm.ravel()
print(f"\nüìã Confusion Matrix:")
print(f"    True Positives: {tp} | False Positives: {fp}")
print(f"    False Negatives: {fn} | True Negatives: {tn}")

print(f"\nüíº Business Metrics:")
print(f"    Fraud Detection Rate: {tp/(tp+fn)*100:.1f}%")
print(f"    Alert Accuracy: {tp/(tp+fp)*100:.1f}%")
print(f"    False Positive Rate: {fp/(fp+tn)*100:.3f}%")
print(f"    Missed Fraud: {fn} transactions")

Hyperparameter Optimization with Optuna

In [None]:
def objective(trial):
    params = {
        'iterations': trial.suggest_int('iterations', 50, 500),
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
        'depth': trial.suggest_int('depth', 4, 10),
        'l2_leaf_reg': trial.suggest_float('l2_leaf_reg', 1, 10),
        'random_strength': trial.suggest_float('random_strength', 0, 1),
        'bagging_temperature': trial.suggest_float('bagging_temperature', 0, 1),
        'border_count': trial.suggest_int('border_count', 32, 255),
        'loss_function': 'Logloss',
        'eval_metric': 'AUC',
        'random_seed': 42,
        'verbose': False
    }

    class_weight = trial.suggest_float('class_weight', 50, 200)
    params['class_weights'] = {0: 1, 1: class_weight}

    model = CatBoostClassifier(**params)
    model.fit(
        X_train, y_train,
        eval_set=(X_val, y_val),
        early_stopping_rounds=20,
        verbose=False
    )

    y_pred_proba = model.predict_proba(X_val)[:, 1]
    return average_precision_score(y_val, y_pred_proba)

study = optuna.create_study(direction='maximize', sampler=optuna.samplers.TPESampler(seed=42))
study.optimize(objective, n_trials=10, show_progress_bar=True)

print(f"\n‚úÖ Optimization Complete!")
print(f"   Best PR-AUC: {study.best_value:.4f}")
print(f"\nüìã Best Parameters:")
for param, value in study.best_params.items():
    if isinstance(value, float):
        print(f"    {param}: {value:.4f}")
    else:
        print(f"    {param}: {value}")


In [None]:
best_params = study.best_params.copy()
class_weight = best_params.pop('class_weight')
best_params['class_weights'] = {0: 1, 1: class_weight}
best_params['loss_function'] = 'Logloss'
best_params['eval_metric'] = 'AUC'
best_params['random_seed'] = 42
best_params['verbose'] = False

final_model = CatBoostClassifier(**best_params)
final_model.fit(
    X_train, y_train,
    eval_set=(X_val, y_val),
    early_stopping_rounds=30,
    verbose=100
)

<a id='evaluation'></a>
## 6. Model Evaluation & Interpretability üìä

### Comprehensive evaluation includes:

1. Performance metrics (ROC-AUC, PR-AUC, F1, Presion, Recall)

2. Confusion matrix analysis

3. ROC and PR curves

4. Feature importance

5. SHAP analysis for interpretability



üéØ Final Model Evaluation on Test Set

In [None]:
y_test_pred = final_model.predict(X_test)
y_test_proba = final_model.predict_proba(X_test)[:, 1]

test_metrics = {
    'ROC-AUC': roc_auc_score(y_test, y_test_proba),
    'PR-AUC': average_precision_score(y_test, y_test_proba),
    'F1-Score': f1_score(y_test, y_test_pred),
    'Precision': (y_test_pred[y_test == 1] == 1).mean(),
    'Recall': (y_test[y_test_pred == 1] == 1).mean()
}

print("\nüìä Test Set Performance:")
for metric, value in test_metrics.items():
    print(f"    {metric}: {value:.4f}")

cm = confusion_matrix(y_test, y_test_pred)
tn, fp, fn, tp = cm.ravel()

print(f"\nüìã Confusion Matrix Analysis:")
print(f"    True Negatives: {tn:,} ({tn/(tn+fp)*100:.2f}%)")
print(f"    False Positives: {fp:,} ({fp/(tn+fp)*100:.2f}%)")
print(f"    False Negatives: {fn:,} ({fn/(fn+tp)*100:.2f}%)")
print(f"    True Positives: {tp:,} ({tp/(fn+tp)*100:.2f}%)")


In [None]:
fig = plt.figure(figsize=(20, 12))
gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)

ax1 = fig.add_subplot(gs[0, 0])
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Normal', 'Fraud'],
            yticklabels=['Normal', 'Fraud'],
            cbar_kws={'label': 'Count'})
ax1.set_ylabel('True Label')
ax1.set_xlabel('Predicted Label')
ax1.set_title('Confusion Matrix', fontweight='bold')

for i in range(2):
    for j in range(2):
        percentage = cm[i, j] / cm.sum() * 100
        ax1.text(j + 0.5, i + 0.7, f'({percentage:.2f}%)',
                ha='center', va='center', fontsize=9, color='red')

ax2 = fig.add_subplot(gs[0, 1])
fpr, tpr, roc_thresholds = roc_curve(y_test, y_test_proba)
roc_auc = roc_auc_score(y_test, y_test_proba)

ax2.plot(fpr, tpr, color='#E74C3C', lw=2,
         label=f'ROC curve (AUC = {roc_auc:.4f})')
ax2.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', alpha=0.5)
ax2.fill_between(fpr, tpr, alpha=0.3, color='#E74C3C')
ax2.set_xlim([0.0, 1.0])
ax2.set_ylim([0.0, 1.05])
ax2.set_xlabel('False Positive Rate')
ax2.set_ylabel('True Positive Rate')
ax2.set_title('ROC Curve', fontweight='bold')
ax2.legend(loc="lower right")
ax2.grid(True, alpha=0.3)

ax3 = fig.add_subplot(gs[0, 2])
precision, recall, pr_thresholds = precision_recall_curve(y_test, y_test_proba)
pr_auc = average_precision_score(y_test, y_test_proba)

ax3.plot(recall, precision, color='#27AE60', lw=2,
         label=f'PR curve (AUC = {pr_auc:.4f})')
ax3.fill_between(recall, precision, alpha=0.3, color='#27AE60')
ax3.axhline(y=y_test.mean(), color='navy', linestyle='--',
            label=f'Baseline (Random): {y_test.mean():.4f}')
ax3.set_xlabel('Recall')
ax3.set_ylabel('Precision')
ax3.set_title('Precision-Recall Curve', fontweight='bold')
ax3.legend(loc="upper right")
ax3.grid(True, alpha=0.3)

ax4 = fig.add_subplot(gs[1, :])
feature_importance = final_model.get_feature_importance()
feature_names = X_train.columns
importance_df = pd.DataFrame({
    'feature': feature_names[:len(feature_importance)],
    'importance': feature_importance
}).sort_values('importance', ascending=False).head(20)

colors = plt.cm.viridis(importance_df['importance'] / importance_df['importance'].max())
bars = ax4.barh(range(len(importance_df)), importance_df['importance'], color=colors)
ax4.set_yticks(range(len(importance_df)))
ax4.set_yticklabels(importance_df['feature'])
ax4.set_xlabel('Importance Score')
ax4.set_title('Top 20 Feature Importances', fontweight='bold')
ax4.invert_yaxis()
ax4.grid(True, alpha=0.3, axis='x')

for i, (bar, value) in enumerate(zip(bars, importance_df['importance'])):
    ax4.text(value, i, f' {value:.1f}', va='center', fontsize=9)

ax5 = fig.add_subplot(gs[2, 0])
thresholds = np.linspace(0.1, 0.9, 50)
metrics_by_threshold = []

for threshold in thresholds:
    y_pred_threshold = (y_test_proba >= threshold).astype(int)
    metrics_by_threshold.append({
        'threshold': threshold,
        'precision': precision_score(y_test, y_pred_threshold, zero_division=0),
        'recall': recall_score(y_test, y_pred_threshold),
        'f1': f1_score(y_test, y_pred_threshold, zero_division=0)
    })

metrics_df = pd.DataFrame(metrics_by_threshold)
ax2 = ax5.twinx()

line1 = ax5.plot(metrics_df['threshold'], metrics_df['precision'],
                'b-', linewidth=2, label='Precision')
line2 = ax2.plot(metrics_df['threshold'], metrics_df['recall'],
                 'r-', linewidth=2, label='Recall')
ax5.plot(metrics_df['threshold'], metrics_df['f1'],
         'g-', linewidth=2, label='F1-Score')

ax5.set_xlabel('Threshold')
ax5.set_ylabel('Score')
ax5.set_title('Metrics by Decision Threshold', fontweight='bold')
ax5.legend()
ax5.grid(True, alpha=0.3)

ax6 = fig.add_subplot(gs[2, 1])
ax6.hist(y_test_proba[y_test == 0], bins=50, alpha=0.7,
         label='Normal', color='#2E86C1', density=True)
ax6.hist(y_test_proba[y_test == 1], bins=50, alpha=0.7,
         label='Fraud', color='#E74C3C', density=True)
ax6.set_xlabel('Predicted Probability')
ax6.set_ylabel('Density')
ax6.set_title('Score Distribution by Class', fontweight='bold')
ax6.legend()
ax6.grid(True, alpha=0.3)

ax7 = fig.add_subplot(gs[2, 2])
from sklearn.calibration import calibration_curve
fraction_pos, mean_pred = calibration_curve(y_test, y_test_proba, n_bins=10)
ax7.plot(mean_pred, fraction_pos, marker='o', linewidth=2,
         label='Model', color='#E74C3C')
ax7.plot([0, 1], [0, 1], linestyle='--', label='Perfect Calibration',
         color='gray', alpha=0.5)
ax7.set_xlabel('Mean Predicted Probability')
ax7.set_ylabel('Fraction of Positives')
ax7.set_title('Calibration Plot', fontweight='bold')
ax7.legend()
ax7.grid(True, alpha=0.3)

plt.suptitle('Comprehensive Model Evaluation',
             fontsize=18, fontweight='bold', y=1.02)
plt.show()

üîç SHAP Analysis for Model Interpretability

In [None]:
sample_size = 1000
X_test_sample = X_test.sample(n=min(sample_size, len(X_test)), random_state=42)

explainer = shap.TreeExplainer(final_model)
shap_values = explainer.shap_values(X_test_sample)

<a id='business-impact'></a>
## 7. Business Impact Analysis üí∞

### ROI Calculation:

- **Prevented fraud loss**: $1,000 √ó True Positives √ó 1.5

- **Investigation cost**: $50 √ó (True Positives + False Positives)

- **Customer friction**: $100 √ó False Positives √ó 0.02

- **Missed fraud**: $1,000 √ó False Negatives √ó 1.5


In [None]:
def calculate_business_metrics(y_true, y_pred, y_proba):
    tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()

    avg_fraud_amount = 1000
    investigation_cost = 50
    fraud_multiplier = 1.5
    friction_rate = 0.02

    prevented_fraud = tp * avg_fraud_amount * fraud_multiplier
    missed_fraud = fn * avg_fraud_amount * fraud_multiplier
    investigation_costs = (tp + fp) * investigation_cost
    customer_friction = fp * avg_fraud_amount * friction_rate

    net_benefit = prevented_fraud - missed_fraud - investigation_costs - customer_friction

    total_fraud = (tp + fn) * avg_fraud_amount * fraud_multiplier
    savings = prevented_fraud - investigation_costs - customer_friction

    roi = (savings / (investigation_costs + customer_friction)) * 100 if investigation_costs > 0 else 0

    return {
        'prevented_fraud': prevented_fraud,
        'missed_fraud': missed_fraud,
        'investigation_costs': investigation_costs,
        'customer_friction': customer_friction,
        'net_benefit': net_benefit,
        'total_possible_fraud': total_fraud,
        'savings_vs_no_model': savings,
        'roi': roi,
        'tp': tp, 'fp': fp, 'fn': fn, 'tn': tn
    }

business_metrics = calculate_business_metrics(y_test, y_test_pred, y_test_proba)

print("üí∞ Business Impact Analysis")
print(f"\nüìä Detection Performance:")
print(f"    Frauds Detected: {business_metrics['tp']}/{business_metrics['tp']+business_metrics['fn']} ({business_metrics['tp']/(business_metrics['tp']+business_metrics['fn'])*100:.1f}%)")
print(f"    False Alarms: {business_metrics['fp']}")
print(f"    Correct Rejections: {business_metrics['tn']:,}")

print(f"\nüíµ Financial Impact:")
print(f"    Prevented Fraud Loss: ${business_metrics['prevented_fraud']:,.2f}")
print(f"    Missed Fraud Loss: ${business_metrics['missed_fraud']:,.2f}")
print(f"    Investigation Costs: ${business_metrics['investigation_costs']:,.2f}")
print(f"    Customer Friction: ${business_metrics['customer_friction']:,.2f}")
print(f"\n   üí∞ NET BENEFIT: ${business_metrics['net_benefit']:,.2f}")
print(f"   üìà ROI: {business_metrics['roi']:.1f}%")

annual_multiplier = 365 / 2
annual_benefit = business_metrics['net_benefit'] * annual_multiplier

print(f"\nüìÖ Annual Projection:")
print(f"    Annual Net Benefit: ${annual_benefit:,.2f}")
print(f"    Annual Savings: ${business_metrics['savings_vs_no_model'] * annual_multiplier:,.2f}")


In [None]:
final_model.save_model('fraud_detection_final.cbm')

importance_df.to_csv('feature_importance.csv', index=False)

metrics_summary = pd.DataFrame([test_metrics])
metrics_summary.to_csv('model_metrics.csv', index=False)

business_df = pd.DataFrame([business_metrics])
business_df.to_csv('business_impact.csv', index=False)