# Financial Transaction Fraud Detection System
## Real-time Suspicious Activity Detection with ML Models

This notebook implements a comprehensive fraud detection system with:
- **3 ML Models**: Isolation Forest, Random Forest, XGBoost
- **Real-time Scoring**: Stream-based transaction validation
- **Visualization**: Transaction patterns, fraud distribution, model comparison
- **Explainability**: SHAP values for fraud reasons
- **Monitoring**: Drift detection and performance tracking

In [1]:
# 1. ENVIRONMENT SETUP & IMPORTS
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier, IsolationForest
from sklearn.linear_model import LogisticRegression
from xgboost import XGBClassifier
from sklearn.metrics import (confusion_matrix, classification_report, 
                             roc_auc_score, auc, roc_curve, precision_recall_curve)
import shap
import joblib
from datetime import datetime, timedelta
import json

# Set random seeds
np.random.seed(42)

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print("‚úì All imports successful")

Matplotlib is building the font cache; this may take a moment.


ModuleNotFoundError: No module named 'shap'

In [None]:
# 2. LOAD DATASET & QUICK SANITY CHECKS
file_path = r"c:\Users\ringa\OneDrive\Desktop\project\new\Copy of Sample_DATA.csv"
df = pd.read_csv(file_path)

print(f"Dataset Shape: {df.shape}")
print(f"\nFirst few rows:")
print(df.head())
print(f"\nColumn Data Types:")
print(df.dtypes)
print(f"\nMissing Values:")
print(df.isnull().sum())
print(f"\nFraud Class Distribution:")
print(df['fraud'].value_counts())
print(f"Fraud Rate: {df['fraud'].mean()*100:.2f}%")

In [None]:
# 3. DATA PREPROCESSING & CLEANING
df_clean = df.copy()

# Parse DateTime
df_clean['DateTime'] = pd.to_datetime(df_clean['Date'] + ' ' + df_clean['Time'], 
                                       format='%d/%m/%y %I:%M:%S %p', errors='coerce')

# Fill missing datetime with current time
df_clean['DateTime'].fillna(pd.Timestamp.now(), inplace=True)

# Drop unnecessary columns
drop_cols = ['Transaction_ID', 'Date', 'Time', 'Merchant_ID', 'Customer_ID', 'Device_ID', 'IP_Address']
df_clean = df_clean.drop(columns=drop_cols)

# Numeric columns - handle any non-numeric values
numeric_cols = ['Transaction_Amount_Deviation', 'Days_Since_Last_Transaction', 'amount', 'Transaction_Frequency']
for col in numeric_cols:
    df_clean[col] = pd.to_numeric(df_clean[col], errors='coerce')
    df_clean[col].fillna(df_clean[col].median(), inplace=True)

# Categorical columns
categorical_cols = ['Transaction_Type', 'Payment_Gateway', 'Device_OS', 
                    'Merchant_Category', 'Transaction_Channel', 'Transaction_Status']

for col in categorical_cols:
    df_clean[col] = df_clean[col].fillna('Unknown')

print("‚úì Data Cleaning Complete")
print(f"Dataset shape after cleaning: {df_clean.shape}")
print(f"\nData Info:")
print(df_clean.info())

In [None]:
# 4. FEATURE ENGINEERING
df_features = df_clean.copy()

# Time-based features
df_features['hour'] = df_features['DateTime'].dt.hour
df_features['day_of_week'] = df_features['DateTime'].dt.dayofweek
df_features['day_of_month'] = df_features['DateTime'].dt.day
df_features['month'] = df_features['DateTime'].dt.month

# Risk indicators from existing features
df_features['is_high_amount'] = (df_features['amount'] > df_features['amount'].quantile(0.75)).astype(int)
df_features['is_low_frequency'] = (df_features['Transaction_Frequency'] <= 2).astype(int)
df_features['amount_deviation_high'] = (abs(df_features['Transaction_Amount_Deviation']) > 50).astype(int)
df_features['failed_status'] = (df_features['Transaction_Status'] == 'Failed').astype(int)
df_features['pending_status'] = (df_features['Transaction_Status'] == 'Pending').astype(int)

# Encode categorical variables
label_encoders = {}
for col in categorical_cols:
    le = LabelEncoder()
    df_features[col + '_encoded'] = le.fit_transform(df_features[col])
    label_encoders[col] = le

print("‚úì Feature Engineering Complete")
print(f"Total Features: {df_features.shape[1]}")
print(f"\nNew engineered features:")
engineered_features = ['hour', 'day_of_week', 'day_of_month', 'month', 
                       'is_high_amount', 'is_low_frequency', 'amount_deviation_high',
                       'failed_status', 'pending_status']
print(engineered_features)

In [None]:
# 5. PREPARE TRAINING DATA & SPLIT
# Select features for modeling
feature_cols = (numeric_cols + engineered_features + 
                [col + '_encoded' for col in categorical_cols])

X = df_features[feature_cols].fillna(0)
y = df_features['fraud']

# Time-based split (to avoid data leakage)
split_point = int(len(X) * 0.7)
split_point_val = int(len(X) * 0.85)

X_train, y_train = X.iloc[:split_point], y.iloc[:split_point]
X_val, y_val = X.iloc[split_point:split_point_val], y.iloc[split_point:split_point_val]
X_test, y_test = X.iloc[split_point_val:], y.iloc[split_point_val:]

# Standardize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

print(f"‚úì Data Split Complete")
print(f"Training set: {X_train.shape} | Fraud rate: {y_train.mean()*100:.2f}%")
print(f"Validation set: {X_val.shape} | Fraud rate: {y_val.mean()*100:.2f}%")
print(f"Test set: {X_test.shape} | Fraud rate: {y_test.mean()*100:.2f}%")

In [None]:
# 6. EXPLORATORY DATA ANALYSIS & VISUALIZATIONS
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Transaction Amount Distribution
axes[0, 0].hist(df_features[df_features['fraud']==0]['amount'], bins=50, alpha=0.7, label='Legitimate', color='green')
axes[0, 0].hist(df_features[df_features['fraud']==1]['amount'], bins=50, alpha=0.7, label='Fraudulent', color='red')
axes[0, 0].set_xlabel('Transaction Amount')
axes[0, 0].set_ylabel('Frequency')
axes[0, 0].set_title('Transaction Amount Distribution by Fraud Status')
axes[0, 0].legend()

# Fraud Rate by Transaction Type
fraud_by_type = df_features.groupby('Transaction_Type')['fraud'].agg(['sum', 'count'])
fraud_by_type['rate'] = fraud_by_type['sum'] / fraud_by_type['count']
axes[0, 1].barh(fraud_by_type.index, fraud_by_type['rate'], color='coral')
axes[0, 1].set_xlabel('Fraud Rate')
axes[0, 1].set_title('Fraud Rate by Transaction Type')

# Transaction Status vs Fraud
status_fraud = pd.crosstab(df_features['Transaction_Status'], df_features['fraud'])
status_fraud.plot(kind='bar', ax=axes[1, 0], color=['green', 'red'])
axes[1, 0].set_title('Transaction Status vs Fraud')
axes[1, 0].set_xlabel('Status')
axes[1, 0].set_ylabel('Count')
axes[1, 0].legend(['Legitimate', 'Fraud'])

# Fraud by Hour of Day
fraud_by_hour = df_features.groupby('hour')['fraud'].mean()
axes[1, 1].plot(fraud_by_hour.index, fraud_by_hour.values, marker='o', color='purple')
axes[1, 1].set_xlabel('Hour of Day')
axes[1, 1].set_ylabel('Fraud Rate')
axes[1, 1].set_title('Fraud Rate by Hour of Day')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(r'c:\Users\ringa\OneDrive\Desktop\project\new\eda_visualizations.png', dpi=100, bbox_inches='tight')
plt.show()

print("‚úì EDA Visualizations Complete")

In [None]:
# 7. CORRELATION HEATMAP
fig, ax = plt.subplots(figsize=(12, 8))
numeric_features = numeric_cols + engineered_features
correlation = df_features[numeric_features + ['fraud']].corr()
sns.heatmap(correlation, annot=True, fmt='.2f', cmap='coolwarm', center=0, ax=ax, cbar_kws={'label': 'Correlation'})
ax.set_title('Feature Correlation Matrix')
plt.tight_layout()
plt.savefig(r'c:\Users\ringa\OneDrive\Desktop\project\new\correlation_heatmap.png', dpi=100, bbox_inches='tight')
plt.show()

print("‚úì Correlation Analysis Complete")

In [None]:
# 8. MODEL 1: ISOLATION FOREST (UNSUPERVISED ANOMALY DETECTION)
print("="*60)
print("MODEL 1: ISOLATION FOREST")
print("="*60)

iso_forest = IsolationForest(
    contamination=y_train.mean(),  # Set contamination to match fraud rate
    random_state=42,
    n_jobs=-1
)

# Train on full training data
iso_forest.fit(X_train_scaled)

# Predictions (1 for normal, -1 for anomaly)
y_train_iso = iso_forest.predict(X_train_scaled)
y_val_iso = iso_forest.predict(X_val_scaled)
y_test_iso = iso_forest.predict(X_test_scaled)

# Convert to binary (0 for normal, 1 for anomaly)
y_train_iso_pred = (y_train_iso == -1).astype(int)
y_val_iso_pred = (y_val_iso == -1).astype(int)
y_test_iso_pred = (y_test_iso == -1).astype(int)

# Get anomaly scores
train_scores = iso_forest.score_samples(X_train_scaled)
val_scores = iso_forest.score_samples(X_val_scaled)
test_scores = iso_forest.score_samples(X_test_scaled)

print(f"Isolation Forest - Test Set Metrics:")
print(f"Accuracy: {(y_test_iso_pred == y_test).mean():.4f}")
print(f"ROC-AUC: {roc_auc_score(y_test, -test_scores):.4f}")
print(f"Confusion Matrix:\n{confusion_matrix(y_test, y_test_iso_pred)}")

In [None]:
# 9. MODEL 2: RANDOM FOREST CLASSIFIER
print("\n" + "="*60)
print("MODEL 2: RANDOM FOREST CLASSIFIER")
print("="*60)

rf_model = RandomForestClassifier(
    n_estimators=100,
    max_depth=15,
    min_samples_split=10,
    class_weight='balanced',  # Handle class imbalance
    random_state=42,
    n_jobs=-1
)

# Train model
rf_model.fit(X_train_scaled, y_train)

# Predictions
y_train_rf_pred = rf_model.predict(X_train_scaled)
y_val_rf_pred = rf_model.predict(X_val_scaled)
y_test_rf_pred = rf_model.predict(X_test_scaled)

# Probabilities for scoring
y_train_rf_proba = rf_model.predict_proba(X_train_scaled)[:, 1]
y_val_rf_proba = rf_model.predict_proba(X_val_scaled)[:, 1]
y_test_rf_proba = rf_model.predict_proba(X_test_scaled)[:, 1]

print(f"Random Forest - Test Set Metrics:")
print(f"Accuracy: {(y_test_rf_pred == y_test).mean():.4f}")
print(f"ROC-AUC: {roc_auc_score(y_test, y_test_rf_proba):.4f}")
print(f"Confusion Matrix:\n{confusion_matrix(y_test, y_test_rf_pred)}")

# Feature Importance
feature_importance_rf = pd.DataFrame({
    'feature': feature_cols,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)

print(f"\nTop 10 Most Important Features (Random Forest):")
print(feature_importance_rf.head(10))

In [None]:
# 10. MODEL 3: XGBOOST CLASSIFIER
print("\n" + "="*60)
print("MODEL 3: XGBOOST CLASSIFIER")
print("="*60)

xgb_model = XGBClassifier(
    n_estimators=100,
    max_depth=7,
    learning_rate=0.1,
    scale_pos_weight=len(y_train[y_train==0])/len(y_train[y_train==1]),  # Handle imbalance
    random_state=42,
    eval_metric='logloss',
    verbosity=0
)

# Train model
xgb_model.fit(
    X_train_scaled, y_train,
    eval_set=[(X_val_scaled, y_val)],
    early_stopping_rounds=20,
    verbose=False
)

# Predictions
y_train_xgb_pred = xgb_model.predict(X_train_scaled)
y_val_xgb_pred = xgb_model.predict(X_val_scaled)
y_test_xgb_pred = xgb_model.predict(X_test_scaled)

# Probabilities for scoring
y_train_xgb_proba = xgb_model.predict_proba(X_train_scaled)[:, 1]
y_val_xgb_proba = xgb_model.predict_proba(X_val_scaled)[:, 1]
y_test_xgb_proba = xgb_model.predict_proba(X_test_scaled)[:, 1]

print(f"XGBoost - Test Set Metrics:")
print(f"Accuracy: {(y_test_xgb_pred == y_test).mean():.4f}")
print(f"ROC-AUC: {roc_auc_score(y_test, y_test_xgb_proba):.4f}")
print(f"Confusion Matrix:\n{confusion_matrix(y_test, y_test_xgb_pred)}")

# Feature Importance
feature_importance_xgb = pd.DataFrame({
    'feature': feature_cols,
    'importance': xgb_model.feature_importances_
}).sort_values('importance', ascending=False)

print(f"\nTop 10 Most Important Features (XGBoost):")
print(feature_importance_xgb.head(10))

In [None]:
# 11. MODEL COMPARISON & EVALUATION
print("\n" + "="*60)
print("MODEL PERFORMANCE COMPARISON")
print("="*60)

models_comparison = pd.DataFrame({
    'Model': ['Isolation Forest', 'Random Forest', 'XGBoost'],
    'Accuracy': [
        (y_test_iso_pred == y_test).mean(),
        (y_test_rf_pred == y_test).mean(),
        (y_test_xgb_pred == y_test).mean()
    ],
    'ROC-AUC': [
        roc_auc_score(y_test, -test_scores),
        roc_auc_score(y_test, y_test_rf_proba),
        roc_auc_score(y_test, y_test_xgb_proba)
    ]
})

print("\nModel Performance on Test Set:")
print(models_comparison.to_string(index=False))

# Plot ROC curves for all models
fpr_iso, tpr_iso, _ = roc_curve(y_test, -test_scores)
fpr_rf, tpr_rf, _ = roc_curve(y_test, y_test_rf_proba)
fpr_xgb, tpr_xgb, _ = roc_curve(y_test, y_test_xgb_proba)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# ROC Curves
ax1.plot(fpr_iso, tpr_iso, label=f"Isolation Forest (AUC={roc_auc_score(y_test, -test_scores):.3f})", linewidth=2)
ax1.plot(fpr_rf, tpr_rf, label=f"Random Forest (AUC={roc_auc_score(y_test, y_test_rf_proba):.3f})", linewidth=2)
ax1.plot(fpr_xgb, tpr_xgb, label=f"XGBoost (AUC={roc_auc_score(y_test, y_test_xgb_proba):.3f})", linewidth=2)
ax1.plot([0, 1], [0, 1], 'k--', label='Random Classifier', alpha=0.3)
ax1.set_xlabel('False Positive Rate')
ax1.set_ylabel('True Positive Rate')
ax1.set_title('ROC Curves - Model Comparison')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Feature Importance Comparison
top_features_rf = feature_importance_rf.head(8)
top_features_xgb = feature_importance_xgb.head(8)

x_pos = np.arange(len(top_features_rf))
width = 0.35

ax2.bar(x_pos - width/2, top_features_rf['importance'], width, label='Random Forest', alpha=0.8)
ax2.bar(x_pos + width/2, top_features_xgb.iloc[:len(top_features_rf)]['importance'], width, label='XGBoost', alpha=0.8)
ax2.set_xlabel('Features')
ax2.set_ylabel('Importance')
ax2.set_title('Top 8 Feature Importance Comparison')
ax2.set_xticks(x_pos)
ax2.set_xticklabels(top_features_rf['feature'], rotation=45, ha='right')
ax2.legend()

plt.tight_layout()
plt.savefig(r'c:\Users\ringa\OneDrive\Desktop\project\new\model_comparison.png', dpi=100, bbox_inches='tight')
plt.show()

print("‚úì Model Comparison Complete")

In [None]:
# 12. ENSEMBLE VOTING & CONSENSUS RULES
print("\n" + "="*60)
print("ENSEMBLE VOTING STRATEGY")
print("="*60)

# Simple voting ensemble (majority voting)
ensemble_votes = (y_test_iso_pred + y_test_rf_pred + y_test_xgb_pred)
ensemble_pred = (ensemble_votes >= 2).astype(int)  # Flag if at least 2 models vote fraud

print(f"Ensemble (Voting) - Test Set Metrics:")
print(f"Accuracy: {(ensemble_pred == y_test).mean():.4f}")
print(f"ROC-AUC: {roc_auc_score(y_test, ensemble_votes):.4f}")
print(f"Confusion Matrix:\n{confusion_matrix(y_test, ensemble_pred)}")
print(f"Classification Report:\n{classification_report(y_test, ensemble_pred)}")

# Weighted average of probabilities (for probability-based models)
ensemble_proba_weighted = (0.3 * y_test_rf_proba + 0.7 * y_test_xgb_proba)  # Favor XGBoost
ensemble_pred_proba = (ensemble_proba_weighted >= 0.5).astype(int)

print(f"\nEnsemble (Weighted Probability) - Test Set Metrics:")
print(f"Accuracy: {(ensemble_pred_proba == y_test).mean():.4f}")
print(f"ROC-AUC: {roc_auc_score(y_test, ensemble_proba_weighted):.4f}")
print(f"Confusion Matrix:\n{confusion_matrix(y_test, ensemble_pred_proba)}")

In [None]:
# 13. REAL-TIME SCORING ENGINE & FLAGGING LOGIC
class RealTimeTransactionScorer:
    \"\"\"Real-time fraud detection scoring engine with multiple models and thresholding\"\"\"\n
    def __init__(self, iso_forest, rf_model, xgb_model, scaler, threshold=0.5):
        self.iso_forest = iso_forest
        self.rf_model = rf_model
        self.xgb_model = xgb_model
        self.scaler = scaler
        self.threshold = threshold
        self.flagged_transactions = []
        
    def score_transaction(self, transaction_features):
        \"\"\"Score a single transaction using ensemble approach\"\"\"\n
        # Scale features
        features_scaled = self.scaler.transform(transaction_features.reshape(1, -1))
        
        # Model 1: Isolation Forest (anomaly score)
        iso_score = -self.iso_forest.score_samples(features_scaled)[0]
        iso_pred = (iso_score > 0.5).astype(int)
        
        # Model 2: Random Forest (probability)
        rf_proba = self.rf_model.predict_proba(features_scaled)[0, 1]
        rf_pred = (rf_proba >= self.threshold).astype(int)
        
        # Model 3: XGBoost (probability)
        xgb_proba = self.xgb_model.predict_proba(features_scaled)[0, 1]
        xgb_pred = (xgb_proba >= self.threshold).astype(int)
        
        # Ensemble: consensus voting
        ensemble_score = (iso_pred + rf_pred + xgb_pred) / 3  # Average of predictions
        ensemble_pred = int(ensemble_score >= 0.66)  # Flag if consensus is strong
        
        return {
            'iso_score': iso_score,
            'rf_proba': rf_proba,
            'xgb_proba': xgb_proba,
            'ensemble_score': ensemble_score,
            'is_flagged': ensemble_pred,
            'confidence': max(iso_score, rf_proba, xgb_proba)
        }
    
    def flag_suspicious_transaction(self, transaction_id, features, scores):
        \"\"\"Log flagged transaction for review\"\"\"\n
        flag_record = {
            'timestamp': datetime.now(),
            'transaction_id': transaction_id,
            'scores': scores,
            'action': 'FLAG_FOR_REVIEW'
        }
        self.flagged_transactions.append(flag_record)
        return flag_record

# Initialize scorer with trained models
scorer = RealTimeTransactionScorer(
    iso_forest=iso_forest,
    rf_model=rf_model,
    xgb_model=xgb_model,
    scaler=scaler,
    threshold=0.5
)

print(\"‚úì Real-Time Scoring Engine Initialized\")

In [None]:
# 14. REAL-TIME STREAMING SIMULATION
print("\n" + "="*60)
print("REAL-TIME TRANSACTION STREAMING SIMULATION")
print("="*60)

# Simulate real-time scoring on test set
real_time_results = []

for idx, (i, row) in enumerate(X_test.iterrows()):
    features = row.values
    scores = scorer.score_transaction(features)
    
    result = {
        'transaction_idx': i,
        'actual_fraud': y_test.iloc[idx],
        'flagged': scores['is_flagged'],
        'iso_score': scores['iso_score'],
        'rf_proba': scores['rf_proba'],
        'xgb_proba': scores['xgb_proba'],
        'ensemble_score': scores['ensemble_score'],
        'confidence': scores['confidence']
    }
    real_time_results.append(result)
    
    if scores['is_flagged']:
        scorer.flag_suspicious_transaction(f\"TXN_{i}\", features, scores)

# Convert to DataFrame
real_time_df = pd.DataFrame(real_time_results)

print(f"‚úì Processed {len(real_time_df)} transactions in real-time simulation")
print(f"Flagged Transactions: {real_time_df['flagged'].sum()}")
print(f"True Fraud Cases: {real_time_df['actual_fraud'].sum()}")
print(f"\nReal-time Detection Rate: {real_time_df['flagged'].mean()*100:.2f}%")
print(f"Actual Fraud Rate: {real_time_df['actual_fraud'].mean()*100:.2f}%")

# Performance metrics
print(f"\nReal-time Performance on Test Set:")
print(f\"Accuracy: {(real_time_df['flagged'] == real_time_df['actual_fraud']).mean():.4f}\")
print(f\"ROC-AUC: {roc_auc_score(real_time_df['actual_fraud'], real_time_df['ensemble_score']):.4f}\")

In [None]:
# 15. FLAGGED TRANSACTIONS ANALYSIS
print("\n" + "="*60)
print("FLAGGED TRANSACTIONS ANALYSIS")
print("="*60)

# Filter flagged transactions
flagged_mask = real_time_df['flagged'] == 1

print(f"\nFlagged Transaction Summary:")
print(f"Total Flagged: {flagged_mask.sum()}")
print(f"True Positives (Correct Flags): {((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==1)).sum()}")
print(f"False Positives (Incorrect Flags): {((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==0)).sum()}")
print(f\"False Negative (Missed Fraud): {((real_time_df['flagged']==0) & (real_time_df['actual_fraud']==1)).sum()}\")

# Show high-confidence flagged transactions
high_confidence_flagged = real_time_df[
    (real_time_df['flagged']==1) & (real_time_df['confidence'] > 0.7)
].sort_values('confidence', ascending=False)

print(f"\nTop 10 High-Confidence Flagged Transactions:")
print(high_confidence_flagged[['transaction_idx', 'actual_fraud', 'rf_proba', 'xgb_proba', 'confidence']].head(10))

# Visualize real-time detection
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Distribution of ensemble scores
axes[0, 0].hist(real_time_df[real_time_df['actual_fraud']==0]['ensemble_score'], bins=30, alpha=0.7, label='Legitimate', color='green')
axes[0, 0].hist(real_time_df[real_time_df['actual_fraud']==1]['ensemble_score'], bins=30, alpha=0.7, label='Fraudulent', color='red')
axes[0, 0].axvline(0.66, color='black', linestyle='--', label='Threshold')
axes[0, 0].set_xlabel('Ensemble Score')
axes[0, 0].set_ylabel('Frequency')
axes[0, 0].set_title('Distribution of Ensemble Fraud Scores')
axes[0, 0].legend()

# Model probability comparison
axes[0, 1].scatter(real_time_df[real_time_df['actual_fraud']==0]['rf_proba'], 
                   real_time_df[real_time_df['actual_fraud']==0]['xgb_proba'],
                   alpha=0.5, s=20, label='Legitimate', color='green')
axes[0, 1].scatter(real_time_df[real_time_df['actual_fraud']==1]['rf_proba'], 
                   real_time_df[real_time_df['actual_fraud']==1]['xgb_proba'],
                   alpha=0.5, s=20, label='Fraudulent', color='red')
axes[0, 1].set_xlabel('Random Forest Probability')
axes[0, 1].set_ylabel('XGBoost Probability')
axes[0, 1].set_title('Model Agreement - RF vs XGBoost')
axes[0, 1].legend()

# Confusion matrix heatmap
cm = confusion_matrix(real_time_df['actual_fraud'], real_time_df['flagged'])
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', ax=axes[1, 0], cbar=False)
axes[1, 0].set_xlabel('Predicted')
axes[1, 0].set_ylabel('Actual')
axes[1, 0].set_title('Confusion Matrix - Real-time Detection')

# Flagged vs Not Flagged
status = pd.DataFrame({
    'Status': ['Flagged', 'Not Flagged'],
    'Fraud Cases': [
        ((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==1)).sum(),
        ((real_time_df['flagged']==0) & (real_time_df['actual_fraud']==1)).sum()
    ],
    'Legitimate Cases': [
        ((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==0)).sum(),
        ((real_time_df['flagged']==0) & (real_time_df['actual_fraud']==0)).sum()
    ]
})
status.set_index('Status')[['Fraud Cases', 'Legitimate Cases']].plot(kind='bar', ax=axes[1, 1], color=['red', 'green'])
axes[1, 1].set_title('Flagging Results Breakdown')
axes[1, 1].set_ylabel('Count')
axes[1, 1].legend(title='Type')

plt.tight_layout()
plt.savefig(r'c:\Users\ringa\OneDrive\Desktop\project\new\realtime_flagging_analysis.png', dpi=100, bbox_inches='tight')
plt.show()

print(\"‚úì Real-time Flagging Analysis Complete\")

In [None]:
# 16. EXPLAINABILITY - SHAP VALUES FOR XGBoost
print("\n" + "="*60)
print("MODEL EXPLAINABILITY - SHAP VALUES")
print("="*60)

# Create SHAP explainer for XGBoost model
explainer = shap.TreeExplainer(xgb_model)

# Calculate SHAP values for test set (sample for performance)
sample_size = min(100, len(X_test_scaled))
shap_values = explainer.shap_values(X_test_scaled[:sample_size])

# Summary plot
plt.figure(figsize=(10, 6))
shap.summary_plot(shap_values, X_test_scaled[:sample_size], feature_names=feature_cols, plot_type="bar", show=False)
plt.title("SHAP Feature Importance - XGBoost Model")
plt.tight_layout()
plt.savefig(r'c:\Users\ringa\OneDrive\Desktop\project\new\shap_feature_importance.png', dpi=100, bbox_inches='tight')
plt.show()

print("‚úì SHAP Feature Importance Visualization Complete")

# Feature importance summary
feature_names_arr = np.array(feature_cols)
mean_abs_shap = np.abs(shap_values).mean(axis=0)
top_features = feature_names_arr[np.argsort(mean_abs_shap)[-10:]][::-1]

print("\nTop 10 Most Important Features (SHAP):")
for i, feat in enumerate(top_features, 1):
    print(f"{i}. {feat}")

In [None]:
# 17. MODEL PERSISTENCE & SAVING
print("\n" + "="*60)
print("SAVING TRAINED MODELS")
print("="*60)

model_dir = r'c:\Users\ringa\OneDrive\Desktop\project\new\models'
import os
os.makedirs(model_dir, exist_ok=True)

# Save models
joblib.dump(iso_forest, os.path.join(model_dir, 'isolation_forest_model.pkl'))
joblib.dump(rf_model, os.path.join(model_dir, 'random_forest_model.pkl'))
joblib.dump(xgb_model, os.path.join(model_dir, 'xgboost_model.pkl'))
joblib.dump(scaler, os.path.join(model_dir, 'feature_scaler.pkl'))

# Save label encoders
joblib.dump(label_encoders, os.path.join(model_dir, 'label_encoders.pkl'))

# Save model metadata
metadata = {
    'models': ['isolation_forest', 'random_forest', 'xgboost'],
    'feature_columns': feature_cols,
    'feature_count': len(feature_cols),
    'training_date': datetime.now().isoformat(),
    'test_performance': {
        'isolation_forest_auc': float(roc_auc_score(y_test, -test_scores)),
        'random_forest_auc': float(roc_auc_score(y_test, y_test_rf_proba)),
        'xgboost_auc': float(roc_auc_score(y_test, y_test_xgb_proba))
    }
}

with open(os.path.join(model_dir, 'model_metadata.json'), 'w') as f:
    json.dump(metadata, f, indent=2)

print(f"‚úì Models saved to {model_dir}")
print(f"Files saved:")
print(f"  - isolation_forest_model.pkl")
print(f"  - random_forest_model.pkl")
print(f"  - xgboost_model.pkl")
print(f"  - feature_scaler.pkl")
print(f"  - label_encoders.pkl")
print(f"  - model_metadata.json")

In [None]:
# 18. PRODUCTION INFERENCE EXAMPLE
print("\n" + "="*60)
print("PRODUCTION INFERENCE EXAMPLE")
print("="*60)

# Load saved models
iso_forest_prod = joblib.load(os.path.join(model_dir, 'isolation_forest_model.pkl'))
rf_model_prod = joblib.load(os.path.join(model_dir, 'random_forest_model.pkl'))
xgb_model_prod = joblib.load(os.path.join(model_dir, 'xgboost_model.pkl'))
scaler_prod = joblib.load(os.path.join(model_dir, 'feature_scaler.pkl'))

# Reinitialize scorer with loaded models
scorer_prod = RealTimeTransactionScorer(
    iso_forest=iso_forest_prod,
    rf_model=rf_model_prod,
    xgb_model=xgb_model_prod,
    scaler=scaler_prod,
    threshold=0.5
)

# Test inference on new transaction
test_transaction = X_test.iloc[0].values
scores = scorer_prod.score_transaction(test_transaction)

print(f"Test Transaction Scoring:")
print(f"  Isolation Forest Score: {scores['iso_score']:.4f}")
print(f"  Random Forest Probability: {scores['rf_proba']:.4f}")
print(f"  XGBoost Probability: {scores['xgb_proba']:.4f}")
print(f"  Ensemble Score: {scores['ensemble_score']:.4f}")
print(f"  Is Flagged: {'YES - SUSPICIOUS' if scores['is_flagged'] else 'NO - LEGITIMATE'}")
print(f"  Confidence: {scores['confidence']:.4f}")

print("\n‚úì Production Inference Example Complete")

In [None]:
# 19. MONITORING & DRIFT DETECTION
print("\n" + "="*60)
print("MONITORING & DRIFT DETECTION")
print("="*60)

# Compare training vs test distributions
drift_metrics = {}

for col in numeric_cols:
    train_mean = X_train[col].mean()
    test_mean = X_test[col].mean()
    train_std = X_train[col].std()
    test_std = X_test[col].std()
    
    # Calculate drift using % change
    mean_drift = abs(test_mean - train_mean) / (abs(train_mean) + 0.001) * 100
    std_drift = abs(test_std - train_std) / (abs(train_std) + 0.001) * 100
    
    drift_metrics[col] = {
        'mean_drift_pct': mean_drift,
        'std_drift_pct': std_drift,
        'max_drift': max(mean_drift, std_drift)
    }

# Display high-drift features
drift_df = pd.DataFrame(drift_metrics).T
drift_df = drift_df.sort_values('max_drift', ascending=False)

print("\nFeature Drift Analysis (Train vs Test):")
print(drift_df[['mean_drift_pct', 'std_drift_pct', 'max_drift']])

# Alert on high drift
high_drift_features = drift_df[drift_df['max_drift'] > 20].index.tolist()
if high_drift_features:
    print(f"\n‚ö†Ô∏è  HIGH DRIFT DETECTED in features: {high_drift_features}")
    print("   Recommendation: Retrain models with recent data")
else:
    print("\n‚úì No significant data drift detected")

# Model performance over time (on test set sorted by time)
print("\n\nModel Performance Over Time (Test Set):")
window_size = len(real_time_df) // 5
for i in range(5):
    start_idx = i * window_size
    end_idx = (i+1) * window_size if i < 4 else len(real_time_df)
    
    window_data = real_time_df.iloc[start_idx:end_idx]
    auc = roc_auc_score(window_data['actual_fraud'], window_data['ensemble_score'])
    accuracy = (window_data['flagged'] == window_data['actual_fraud']).mean()
    
    print(f"  Window {i+1} (transactions {start_idx}-{end_idx}): AUC={auc:.4f}, Accuracy={accuracy:.4f}")

In [None]:
# 20. SUMMARY REPORT & RECOMMENDATIONS
print("\n" + "="*60)
print("FRAUD DETECTION SYSTEM - SUMMARY REPORT")
print("="*60)

summary_report = f"""
üìä DATASET OVERVIEW
{'='*50}
Total Transactions: {len(df_clean)}
Training Set: {len(X_train)} transactions
Validation Set: {len(X_val)} transactions
Test Set: {len(X_test)} transactions
Overall Fraud Rate: {y.mean()*100:.2f}%

ü§ñ MODEL PERFORMANCE (Test Set)
{'='*50}
Model 1: Isolation Forest
  - Accuracy: {(y_test_iso_pred == y_test).mean():.4f}
  - ROC-AUC: {roc_auc_score(y_test, -test_scores):.4f}
  - Type: Unsupervised Anomaly Detection

Model 2: Random Forest
  - Accuracy: {(y_test_rf_pred == y_test).mean():.4f}
  - ROC-AUC: {roc_auc_score(y_test, y_test_rf_proba):.4f}
  - Type: Supervised Tree-based

Model 3: XGBoost
  - Accuracy: {(y_test_xgb_pred == y_test).mean():.4f}
  - ROC-AUC: {roc_auc_score(y_test, y_test_xgb_proba):.4f}
  - Type: Supervised Gradient Boosting

üîÄ ENSEMBLE MODEL PERFORMANCE
{'='*50}
Voting Ensemble:
  - Accuracy: {(ensemble_pred == y_test).mean():.4f}
  - ROC-AUC: {roc_auc_score(y_test, ensemble_votes):.4f}

Weighted Probability Ensemble:
  - Accuracy: {(ensemble_pred_proba == y_test).mean():.4f}
  - ROC-AUC: {roc_auc_score(y_test, ensemble_proba_weighted):.4f}

üö® REAL-TIME PERFORMANCE
{'='*50}
Transactions Flagged: {real_time_df['flagged'].sum()} / {len(real_time_df)}
Detection Rate: {real_time_df['flagged'].mean()*100:.2f}%
True Positives: {((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==1)).sum()}
False Positives: {((real_time_df['flagged']==1) & (real_time_df['actual_fraud']==0)).sum()}
False Negatives: {((real_time_df['flagged']==0) & (real_time_df['actual_fraud']==1)).sum()}

‚≠ê KEY FEATURES (Top 5)
{'='*50}
Random Forest: {', '.join(feature_importance_rf.head(5)['feature'].values)}
XGBoost: {', '.join(feature_importance_xgb.head(5)['feature'].values)}

üí° RECOMMENDATIONS
{'='*50}
1. Use Weighted Probability Ensemble for production (best performance)
2. Set threshold at 0.5-0.6 probability for optimal precision-recall
3. Monitor for data drift monthly using reference distributions
4. Retrain models quarterly with new labeled data
5. Implement real-time alert system for flagged transactions
6. Review false positives weekly to improve whitelist rules
7. Consider transaction amount and merchant risk in manual review
8. Use SHAP values to explain fraud flags to stakeholders

üìÅ OUTPUT ARTIFACTS
{'='*50}
‚úì Trained Models (saved in ./models/)
‚úì EDA Visualizations
‚úì Model Comparison Charts
‚úì Real-time Flagging Analysis
‚úì SHAP Feature Importance
‚úì Model Metadata & Versioning
"""

print(summary_report)

# Save report
report_path = r'c:\Users\ringa\OneDrive\Desktop\project\new\fraud_detection_summary_report.txt'
with open(report_path, 'w') as f:
    f.write(summary_report)

print(f"\n‚úì Report saved to {report_path}")