In [None]:
## PAYSIM FRAUD DATASET ######

In [None]:
"""
Overview:

This project focuses on fraud detection using three machine learning models: Random Forest, XGBoost,
and Multi-layer Perceptron (MLP). To ensure fair comparison, a consistent data processing and evaluation
pipeline is implemented. The dataset's severe class imbalance is addressed using two sampling techniques:
SMOTE and Random Under-Samplin -- enhancing minority class representation.

Feature Engineering: (will be detailed in final report)
- The "Time" feature is removed due to lack of informational value.
- Features V22 to V28 are dropped because of low variance.


***Dataset Downsampling: To address computational and runtime limitations, a stratified sampling approach was
applied for training in some cases. This ensured the class balance was retained while significantly reducing
the dataset size, enabling faster processing. The original dataset's large size caused system overheating
on my local machine, making it impractical to train models without downsampling. Attempts to use Colab lead
to challenges such as timeouts and resource issues.

This approach allowed for efficient experimentation and testing, I fully acknowledged that this was far from ideal.
Any findings or conclusions must be interpreted with this big limitation in mind.


Avoiding Data Leakage:
- Test data is kept separate before any transformations or sampling to ensure unbiased eval.
- Pipeline Construction: All transformations (e.g., scaling, sampling) are applied only to training data during
cross-val and model training, avoiding test set contamination.

Model Pipeline:
1. Scaling: StandardScaler is used.
2. Sampling: Two techniques are tested:
   - SMOTE: Synthesizes new instances for the minority class.
   - Random Under-Sampling: Reduces the majority class size to improve balance.

Parameter Tuning:
- GridSearchCV optimizes hyperparams by testing combinations and selecting the best configu based on ROC AUC. 
- final model is retrained on full training set using the best hyperparams for test evaluation.
- Metrics store in `cv_results_` allow for detailed  analysis.

Cross-Val:
- Stratified K-Fold CV (3-5 folds) ensures consistent class distribution across folds.
- Primary metric: ROC AUC, which evaluate the model's ability to distinguish between classes.

Evaluation Metrics:
- Precision, recall, F1-score, and ROC AUC are used to assess performance. 
- Training and inference times are recorded to analyze computational efficiency.
- Stability is measured using the SD of ROC AUC across folds.

***************************************************************************************************************
****NOTE on Thresholding: XG BOOST AND MLP ONLY ******  

After selecting the best model from GridSearchCV, threshold tuning is performed on validation set.
The default threshold of 0.5 gave poor results(very low precision), which is typical with imbalanced datasets.
Predicted probabilities from the validation set are used to evaluate multiple thresholds, and the one 
that gave a better balance of precision/recall is chosen. 

Data Splitting:
- Dataset is divided into training (60%), validation (20%), and test (20%) sets:
- Training set: Model building and hyperparameter tuning.
- Validation set: Threshold tuning
- Test Set: unbiased performance evaluation.


**************************************************************************************************************
******** NOTE on RandomUndersampler: ********

Across all models, the use of RandomUnderSampler consistently resulted in unusaable performance. This
is because of significant loss of information from the majority class during undersampling, which reduced
dataset size and limited the model's ability to learn. Initial dataset size reduction in some
cases also contributed to this.  This was a big lesson learned.



In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from imblearn.over_sampling import SMOTE
from imblearn.pipeline import Pipeline
from sklearn.metrics import classification_report, roc_auc_score, make_scorer, accuracy_score, precision_score, recall_score, f1_score

# Load data
file_path = r'C:\Users\ssain\Downloads\archive (9)\PS_20174392719_1491204439457_log.csv'
df = pd.read_csv(file_path)
df

Unnamed: 0,step,type,amount,nameOrig,oldbalanceOrg,newbalanceOrig,nameDest,oldbalanceDest,newbalanceDest,isFraud,isFlaggedFraud
0,1,PAYMENT,9839.64,C1231006815,170136.00,160296.36,M1979787155,0.00,0.00,0,0
1,1,PAYMENT,1864.28,C1666544295,21249.00,19384.72,M2044282225,0.00,0.00,0,0
2,1,TRANSFER,181.00,C1305486145,181.00,0.00,C553264065,0.00,0.00,1,0
3,1,CASH_OUT,181.00,C840083671,181.00,0.00,C38997010,21182.00,0.00,1,0
4,1,PAYMENT,11668.14,C2048537720,41554.00,29885.86,M1230701703,0.00,0.00,0,0
...,...,...,...,...,...,...,...,...,...,...,...
6362615,743,CASH_OUT,339682.13,C786484425,339682.13,0.00,C776919290,0.00,339682.13,1,0
6362616,743,TRANSFER,6311409.28,C1529008245,6311409.28,0.00,C1881841831,0.00,0.00,1,0
6362617,743,CASH_OUT,6311409.28,C1162922333,6311409.28,0.00,C1365125890,68488.84,6379898.11,1,0
6362618,743,TRANSFER,850002.52,C1685995037,850002.52,0.00,C2080388513,0.00,0.00,1,0


In [None]:
########## XG BOOST ##################################################################

In [3]:
#FINAL  *************

import xgboost as xgb
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, precision_recall_curve
from imblearn.pipeline import Pipeline
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix


# Drop unnecessary columns that are not useful for the model
df = df.drop(columns=['nameOrig', 'nameDest', 'type', 'step'])

# Create a new feature 'netAmount' to represent the net transaction amount
df['netAmount'] = df['amount'] - df['oldbalanceOrg']

# Retain only numerical columns (int and float types) for modeling
df = df.select_dtypes(include=[int, float]).copy()

# Remove rows with missing values
df = df.dropna()

# Downsample the dataset for more manageable size, maintaining class balance
df_sampled = df.groupby('isFraud', group_keys=False).apply(lambda x: x.sample(frac=0.40, random_state=42))

# Separate features (X) and target variable (y)
X = df_sampled.drop(['isFraud'], axis=1)
y = df_sampled['isFraud']


# Split data into training (60%), validation (20%), and test (20%) sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, stratify=y, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, stratify=y_temp, random_state=42)

# Define samplers to compare
samplers = {
    'SMOTE': SMOTE(random_state=42),
    'RandomUnderSampler': RandomUnderSampler(random_state=42)
}

# Define hyperparameter grid for XGBoost
param_grid = {
    'classifier__n_estimators': [50, 100],  #number of trees
    'classifier__max_depth': [3, 5],   #max depth
    'classifier__learning_rate': [0.1, 0.01]
}

# dictionary to store results for each sampler
results = {}

# Iterate over each sampler (SMOTE and RandomUnderSampler)
for sampler_name, sampler in samplers.items():
    print(f"\nGetting results for: {sampler_name}...")

    # Pipeline with sampler, scaler, and classifier
    pipeline = Pipeline([
        ('sampler', sampler),                # Apply the current sampler (SMOTE or RandomUnderSampler)
        ('scaler', StandardScaler()),        # Standard scaling
        ('classifier', xgb.XGBClassifier(scale_pos_weight=len(y_train) / sum(y_train), random_state=42))
    ])

    # Perform Hyperparameter tuning using GridSearchCV with 5 fold CV
    grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='roc_auc', return_train_score=True)
    grid_search.fit(X_train, y_train)

    # Get probabilities for validation set
    proba_predictions_val = grid_search.predict_proba(X_val)[:, 1]
    
    # Calculate precision-recall thresholds to find the optimal decision threshold
    precision, recall, thresholds = precision_recall_curve(y_val, proba_predictions_val)
    optimal_idx = (precision + recall).argmax()
    optimal_threshold = thresholds[optimal_idx]

    # Evaluate model on test set using new thrshold
    proba_predictions_test = grid_search.predict_proba(X_test)[:, 1]
    predictions_adjusted = (proba_predictions_test >= optimal_threshold).astype(int)

    # Record evaluation metrics
    evaluation_metrics_adjusted = {
        'Accuracy': accuracy_score(y_test, predictions_adjusted),
        'Precision': precision_score(y_test, predictions_adjusted, zero_division=1),
        'Recall': recall_score(y_test, predictions_adjusted, zero_division=1),
        'F1 Score': f1_score(y_test, predictions_adjusted, zero_division=1),
        'ROC AUC': roc_auc_score(y_test, proba_predictions_test)
    }

    # Cross-validation results for best model
    best_index = grid_search.best_index_
    cv_results = grid_search.cv_results_
    best_model_score = cv_results['mean_test_score'][best_index]
    best_model_score_std = cv_results['std_test_score'][best_index]
    best_model_fit_time = cv_results['mean_fit_time'][best_index]
    best_model_score_time = cv_results['mean_score_time'][best_index]

    # Store results, including stability and efficiency metrics
    results[sampler_name] = {
        'Evaluation Metrics': evaluation_metrics_adjusted,
        'Best Hyperparameters': grid_search.best_params_,
        'Best CV ROC AUC Score': best_model_score,
        'CV Score Std Dev': best_model_score_std,
        'Mean Training Time (s)': best_model_fit_time,
        'Mean Inference Time (s)': best_model_score_time
    }

# Print confusion matrix
    cm = confusion_matrix(y_test, predictions_adjusted)
    print(f"Confusion Matrix for {sampler_name}:")
    print(cm)
    
# Print results
for sampler_name, metrics in results.items():
    print(f"\nResults for {sampler_name}:")
    print("Evaluation Metrics:", metrics['Evaluation Metrics'])
    print("Best Hyperparameters:", metrics['Best Hyperparameters'])
    print(f"Best CV ROC AUC Score: {metrics['Best CV ROC AUC Score']:.4f}")
    print(f"CV Score Standard Deviation: {metrics['CV Score Std Dev']:.4f}")
    print(f"Mean Training Time (s): {metrics['Mean Training Time (s)']:.4f}")
    print(f"Mean Inference Time (s): {metrics['Mean Inference Time (s)']:.4f}")



Getting results for: SMOTE...
Confusion Matrix for SMOTE:
[[508322     31]
 [   100    557]]

Getting results for: RandomUnderSampler...
Confusion Matrix for RandomUnderSampler:
[[508306     47]
 [    92    565]]

Results for SMOTE:
Evaluation Metrics: {'Accuracy': 0.999742637669201, 'Precision': 0.9472789115646258, 'Recall': 0.84779299847793, 'F1 Score': 0.8947791164658635, 'ROC AUC': 0.998314775880772}
Best Hyperparameters: {'classifier__learning_rate': 0.1, 'classifier__max_depth': 3, 'classifier__n_estimators': 100}
Best CV ROC AUC Score: 0.9983
CV Score Standard Deviation: 0.0012
Mean Training Time (s): 4.7013
Mean Inference Time (s): 0.1526

Results for RandomUnderSampler:
Evaluation Metrics: {'Accuracy': 0.9997269208856407, 'Precision': 0.923202614379085, 'Recall': 0.8599695585996956, 'F1 Score': 0.8904649330181245, 'ROC AUC': 0.9988582251751553}
Best Hyperparameters: {'classifier__learning_rate': 0.1, 'classifier__max_depth': 5, 'classifier__n_estimators': 100}
Best CV ROC AUC

In [None]:
#########################################################################################

In [None]:
### Random Forest ###################

In [4]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from imblearn.pipeline import Pipeline
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from sklearn.metrics import confusion_matrix


# Load data
file_path = r'C:\Users\ssain\Downloads\archive (9)\PS_20174392719_1491204439457_log.csv'
df = pd.read_csv(file_path)



# Remove irrelevant or non-predictive features from the dataset
df = df.drop(columns=['nameOrig', 'nameDest', 'type', 'step'],errors='ignore')

# Create a new feature 'netAmount' to capture the net transaction amount
# Calculated as the transaction amount minus the original balance of the sender
df['netAmount'] = df['amount'] - df['oldbalanceOrg']

# Ensure that all columns are numerical
df = df.select_dtypes(include=[int, float]).copy()

# Handle missing values by removing rows with NaN
df = df.dropna()

# Sample 15% of the dataset for faster computation while maintaining the class distribution
df_sampled = df.groupby('isFraud', group_keys=False).apply(lambda x: x.sample(frac=0.40, random_state=42))

# Split data into features and target
X = df_sampled.drop(['isFraud'], axis=1)
y = df_sampled['isFraud']

# Train-test split with stratification to maintain class balance
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)

# Define parameter grid for RandomForest
param_grid = {
    'classifier__n_estimators': [10, 50],  # Number of trees in the forest
    'classifier__max_depth': [5, 10]       # Maximum depth of each tree
}

# Define resampling methods and dictionary to store results
resampling_methods = {'SMOTE': SMOTE(random_state=42), 'RandomUnderSampler': RandomUnderSampler(random_state=42)}
results = {}

# Loop through resampling methods
for method_name, sampler in resampling_methods.items():
    # Build pipeline
    pipeline = Pipeline([
        ('resampler', sampler),                  # Resampling method (SMOTE or RandomUnderSampler)
        ('scaler', StandardScaler()),            # Scaling
        ('classifier', RandomForestClassifier(class_weight='balanced', random_state=42))  # RandomForest
    ])
    
     # Use 5-fold cross-validation and optimize for ROC AUC score
    grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='roc_auc', refit='roc_auc')
    grid_search.fit(X_train, y_train)

    # Best model cross-validation ROC AUC scores
    best_cv_results = grid_search.cv_results_
    best_index = grid_search.best_index_
    best_cv_roc_auc = best_cv_results['mean_test_score'][best_index]
    std_cv_roc_auc = best_cv_results['std_test_score'][best_index]
    
    # Calculate mean training and inference time from CV results
    mean_train_time = best_cv_results['mean_fit_time'][best_index]
    mean_inference_time = best_cv_results['mean_score_time'][best_index]
    
    # Predict and calculate probabilities for ROC AUC calculation
    predictions = grid_search.predict(X_test)
    proba_predictions = grid_search.predict_proba(X_test)[:, 1]
    
    # Compute confusion matrix
    cm = confusion_matrix(y_test, predictions)
    
    # Evaluation metrics
    evaluation_metrics = {
        'Accuracy': accuracy_score(y_test, predictions),
        'Precision': precision_score(y_test, predictions, zero_division=1),
        'Recall': recall_score(y_test, predictions, zero_division=1),
        'F1 Score': f1_score(y_test, predictions, zero_division=1),
        'ROC AUC': roc_auc_score(y_test, proba_predictions)
    }
    
    # Store results
    results[method_name] = {
        'Best Parameters': grid_search.best_params_,
        'Best CV ROC AUC (Mean)': best_cv_roc_auc,
        'Best CV ROC AUC (SD)': std_cv_roc_auc,
        'Mean Train Time (CV)': mean_train_time,
        'Mean Inference Time (CV)': mean_inference_time,
        'Evaluation Metrics': evaluation_metrics,
        'Confusion Matrix':cm
    }

# Print results
for method, result in results.items():
    print(f"Results for {method}:")
    print("Best Parameters:", result['Best Parameters'])
    print("Best CV ROC AUC (SD):", result['Best CV ROC AUC (SD)'])
    print("Mean Train Time (CV):", result['Mean Train Time (CV)'])
    print("Mean Inference Time (CV):", result['Mean Inference Time (CV)'])
    print("Evaluation Metrics on Test Data:", result['Evaluation Metrics'])
    print("Confusion Matrix:\n", result['Confusion Matrix'])
    print("\n")


Results for SMOTE:
Best Parameters: {'classifier__max_depth': 10, 'classifier__n_estimators': 10}
Best CV ROC AUC (SD): 0.0010177553604649844
Mean Train Time (CV): 8.382117509841919
Mean Inference Time (CV): 0.06881694793701172
Evaluation Metrics on Test Data: {'Accuracy': 0.9998899826591715, 'Precision': 0.9310344827586207, 'Recall': 0.9878048780487805, 'F1 Score': 0.9585798816568047, 'ROC AUC': 0.9930865756229599}
Confusion Matrix:
 [[190615     18]
 [     3    243]]


Results for RandomUnderSampler:
Best Parameters: {'classifier__max_depth': 5, 'classifier__n_estimators': 50}
Best CV ROC AUC (SD): 0.0009538827175289722
Mean Train Time (CV): 0.15845742225646972
Mean Inference Time (CV): 0.20468721389770508
Evaluation Metrics on Test Data: {'Accuracy': 0.997652963395659, 'Precision': 0.35319767441860467, 'Recall': 0.9878048780487805, 'F1 Score': 0.5203426124197003, 'ROC AUC': 0.9909105667174133}
Confusion Matrix:
 [[190188    445]
 [     3    243]]




In [None]:
##########################################################################

In [None]:
# MLP ####################################################################

# This Multi-Layer Perceptron (MLP) is a feedforward neural net designed for binary classification.
# In this implementation, the Multi-Layer Perceptron (MLP) consists of an input layer, two hidden layers with
# ReLU activation, and an output layer with a sigmoid activation function for binary classif. The hidden
# layers are designed with a moderate number of neurons to balance complexity and computational efficiency, while
# dropout layers are added to prevent overfitting. This design was chosen to capture relationships
# in the data while remaining computationally practical given resource constraints.


# Note:create_model is a factory function used by KerasClassifier to generate new model instances dynamically.
# Although we only define create_model once, it’s called multiple times by KerasClassifier during training,
# cross-validation, and hyperparameter search.

In [5]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score,precision_recall_curve
from imblearn.pipeline import Pipeline
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score
import tensorflow as tf
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import StratifiedKFold

# Load PaySim dataset
file_path = r'C:\Users\ssain\Downloads\archive (9)\PS_20174392719_1491204439457_log.csv'
df = pd.read_csv(file_path)

# Data Preparation
df = df.drop(columns=['nameOrig', 'nameDest', 'type', 'step'])
df['netAmount'] = df['amount'] - df['oldbalanceOrg']
df = df.select_dtypes(include=[int, float]).copy()
df = df.dropna()

# Sample the dataset to reduce computational cost
df_sampled = df.groupby('isFraud', group_keys=False).apply(lambda x: x.sample(frac=0.40, random_state=42))

# Split into features and target
X = df_sampled.drop(['isFraud'], axis=1)
y = df_sampled['isFraud']

# Train-validation-test split
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, stratify=y, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, stratify=y_temp, random_state=42)

# Define a function to create the MLP model
def create_model(neurons=32, dropout_rate=0.3, optimizer='adam'):
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(X_train.shape[1],)),  # Input layer
        tf.keras.layers.Dense(neurons, activation='relu'),  # First hidden layer
        tf.keras.layers.Dropout(dropout_rate),             # Dropout for regularization
        tf.keras.layers.Dense(neurons // 2, activation='relu'),  # Second hidden layer
        tf.keras.layers.Dropout(dropout_rate),
        tf.keras.layers.Dense(1, activation='sigmoid')     # Output layer for binary classification
    ])
    model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
    return model

# Wrap the MLP model using KerasClassifier
model = KerasClassifier(model=create_model, verbose=0, random_state=42)

# Define hyperparameter grid for MLP
param_grid = {
    'mlp__model__neurons': [64, 128], #neurons in first hidden
    'mlp__model__dropout_rate': [0.3],
    'mlp__epochs': [5], # No of epochs
    'mlp__batch_size': [128],
    'mlp__model__optimizer': ['adam'] #optimizer
}

# Define stratified K-fold CrossVal
# Ensures class dist. preserved in every fold
stratified_cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)

# Define sampling methods for comparison
samplers = {
    'SMOTE': SMOTE(random_state=42),
    'RandomUnderSampler': RandomUnderSampler(random_state=42)
}

# Results dictionary
results = {}

# Loop through each sampling method
for sampler_name, sampler in samplers.items():
    print(f"\nGetting results for {sampler_name}...")

    # Define pipeline
    pipeline = Pipeline([
        ('sampler', sampler),
        ('scaler', StandardScaler()),  # Standard scaling
        ('mlp', model)                 # MLP as the model
    ])

    # GridSearchCV for hyperparameter tuning
    grid_search = GridSearchCV(
        estimator=pipeline,
        param_grid=param_grid,
        cv=stratified_cv,  # Use 3-fold cross-validation
        scoring='roc_auc',
        return_train_score=True,
        verbose=1
    )
    grid_search.fit(X_train, y_train)

    # Predict probabilities on validation set for threshold tuning
    proba_val = grid_search.predict_proba(X_val)[:, 1]
    precision, recall, thresholds = precision_recall_curve(y_val, proba_val)
    optimal_idx = (precision + recall).argmax()
    optimal_threshold = thresholds[optimal_idx]

    # Predict on test set with adjusted threshold
    proba_test = grid_search.predict_proba(X_test)[:, 1]
    adjusted_predictions = (proba_test >= optimal_threshold).astype(int)
    
    # Compute confusion matrix
    cm = confusion_matrix(y_test, adjusted_predictions)
    

    # Evaluation metrics
    evaluation_metrics = {
        'Accuracy': accuracy_score(y_test, adjusted_predictions),
        'Precision': precision_score(y_test, adjusted_predictions, zero_division=1),
        'Recall': recall_score(y_test, adjusted_predictions, zero_division=1),
        'F1 Score': f1_score(y_test, adjusted_predictions, zero_division=1),
        'ROC AUC': roc_auc_score(y_test, proba_test)
    }

    # Cross-validation metrics for the best model
    best_index = grid_search.best_index_
    cv_results = grid_search.cv_results_
    best_model_score = cv_results['mean_test_score'][best_index]
    best_model_score_std = cv_results['std_test_score'][best_index]
    best_model_fit_time = cv_results['mean_fit_time'][best_index]
    best_model_score_time = cv_results['mean_score_time'][best_index]

    # Store results
    results[sampler_name] = {
        'Evaluation Metrics': evaluation_metrics,
        'Confusion Matrix':cm,
        'Best Hyperparameters': grid_search.best_params_,
        'Best CV ROC AUC Score': best_model_score,
        'CV Score Std Dev': best_model_score_std,
        'Mean Training Time (s)': best_model_fit_time,
        'Mean Inference Time (s)': best_model_score_time
    }

# Display results
for sampler_name, metrics in results.items():
    print(f"\nResults for {sampler_name}:")
    print("Evaluation Metrics:", metrics['Evaluation Metrics'])
    print("Confusion Matrix:\n", metrics['Confusion Matrix'])
    print("Best Hyperparameters:", metrics['Best Hyperparameters'])
    print(f"Best CV ROC AUC Score: {metrics['Best CV ROC AUC Score']:.4f}")
    print(f"CV Score Standard Deviation ROC_AUC: {metrics['CV Score Std Dev']:.4f}")
    print(f"Mean Training Time (s): {metrics['Mean Training Time (s)']:.4f}")
    print(f"Mean Inference Time (s): {metrics['Mean Inference Time (s)']:.4f}")




Getting results for SMOTE...
Fitting 3 folds for each of 2 candidates, totalling 6 fits

Getting results for RandomUnderSampler...
Fitting 3 folds for each of 2 candidates, totalling 6 fits

Results for SMOTE:
Evaluation Metrics: {'Accuracy': 0.9994518771733365, 'Precision': 0.85, 'Recall': 0.6986301369863014, 'F1 Score': 0.7669172932330827, 'ROC AUC': 0.9970500714605185}
Confusion Matrix:
 [[508272     81]
 [   198    459]]
Best Hyperparameters: {'mlp__batch_size': 128, 'mlp__epochs': 5, 'mlp__model__dropout_rate': 0.3, 'mlp__model__neurons': 64, 'mlp__model__optimizer': 'adam'}
Best CV ROC AUC Score: 0.9946
CV Score Standard Deviation ROC_AUC: 0.0018
Mean Training Time (s): 69.5803
Mean Inference Time (s): 3.6113

Results for RandomUnderSampler:
Evaluation Metrics: {'Accuracy': 0.9989292941199583, 'Precision': 0.9745762711864406, 'Recall': 0.1750380517503805, 'F1 Score': 0.2967741935483871, 'ROC AUC': 0.9118502087984194}
Confusion Matrix:
 [[508350      3]
 [   542    115]]
Best Hyp