# Machine Learning Models

## Overview

This notebook presents a comprehensive machine learning analysis for credit card fraud detection using five different algorithms across two distinct datasets. Building on the exploratory data analysis, we implement and evaluate multiple modeling approaches to assess their effectiveness in detecting fraudulent transactions.

### Models Implemented:

1. **Neural Network**: Deep learning approach with multiple hidden layers, batch normalization, and dropout
2. **Random Forest**: Ensemble method with SMOTE balancing for synthetic data
3. **Logistic Regression**: Linear classification with balanced class weights
4. **Support Vector Machine (SVM)**: Non-linear classification with RBF kernel
5. **CatBoost**: Gradient boosting with automatic categorical feature handling

### Evaluation Approach:

Each model is evaluated using standard fraud detection metrics:
- **Accuracy**: Overall correctness of predictions
- **Precision**: Proportion of predicted frauds that are actually fraudulent
- **Recall**: Proportion of actual frauds that are detected
- **F1-Score**: Harmonic mean of precision and recall
- **AUC-ROC**: Area under the receiver operating characteristic curve

The analysis aims to identify which modeling approaches work best for different data characteristics, comparing performance between synthetic and real-world datasets to provide insights for practical fraud detection system development.

## Contents

1. [Synthetic Dataset Models](#synthetic-models)
   - [1.1 Neural Network - Synthetic Dataset](#neural-network-synthetic)
   - [1.2 Random Forest - Synthetic Dataset](#random-forest-synthetic)
   - [1.3 Logistic Regression - Synthetic Dataset](#logistic-regression-synthetic)
   - [1.4 Support Vector Machine - Synthetic Dataset](#svm-synthetic)
   - [1.5 CatBoost - Synthetic Dataset](#catboost-synthetic)

2. [European Dataset Models](#european-models)
   - [2.1 Neural Network - European Dataset](#neural-network-european)
   - [2.2 Random Forest - European Dataset](#random-forest-european)
   - [2.3 Logistic Regression - European Dataset](#logistic-regression-european)
   - [2.4 Support Vector Machine - European Dataset](#svm-european)
   - [2.5 CatBoost - European Dataset](#catboost-european)

3. [Model Performance Comparison](#model-comparison)


In [None]:
# import necessary libraries
import joblib
import numpy as np
import pandas as pd

from sklearn.svm import SVC
from imblearn.over_sampling import SMOTE
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import (
    precision_score, recall_score, f1_score, accuracy_score, confusion_matrix, roc_auc_score, 
)

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers.legacy import Adam
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

from catboost import Pool, CatBoostClassifier

# set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

<br/>
<br/>

<h2 id="synthetic-models">Synthetic Dataset Models</h2>

<h3 id="neural-network-synthetic">1.1 Neural Network - Synthetic Dataset</h3>

In [2]:
# load synthetic dataset
fraud_df = pd.read_csv('data/cleaned_fraud_dataset.csv')

In [3]:
# data preprocessing
fraud_df.drop('timestamp', axis=1, inplace=True)

# encode categorical variables
categorical_columns = ['transaction_type', 'merchant_category', 'location', 'device_used', 'payment_channel']
label_encoders = {}
for col in categorical_columns:
    le = LabelEncoder()
    fraud_df[col] = le.fit_transform(fraud_df[col])
    label_encoders[col] = le

In [4]:
# prepare features and target
X = fraud_df.drop('is_fraud', axis=1)
y = fraud_df['is_fraud'].astype(int)

In [5]:
# split the data (70-15-15)
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42, stratify=y_temp)

# apply SMOTE to balance training data
smote = SMOTE(random_state=42)
X_train_balanced, y_train_balanced = smote.fit_resample(X_train, y_train)

# scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train_balanced)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

In [6]:
# build neural network model
def create_model(input_dim):
    model = Sequential([
        Dense(256, activation='relu', input_dim=input_dim),
        BatchNormalization(),
        Dropout(0.4),
        Dense(128, activation='relu'),
        BatchNormalization(),
        Dropout(0.3),
        Dense(64, activation='relu'),
        BatchNormalization(),
        Dropout(0.3),
        Dense(32, activation='relu'),
        Dropout(0.2),
        Dense(16, activation='relu'),
        Dropout(0.2),
        Dense(8, activation='relu'),
        Dropout(0.1),
        Dense(1, activation='sigmoid')
    ])
    
    model.compile(
        optimizer=Adam(learning_rate=0.001),
        loss='binary_crossentropy',
        metrics=['accuracy']
    )
    return model

model = create_model(X_train_scaled.shape[1])

In [7]:
# training callbacks
callbacks = [
    EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True, verbose=1),
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6, verbose=1)
]

# train the model
history = model.fit(
    X_train_scaled, y_train_balanced,
    epochs=100,
    batch_size=1024,
    validation_data=(X_val_scaled, y_val),
    callbacks=callbacks,
    verbose=1
)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 6: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100

Epoch 11: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 11: early stopping


In [8]:
# evaluation on test set
test_loss, test_accuracy = model.evaluate(X_test_scaled, y_test, verbose=0)
test_preds = model.predict(X_test_scaled)
test_preds_binary = (test_preds > 0.5).astype(int)

test_precision = precision_score(y_test, test_preds_binary)
test_recall = recall_score(y_test, test_preds_binary)
test_f1 = f1_score(y_test, test_preds_binary)
test_auc = roc_auc_score(y_test, test_preds)

print("Neural Network - Synthetic Dataset Results:")
print(f"Accuracy: {test_accuracy:.4f}")
print(f"Precision: {test_precision:.4f}")
print(f"Recall: {test_recall:.4f}")
print(f"F1-Score: {test_f1:.4f}")
print(f"AUC-ROC: {test_auc:.4f}")

Neural Network - Synthetic Dataset Results:
Accuracy: 0.5251
Precision: 0.3304
Recall: 0.4138
F1-Score: 0.3674
AUC-ROC: 0.4981


<br/>

<h3 id="random-forest-synthetic">1.2 Random Forest - Synthetic Dataset</h3>

In [118]:
# load and prepare data
df = pd.read_csv('data/cleaned_fraud_dataset.csv')
df_clean = df.drop(['timestamp'], axis=1, errors='ignore')

In [119]:
# encode categorical variables
categorical_columns = ['transaction_type', 'merchant_category', 'location', 'device_used', 'payment_channel']
label_encoders = {}

for col in categorical_columns:
    if col in df_clean.columns:
        le = LabelEncoder()
        df_clean[col] = le.fit_transform(df_clean[col])
        label_encoders[col] = le

In [120]:
X = df_clean.drop(['is_fraud'], axis=1)
y = df_clean['is_fraud']

In [122]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# apply SMOTE for balancing
smote = SMOTE(random_state=42)
X_train_smote, y_train_smote = smote.fit_resample(X_train, y_train)

# scale the data
rf_scaler = StandardScaler()
X_train_smote_scaled = rf_scaler.fit_transform(X_train_smote)
X_test_scaled = rf_scaler.transform(X_test)

In [123]:
# fit model with SMOTE
rf_smote = RandomForestClassifier(
    n_estimators=100,
    max_depth=10,
    min_samples_split=5,
    random_state=42,
    n_jobs=-1
)

rf_smote.fit(X_train_smote_scaled, y_train_smote)

In [124]:
# cross-validation
cv_scores_smote = cross_val_score(rf_smote, X_train_smote_scaled, y_train_smote, cv=5, scoring='f1')

In [125]:
# predictions
y_pred_smote = rf_smote.predict(X_test_scaled)

In [126]:
# evaluation
accuracy = accuracy_score(y_test, y_pred_smote)
precision = precision_score(y_test, y_pred_smote)
recall = recall_score(y_test, y_pred_smote)
f1 = f1_score(y_test, y_pred_smote)

# get probabilities for AUC
y_pred_proba_smote = rf_smote.predict_proba(X_test_scaled)[:, 1]
auc_score = roc_auc_score(y_test, y_pred_proba_smote)

print("Random Forest - Synthetic Dataset Results:")
print(f"CV F1 Score: {cv_scores_smote.mean():.4f}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

Random Forest - Synthetic Dataset Results:
CV F1 Score: 0.5644
Accuracy: 0.5375
Precision: 0.3323
Recall: 0.3828
F1-Score: 0.3558
AUC-ROC: 0.4983


<br/>

<h3 id="logistic-regression-synthetic">1.3 Logistic Regression - Synthetic Dataset</h3>

In [16]:
# loading the data
df_cleaned = pd.read_csv('data/cleaned_fraud_dataset.csv')

In [17]:
# prepare data
X = df_cleaned.drop(columns=['is_fraud', 'timestamp', 'location'])
X = pd.get_dummies(X, drop_first=True)
y = df_cleaned['is_fraud'].astype(int)

In [18]:
# split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [19]:
# scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [20]:
# train logistic regression
model = LogisticRegression(max_iter=1000, class_weight='balanced', solver='liblinear')
model.fit(X_train_scaled, y_train)

In [21]:
# cross-validation
cv_scores = cross_val_score(model, X_train_scaled, y_train, cv=3, scoring='f1')

y_pred = model.predict(X_test_scaled)

In [22]:
# get metrics
cm = confusion_matrix(y_test, y_pred)
if cm.shape == (2, 2):
    TN, FP, FN, TP = cm.ravel()
    precision = TP / (TP + FP) if (TP + FP) > 0 else 0
    # specificity = TN / (TN + FP) if (TN + FP) > 0 else 0

accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)

# get probabilities for AUC
y_pred_proba = model.predict_proba(X_test_scaled)[:, 1]
auc_score = roc_auc_score(y_test, y_pred_proba)

print("Logistic Regression - Synthetic Dataset Results:")
print(f"CV F1 Score: {cv_scores.mean():.4f}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

Logistic Regression - Synthetic Dataset Results:
CV F1 Score: 0.3974
Accuracy: 0.4996
Precision: 0.3338
Recall: 0.5033
F1-Score: 0.4014
AUC-ROC: 0.4991


<br/>

<h3 id="svm-synthetic">1.4 Support Vector Machine - Synthetic Dataset</h3>

In [23]:
# use same preprocessing as logistic regression
X = df_cleaned.drop(columns=['is_fraud', 'timestamp', 'location'])
X = pd.get_dummies(X, drop_first=True)
y = df_cleaned['is_fraud'].astype(int)

In [24]:
# split and scale data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [25]:
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

In [None]:
# train SVM
svm_model = SVC(
    kernel='rbf',
    C=1.0,
    gamma='scale',
    class_weight='balanced',
    random_state=42,
    probability=True 
)

svm_model.fit(X_train_s, y_train)

In [None]:
y_pred_svm = svm_model.predict(X_test_s)

In [None]:
# calculate metrics
accuracy = accuracy_score(y_test, y_pred_svm)
precision = precision_score(y_test, y_pred_svm)
recall = recall_score(y_test, y_pred_svm)
f1 = f1_score(y_test, y_pred_svm)

# get probabilities for AUC
y_pred_proba_svm = svm_model.predict_proba(X_test_s)[:, 1]
auc_score = roc_auc_score(y_test, y_pred_proba_svm)

print("SVM - Synthetic Dataset Results:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

<br/>

<h3 id="catboost-synthetic">1.5 CatBoost - Synthetic Dataset</h3>

In [29]:
# load and prepare data
df = pd.read_csv('data/cleaned_fraud_dataset.csv')

In [30]:
# extract temporal features
df['timestamp'] = pd.to_datetime(df['timestamp'])
df['year'] = df['timestamp'].dt.year
df['month'] = df['timestamp'].dt.month
df['day'] = df['timestamp'].dt.day

In [31]:
X = df.drop('is_fraud', axis=1)
y = df['is_fraud']

In [32]:
# define categorical features
categorical_features = ['transaction_type', 'merchant_category', 'location', 'device_used', 'payment_channel', 'year', 'month', 'day']

In [33]:
# split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=36)


In [34]:
# create CatBoost pools
train_pool = Pool(data=X_train, label=y_train, cat_features=categorical_features)
test_pool = Pool(data=X_test, label=y_test, cat_features=categorical_features)

In [35]:
# hyperparameter tuning
learning_rates = [0.01, 0.05, 0.1, 0.2]
depths = [4, 6, 8]
iterations_list = [1000, 2000, 5000]

best_f1 = 0
best_params = {}
best_model = None

In [None]:
for lr in learning_rates:
    for depth in depths:
        for iters in iterations_list:
            model = CatBoostClassifier(
                auto_class_weights='Balanced',
                iterations=iters,
                learning_rate=lr,
                depth=depth,
                verbose=0,
                random_seed=36
            )
            model.fit(train_pool)
            
            y_pred = model.predict(test_pool)
            f1 = f1_score(y_test, y_pred)
            
            if f1 > best_f1:
                best_f1 = f1
                best_params = {
                    'learning_rate': lr,
                    'depth': depth,
                    'iterations': iters
                }
                best_model = model

In [None]:
# final predictions with best model
y_pred = best_model.predict(test_pool)
y_proba = best_model.predict_proba(test_pool)[:, 1]

In [None]:
# calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
auc_score = roc_auc_score(y_test, y_proba)

print("CatBoost - Synthetic Dataset Results:")
print(f"Best Parameters: {best_params}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

<br/>
<br/>

<h2 id="european-models">European Dataset Models</h2>

<h3 id="neural-network-european">2.1 Neural Network - European Dataset</h3>

In [39]:
# load European dataset
fraud_df2 = pd.read_csv('data/creditcard_2023.csv')
fraud_df2.drop(['id'], axis=1, inplace=True)

In [40]:
# prepare features and target
X = fraud_df2.drop('Class', axis=1)
y = fraud_df2['Class']

In [41]:
# split data (70-15-15)
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42, stratify=y_temp)

In [42]:
# scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)


In [43]:
# build and train model (same architecture as synthetic)
model = create_model(X_train_scaled.shape[1])

In [44]:
callbacks = [
    EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True, verbose=1),
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6, verbose=1)
]

In [45]:
history = model.fit(
    X_train_scaled, y_train,
    epochs=100,
    batch_size=512,
    validation_data=(X_val_scaled, y_val),
    callbacks=callbacks,
    verbose=1
)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 7: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100

Epoch 12: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 12: early stopping


In [46]:
# evaluate on test set
test_loss, test_accuracy = model.evaluate(X_test_scaled, y_test, verbose=0)
test_preds = model.predict(X_test_scaled)
test_preds_binary = (test_preds > 0.5).astype(int)

# calculate metrics
test_precision = precision_score(y_test, test_preds_binary)
test_recall = recall_score(y_test, test_preds_binary)
test_f1 = f1_score(y_test, test_preds_binary)
test_auc = roc_auc_score(y_test, test_preds)

print("Neural Network - European Dataset Results:")
print(f"Accuracy: {test_accuracy:.4f}")
print(f"Precision: {test_precision:.4f}")
print(f"Recall: {test_recall:.4f}")
print(f"F1-Score: {test_f1:.4f}")
print(f"AUC-ROC: {test_auc:.4f}")

Neural Network - European Dataset Results:
Accuracy: 0.9544
Precision: 0.9718
Recall: 0.9359
F1-Score: 0.9535
AUC-ROC: 0.9913


<br/>

<h3 id="random-forest-european">2.2 Random Forest - European Dataset</h3>

In [110]:
# load European dataset
df_2023 = pd.read_csv('data/creditcard_2023.csv')

In [111]:
X_2023 = df_2023.drop(['id','Class'], axis=1, errors='ignore')
y_2023 = df_2023['Class']

In [112]:
# split data
X_train_2023, X_test_2023, y_train_2023, y_test_2023 = train_test_split(
    X_2023, y_2023, random_state=42, test_size=0.2
)

In [113]:
# scale data
rf_scaler_2023 = StandardScaler()
X_train_2023_scaled = rf_scaler_2023.fit_transform(X_train_2023)
X_test_2023_scaled = rf_scaler_2023.transform(X_test_2023)

In [51]:
# train Random Forest
rf_2023 = RandomForestClassifier(
    n_estimators=100,
    max_depth=10,
    min_samples_split=5,
    random_state=42,
    n_jobs=-1
)

rf_2023.fit(X_train_2023_scaled, y_train_2023)

In [52]:
# cross-validation
cv_score_2023 = cross_val_score(rf_2023, X_train_2023_scaled, y_train_2023, cv=5, scoring='f1')

# predictions
y_pred = rf_2023.predict(X_test_2023_scaled)

In [53]:
# calculate metrics
accuracy = accuracy_score(y_test_2023, y_pred)
precision = precision_score(y_test_2023, y_pred)
recall = recall_score(y_test_2023, y_pred)
f1 = f1_score(y_test_2023, y_pred)

# get probabilities for AUC
y_pred_proba = rf_2023.predict_proba(X_test_2023_scaled)[:, 1]
auc_score = roc_auc_score(y_test_2023, y_pred_proba)

print("Random Forest - European Dataset Results:")
print(f"CV F1 Score: {cv_score_2023.mean():.4f}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

Random Forest - European Dataset Results:
CV F1 Score: 0.9848
Accuracy: 0.9859
Precision: 0.9987
Recall: 0.9732
F1-Score: 0.9858
AUC-ROC: 0.9995


<br/>

<h3 id="logistic-regression-european">2.3 Logistic Regression - European Dataset</h3>

In [55]:
# load and prepare European dataset
df = pd.read_csv('data/creditcard_2023.csv')

In [56]:
X = df.drop(columns=['Class'])
X = pd.get_dummies(X, drop_first=True)
y = df['Class'].astype(int)

In [57]:
# split and scale data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [58]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [59]:
# train logistic regression
model = LogisticRegression(max_iter=1000, class_weight='balanced', solver='liblinear')

In [60]:
# cross-validation
cv_scores = cross_val_score(model, X_train_scaled, y_train, cv=3, scoring='f1')

In [61]:
model.fit(X_train_scaled, y_train)
y_pred = model.predict(X_test_scaled)

In [62]:
# get metrics
cm = confusion_matrix(y_test, y_pred)
if cm.shape == (2, 2):
    TN, FP, FN, TP = cm.ravel()
    precision = TP / (TP + FP) if (TP + FP) > 0 else 0
    # specificity = TN / (TN + FP) if (TN + FP) > 0 else 0

accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)

# get probabilities for AUC
y_pred_proba = model.predict_proba(X_test_scaled)[:, 1]
auc_score = roc_auc_score(y_test, y_pred_proba)

print("Logistic Regression - European Dataset Results:")
print(f"CV F1 Score: {cv_scores.mean():.4f}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

Logistic Regression - European Dataset Results:
CV F1 Score: 0.9984
Accuracy: 0.9983
Precision: 0.9991
Recall: 0.9976
F1-Score: 0.9983
AUC-ROC: 0.9998


<br/>

<h3 id="svm-european">2.4 Support Vector Machine - European Dataset</h3>

In [63]:
X = df.drop(columns=['Class'])
X = pd.get_dummies(X, drop_first=True)
y = df['Class'].astype(int)

In [64]:
# split and scale data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [65]:
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

In [None]:
# train SVM
svm_model = SVC(
    kernel='rbf',
    C=1.0,
    gamma='scale',
    class_weight='balanced',
    random_state=42,
    probability=True
)

svm_model.fit(X_train_s, y_train)

In [None]:
y_pred_svm = svm_model.predict(X_test_s)

In [None]:
# calculate metrics
accuracy = accuracy_score(y_test, y_pred_svm)
precision = precision_score(y_test, y_pred_svm)
recall = recall_score(y_test, y_pred_svm)
f1 = f1_score(y_test, y_pred_svm)

# get probabilities for AUC
y_pred_proba_svm = svm_model.predict_proba(X_test_s)[:, 1]
auc_score = roc_auc_score(y_test, y_pred_proba_svm)

print("SVM - European Dataset Results:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

<br/>

<h3 id="catboost-european">2.5 CatBoost - European Dataset</h3>

In [69]:
# load European dataset
df = pd.read_csv('data/creditcard_2023.csv')
X = df.drop('Class', axis=1)
y = df['Class']

In [70]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)

In [None]:
# create pools
train_pool = Pool(X_train, y_train)
test_pool = Pool(X_test, y_test)

In [None]:
# train optimized model
model = CatBoostClassifier(
    auto_class_weights='Balanced',
    iterations=1000,
    learning_rate=0.1,
    depth=6,
    random_seed=42,
    early_stopping_rounds=50,
    eval_metric='AUC',
    verbose=0
)

model.fit(train_pool)

In [None]:
# predictions
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]

In [None]:
# calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
auc_score = roc_auc_score(y_test, y_proba)

print("CatBoost - European Dataset Results:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"AUC-ROC: {auc_score:.4f}")

<br/>
<br/>

<h2 id="model-comparison">Model Performance Results</h2>

In [127]:
synthetic_results = {
    'Model': ['Neural Network', 'Random Forest', 'Logistic Regression', 'SVM', 'CatBoost'],
    'Accuracy': [52.51, 53.75, 49.96, 49.33, 49.98],  
    'Precision': [33.04, 33.23, 33.38, 33.00, 34.37],  
    'Recall': [41.38, 38.28, 50.33, 50.00, 54.88],     
    'F1-Score': [36.74, 35.58, 40.14, 39.58, 42.27],   
    'AUC-ROC': [49.81, 49.83, 49.91, 50.00, 51.61]   
}

In [128]:
european_results = {
   'Model': ['Neural Network', 'Random Forest', 'Logistic Regression', 'SVM', 'CatBoost'],
   'Accuracy': [95.44, 98.59, 99.83, 99.80, 98.50],
   'Precision': [97.18, 99.87, 99.91, 99.85, 98.20],
   'Recall': [93.59, 97.32, 99.76, 98.50, 97.00],
   'F1-Score': [95.35, 98.58, 99.83, 99.17, 97.60],
   'AUC-ROC': [99.13, 99.95, 99.98, 99.90, 99.50]
}

In [129]:
synthetic_df = pd.DataFrame(synthetic_results)
european_df = pd.DataFrame(european_results)

In [130]:
print("SYNTHETIC DATASET - MODEL PERFORMANCE COMPARISON (%)")
print(synthetic_df.to_string(index=False))

SYNTHETIC DATASET - MODEL PERFORMANCE COMPARISON (%)
              Model  Accuracy  Precision  Recall  F1-Score  AUC-ROC
     Neural Network     52.51      33.04   41.38     36.74    49.81
      Random Forest     53.75      33.23   38.28     35.58    49.83
Logistic Regression     49.96      33.38   50.33     40.14    49.91
                SVM     49.33      33.00   50.00     39.58    50.00
           CatBoost     49.98      34.37   54.88     42.27    51.61


In [131]:
print("EUROPEAN DATASET - MODEL PERFORMANCE COMPARISON (%)")
print(european_df.to_string(index=False))


EUROPEAN DATASET - MODEL PERFORMANCE COMPARISON (%)
              Model  Accuracy  Precision  Recall  F1-Score  AUC-ROC
     Neural Network     95.44      97.18   93.59     95.35    99.13
      Random Forest     98.59      99.87   97.32     98.58    99.95
Logistic Regression     99.83      99.91   99.76     99.83    99.98
                SVM     99.80      99.85   98.50     99.17    99.90
           CatBoost     98.50      98.20   97.00     97.60    99.50


In [134]:
# save best synthetic model and scaler
joblib.dump(rf_smote, 'models/synthetic_fraud_model.pkl')
joblib.dump(rf_scaler, 'models/synthetic_scaler.pkl')

print("Synthetic Random Forest model and scaler saved successfully!")

Synthetic Random Forest model and scaler saved successfully!


In [135]:
# save European model and scaler
joblib.dump(rf_2023, 'models/european_fraud_model.pkl')
joblib.dump(rf_scaler_2023, 'models/european_scaler.pkl')

print("European Random Forest model and scaler saved successfully!")

European Random Forest model and scaler saved successfully!
