
# ðŸ’³ Credit Card Fraud Detection with Tuned XGBoost

This notebook demonstrates an end-to-end **fraud detection pipeline** using:
- Class imbalance handling
- Feature selection
- Bayesian hyperparameter optimization
- XGBoost classifier

The goal is to **maximize F1-score**, not accuracy.



## 1. Imports & Setup


In [2]:

import pandas as pd
import numpy as np

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.metrics import confusion_matrix, classification_report

from bayes_opt import BayesianOptimization



## 2. Load Dataset

> Dataset is not included due to size.
> Download `creditcard.csv` from Kaggle and place it in the project root.


In [3]:

df = pd.read_csv("creditcard.csv")

X = df.drop("Class", axis=1)
y = df["Class"]

print("Dataset shape:", df.shape)
print(y.value_counts())


Dataset shape: (284807, 31)
Class
0    284315
1       492
Name: count, dtype: int64



## 3. Train-Test Split & Class Imbalance Handling


In [4]:

x_train, x_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

scale_pos_weight = len(y_train[y_train == 0]) / len(y_train[y_train == 1])
print("Scale pos weight:", scale_pos_weight)


Scale pos weight: 577.2868020304569



## 4. Baseline XGBoost Model


In [5]:

baseline_model = XGBClassifier(
    scale_pos_weight=scale_pos_weight,
    eval_metric="mlogloss"
)

baseline_model.fit(x_train, y_train)
y_pred_base = baseline_model.predict(x_test)

print("Baseline Model")
print(confusion_matrix(y_test, y_pred_base))
print(classification_report(y_test, y_pred_base))


Baseline Model
[[56863     1]
 [   17    81]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00     56864
           1       0.99      0.83      0.90        98

    accuracy                           1.00     56962
   macro avg       0.99      0.91      0.95     56962
weighted avg       1.00      1.00      1.00     56962




## 5. Feature Selection (ANOVA F-test)


In [6]:

selector = SelectKBest(score_func=f_classif, k=10)

x_train_fs = selector.fit_transform(x_train, y_train)
x_test_fs = selector.transform(x_test)

selected_features = X.columns[selector.get_support()]
print("Selected Features:", list(selected_features))


Selected Features: ['V3', 'V4', 'V7', 'V10', 'V11', 'V12', 'V14', 'V16', 'V17', 'V18']



## 6. Bayesian Optimization for Hyperparameter Tuning


In [7]:

def xgb_evaluate(max_depth, learning_rate, n_estimators, scale_pos_weight):
    model = XGBClassifier(
        max_depth=int(max_depth),
        learning_rate=learning_rate,
        n_estimators=int(n_estimators),
        scale_pos_weight=scale_pos_weight,
        eval_metric="mlogloss"
    )
    
    scores = cross_val_score(
        model, x_train_fs, y_train,
        scoring="f1", cv=5, n_jobs=-1
    )
    return scores.mean()


In [9]:

optimizer = BayesianOptimization(
    f=xgb_evaluate,
    pbounds={
        "max_depth": (2, 6),
        "learning_rate": (0.01, 0.3),
        "n_estimators": (50, 150),
        "scale_pos_weight": (1, scale_pos_weight)
    },
    random_state=42
)

optimizer.maximize(init_points=5, n_iter=10)


|   iter    |  target   | max_depth | learni... | n_esti... | scale_... |
-------------------------------------------------------------------------
| [39m1        [39m | [39m0.6595233[39m | [39m3.4981604[39m | [39m0.2857071[39m | [39m123.19939[39m | [39m345.99898[39m |
| [39m2        [39m | [39m0.1561286[39m | [39m2.6240745[39m | [39m0.0552384[39m | [39m55.808361[39m | [39m500.16588[39m |
| [39m3        [39m | [39m0.4215543[39m | [39m4.4044600[39m | [39m0.2153410[39m | [39m52.058449[39m | [39m559.94624[39m |
| [35m4        [39m | [35m0.6780665[39m | [35m5.3297705[39m | [35m0.0715783[39m | [35m68.182496[39m | [35m106.69359[39m |
| [39m5        [39m | [39m0.5236202[39m | [39m3.2169689[39m | [39m0.1621793[39m | [39m93.194501[39m | [39m168.83150[39m |
| [35m6        [39m | [35m0.7538050[39m | [35m5.8057794[39m | [35m0.0781583[39m | [35m103.13680[39m | [35m63.116935[39m |
| [35m7        [39m | [35m0.8166102[39m | [


## 7. Train Final Tuned Model


In [10]:

best_params = optimizer.max["params"]

final_model = XGBClassifier(
    max_depth=int(best_params["max_depth"]),
    learning_rate=best_params["learning_rate"],
    n_estimators=int(best_params["n_estimators"]),
    scale_pos_weight=best_params["scale_pos_weight"],
    eval_metric="mlogloss"
)

final_model.fit(x_train_fs, y_train)

y_pred_final = final_model.predict(x_test_fs)

print("Tuned Model")
print(confusion_matrix(y_test, y_pred_final))
print(classification_report(y_test, y_pred_final))


Tuned Model
[[56844    20]
 [   13    85]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00     56864
           1       0.81      0.87      0.84        98

    accuracy                           1.00     56962
   macro avg       0.90      0.93      0.92     56962
weighted avg       1.00      1.00      1.00     56962




## 8. Key Takeaways

- Accuracy is misleading for fraud detection
- Class imbalance must be handled explicitly
- Feature selection improves robustness
- Bayesian Optimization is efficient for tuning
