# Credit Card Fraud Detection – Logistic Regression Baseline

This notebook explores fraud detection using a logistic regression baseline. We:

- Analyze class imbalance
- Build a scalable pipeline with Logistic Regression
- Tune decision thresholds for optimal fraud detection performance
- Prepare for more advanced modeling and deployment

---


In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (
    classification_report,
    confusion_matrix,
    roc_auc_score,
    precision_recall_curve
)


In [5]:
# Load dataset
df = pd.read_csv("../data/creditcard.csv")
df['Class'].value_counts(normalize=True)


Class
0    0.998273
1    0.001727
Name: proportion, dtype: float64

In [6]:
X = df.drop("Class", axis=1)
y = df["Class"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, stratify=y, test_size=0.2, random_state=42
)

# Define and train pipeline
pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("logreg", LogisticRegression(class_weight='balanced', max_iter=5000))
])

pipe.fit(X_train, y_train)

# Predictions
y_pred = pipe.predict(X_test)
y_proba = pipe.predict_proba(X_test)[:, 1]


In [7]:
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
print("ROC AUC Score:", roc_auc_score(y_test, y_proba))


Confusion Matrix:
[[55478  1386]
 [    8    90]]

Classification Report:
              precision    recall  f1-score   support

           0       1.00      0.98      0.99     56864
           1       0.06      0.92      0.11        98

    accuracy                           0.98     56962
   macro avg       0.53      0.95      0.55     56962
weighted avg       1.00      0.98      0.99     56962

ROC AUC Score: 0.9720834996210077


In [8]:
thresholds_to_test = [0.1 * i for i in range(1, 10)]

for t in thresholds_to_test:
    print(f"\nThreshold: {t}")
    y_pred_thresh = (y_proba >= t).astype(int)
    print("Classification Report:")
    print(classification_report(y_test, y_pred_thresh))
    print("Confusion Matrix:")
    print(confusion_matrix(y_test, y_pred_thresh))
    print("ROC AUC Score:", roc_auc_score(y_test, y_proba))



Threshold: 0.1
Classification Report:
              precision    recall  f1-score   support

           0       1.00      0.80      0.89     56864
           1       0.01      0.95      0.02        98

    accuracy                           0.80     56962
   macro avg       0.50      0.87      0.45     56962
weighted avg       1.00      0.80      0.89     56962

Confusion Matrix:
[[45549 11315]
 [    5    93]]
ROC AUC Score: 0.9720834996210077

Threshold: 0.2
Classification Report:
              precision    recall  f1-score   support

           0       1.00      0.90      0.95     56864
           1       0.02      0.94      0.03        98

    accuracy                           0.90     56962
   macro avg       0.51      0.92      0.49     56962
weighted avg       1.00      0.90      0.95     56962

Confusion Matrix:
[[51349  5515]
 [    6    92]]
ROC AUC Score: 0.9720834996210077

Threshold: 0.30000000000000004
Classification Report:
              precision    recall  f1-score   s

## ✅ Threshold Decision: 0.9

After evaluating thresholds from 0.1 to 0.9, we selected **0.9** as the optimal operational point:

- **Recall:** 0.89 – high fraud detection
- **Precision:** 0.25 – big improvement over default (0.06)
- **F1 Score:** 0.39 – solid balance
- **False Positives:** dropped significantly (265 from 1386)

This threshold offers a practical trade-off between minimizing false alarms and maximizing fraud capture. We now proceed to test advanced models like XGBoost before deployment.
