# Credit Card Fraud Detection

This notebook demonstrates a complete workflow for credit card fraud detection, including:
- Data preparation and exploration
- Visualization of imbalanced classes
- Balancing the dataset using SMOTE
- Training and evaluating a fraud detection model

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Import our custom modules
import sys
sys.path.append('..')
from src.data_processor import load_credit_data, prepare_features_labels, scale_features
from src.visualization import plot_class_distribution, scatter_2d_data, compare_distributions
from src.balancing import apply_pca, apply_smote
from src.model import split_data, train_evaluate_model

## 1. Load and Explore the Data

First, we'll load the credit card fraud dataset. You can download it from Kaggle: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

In [None]:
# Load data (adjust the path as needed)
df = load_credit_data('path_to_your_data/creditcard.csv')

# Display the first few rows
df.head()

In [None]:
# Prepare features and labels
X, y = prepare_features_labels(df)

# Visualize class distribution
plot_class_distribution(y)

## 2. Preprocess and Visualize the Data

In [None]:
# Scale features
X_scaled = scale_features(X)

# Apply PCA to visualize in 2D
X_pca = apply_pca(X_scaled, n_components=2)

# Plot the 2D projection
scatter_2d_data(X_pca, y, title="PCA Projection of Credit Card Transactions")

## 3. Apply SMOTE to Handle Class Imbalance

In [None]:
# Apply SMOTE to the PCA-transformed data for visualization
X_pca_resampled, y_resampled = apply_smote(X_pca, y)

# Compare before and after distributions
compare_distributions(X_pca, y, X_pca_resampled, y_resampled, method_name="SMOTE")

## 4. Train and Evaluate the Model

In [None]:
# Split the original data
X_train, X_test, y_train, y_test = split_data(X_scaled, y)

# Train and evaluate without SMOTE
print("\n=== Model without SMOTE ===")
model_without_smote = train_evaluate_model(X_train, y_train, X_test, y_test, use_smote=False)

# Train and evaluate with SMOTE
print("\n=== Model with SMOTE ===")
model_with_smote = train_evaluate_model(X_train, y_train, X_test, y_test, use_smote=True)

## 5. Conclusion

In this notebook, we've demonstrated:
1. How to explore and visualize imbalanced credit card transaction data
2. How to apply SMOTE to create a balanced training dataset
3. How to train fraud detection models and evaluate their performance

The results show how SMOTE can improve the model's ability to detect fraudulent transactions by providing more examples of the minority class.