# Feature Engineering for Fraud Detection

In this notebook, we will focus on feature engineering techniques that can enhance the performance of our fraud detection models. We will create new features based on existing data, including time-based features and transaction frequency metrics. Additionally, we will address class imbalance in the dataset.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
from datetime import timedelta
from src.features.build_features import create_time_based_features, create_transaction_frequency_features
from src.data.sampling import handle_class_imbalance

# Load the dataset
data = pd.read_csv('../data/processed/fraud_data.csv')
data.head()

In [2]:
# Create time-based features
data = create_time_based_features(data)

# Create transaction frequency features
data = create_transaction_frequency_features(data)

# Check the new features
data.head()

In [3]:
# Handle class imbalance
X, y = data.drop('is_fraud', axis=1), data['is_fraud']
X_resampled, y_resampled = handle_class_imbalance(X, y)

# Check the distribution of the target variable after resampling
y_resampled.value_counts()

## Summary

In this notebook, we successfully engineered new features that can help improve the performance of our fraud detection models. We also addressed the class imbalance issue, which is crucial for building robust models. The next steps will involve modeling these features to evaluate their effectiveness.