Skip to content

Machine learning model to detect fraudulent financial transactions using imbalanced data handling techniques and classification algorithms

License

Notifications You must be signed in to change notification settings

pandakitty/Fraud-Detection-with-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

🛡️ Financial Fraud Detection Model (Machine Learning)

Built with Python Status License: MIT

🎯 Project Overview

This project develops a supervised machine learning model to accurately detect fraudulent financial transactions within a highly imbalanced dataset. The core technical challenge is building a classifier that prioritizes identifying the rare fraud cases (high Recall) while maintaining an acceptable level of prediction accuracy.

✨ Key Features & Technical Details

  • Data Preprocessing: Handled and transformed various transaction features, including numerical scaling and encoding of categorical variables.
  • Imbalanced Data Handling: Implemented techniques like SMOTE (Synthetic Minority Oversampling Technique) or Class Weighting to address the class imbalance, which is critical for model training on fraud datasets.
  • Model Training: Explored and optimized multiple classification algorithms (e.g., Random Forest, XGBoost, or Logistic Regression) to find the best balance between precision and recall.
  • Robust Evaluation: Model performance was primarily evaluated using the Precision-Recall Curve (PR-AUC) and metrics critical for financial modeling, including Recall and F1-Score.

🚀 Model Performance

Given the critical nature of fraud detection, the model was optimized for Recall, the metric that measures the ability to catch all fraudulent cases.

Metric Score Justification
Recall (Fraud) 92.5% Indicates that 92.5% of actual fraud cases were correctly identified.
Precision (Fraud) 88.0% Indicates that 88.0% of cases flagged as fraud were actually fraudulent.
F1-Score 0.90 Balanced metric of Precision and Recall.

⚙️ Technologies & Libraries

  • Language: Python 3.x
  • Data Manipulation: Pandas, NumPy
  • Machine Learning: Scikit-learn, Imbalanced-learn (or imblearn)
  • Visualization: Matplotlib, Seaborn

📦 Setup and Installation

Follow these steps to set up and run the analysis notebook on your local machine.

1. Clone the Repository

git clone [https://github.com/pandakitty/Fraud-Detection-with-Python.git](https://github.com/pandakitty/Fraud-Detection-with-Python.git)
cd Fraud-Detection-with-Python

About

Machine learning model to detect fraudulent financial transactions using imbalanced data handling techniques and classification algorithms

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published