Skip to content

This repository is dedicated to a project on Basic Credit Card Fraud Detection.

Notifications You must be signed in to change notification settings

pierogio/Credit_Card_Fraud_Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Basic Credit Card Fraud Detection:

forthebadge made-with-python
Made withJupyter

The goal of this project is to detect fraudulent transactions in credit card data. This is achieved by comparing the performance of three different models - Random Forest, Logistic Regression, and Decision Tree - using both undersampling and oversampling techniques. The final model is a basic Random Forest model trained on oversampled data.

What is Credit Card Fraud Detection?

Credit Card Fraud Detection is a technique used to identify fraudulent transactions made through credit cards. It’s a critical aspect of cybersecurity and financial technology (FinTech) that uses machine learning and data analysis to detect unusual patterns and prevent unauthorized transactions.

Consideration:

The dataset used in this model covers various transactions, both legitimate and fraudulent. We have chosen not to exclude any data from the full time period for this project. We have opted to retain the entire dataset in our analysis, despite the potential for poorer predictive results, to ensure a comprehensive understanding of the dataset’s dynamics.

Models:

Random Forest Model

Random Forest is a powerful ensemble learning method that operates by constructing multiple decision trees during training and outputting the class that is the mode of the classes or mean prediction of the individual trees.

Logistic Regression Model

Logistic Regression is a statistical model used in binary classification problems. It uses the logistic function to model the probability of a certain class or event.

Decision Tree Model

A Decision Tree is a flowchart-like structure in which each internal node represents a feature, each branch represents a decision rule, and each leaf node represents an outcome.

Spot-Check

A spot-check is a preliminary analysis used to assess the performance of different models on the dataset. In this project, we perform a spot-check to compare the performance of the Random Forest, Logistic Regression, and Decision Tree models using both undersampling and oversampling techniques.

Oversampling and Undersampling

Oversampling and undersampling are techniques used to handle imbalanced datasets. Oversampling increases the number of minority class samples, while undersampling reduces the number of majority class samples. In this project, we create a basic Random Forest model with oversampled data to improve the detection of fraudulent transactions.

In summary, this project aims to detect credit card fraud by comparing the performance of different models and sampling techniques, ultimately creating a Random Forest model trained on oversampled data.

About

This repository is dedicated to a project on Basic Credit Card Fraud Detection.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published