Skip to content

Credit card fraud detection: A comparison between a logistic regression model, random forest and DNN when using under and over sampling methods to determine which approach returns the best evaluation metrics.

niall-anthony-mcnulty/Credit-card-fraud-detection-sampling-methods

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

A Comparison Between Models and Sampling Methods for Imbalanced Fraudulent Credit Card Data:

Given that credit card fraud in the UK has risen 55.7% during the years 2016 – 2020, the need for better fraud detection solutions is paramount. The aim of this paper is to find an answer to this predicament. Credit card transactional data by nature is imbalanced, biased and skewed due to the instances of fraud being much lower than that of non-fraud. To counter this bias, this research paper uses resampling methods to redistribute the data, in an attempt to build robust solutions. First, a comparison between Logistic Regression, Random Forest and Sequential DNN models will deduce which model generalizes best on imbalanced data. Synthetic Minority Oversampling Technique (SMOTE ), Random Under-Sampling (RUS), class weighting, and a hybrid approach, will then be utilised on the best performing model, in an effort to provide maximum MCC, Recall and F1 performance, with an emphasis on recall.

About

Credit card fraud detection: A comparison between a logistic regression model, random forest and DNN when using under and over sampling methods to determine which approach returns the best evaluation metrics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published