Skip to content

A machine learning model to detect the fraudulent transactions.

Notifications You must be signed in to change notification settings

Tacacs-1101/Fraud-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Fraud-Detection

A machine learning model to detect the fraudulent transactions.

Overview

This repository contains the detailed analysis on a dataset containing credit card transactions. The target variable has two classes (Normal and Fraud). The dataset is challenging because it is highly imbalanced. More than 99% data points belong to Normal class.

Download the data

Download the dataset (csv format) from here.

Installation

Anaconda is highly recommended for executing any data science projects. It comes with a lots of pre-installed packages for data analysis and machine learning. Two packages needs to be manually installed beside installing Anaconda.

  • Seaborn (pip install seaborn or conda install seaborn)
  • Imbalanced-learn (pip install -U imbalanced-learn)

Summary

This notebook can be devided into the following sections:

  • Data exploration
  • Feature engineering
  • Evaluation metrics
  • Modeling
  • Parameter tuning

After initial exploration, the dataset turns out to be highly imbalanced. Normal machine learning algorithms are biased towards the majority class. Resampling technique has been used to handle this problem. New features are generated based on the distribution of variables with in class. The accuracy metric is not useful for imbalanced class, so f1 ( harmonic mean of precision and recall ) and auc ( area under the roc curve) are used to evaluate the model performance. The usual threshold (probability = 0.5) is not used for classification. It has been tuned using cross-validation strategy.

About

A machine learning model to detect the fraudulent transactions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published