Project 2

Overview

Welcome to Are You a Fraud! This project involves an in-depth analysis utilizing machine learning models to predict if a credit card transaction is fraudulent or authentic. Below, you'll find an overview of the analysis, the dataset, the models used, and the key results. We also focused on the feature importance of our K-Nearest model as well as our Gradient Boosting model to better understand what variables are looked at most when determining fraud.

Analysis Overview

Purpose of the Analysis

The primary objective of this analysis is to read in credit card transactions and train our machine learning model to more accurately predict credit card fraud. We aim to leverage machine learning models to predict when a charge is fraud based on financial information in the provided dataset.

Financial Information and Prediction Target

Financial Data

The dataset contains eight variables of data that include distance from home, distance from last transaction, ration to median purchase price, repeat retailer, used chip, used pin number, online order, and fraud. These variable can provide valuable insights into the validity of the credit card transactions on our provided dataset.

Stages of Machine Learning Process

The machine learning process encompassed data preprocessing, feature engineering, model selection, and evaluation. Each stage was carefully executed to ensure optimal model performance. We utilized the XGBoost model to make sure the Gradient Boosting model did not over fit. We visualized the importance features in the K-Nearest model as well as the Gradient Boosting model to see what variables had the greatest impact on fraudulent charges.

Methods Used

We employed various machine learning models, including:

K-Nearest Neighbors
Gradient Boosting
XGBoost

Results

Model Performances

We achieved the following classification report results for each model:

K-Nearest Neighbors

Gradient Boosting

XGBoost

Data Visualization

We created a correlation matrix heatmap to visually explore the relationships between selected features and the target variable.

From our Gradient Boosting model we were able to visualize which variables were more important when making a decision if the transaction was fraudulent.

Conclusion

In conclusion, the analysis provided valuable insights into predicting credit card fraud with machine learning models. Although our results may not have yielded exactly what we hoped for, we learned of valauable resources and practices that guided along the way, including feature importance. Overall Gradient Boosting was the most accurate at predicting false and non-false charges. We still made sure to use XGBoost to make Gradient Boosting did not over fit the model. As we noticed in out heat map median purchase price had the highest correlation at determining if a charge was fraudulent or not.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Resources		Resources
FinalPP.pptx		FinalPP.pptx
Final_module.ipynb		Final_module.ipynb
Presentation (2).pptx		Presentation (2).pptx
Presentation.pptx		Presentation.pptx
README.md		README.md
data_collection.py		data_collection.py
heatmap.png		heatmap.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 2

Overview

Analysis Overview

Purpose of the Analysis

Financial Information and Prediction Target

Financial Data

Stages of Machine Learning Process

Stages of Machine Learning Process

Methods Used

Results

Model Performances

K-Nearest Neighbors

Gradient Boosting

XGBoost

Data Visualization

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project 2

Overview

Analysis Overview

Purpose of the Analysis

Financial Information and Prediction Target

Financial Data

Stages of Machine Learning Process

Stages of Machine Learning Process

Methods Used

Results

Model Performances

K-Nearest Neighbors

Gradient Boosting

XGBoost

Data Visualization

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages