Skip to content

Py-Fi-nance/KNN-for-credit-card-fraud-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Credit Card Fraud Detection: K-Nearest Neighbors (KNN)

Overview

This project demonstrates a simple yet effective approach to detecting fraudulent transactions using the K-Nearest Neighbors algorithm. The objective is to build a model that can accurately classify transactions in a given dataset as either fraudulent or non-fraudulent.

Python Version

License: MIT

Forks Stars

Table of Contents

  1. Dataset
  2. Implementation Steps
  3. Results
  4. Conclusion
  5. Contributing
  6. Contact Information

Dataset

The creditcard.csv dataset is utilized in this project. It consists of various anonymized features, along with Time, Amount, and Class, where Class indicates whether the transaction is fraudulent (1) or not (0).

Implementation Steps

Data Loading and Exploration

The dataset is loaded, and initial exploration is performed to understand the data structure and content. Data Scource : https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud/data

Data Pre-processing

Any rows with 'NaN' values in the 'Class' column are dropped to maintain data integrity.

Data Splitting

The dataset is split into features (X) and the target variable (y), and further divided into training and test sets.

Data Scaling

Feature scaling is performed using StandardScaler to standardize the dataset.

Modeling

The K-Nearest Neighbors model is initialized, trained on the training data, and subsequently used to make predictions on the test data.

Evaluation

Model performance is evaluated using a classification report and accuracy score, assessing its ability to classify transactions correctly.

Visualization

A confusion matrix is plotted as a heatmap, offering a visual representation of the model's performance, illustrating True Positives, True Negatives, False Positives, and False Negatives.

Results

The implemented model exhibits high accuracy, effectively differentiating between fraudulent and non-fraudulent transactions. It achieves an accuracy score of approximately 0.9996.

Conclusion

This K-Nearest Neighbors implementation serves as an efficient solution for credit card fraud detection, balancing simplicity with high accuracy. However, there is always room for further optimizations and enhancements to improve model adaptability and robustness, such as exploring different algorithms, tuning hyperparameters, and performing feature engineering.

Contributing

We welcome contributions to this project. To contribute:

  1. Fork the project.
  2. Create your feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

Contact Information

For any questions or inquiries, please contact support@pyfi.com - Subject: Github Repo Q, KNN. For a full article walkthrough please visit > https://pyfi.com/blogs/articles/k-nearest-neighbors-algorithm-for-credit-card-fraud-detection < and learn more about PyFi's award winning Python for Finance courses which have been trusted by the top financial institutions in the United States and Canada multiple years running here >> https://www.pyfi.com << Follow on LinkedIn

Releases

No releases published

Packages

No packages published