This repository contains a project focused on detecting credit card fraud using regression techniques. The goal is to create a model that can accurately predict fraudulent transactions, thereby helping financial institutions minimize losses and protect consumers.
The increase in the use of credit cards for transactions has led to a significant rise in credit card fraud, posing a critical challenge for both financial institutions and cardholders. Addressing this issue involves creating a strong machine-learning model capable of effectively differentiating between fraudulent transactions and genuine ones. The model must be capable of predicting fraudulent activities in real-time, thereby ensuring that financial losses are minimized, and consumers' interests are protected. Given the increasing scale and complexity of fraudulent activities, it is essential to utilize advanced technologies in order to create a dependable and effective solution to address this issue.
The main goal is to create a model that can effectively differentiate between legitimate and fraudulent transactions. The success of such a model depends on its ability to make reliable predictions on unseen data and in real-time scenarios. Hence, it is imperative to construct the model with a thorough comprehension of the foundational data and the possible trends in illegitimate transactions. The objective of this research is to create a model that can accurately forecast the legitimacy of transactions and deliver dependable real-time predictions.
- Data preprocessing and exploration
- Feature engineering
- Model training and evaluation
- Hyperparameter tuning
- Model deployment
- Performance metrics and visualization
To get started, clone the repository and install the required dependencies.
git clone git@github.com:VaishnaviThakre/Credit-Card-Fraud-Detection-Regression.git
cd Credit-Card-Fraud-Detection-Regression
pip install -r requirements.txt
-
Data Preparation: Download the dataset and place it in the
data/
directory. -
Preprocessing: Run the preprocessing script to clean and prepare the data.
-
Training: Train the model using the prepared dataset.
-
Evaluation: Evaluate the model's performance on the test dataset.
-
Prediction: Use the trained model to make predictions on new data.
The dataset used in this project is the Credit Card Fraud Detection dataset available from Kaggle. It contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions.
Time
: Number of seconds elapsed between this transaction and the first transaction in the dataset.V1
toV28
: Result of a PCA transformation to anonymize features.Amount
: Transaction amount.Class
: 1 for fraudulent transactions, 0 otherwise.
The project employs regression techniques to predict the likelihood of a transaction being fraudulent. The models explored include:
- Linear Regression
- Ridge Regression
- Lasso Regression
- Polynomial Regression
The model is evaluated using various metrics such as:
- Accuracy
- Precision
- Recall
- F1 Score
- ROC-AUC
Detailed results and performance metrics of the trained models are stored in the notebooks
directory. Visualizations of the results can be found in the notebooks
directory.
Contributions are welcome! Please open an issue to discuss what you would like to change or contribute, and submit a pull request.
- Fork the repository
- Create a new branch (
git checkout -b feature-branch
) - Commit your changes (
git commit -m 'Add some feature'
) - Push to the branch (
git push origin feature-branch
) - Open a pull request
This project is licensed under the MIT License - see the LICENSE file for details.