# Credit Card Fraud Detecter

We will be exploring a dataset that contains transactions made by credit cards. The goal of this project is to build a model that can detect credit card fraudulent transactions. The dataset used for this project was found [here](https://www.kaggle.com/mlg-ulb/creditcardfraud).

In [None]:
# import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
# read the data
transactions = pd.read_csv('creditcard.csv')

### Exploring the Data

In [None]:
transactions.head()

The columns of this dataset seems to have strange features. The only columns with known attributes are the "Time," "Amount," and "Class" features. Looking at the Kaggle page for this dataset reveals the reasoning behind the odd naming convention. All of the "V" columns are features that were determined by Principal Compenent Analysis to be principal components. This saves us the trouble of having to perform PCA, ourselves. Unfortunately, we are unable to see what the original components were due to confidentiality issues. The "Time" feature contains the seconds elapsed between each transaction and the first transaction in the dataset. The "Amount" feature contains the actual transaction amount. "Class" is the labeled data where 0 indicates no fraud and 1 indicates fraud. 

In [None]:
transactions.info()

In [None]:
transactions.describe()

Since the "Time" feature is relative to the first input of the dataset, the statistics of the column aren't relevant. We should be looking at "Amount." Let's start visualizing the data to attempt to form a general opinion about it. Due to the hidden nature of the "V" columns, let's only consider the named columns.

In [None]:
sns.distplot(transactions['Amount'], kde=False)

The distribution for the transaction amounts make sense. Most day-to-day transactions involve fairly low amounts. Although it is difficult to tell from the plot, there are transaction amounts up to the 26,000 range. Since there are very few of these transactions, the bars are drowned out by the very large amount of low amount transactions.

In [None]:
sns.distplot(transactions[transactions['Class'] == 1]['Amount'], kde=False)

Looking at just the fraudulent transactions, we can see that most of these transactions are fairly low amounts.

In [None]:
sns.countplot(x='Class', data=transactions)

In this dataset, there is an overwhelmingly large amount of non-fraudulent transactions in comparision to fraudulent transactions. While this is expected if this dataset is representative of all transactions, we will end up training our model on an imbalanced dataset. We should keep this in mind when choosing a model.

### Building the Model

Now let's actually pick and train a model. Since we our output is categorical, we'll choose a classification algorithm. Let's do a logistic regression because it is less prone to overfitting, and it works well with a binary output (such as this case). Before we actually build the model, let's standardize the "Time" and "Amount" columns so the model can converge faster.

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

In [None]:
ct = ColumnTransformer([
    ('ss', StandardScaler(), ['Time', 'Amount'])
], remainder='passthrough')

X = transactions.drop(['Class'], axis=1)
ct.fit_transform(X)

In [None]:
y = transactions['Class']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

In [None]:
model = LogisticRegression(max_iter=300)
model.fit(X_train, y_train)

In [None]:
predictions = model.predict(X_test)

Let's see how well the model performed.

In [None]:
from sklearn.metrics import classification_report

In [None]:
print(classification_report(y_test, predictions))

As expected with the imbalanced dataset, the model predicted the non-fraudulent cases perfectly while the fraudulent predictions leaves room for improvement. To combat the imbalanced dataset, let's use downsampling to balance out the training set.

In [None]:
from sklearn.utils import resample

In [None]:
# separate the classes
trans_maj = transactions[transactions['Class'] == 0]
trans_min = transactions[transactions['Class'] == 1]

# downsample the majority class
trans_maj_down = resample(trans_maj, replace=False, n_samples=len(trans_min))

# combine the downsampled majority class with the original minority class
trans_balanced = pd.concat([trans_maj_down, trans_min])

Alright, let's standardize the values then build the model again.

In [None]:
X = trans_balanced.drop(['Class'], axis=1)
y = trans_balanced['Class']

ct.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

In [None]:
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

In [None]:
predictions = model.predict(X_test)

Hopefully, the model did better this time.

In [None]:
print(classification_report(y_test, predictions))

The predictions for the non-fraudulent cases have become worse, but you could argue that they became more "realistic." What we should be looking at is the fraudulent cases. It seems that our model has improved a lot in that regard. Let's see how the model performs over the entire dataset now.

In [None]:
predictions = model.predict(transactions.drop(['Class'], axis=1))

In [None]:
print(classification_report(transactions['Class'], predictions))

...This is not an ideal result. Every metric besides the precision and f1-score for the fraudulent cases has improved. Unfortunately, the precision has dropped significantly. A low precision means that our model is flagging way more non-fraudulent transactions as fraudulent transactions. This could be due to the fact that we downsampled the valid transactions. Our model correlated certain features with being fraudulent when it shouldn't have. Instead of downsampling, let's try upsampling instead.

In [None]:
# separate the classes
trans_maj = transactions[transactions['Class'] == 0]
trans_min = transactions[transactions['Class'] == 1]

# upsample the minority class
trans_min_up = resample(trans_min, replace=True, n_samples=len(trans_maj))

# combine the upsampled minority class with the original majority class
trans_balanced = pd.concat([trans_min_up, trans_maj])

In [None]:
X = trans_balanced.drop(['Class'], axis=1)
y = trans_balanced['Class']

ct.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

In [None]:
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

In [None]:
predictions = model.predict(X_test)

In [None]:
print(classification_report(y_test, predictions))

The model seems to perform simarly on the upsampled dataset and the downsampled dataset. Next comes the real test: the original dataset.

In [None]:
predictions = model.predict(transactions.drop(['Class'], axis=1))

In [None]:
print(classification_report(transactions['Class'], predictions))

This model performed about the same as the downsampled model. Before we think about trying more class balancing methods, let's think about our actual metrics. Our precision is abysmal, but the recall isn't too bad. Remember that recall measures how many true positives were actually identified. In the case of credit card fraud, wouldn't that metric be more important than precision. Remember that precision is the measure of how many positives were actually positives. It is better to investigate false positives than miss false negatives.

### Conclusion

In this project, we tried to build a model that detected credit card fraud. We learned that the data was extremely imbalanced, so we trained our model over the original dataset, a downsampled dataset, and an upsampled dataset. We choose a logistic regression algorithm for our model. Although our model performed horribly with respect to precision, the recall was fairly decent. For credit card fraud, recall is a better metric than precision, so we can conclusively say that our model was fairly successful in detecting credit card fraud.

# Acknowledgements
*This acknowledgements section is taken directly from the Kaggle page for the dataset*

The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection.
More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project


Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015

Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon

Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE

Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)

Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier

Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing

Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019