# Credit Card Fraud Detection Using Daimensions

In this notebook, we will be using a dataset from Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles). This dataset has 30 attribute columns to describe a credit card transaction and one target column to determine if it is a fraudulant transaction. The dataset can be found on Kaggle: 
https://www.kaggle.com/mlg-ulb/creditcardfraud

Below is a sample of the data. All of the features that start with "V" are the result of a PCA transformation on the sensitive data relevant to the transaction. We are trying to predict the "Class" column, and it has the labels "1" for fraudulent transactions and "0" for regular ones. Also, the dataset is highly unbalanced, with only 0.17% of the transactions being fraudulent.

In [1]:
! head creditcard.csv

"Time","V1","V2","V3","V4","V5","V6","V7","V8","V9","V10","V11","V12","V13","V14","V15","V16","V17","V18","V19","V20","V21","V22","V23","V24","V25","V26","V27","V28","Amount","Class"
0,-1.3598071336738,-0.0727811733098497,2.53634673796914,1.37815522427443,-0.338320769942518,0.462387777762292,0.239598554061257,0.0986979012610507,0.363786969611213,0.0907941719789316,-0.551599533260813,-0.617800855762348,-0.991389847235408,-0.311169353699879,1.46817697209427,-0.470400525259478,0.207971241929242,0.0257905801985591,0.403992960255733,0.251412098239705,-0.018306777944153,0.277837575558899,-0.110473910188767,0.0669280749146731,0.128539358273528,-0.189114843888824,0.133558376740387,-0.0210530534538215,149.62,"0"
0,1.19185711131486,0.26615071205963,0.16648011335321,0.448154078460911,0.0600176492822243,-0.0823608088155687,-0.0788029833323113,0.0851016549148104,-0.255425128109186,-0.166974414004614,1.61272666105479,1.06523531137287,0.48909501589608,-0.143772296441519,0.635558093258208,0.4639170410

For this dataset, our objective is to understand which attributes are most important, and then be able to build a model that detects credit card fraud. Daimension's has an option to enable attribute ranking, which is extremely helpful in finding the features that are most correlated with the target class.

## 1. Get Measurements
Before we build the predictor for the dataset, it would be wise to measure it. This allows us to find the most optimal model, without even having to build one. For more information about how to use Daimensions and why we want to measure our data beforehand, check out the Titanic notebook.

In [9]:
! ./btc creditcard.csv -measureonly

Brainome Daimensions(tm) 0.99 Copyright (c) 2019, 2020 by Brainome, Inc. All Rights Reserved.
Licensed to: Alexander Makhratchev
Expiration date: 2021-04-30 (60 days left)
Number of threads: 1
Maximum file size: 30720MB
Running locally.

Data:
Number of instances: 284807
Number of attributes: 30
Number of classes: 2
Class balance: 99.83% 0.17%

Learnability:
Best guess accuracy: 99.83%
Capacity progression: [8, 9, 10, 10, 11, 11]
Decision Tree: 938 parameters
Estimated Memory Equivalent Capacity for Neural Networks: 321 parameters

Risk that model needs to overfit for 100% accuracy...
using Decision Tree: 95.49%
using Neural Networks: 100.00%

Expected Generalization...
using Decision Tree: 5.57 bits/bit
using a Neural Network: 16.28 bits/bit

Recommendations:
Note: Maybe enough data to generalize. [yellow]
Time estimate for a Neural Network:
Estimated time to architect: 0d 2h 43m 53s

Estimated time to prime (subject to change after model architecting): 0d 0h 2m 13s

Time estimate for

## 2. Neural Network with -O 
From the daimensions measurements, we can see that the best model for this dataset would be a neural network. It has the highest generalization and lowest memory equivalent capacity. However, the neural network has a much higher risk for overfit. Because the dataset is so unbalanced, we will be using the -O command line option in order optimize the true positive rate (TPR). After the -O, we specify the label to focus on, and in our case it is the fradulent charges "1".

In [5]:
! ./btc creditcard.csv -f NN -O 1 --yes


Brainome Daimensions(tm) 0.99 Copyright (c) 2019 - 2021 by Brainome, Inc. All Rights Reserved.
Licensed to:              Alexander Makhratchev  (Evaluation)
Expiration Date:          2021-04-30   58 days left
Number of Threads:        1
Maximum File Size:        30 GB
Maximum Instances:        unlimited
Maximum Attributes:       unlimited
Maximum Classes:          unlimited
Connected to:             daimensions.brainome.ai  (local execution)



Command:
    btc creditcard.csv -f NN -O 1 --yes

Start Time:                 03/03/2021, 04:25


Data:
    Input:                      creditcard.csv
    Target Column:              Class
    Number of instances:        284807
    Number of attributes:       30
    Number of classes:          2
    Class Balance:              0: 99.83%, 1: 0.17%

Learnability:
    Best guess accuracy:          99.83%
    Data Sufficiency:            Maybe enough data to generalize. [yellow]

Capacity Progression:            at [ 5%, 10%, 20%, 40%, 80%, 100% ]


The neural network had a very poor overall accuracy on the validation set. However, the true positive rate is 100%, signifying that every transaction that was fraudulent was identified. 

## 3. Decision Tree with -O
We can also try to a decision tree for the dataset by simply replacing the NN command with DT. 

In [4]:
! ./btc creditcard.csv -rank -f DT -O 1 --yes

Brainome Daimensions(tm) 0.99 Copyright (c) 2019, 2020 by Brainome, Inc. All Rights Reserved.
Licensed to: Alexander Makhratchev
Expiration date: 2021-04-30 (61 days left)
Number of threads: 1
Maximum file size: 30720MB
Running locally.

Running btc will overwrite existing a.py. OK? [y/N] yes

Attribute Ranking:
Using only the important columns: V17 V14 V10 V9 V25 
Risk of coincidental column correlation: <0.001%

Data:
Number of instances: 284807
Number of attributes: 5
Number of classes: 2
Class balance: 99.83% 0.17%

Learnability:
Best guess accuracy: 99.83%
Capacity progression: [8, 9, 10, 12, 12, 13]
Decision Tree: 289 parameters
Estimated Memory Equivalent Capacity for Neural Networks: 64 parameters

Risk that model needs to overfit for 100% accuracy...
using Decision Tree: 29.42%
using Neural Networks: 90.14%

Expected Generalization...
using Decision Tree: 18.08 bits/bit
using a Neural Network: 81.63 bits/bit

Recommendations:
Time estimate for Decision Tree:
Estimated time to 

The decion tree was able to predict all of the fraudelent charges with 100% accuracy. The use of attribute ranking significantly reduces the noise in a dataset and improves accuracy.

## 4. Neural Netork with -balance
Now we will try the -balance command which optimizes the true positive rate for each class, instead of a specific one.

In [7]:
! ./btc creditcard.csv -f NN -balance --yes

Brainome Daimensions(tm) 0.99 Copyright (c) 2019, 2020 by Brainome, Inc. All Rights Reserved.
Licensed to: Alexander Makhratchev
Expiration date: 2021-04-30 (60 days left)
Number of threads: 1
Maximum file size: 30720MB
Running locally.

Running btc will overwrite existing a.py. OK? [y/N] yes
Data:
Number of instances: 284807
Number of attributes: 30
Number of classes: 2
Class balance: 99.83% 0.17%

Learnability:
Best guess accuracy: 99.83%
Capacity progression: [8, 9, 10, 10, 11, 11]
Decision Tree: 938 parameters
Estimated Memory Equivalent Capacity for Neural Networks: 321 parameters

Risk that model needs to overfit for 100% accuracy...
using Decision Tree: 95.49%
using Neural Networks: 100.00%

Expected Generalization...
using Decision Tree: 5.57 bits/bit
using a Neural Network: 16.28 bits/bit

Recommendations:
Note: Maybe enough data to generalize. [yellow]
Time estimate for a Neural Network:
Estimated time to architect: 0d 0h 12m 48s

Estimated time to prime (subject to change af

Unfortunately, our model performs slightly worse than best guess, but these are some of the challenges with highly unbalanced datasets