# EvalML Fraud Detection Demo:
This demo showcases using EvalMl to optimize models using a custom objective to predict realized business value. The goal of the model would be to take in credit card transaction data and decide whether the transaction is fraudulent. 

Data: https://www.kaggle.com/c/ieee-fraud-detection/

In [1]:
import os

import evalml
import numpy as np
import pandas as pd

In [2]:
%%time 
train_identity = pd.read_csv('https://featuretools-static.s3.amazonaws.com/evalml/IEEE-CIS+Fraud+Detection/train_identity.csv')
train_transaction = pd.read_csv('https://featuretools-static.s3.amazonaws.com/evalml/IEEE-CIS+Fraud+Detection/train_transaction.csv')

CPU times: user 27.1 s, sys: 12.3 s, total: 39.4 s
Wall time: 1min 38s


In [3]:
display(train_identity.head())
display(train_transaction.head())

Unnamed: 0,TransactionID,id_01,id_02,id_03,id_04,id_05,id_06,id_07,id_08,id_09,...,id_31,id_32,id_33,id_34,id_35,id_36,id_37,id_38,DeviceType,DeviceInfo
0,2987004,0.0,70787.0,,,,,,,,...,samsung browser 6.2,32.0,2220x1080,match_status:2,T,F,T,T,mobile,SAMSUNG SM-G892A Build/NRD90M
1,2987008,-5.0,98945.0,,,0.0,-5.0,,,,...,mobile safari 11.0,32.0,1334x750,match_status:1,T,F,F,T,mobile,iOS Device
2,2987010,-5.0,191631.0,0.0,0.0,0.0,0.0,,,0.0,...,chrome 62.0,,,,F,F,T,T,desktop,Windows
3,2987011,-5.0,221832.0,,,0.0,-6.0,,,,...,chrome 62.0,,,,F,F,T,T,desktop,
4,2987016,0.0,7460.0,0.0,0.0,1.0,0.0,,,0.0,...,chrome 62.0,24.0,1280x800,match_status:2,T,F,T,T,desktop,MacOS


Unnamed: 0,TransactionID,isFraud,TransactionDT,TransactionAmt,ProductCD,card1,card2,card3,card4,card5,...,V330,V331,V332,V333,V334,V335,V336,V337,V338,V339
0,2987000,0,86400,68.5,W,13926,,150.0,discover,142.0,...,,,,,,,,,,
1,2987001,0,86401,29.0,W,2755,404.0,150.0,mastercard,102.0,...,,,,,,,,,,
2,2987002,0,86469,59.0,W,4663,490.0,150.0,visa,166.0,...,,,,,,,,,,
3,2987003,0,86499,50.0,W,18132,567.0,150.0,mastercard,117.0,...,,,,,,,,,,
4,2987004,0,86506,50.0,H,4497,514.0,150.0,mastercard,102.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Merge dataframes:

Since the data given is a one-to-one relationship between the identity and transaction data, we can merge the two dataframes on the `TransactionID` column.

In [4]:
train_df = train_transaction.merge(train_identity)

# select sample size here! `frac=1.0` may take up to a couple hours to finish!
train_sample = train_df.sample(frac=0.65, random_state=1)
X_train = train_sample.drop('isFraud', axis=1)
y_train = train_sample['isFraud']

## Encode Categorical Variables:
As categorical variables are not compatible with some machine learning models, here we encode them into numerical variables by creating dummy variables.

In [5]:
cat_cols = X_train.select_dtypes(include=['object']).columns

In [6]:
# encode categorical features
X_train = pd.get_dummies(X_train, columns=cat_cols)

In [7]:
X_train, X_holdout, y_train, y_holdout = evalml.preprocessing.split_data(X_train, y_train, test_size=.8, random_state=0)

## Model Training With AUC
Here we utilize a traditional classification objective function to automatically learn the best model. Further down, 

In [8]:
clf = evalml.AutoClassifier(objective="AUC",
                            model_types=['linear_model'],
                            max_pipelines=10)

### After fitting our models, we can display the rankings of all the models and also score the holdout data with the best model

In [9]:
%%time
# fit using autoclassfier
clf.fit(X_train, y_train)

[1m*****************************[0m
[1m* Beginning pipeline search *[0m
[1m*****************************[0m

Optimizing for AUC. Greater score is better.

Searching up to 10 pipelines. No time limit is set. Set one using max_time parameter.

Possible model types: linear_model

Testing LogisticRegression w/ imputation + scaling: 100%|██████████| 10/10 [26:40<00:00, 160.07s/it]

✔ Optimization finished
CPU times: user 2min 22s, sys: 53.8 s, total: 3min 16s
Wall time: 26min 41s


In [10]:
clf.rankings

Unnamed: 0,id,pipeline_name,score,high_variance_cv,parameters
0,2,LogisticRegressionPipeline,0.884003,False,"{'penalty': 'l2', 'C': 0.5765626434012575, 'im..."
1,9,LogisticRegressionPipeline,0.880749,False,"{'penalty': 'l2', 'C': 1.0680169958060437, 'im..."
2,5,LogisticRegressionPipeline,0.872452,False,"{'penalty': 'l2', 'C': 3.6887329830070748, 'im..."
3,7,LogisticRegressionPipeline,0.870912,False,"{'penalty': 'l2', 'C': 5.209570020716537, 'imp..."
4,8,LogisticRegressionPipeline,0.870481,False,"{'penalty': 'l2', 'C': 5.824377722830321, 'imp..."
5,1,LogisticRegressionPipeline,0.870036,False,"{'penalty': 'l2', 'C': 6.239401330891865, 'imp..."
6,0,LogisticRegressionPipeline,0.8695,False,"{'penalty': 'l2', 'C': 8.444214828324364, 'imp..."
7,3,LogisticRegressionPipeline,0.869403,False,"{'penalty': 'l2', 'C': 8.123565600467177, 'imp..."
8,6,LogisticRegressionPipeline,0.869102,False,"{'penalty': 'l2', 'C': 8.702171711000782, 'imp..."
9,4,LogisticRegressionPipeline,0.868824,False,"{'penalty': 'l2', 'C': 8.362426847738403, 'imp..."


In [11]:
pipeline = clf.best_pipeline
print("Model Score: {}".format(pipeline.score(X_holdout, y_holdout)))

Model Score: 0.8814915802575387


## Custom Objective:

Here we utilize a custom objective function built within EvalML for fraud detection. Using it we can define how the model will train to provide the most realized business value. We define below that `50%` of our customers will retry a declined transaction, we earn `2%` of each transaction and we will not be able to collect `100%` of all fraudulent transactions. Thus, the model chosen will best fit our business needs.

In [12]:
fraud_objective = evalml.objectives.FraudCost(
    retry_percentage=.5,
    interchange_fee=.02,
    fraud_payout_percentage=1.0,
    amount_col='TransactionAmt'  # column in data that contains the amount of the transaction
)

clf_fraud = evalml.AutoClassifier(objective=fraud_objective,
                                  model_types=['linear_model'],
                                  max_pipelines=10)

In [13]:
%%time
# fit using autoclassfier
clf_fraud.fit(X_train, y_train)

[1m*****************************[0m
[1m* Beginning pipeline search *[0m
[1m*****************************[0m

Optimizing for Fraud Cost. Lower score is better.

Searching up to 10 pipelines. No time limit is set. Set one using max_time parameter.

Possible model types: linear_model

Testing LogisticRegression w/ imputation + scaling: 100%|██████████| 10/10 [18:42<00:00, 112.20s/it]

✔ Optimization finished
CPU times: user 2min 8s, sys: 41.8 s, total: 2min 49s
Wall time: 18min 43s


### Again we can rank our models and see the performance on our holdout sets. However, this time we will see the predicted amount of dollars lost due to fraudulent transactions!

In [14]:
clf_fraud.rankings

Unnamed: 0,id,pipeline_name,score,high_variance_cv,parameters
0,2,LogisticRegressionPipeline,0.009309,False,"{'penalty': 'l2', 'C': 0.5765626434012575, 'im..."
1,7,LogisticRegressionPipeline,0.009349,False,"{'penalty': 'l2', 'C': 5.209570020716537, 'imp..."
2,8,LogisticRegressionPipeline,0.009468,False,"{'penalty': 'l2', 'C': 5.824377722830321, 'imp..."
3,9,LogisticRegressionPipeline,0.009713,False,"{'penalty': 'l2', 'C': 1.0680169958060437, 'im..."
4,5,LogisticRegressionPipeline,0.009808,False,"{'penalty': 'l2', 'C': 3.6887329830070748, 'im..."
5,4,LogisticRegressionPipeline,0.00984,False,"{'penalty': 'l2', 'C': 8.362426847738403, 'imp..."
6,6,LogisticRegressionPipeline,0.009897,False,"{'penalty': 'l2', 'C': 8.702171711000782, 'imp..."
7,3,LogisticRegressionPipeline,0.009924,True,"{'penalty': 'l2', 'C': 8.123565600467177, 'imp..."
8,0,LogisticRegressionPipeline,0.010119,False,"{'penalty': 'l2', 'C': 8.444214828324364, 'imp..."
9,1,LogisticRegressionPipeline,0.010483,True,"{'penalty': 'l2', 'C': 6.239401330891865, 'imp..."


In [15]:
pipeline = clf_fraud.best_pipeline
print("Best Model % of Money Lost: {}".format(pipeline.score(X_holdout, y_holdout)))

Best Model % of Money Lost: 0.009539135855959386


### In comparison, the model that optimized for AUC performed much worse. This just goes to show how EvalML can get the results you want by optimizing for the right objective!

In [16]:
pipeline = clf.best_pipeline
print("AUC Model % of Money Lost: {}".format(pipeline.score(X_holdout, y_holdout, other_objectives=[fraud_objective])[1]['Fraud Cost']))

AUC Model % of Money Lost: 0.039593652184801854
