## Task 1: Credit Card Routing for Online Purchase via Predictive Modelling

### Problem statement
* Over the past year, the online payment department at a large retail company have encountered a high failure rate of online credit card payments done via so-called payment service providers, referred to as PSP's by the business stakeholders.
* The company losses alot of money due to failed transactions and customers have become increasingly unsatisfied with the online shop.
* The current routing logic is manual and rule-based. Business decision makers hope that with predictive modelling, a smarter way of routing a PSP to a transaction is possible.

### Data Science Task
* Help the business to automate the credit card routing via a predictive model
* Such a model should increase the payment success rate by finding the best possible PSP for each transaction and at the same time keep the transaction fees low.

# PART 5b: Final Model
### CRISP-DM (5) - Evaluation (Comparing model metrics)
* Tune hyperparameters of the final model
* Run the final tune model
* Determine feature importance and discuss model interpretability

### Import Key Libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import metrics

In [2]:
# import visualization libraries
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns
from bokeh.plotting import figure, show, output_notebook 
from bokeh.palettes import Spectral
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show

### 5bi. Preparation of data

### bi1. Read Dataset and update index

In [3]:
dataset = pd.read_excel("PSP_Jan_Feb_2019.xlsx")

In [4]:
dataset.head()

Unnamed: 0.1,Unnamed: 0,tmsp,country,amount,success,PSP,3D_secured,card
0,0,2019-01-01 00:01:11,Germany,89,0,UK_Card,0,Visa
1,1,2019-01-01 00:01:17,Germany,89,1,UK_Card,0,Visa
2,2,2019-01-01 00:02:49,Germany,238,0,UK_Card,1,Diners
3,3,2019-01-01 00:03:13,Germany,238,1,UK_Card,1,Diners
4,4,2019-01-01 00:04:33,Austria,124,0,Simplecard,0,Diners


In [5]:
dataset = dataset.drop('Unnamed: 0', axis=1)

#### bi2. Remove Duplicates
* Comment out to include all transactions

In [6]:
dataset.sort_values(["tmsp", "country", "amount"], axis = 0, ascending = True, inplace = True, na_position = "first")
dataset.reset_index(inplace=True, drop=True)
dataset["time_delta"] = (dataset["tmsp"]-dataset["tmsp"].shift(1)).dt.total_seconds()
dataset["time_delta"] = dataset["time_delta"].fillna(0)
same_tx = (dataset["time_delta"]>60).cumsum()
dataset['tx_number'] = dataset.groupby(same_tx).ngroup()
## Comment out to include duplicates
dataset.drop_duplicates(subset=['tx_number', 'PSP'], keep='first', inplace=True)

#### bi3. Create dummy data/time features

In [7]:
### Remove year and month from the features list as there is only one year and the success rate is equally distributed between the two months
## Create hour of the day feature
dataset['day_of_month'] = dataset['tmsp'].dt.day
## Create days of the week feature
dataset['day_of_week'] = dataset['tmsp'].dt.day_name()
## Create hour of the day feature
dataset['hour'] = dataset['tmsp'].dt.hour

In [8]:
# make timestamp the index for easier analysis
dataset = dataset.set_index(dataset.columns[0])

In [9]:
dataset.head()

Unnamed: 0_level_0,country,amount,success,PSP,3D_secured,card,time_delta,tx_number,day_of_month,day_of_week,hour
tmsp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2019-01-01 00:01:11,Germany,89,0,UK_Card,0,Visa,0.0,0,1,Tuesday,0
2019-01-01 00:02:49,Germany,238,0,UK_Card,1,Diners,92.0,1,1,Tuesday,0
2019-01-01 00:04:33,Austria,124,0,Simplecard,0,Diners,80.0,2,1,Tuesday,0
2019-01-01 00:06:41,Switzerland,282,0,UK_Card,0,Master,128.0,3,1,Tuesday,0
2019-01-01 00:07:19,Switzerland,282,0,Simplecard,0,Master,38.0,3,1,Tuesday,0


In [10]:
# add a feature field to hold the order of the dates - for the base model
dataset['date_order'] = np.arange(len(dataset.index))

#### b14. Recreate dataset_time

In [11]:
dataset.groupby('country')['country'].count()

country
Austria         7434
Germany        22683
Switzerland     7815
Name: country, dtype: int64

In [12]:
# Print the number of missing entries in each column
print(dataset.isna().sum())

country         0
amount          0
success         0
PSP             0
3D_secured      0
card            0
time_delta      0
tx_number       0
day_of_month    0
day_of_week     0
hour            0
date_order      0
dtype: int64


#### bi5. Encoding of categorical feature variables and label and defining feature variable and dependent variable vector matrices for the base model

In [13]:
# Encoding day of the week
def encode_DayOfWeek(day_of_week):
    if day_of_week=="Monday":
        return 0
    if day_of_week=="Tuesday":
        return 1
    if day_of_week=="Wednesday":
        return 2
    if day_of_week=="Thursday":
        return 3
    if day_of_week=="Friday":
        return 4
    if day_of_week=="Saturday":
        return 5
    if day_of_week=="Sunday":
        return 6

In [14]:
dataset['day_of_week_num'] = dataset['day_of_week'].apply(encode_DayOfWeek)

In [15]:
dataset.head(1)

Unnamed: 0_level_0,country,amount,success,PSP,3D_secured,card,time_delta,tx_number,day_of_month,day_of_week,hour,date_order,day_of_week_num
tmsp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2019-01-01 00:01:11,Germany,89,0,UK_Card,0,Visa,0.0,0,1,Tuesday,0,0,1


In [16]:
#define categorical features
cat_features = ['country', 'card', 'PSP']

In [17]:
#encoding the categorical feature variables using OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(),cat_features)], remainder='passthrough')
### with no dups and no date_order (scenario AT, DTIDRF, DTEDRF)
X = np.array(ct.fit_transform(dataset.drop(['success','day_of_week','date_order'], axis=1)))

In [18]:
print(X[2])

[  1.   0.   0.   1.   0.   0.   0.   0.   1.   0. 124.   0.  80.   2.
   1.   0.   1.]


In [19]:
#encoding the label using LabelEncoder
le = LabelEncoder()
y = le.fit_transform(dataset['success'])

In [20]:
print(y[2])

0


#### bi6. Split the data into training set and the test set

In [21]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,random_state=30)

In [22]:
print(X_train[2])

[1.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 1.0000e+00
 0.0000e+00 0.0000e+00 0.0000e+00 1.0000e+00 3.1600e+02 0.0000e+00
 1.1600e+02 2.2953e+04 1.9000e+01 9.0000e+00 1.0000e+00]


#### bi7. Feature scaling

In [23]:
# scaling all the non-encoded columns on both train and test set
sc = StandardScaler()
X_train[:,6:] = sc.fit_transform(X_train[:,6:]) #fitting is done only with the train set
X_test[:,6:] = sc.transform(X_test[:,6:]) #scale test data using the fitted scaler

In [24]:
print(X_train[1])

[ 0.          1.          0.          0.          1.          0.
 -0.30022504  2.02908863 -0.62597507 -0.88706933  0.44714685 -0.56074325
 -0.63608601  0.26660762 -1.33962713  1.36456111  1.65172211]


In [25]:
print(X_test[1])

[ 0.          1.          0.          0.          1.          0.
 -0.30022504 -0.49283209 -0.62597507  1.1273076  -0.07250978  1.78334738
  0.46761663  1.64174621  1.41941929 -0.2199559  -0.41945735]


### 5biii. Final Model

##### biii1. Model Training

In [26]:
import xgboost as xgb
from xgboost.sklearn import XGBClassifier

classifier = XGBClassifier(learning_rate =0.01, n_estimators=100, max_depth=9, min_child_weight=9, gamma=1.6, subsample=0.8,
                           colsample_bytree=0.9, objective= 'binary:logistic', nthread=4, scale_pos_weight=1, seed=27, random_state=30)
classifier.fit(X_train, y_train)

In [27]:
# Predict y given X_test
y_pred = classifier.predict(X_test)
y_pred_proba = classifier.predict_proba(X_test)

### CRISP-DM (6) - Deployment

#### 6i. Additional steps for model deployment – factoring in PSP fees
Implementing the PSP routing function based on the 3 prediction possibilities and the PSP fee: 
* All PSPs failed,
* One PSP successful
* More than 1 PSP successful
In each scenario, the PSP with the lowest fee is chosen

##### a. Import PSP Fees

In [28]:
psp_fees = pd.read_excel("Transaction Fees.xlsx")

In [29]:
psp_fees

Unnamed: 0,name,Fee on successful transactions,Fee on failed transactions,Currency
0,Moneycard,5,2.0,Euro
1,Goldcard,10,5.0,Euro
2,UK_Card,3,1.0,Euro
3,Simplecard,1,0.5,Euro


##### b. Function to prepare data in production for prediction
We will use the prepare Test Dataset to complete this secton and assume that the data is ready for the model's consumption:
* We assume that the data comes as a single stream of one transaction at a time
* We also assume that feature engineering has been done (all transactions have to go through the feature engineering steps done in the modelling stage
* We also assume that feature scaling is done

##### c. Routing function
* The routing and choice will be facilitated through the user interface

###### ci. Define the 3 scenarios of prediction results

In [30]:
psp_pred_dict1 = {'UK_Card':0, 'Simplecard':0, 'Moneycard':0, 'Goldcard':0}
psp_pred_dict2 = {'UK_Card':1, 'Simplecard':0, 'Moneycard':0, 'Goldcard':0}
psp_pred_dict3 = {'UK_Card':0, 'Simplecard':0, 'Moneycard':1, 'Goldcard':1}

###### cii. Define the 3 scenarios of prediction results

In [31]:
def routing_function(psp_pred_dict, psp_fees):
    pred_list = []
    successful_psp_list = []
    successful_psp_dict = {}

    ## Get all predictions into a list
    for psp, pred in psp_pred_dict.items():
        pred_list.append(pred)

    ## Scenario 1: Predict failure for all PSPs (sum of predictions is 0)
    if sum(pred_list) == 0:
        min_fee_line = psp_fees[psp_fees['Fee on failed transactions'] == psp_fees['Fee on failed transactions'].min()].reset_index()
        selected_psp = min_fee_line.at[0, 'name']
        min_fee = psp_fees[psp_fees['name']==selected_psp].reset_index().at[0,'Fee on failed transactions']
        
    ## Scenario 2: Predict only one PSP successful (sum of predictions is 1)
    elif sum(pred_list) == 1:
        selected_psp = list(psp_pred_dict.keys())[list(psp_pred_dict.values()).index(1)]
        min_fee = psp_fees[psp_fees['name']==selected_psp].reset_index().at[0,'Fee on failed transactions']

    ## Scenario 3: Predict more than one PSP successful (sum of predictions is greater than 1)
    elif sum(pred_list) > 1:
        for psp, pred in psp_pred_dict.items():
            if pred==1:
                successful_psp_list.append(psp)
        for index, row in psp_fees.iterrows():
            if row['name'] in successful_psp_list:
                successful_psp_dict[row['name']] = row['Fee on successful transactions']
        selected_psp = min(successful_psp_dict, key=successful_psp_dict.get)
        min_fee = psp_fees[psp_fees['name']==selected_psp].reset_index().at[0,'Fee on failed transactions']
    ### Raise exception for any other entries
    else:
        raise Exception("System error. Scenario does not exist. Contact Admin")
    return selected_psp, min_fee

In [32]:
selected_psp, min_fee = routing_function(psp_pred_dict2, psp_fees)

In [33]:
print('The selected PSP is {} with a fee of {} Euro(s)'.format(selected_psp, min_fee))

The selected PSP is UK_Card with a fee of 1.0 Euro(s)


#### 6ii. How the model can be used by the business (Discussion Only)
* Discuss how the model solves the business problem - Refer to introduction section in main case study report
* Discuss how the model would be deployed - beyond the scope of the case study
* Should we create a pickle? - beyond the scope of the case study
* Document the model dependencies (Add to ReadMe to Github) - beyond the scope of the case study
* In the last step of the project, give a proposal of how your model could be used by the business in everyday work, for instance, via a graphical user interface (GUI) - Refer to section 2.8 of the main case study report