# Beta Bank Customer Retention

## Data Preparation

In [1]:
import pandas as pd
df = pd.read_csv('/datasets/Churn.csv')
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2.0,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1.0,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8.0,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1.0,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2.0,125510.82,1,1,1,79084.1,0


In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
RowNumber          10000 non-null int64
CustomerId         10000 non-null int64
Surname            10000 non-null object
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             9091 non-null float64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(3), int64(8), object(3)
memory usage: 1.1+ MB


In [3]:
df_descript = df.describe()
tenure_median = df_descript.at['50%', 'Tenure']

df['Tenure'] = df['Tenure'].fillna(tenure_median)
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
RowNumber          10000 non-null int64
CustomerId         10000 non-null int64
Surname            10000 non-null object
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             10000 non-null float64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(3), int64(8), object(3)
memory usage: 1.1+ MB


In [4]:
data_ohe = pd.get_dummies(df, drop_first=True)
data_ohe.head()

Unnamed: 0,RowNumber,CustomerId,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,...,Surname_Zotova,Surname_Zox,Surname_Zubarev,Surname_Zubareva,Surname_Zuev,Surname_Zuyev,Surname_Zuyeva,Geography_Germany,Geography_Spain,Gender_Male
0,1,15634602,619,42,2.0,0.0,1,1,1,101348.88,...,0,0,0,0,0,0,0,0,0,0
1,2,15647311,608,41,1.0,83807.86,1,0,1,112542.58,...,0,0,0,0,0,0,0,0,1,0
2,3,15619304,502,42,8.0,159660.8,3,1,0,113931.57,...,0,0,0,0,0,0,0,0,0,0
3,4,15701354,699,39,1.0,0.0,2,0,0,93826.63,...,0,0,0,0,0,0,0,0,0,0
4,5,15737888,850,43,2.0,125510.82,1,1,1,79084.1,...,0,0,0,0,0,0,0,0,1,0


I downloaded the data and assigned it to the df variable to begin to preprocess the data and prepare the features for model training. I found that the Tenure column had null values in about 10 percent of the data. I looked at the description of the data an found that mean and median Tenure were very close. I assigned median to the null values to make the data complete. I used one hot encoding to convert the categorical featues into numerical values. I dropped the first dummy column to avoid the dummy trap.

## Class Imbalance

In [5]:
data_ohe['Exited'].value_counts()

0    7963
1    2037
Name: Exited, dtype: int64

I checked the class imbalance of the Exited column. I found that the Exited column was in favor of the 0 class meaning approximately 80 percent of the customers have not left the bank. This means that my model can achieve 80% accuracy by predicting 0 each time.

In [6]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score


data_train, data_valid = train_test_split(
    data_ohe, test_size=0.4, random_state=12345)

data_valid, data_test = train_test_split(
    data_valid, test_size=0.5, random_state=12345)

target_train = data_train['Exited']
features_train = data_train.drop('Exited', axis=1)
target_valid = data_valid['Exited']
features_valid = data_valid.drop('Exited', axis=1)
target_test = data_test['Exited']
features_test = data_test.drop('Exited', axis=1)

model = LogisticRegression(random_state=12345, solver='liblinear')
model.fit(features_train, target_train)
predicted_valid = model.predict(features_valid)

print("F1:", f1_score(target_valid, predicted_valid))

F1: 0.0


  'precision', 'predicted', average, warn_for)


I trained the logistic regression model without fixing the class imbalances. I got the warning messages and an F1 score of 0. I looked into the error message and saw that it means that the 1 class was not being predicted therefore the f1 score is 0.

## Fixing Class Imbalances.

### Logistic Regression model.

In [48]:
model = LogisticRegression(random_state=12345, class_weight='balanced', solver='liblinear')
model.fit(features_train, target_train)
predicted_valid = model.predict(features_valid)
print("F1: {:.2f}".format(f1_score(target_valid, predicted_valid)))

F1: 0.50


In [8]:
print(target_train.value_counts())

0    4804
1    1196
Name: Exited, dtype: int64


In [9]:
from sklearn.utils import shuffle
from sklearn.utils import resample

train_minority = data_train[data_train.Exited==1]
train_majority = data_train[data_train.Exited==0]

train_minority_upsampled = resample(train_minority, replace=True, n_samples=4804, random_state=12345)
data_train_upsampled = pd.concat([train_majority, train_minority_upsampled])

target_train_upsampled = data_train_upsampled['Exited']
features_train_upsampled = data_train_upsampled.drop('Exited', axis=1)

print(data_train_upsampled['Exited'].value_counts())


1    4804
0    4804
Name: Exited, dtype: int64


In [49]:
model = LogisticRegression(random_state=12345, class_weight='balanced', solver='liblinear')
model.fit(features_train_upsampled, target_train_upsampled)
predicted_valid = model.predict(features_valid)
print("F1: {:.2f}".format(f1_score(target_valid, predicted_valid)))

F1: 0.49


In [11]:
import numpy as np

probabilities_valid = model.predict_proba(features_valid)
probabilities_one_valid = probabilities_valid[:, 1]

for threshold in np.arange(0, 1, 0.05):
    predicted_valid = probabilities_one_valid > threshold
    F1 = f1_score(target_valid, predicted_valid)

    print("Threshold = {:.2f} | F1 = {:.2f}".format(
        threshold, F1))

Threshold = 0.00 | F1 = 0.35
Threshold = 0.05 | F1 = 0.35
Threshold = 0.10 | F1 = 0.35
Threshold = 0.15 | F1 = 0.35
Threshold = 0.20 | F1 = 0.36
Threshold = 0.25 | F1 = 0.38
Threshold = 0.30 | F1 = 0.40
Threshold = 0.35 | F1 = 0.43
Threshold = 0.40 | F1 = 0.45
Threshold = 0.45 | F1 = 0.47
Threshold = 0.50 | F1 = 0.49
Threshold = 0.55 | F1 = 0.50
Threshold = 0.60 | F1 = 0.47
Threshold = 0.65 | F1 = 0.44
Threshold = 0.70 | F1 = 0.35
Threshold = 0.75 | F1 = 0.26
Threshold = 0.80 | F1 = 0.17
Threshold = 0.85 | F1 = 0.08
Threshold = 0.90 | F1 = 0.02
Threshold = 0.95 | F1 = 0.00


I first set the class_weight hyperperameter to balanced. This improved the f1 score to .50 which was still under the acceptable score. I then used the resample method to upsample the minority class and add weight to the 1 class. I resampled the minority class to be balanced with the majority. I retrained the model with the new data and calculated the F1 score and the score dropped to .48. I then checked to se if canging the threshold would help. It did not.

### Decision Tree model

In [28]:
from sklearn.tree import DecisionTreeClassifier

tree_model = DecisionTreeClassifier(random_state=12345, class_weight='balanced')
tree_model.fit(features_train_upsampled, target_train_upsampled)
tree_predicted_valid = tree_model.predict(features_valid)
print("F1: {:.2f}".format(f1_score(target_valid, tree_predicted_valid)))

F1: 0.54


In [39]:
for depth in range(1, 10):
        model = DecisionTreeClassifier(random_state=12345, class_weight='balanced', max_depth=depth)

        model.fit(features_train_upsampled, target_train_upsampled)

        predictions_valid = model.predict(features_valid)

        print("max_depth =", depth, ": ", end='')
        print("F1: {:.2f}".format(f1_score(target_valid, predictions_valid)))

max_depth = 1 : F1: 0.51
max_depth = 2 : F1: 0.54
max_depth = 3 : F1: 0.54
max_depth = 4 : F1: 0.59
max_depth = 5 : F1: 0.59
max_depth = 6 : F1: 0.56
max_depth = 7 : F1: 0.58
max_depth = 8 : F1: 0.55
max_depth = 9 : F1: 0.56


In [47]:
tree_model = DecisionTreeClassifier(random_state=12345, class_weight='balanced', max_depth=5)
tree_model.fit(features_train_upsampled, target_train_upsampled)
tree_predicted_test = tree_model.predict(features_test)
print("F1: {:.2f}".format(f1_score(target_test, tree_predicted_test)))

F1: 0.58


In [46]:
tree_model = DecisionTreeClassifier(random_state=12345, class_weight='balanced', max_depth=7)
tree_model.fit(features_train_upsampled, target_train_upsampled)
tree_predicted_test = tree_model.predict(features_test)
print("F1: {:.2f}".format(f1_score(target_test, tree_predicted_test)))

F1: 0.60


I trained a decision tree model to determine if I could futher improve on the f1 score. I used the upsampled data to train the model and got an F1 score of .54. Then I iterated through multiple max_depth hyperperameters to find one that gives me an acceptable f1. I found that the max depth of 4 and 5 give me an acceptable f1. i then used the test set and found it to be slightly under the acceptable f1 i used the next highest max depth parameter of 7 and get an acceptable f1 of .60

In [14]:
from sklearn.metrics import roc_auc_score
tree_model = DecisionTreeClassifier(random_state=12345, class_weight='balanced', max_depth=4)
tree_model.fit(features_train_upsampled, target_train_upsampled)
probabilities_test = tree_model.predict_proba(features_test)
probabilities_one_test = probabilities_test[:, 1]
auc_roc = roc_auc_score(target_test, probabilities_one_test)
print(auc_roc)

0.7164994430877673


I calculated the auc_roc score and found it to be around .97 which is considered very good. 