## Bank Turnover prediction (Handling imbalanced data)

- Build a deep learning model to predict churn rate at bank
- Once model is built, print classification report and analyze precision, recall and f1-score
- Improve f1 score in minority class using various techniques such as undersampling, oversampling, ensemble etc

Kaggle:  https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

In [1]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow import keras

In [2]:
df = pd.read_csv('Churn_Modelling.csv')
df_raw = df.copy()
print(df.shape)
df.head()

(10000, 14)


Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [3]:
cols_to_drop = ['RowNumber', 'CustomerId', 'Surname']
df = df.drop(cols_to_drop, axis=1)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   CreditScore      10000 non-null  int64  
 1   Geography        10000 non-null  object 
 2   Gender           10000 non-null  object 
 3   Age              10000 non-null  int64  
 4   Tenure           10000 non-null  int64  
 5   Balance          10000 non-null  float64
 6   NumOfProducts    10000 non-null  int64  
 7   HasCrCard        10000 non-null  int64  
 8   IsActiveMember   10000 non-null  int64  
 9   EstimatedSalary  10000 non-null  float64
 10  Exited           10000 non-null  int64  
dtypes: float64(2), int64(7), object(2)
memory usage: 859.5+ KB


In [5]:
def get_unique_values(col):
    unique_values = df[col].unique()
    n = len(unique_values)
    if n > 5: 
        unique_values = ['n>5']
    print(f'{col}: ({n}) {unique_values}')    

In [6]:
for col in df.columns:
    get_unique_values(col)

CreditScore: (460) ['n>5']
Geography: (3) ['France' 'Spain' 'Germany']
Gender: (2) ['Female' 'Male']
Age: (70) ['n>5']
Tenure: (11) ['n>5']
Balance: (6382) ['n>5']
NumOfProducts: (4) [1 3 2 4]
HasCrCard: (2) [1 0]
IsActiveMember: (2) [1 0]
EstimatedSalary: (9999) ['n>5']
Exited: (2) [1 0]


### Encoding

In [7]:
df['Gender'] = df['Gender'].map({'Female': 0, 'Male': 1})

**Categorical columns**

In [8]:
categorical_columns = ['Geography']

df_encoded = pd.get_dummies(df[categorical_columns].copy(), columns=categorical_columns, drop_first=True).astype(int)


df = df.drop(categorical_columns, axis=1)
df = pd.concat([df, df_encoded], axis=1)

df.filter(regex='^Geography').head()

Unnamed: 0,Geography_Germany,Geography_Spain
0,0,0
1,0,1
2,0,0
3,0,0
4,0,1


**Scaling**

In [9]:
cols_to_scale = ['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'EstimatedSalary']

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
df[cols_to_scale] = scaler.fit_transform(df[cols_to_scale])

#### Finallly

In [10]:
df.head()

Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Geography_Germany,Geography_Spain
0,0.538,0,0.324324,0.2,0.0,0.0,1,1,0.506735,1,0,0
1,0.516,0,0.310811,0.1,0.334031,0.0,0,1,0.562709,0,0,1
2,0.304,0,0.324324,0.8,0.636357,0.666667,1,0,0.569654,1,0,0
3,0.698,0,0.283784,0.1,0.0,0.333333,0,0,0.46912,0,0,0
4,1.0,0,0.337838,0.2,0.500246,0.0,1,1,0.3954,0,0,1


### X, y

In [11]:
from sklearn.model_selection import train_test_split

In [12]:
X = df.drop('Exited', axis=1)
y = df['Exited']

In [13]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

X_train.shape, X_test.shape

((8000, 11), (2000, 11))

In [14]:
def get_model():
    model = keras.Sequential([
    keras.layers.Dense(11, input_shape=(11,), activation='relu'),
    keras.layers.Dense(8, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid'),
    ])

    model.compile(
        optimizer='adam',
        loss='binary_crossentropy',
        metrics=['accuracy']
    )
    
    return model

In [15]:
model = get_model()

In [16]:
model.fit(X_train, y_train, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1ee0a563f10>

In [17]:
model.evaluate(X_test, y_test)



[0.380561888217926, 0.8399999737739563]

In [18]:
from sklearn.metrics import confusion_matrix , classification_report

def display_classification_report(threshold=0.5):
    y_pred = model.predict(X_test)
    y_pred = np.where(y_pred > threshold, 1, 0)

    print(classification_report(y_test, y_pred))

In [19]:
display_classification_report()

              precision    recall  f1-score   support

           0       0.87      0.95      0.90      1607
           1       0.65      0.40      0.50       393

    accuracy                           0.84      2000
   macro avg       0.76      0.67      0.70      2000
weighted avg       0.82      0.84      0.82      2000



In [20]:
display_classification_report(threshold=0.3)

              precision    recall  f1-score   support

           0       0.91      0.81      0.86      1607
           1       0.46      0.68      0.55       393

    accuracy                           0.78      2000
   macro avg       0.69      0.74      0.70      2000
weighted avg       0.82      0.78      0.80      2000



**f1-score and recall are very low for Class 1 (Churn). So, model's performance in predicting customers who will churn is very weak**

In [21]:
y.value_counts()

Exited
0    7963
1    2037
Name: count, dtype: int64

**Dataset is imbalanced**

### Handling imbalanced dataset

#### 1. Undersampling

In [22]:
class_count_0, class_count_1 = df['Exited'].value_counts()

print('Before sampling:', class_count_0, class_count_1)

df_class_0 = df[df['Exited'] == 0]
df_class_1 = df[df['Exited'] == 1]

df_class_0_resampled = df_class_0.sample(class_count_1, random_state=42)

df1 = pd.concat([df_class_0_resampled, df_class_1], axis=0)

df1['Exited'].value_counts()

Before sampling: 7963 2037


Exited
0    2037
1    2037
Name: count, dtype: int64

In [25]:
def ANN(X, y, epochs=100):   
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
    
    print('Train - Test shape:', X_train.shape, X_test.shape)
    
    model = get_model()
    
    # Early Stopping Callback
    early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
    
    # Model Training
    print('Model fit')
    model.fit(X_train, y_train, epochs=epochs, verbose=0, validation_split=0.2, callbacks=[early_stopping])
    
    print('Model evaluate')
    model.evaluate(X_test, y_test)
    
    y_pred_proba = model.predict(X_test)
    y_pred = np.round(y_pred_proba)

    print('Classification report')
    print(classification_report(y_test, y_pred))

In [26]:
X, y = df1.drop('Exited', axis=1), df1['Exited']

ANN(X, y)

Train - Test shape: (3259, 11) (815, 11)
Model fit
Model evaluate
Classification report
              precision    recall  f1-score   support

           0       0.78      0.77      0.77       408
           1       0.77      0.78      0.78       407

    accuracy                           0.77       815
   macro avg       0.77      0.77      0.77       815
weighted avg       0.77      0.77      0.77       815



f1-score for minority class 1 improved. Score for class 0 degraded. but that's ok. We have more generalized classifier which classifies both classes with similar prediction score

#### Using imblearn

In [27]:
from imblearn.under_sampling import RandomUnderSampler

over_sampler = RandomUnderSampler(random_state=42)

X, y = df.drop('Exited', axis=1), df['Exited']
X_resampled, y_resampled = over_sampler.fit_resample(X, y)

In [28]:
y_resampled.value_counts()

Exited
0    2037
1    2037
Name: count, dtype: int64

In [29]:
ANN(X_resampled, y_resampled)

Train - Test shape: (3259, 11) (815, 11)
Model fit
Model evaluate
Classification report
              precision    recall  f1-score   support

           0       0.76      0.80      0.78       408
           1       0.79      0.74      0.76       407

    accuracy                           0.77       815
   macro avg       0.77      0.77      0.77       815
weighted avg       0.77      0.77      0.77       815



#### 2. Oversampling

In [30]:
from imblearn.over_sampling import RandomOverSampler

over_sampler = RandomOverSampler(random_state=42)

X, y = df.drop('Exited', axis=1), df['Exited']
X_resampled, y_resampled = over_sampler.fit_resample(X, y)

In [31]:
y_resampled.value_counts()

Exited
1    7963
0    7963
Name: count, dtype: int64

In [32]:
y.value_counts()

Exited
0    7963
1    2037
Name: count, dtype: int64

In [33]:
ANN(X_resampled, y_resampled)

Train - Test shape: (12740, 11) (3186, 11)
Model fit
Model evaluate
Classification report
              precision    recall  f1-score   support

           0       0.75      0.83      0.79      1593
           1       0.81      0.72      0.76      1593

    accuracy                           0.78      3186
   macro avg       0.78      0.78      0.77      3186
weighted avg       0.78      0.78      0.77      3186



f1-score for minority class 1 improved. Score for class 0 degraded. but that's ok. We have more generalized classifier which classifies both classes with similar prediction score

### 3. SMOTE (Synthetic Minority Oversampling TEchnique)

In [34]:
from imblearn.over_sampling import SMOTE

smote_sampler = SMOTE(sampling_strategy='minority', random_state=42)

X, y = df.drop('Exited', axis=1), df['Exited']
X_resampled, y_resampled = smote_sampler.fit_resample(X, y)

In [35]:
y_resampled.value_counts()

Exited
1    7963
0    7963
Name: count, dtype: int64

In [36]:
ANN(X_resampled, y_resampled)

Train - Test shape: (12740, 11) (3186, 11)


Model fit
Model evaluate
Classification report
              precision    recall  f1-score   support

           0       0.79      0.81      0.80      1593
           1       0.81      0.78      0.79      1593

    accuracy                           0.80      3186
   macro avg       0.80      0.80      0.80      3186
weighted avg       0.80      0.80      0.80      3186

