<a href="https://colab.research.google.com/github/jpkrajewski/ANN-banking-customer-churn/blob/main/Artificial_Neural_Network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Problem Definition

The dataset provided contains information about bank customers, including various attributes such as customer ID, credit score, geography, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and whether the customer has exited the bank.

The problem based on this dataset could be to predict whether a customer is likely to churn or exit the bank. The "Exited" column indicates whether a customer has left the bank (1 for yes and 0 for no). The goal would be to develop a predictive model that can accurately classify customers as churned or not churned based on the available features.

**Statement:** Develop a machine learning model that can predict whether a bank customer is likely to churn or exit based on their credit score, geography, gender, age, tenure, balance, number of products, credit card status, active membership, and estimated salary.

By solving this problem, the bank could proactively identify customers who are at risk of leaving and take appropriate actions to retain them, such as offering personalized incentives or improved services.

### Importing the libraries

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import KFold

In [None]:
tf.__version__

'2.12.0'

## Part 1 - Data Preprocessing

### Importing the dataset

In [None]:
dataset = pd.read_csv('Churn_Modelling.csv')
dataset.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [None]:
# There is no use for a 'RowNumber', 'CustomerId' and 'Surname'.
# Those columns does not contribute meaningful information to the analysis or prediction

X = dataset.iloc[:, 3:-1].values
y = dataset.iloc[:, -1].values

### Encoding categorical data

Label Encoding the "Gender" column

In [None]:
# By performing this operation, the categorical gender column in X is
# transformed into numerical labels using label encoding.

# The modified X now contains the encoded labels in place of the original
# gender values in the third column.

le_gender = LabelEncoder()
X[:, 2] = le_gender.fit_transform(X[:, 2])

One Hot Encoding the "Geography" column

In [None]:
# The second column of X (Geography) is replaced
# with the one-hot encoded representation.

# Any remaining columns (if present) are preserved in their original form.

ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

### Feature Scaling

In [None]:
# Standardization is a common preprocessing step in machine learning
# that helps to ensure that features are on a similar scale,
# preventing any particular feature from dominating the learning algorithm.

# Dependent (y) varibale is in scaled because  it is in binary format 0 or 1

sc = StandardScaler()
X_std = sc.fit_transform(X)

## Part 2 - Building the ANN

### Initializing the ANN

In [None]:
def create_model(layers: int, units: int) -> tf.keras.Sequential:
  """Create ANN model with chosen layers and units in those layers."""
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Dense(units=6, activation='relu'))

  for _ in range(layers):
    model.add(tf.keras.layers.Dense(units=units, activation='relu'))

  model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

  # The sigmoid function is commonly used in binary classification tasks
  # because it maps any real-valued number to a range between 0 and 1,
  # which can be interpreted as a probability.

  # By setting a threshold (e.g., 0.5),
  # the sigmoid output can be interpreted as a binary decision.
  # If the probability exceeds the threshold, the instance is classified as
  # the positive class; otherwise, it is classified as the negative class.
  # This allows for clear decision boundaries and straightforward classification rules.

  model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
  )

  # Binary cross-entropy, also known as log loss or logistic loss,
  # is a commonly used loss function in binary classification tasks.
  # It is particularly suitable when the output of the model is a probability
  # distribution over the classes.

  return model

In [None]:
models_params = [(6,3), (6,4), (6,5), (6,6), (7,3), (7,4), (7,5), (7,6)]
models = [create_model(layer, units) for layer, units in models_params]

## Part 3 - Training the ANN

### Training the ANN models using K-fold Cross Validation

Training a supervised machine learning model involves changing model weights using a training set. Later, once training has finished, the trained model is tested with new data - the testing set - in order to find out how well it performs in real life.

When you are satisfied with the performance of the model, you train it again with the entire dataset, in order to finalize it and use it in production (Bogdanovist, n.d.)

In [None]:
def evaluate_model(model, inputs, targets):

  # Define the K-fold Cross Validator
  kfold = KFold(n_splits=5, shuffle=True)

  # Define per-fold score containers
  acc_per_fold = []
  loss_per_fold = []

  # K-fold Cross Validation model evaluation
  fold_no = 1
  for train, test in kfold.split(inputs, targets):

    # Generate a print
    print('------------------------------------------------------------------------')
    print(f'Training for fold {fold_no} ...')

    # Fit data to model
    history = model.fit(inputs[train], targets[train],
                batch_size=32,
                epochs=100,
                verbose=1)

    # Generate generalization metrics
    scores = model.evaluate(inputs[test], targets[test], verbose=0)
    print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {scores[0]}; {model.metrics_names[1]} of {scores[1]*100}%')
    acc_per_fold.append(scores[1] * 100)
    loss_per_fold.append(scores[0])

    # Increase fold number
    fold_no = fold_no + 1

  # == Provide average scores ==
  print('------------------------------------------------------------------------')
  print('Score per fold')
  for i in range(0, len(acc_per_fold)):
    print('------------------------------------------------------------------------')
    print(f'> Fold {i+1} - Loss: {loss_per_fold[i]} - Accuracy: {acc_per_fold[i]}%')
  print('------------------------------------------------------------------------')
  print('Average scores for all folds:')
  print(f'> Accuracy: {np.mean(acc_per_fold)} (+- {np.std(acc_per_fold)})')
  print(f'> Loss: {np.mean(loss_per_fold)}')
  print('------------------------------------------------------------------------')
  return np.mean(acc_per_fold)

In [None]:
evaluation = []
for i, model in enumerate(models):
  evaluation.append((i, evaluate_model(model, X_std, y)))

## Part 4 - Evaluating the model

### Check the select the best model

In [None]:
# Print all model's indexes and mean accuracy
print(evaluation)

[(0, 86.16999983787537), (1, 85.9500002861023), (2, 85.83000063896179), (3, 86.32999897003174), (4, 79.62999939918518), (5, 86.16999983787537), (6, 79.63000059127808), (7, 86.20999932289124)]


In [None]:
# Select best performing model
# The best model is model with index "3".
max(evaluation, key=lambda x: x[1])

(3, 86.32999897003174)

In [None]:
best_model = models[3]

In [None]:
# Best model architecture
best_model.summary()

Model: "sequential_21"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_146 (Dense)           (None, 6)                 78        
                                                                 
 dense_147 (Dense)           (None, 6)                 42        
                                                                 
 dense_148 (Dense)           (None, 6)                 42        
                                                                 
 dense_149 (Dense)           (None, 6)                 42        
                                                                 
 dense_150 (Dense)           (None, 6)                 42        
                                                                 
 dense_151 (Dense)           (None, 6)                 42        
                                                                 
 dense_152 (Dense)           (None, 6)               

### Model finalization

In [None]:
best_model.save('model')
!zip -r ./model.zip ./model/

In [None]:
import joblib
joblib.dump(sc, 'scaler.sav')