Artificial Neural Network for Customer Churn Prediction

This notebook demonstrates how to build a neural network to predict customer churn in a bank. We'll go through data preprocessing, model building, training, and evaluation.

Part 1 - Data Preprocessing

In this section, we'll prepare our data for training the neural network.

### Step 1: Import Required Libraries

We need the following libraries:
- numpy: For numerical operations
- matplotlib: For creating visualizations
- pandas: For data manipulation
- seaborn: For enhanced visualizations


In [None]:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

### Step 2: Load and Inspect the Dataset

Load the Churn_Modelling.csv file and separate features:
- X: Independent variables (columns 3-13)
- y: Dependent variable (column 13, whether customer churned)

We'll print the shapes to understand our data dimensions.

In [None]:
# Importing the dataset
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13]
print("Independent features:\n", X)

In [None]:
print("Shape of Independent features data:\n", X.shape)

In [None]:
y = dataset.iloc[:, 13]
print("Dependent features:\n", y)

In [None]:
print("Shape of Dependent features data:\n", y.shape)

### Step 3: Handle Categorical Variables

Convert categorical variables (Geography and Gender) into numeric format using one-hot encoding.
We use drop_first=True to avoid the dummy variable trap.

In [None]:
#Create dummy variables
geography=pd.get_dummies(X["Geography"],drop_first=True)
print("Shape of Geography data:\n", geography.shape)
print("geography data:\n", geography)
gender=pd.get_dummies(X['Gender'],drop_first=True)
print("Shape of gender data:\n", gender.shape)
print("gender data:\n", gender)

### Step 4: Combine Features

Concatenate the one-hot encoded variables with our original features.
This step creates our final feature matrix.

In [None]:
## Concatenate the Data Frames
X=pd.concat([X,geography,gender],axis=1)
print("Shape of X:\n", X.shape)
print("Concatenated data:\n", X.values)

### Step 5: Clean Up Features

Remove the original categorical columns since we now have their one-hot encoded versions.
This prevents duplicate information in our dataset.

In [None]:
## Drop Unnecessary columns
X=X.drop(['Geography','Gender'],axis=1)
print("Shape of X:\n", X.shape)
print("Concatenated data after dropping unnecessary columns:\n", X.values)

### Step 6: Split the Dataset

Divide our data into training (80%) and testing (20%) sets.
We use random_state=0 for reproducible results.

In [None]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
print("Shape of X_train:\n", X_train.shape)
print("Shape of X_test:\n", X_test.shape)
print("Shape of y_train:\n", y_train.shape)
print("Shape of y_test:\n", y_test.shape)

### Step 7: Feature Scaling

Standardize our features using StandardScaler.
This is crucial for neural networks to:
- Ensure all features are on the same scale
- Help with faster convergence
- Prevent features with larger values from dominating the learning process

In [None]:
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Part 2 - Building the Artificial Neural Network

Now we'll create and train our neural network model.

### Step 8: Import Neural Network Libraries

Import Keras components needed for building our neural network:
- Sequential: For creating the neural network
- Dense: For adding fully connected layers
- Dropout: For preventing overfittingPart 2 - Now let's make the ANN!

In [None]:
# Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LeakyReLU,PReLU,ELU
from keras.layers import Dropout

### Step 9: Initialize the Neural Network

Create a Sequential model, which is a linear stack of layers.

In [None]:
# Initialising the ANN
classifier = Sequential()

### Step 10: Add First Hidden Layer

Add the first hidden layer with:
- 6 neurons (units)
- ReLU activation function
- He uniform initialization for weights
- 11 input features (input_dim)

In [None]:
# Adding the input layer and the first hidden layer
# classifier.add(Dense(output_dim = 6, init = 'he_uniform',activation='relu',input_dim = 11))     ## older version
classifier.add(Dense(units = 6, kernel_initializer = 'he_uniform',activation='relu',input_dim = 11))

### Step 11: Add Second Hidden Layer

Add another hidden layer with:
- 6 neurons
- ReLU activation
- He uniform initialization

In [None]:
# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'he_uniform',activation='relu'))

### Step 12: Add Output Layer

Add the output layer with:
- 1 neuron (binary classification)
- Sigmoid activation (outputs probability between 0 and 1)
- Glorot uniform initialization

In [None]:
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'glorot_uniform', activation = 'sigmoid'))

### Step 13: Compile the Model

Configure the learning process with:
- Adamax optimizer: An adaptive learning rate optimization algorithm
- Binary cross-entropy loss: Suitable for binary classification
- Accuracy metric: To monitor model performance

In [None]:
# Compiling the ANN
classifier.compile(optimizer = 'Adamax', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Step 14: Train the Model

Train the neural network with:
- 33% validation split
- Batch size of 10
- 100 epochs
- Verbose output to monitor progress

In [None]:
# Fitting the ANN to the Training set
model_history=classifier.fit(X_train, y_train,validation_split=0.33, batch_size = 10, epochs=100, verbose=1)      ## earlier version: nb_epoch=100
print("model history:\n", model_history)
# list all data in history

### Step 15: Visualize Training History

Plot the training metrics:
1. Accuracy plot: Shows how model accuracy improves over epochs
2. Loss plot: Shows how the loss decreases over epochs

Both plots compare training and validation metrics to detect overfitting.

In [None]:
# list all data in history
print(model_history.history.keys())

In [None]:
# summarize history for accuracy
plt.plot(model_history.history['accuracy'])             ## old version: plt.plot(model_history.history['acc'])
plt.plot(model_history.history['val_accuracy'])         ## old version: plt.plot(model_history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

In [None]:
# summarize history for loss
plt.plot(model_history.history['loss'])
plt.plot(model_history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

# Part 3 - Model Evaluation

Now we'll evaluate our model's performance on the test set.

### Step 16: Make Predictions

Use the trained model to make predictions on test data:
1. Get probability predictions
2. Convert probabilities to binary predictions (threshold = 0.5)

In [None]:
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

### Step 17: Create Confusion Matrix

Generate and display the confusion matrix to show:
- True Positives
- True Negatives
- False Positives
- False Negatives

In [None]:
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)

### Step 18: Calculate and Visualize Final Results

1. Calculate the overall accuracy score
2. Create a heatmap visualization of the confusion matrix
3. Display final model performance metrics

In [None]:
# Calculate the Accuracy
from sklearn.metrics import accuracy_score
score=accuracy_score(y_pred,y_test)
print("Accuracy:", score)

In [None]:
plt.figure(figsize=(6,4))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title(f'Confusion Matrix (Accuracy = {score:.2f})')
plt.show()