# Assignment 2 Design a Three-Layered ANN Classifier

Instructions
You are provided with a bank customers dataset (Churn_Modelling.csv) with about 10,000 customer information that can be used to decide whether the customer is likely to churn. There are multiple features. Identify which features are significant in determining whether the customer will churn. The last column, “Exited,” tells whether the customer stayed with the bank (Exited = 0) or left the bank (Exited = 1). Write a python code to design a three-layered ANN classifier that can predict whether the customer will churn for the test data set, which is 20% of the total dataset. Print the confusion matrix and accuracy, and then, submit the python code.

Be sure to encode the categorical data and perform the feature scaling. Use ‘relu’ activation for the first and second layers and ‘sigmoid’ for the last dense layer. For compiling, use ‘adam’ optimizer; and loss should be ‘binary_crossentropy’ as this is a binary classification problem.

Churn Modelling.csv


In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.metrics import confusion_matrix, accuracy_score
from keras.models import Sequential
from keras.layers import Dense

# Load the dataset
df = pd.read_csv("Churn_Modelling.csv")

# Preprocess the data
X = df.drop(['Exited'], axis=1)
y = df['Exited']

# Identify categorical and numerical columns
categorical_cols = ['Geography', 'Gender']
numerical_cols = X.select_dtypes(include=['int64', 'float64']).columns.tolist()

# Create a column transformer for preprocessing
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numerical_cols),
        ('cat', OneHotEncoder(), categorical_cols)
    ])

# Preprocess the features
X_processed = preprocessor.fit_transform(X)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_processed, y, test_size=0.2, random_state=42)

# Build the ANN model
model = Sequential()
model.add(Dense(units=32, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=1, activation='sigmoid'))

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=100, batch_size=32)

# Evaluate the model
y_pred = (model.predict(X_test) > 0.5).astype("int32")
cm = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)

# Print the confusion matrix and accuracy
print("Confusion Matrix:\n", cm)
print("Accuracy:", accuracy)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.7257 - loss: 0.5431
Epoch 2/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8283 - loss: 0.4127
Epoch 3/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8411 - loss: 0.3758
Epoch 4/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8480 - loss: 0.3573
Epoch 5/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8600 - loss: 0.3425
Epoch 6/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8626 - loss: 0.3387
Epoch 7/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8613 - loss: 0.3378
Epoch 8/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8636 - loss: 0.3299
Epoch 9/100
[1m250/250[0m [32