#### Problem Statement
The goal of this assignment is to predict customer churn using an Artificial Neural Network (ANN) model. The dataset contains customer information such as credit score, geography, gender, age, tenure, balance, and other features. The target variable is `Exited`, which indicates whether a customer has left the service (1) or not (0). The dataset is preprocessed by encoding categorical variables, scaling numerical features, and splitting into training and testing sets. An ANN model is built, trained, and evaluated to classify customer churn.

In [10]:
import pandas as pd

# Load the dataset
df = pd.read_csv('Churn_Modelling.csv')



In [11]:
# Display the first few rows
df.head()


Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [12]:
# Check for missing values
df.isnull().sum()


RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

In [13]:
# Check dataset information
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB


In [14]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split

# Drop unnecessary columns
df = df.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)



In [15]:
df.sample()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
3014,628,Spain,Male,43,3,184926.61,1,1,0,122937.57,0


In [16]:
# Encode categorical variables
label_encoder = LabelEncoder()
df['Gender'] = label_encoder.fit_transform(df['Gender'])


In [17]:
# One-hot encode 'Geography'
df = pd.get_dummies(df, columns=['Geography'], drop_first=True)

In [18]:
df.sample()

Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Geography_Germany,Geography_Spain
3191,590,1,32,5,0.0,2,1,0,59249.83,0,False,False


In [19]:
# Split features and target
X = df.drop('Exited', axis=1)  # Features
y = df['Exited']  # Target


In [22]:
X.shape


(10000, 11)

In [23]:
y.shape

(10000,)

In [24]:
# Scale numerical features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

## Build ANN Model

In [25]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout

# Initialize the ANN
model = Sequential()

# Add input layer and first hidden layer
model.add(Dense(units=128, activation='relu', input_dim=X_train.shape[1]))
model.add(Dropout(0.2))  # Dropout for regularization

# Add second hidden layer
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.2))

# Add output layer
model.add(Dense(units=1, activation='sigmoid'))

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Display model summary
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [26]:
from tensorflow.keras.callbacks import EarlyStopping

# Define early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Train the model
history = model.fit(
    X_train, y_train,
    validation_split=0.2,
    epochs=100,
    batch_size=32,
    callbacks=[early_stopping],
    verbose=1
)

Epoch 1/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.8013 - loss: 0.4794 - val_accuracy: 0.8306 - val_loss: 0.4029
Epoch 2/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8221 - loss: 0.4321 - val_accuracy: 0.8475 - val_loss: 0.3732
Epoch 3/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8548 - loss: 0.3708 - val_accuracy: 0.8537 - val_loss: 0.3550
Epoch 4/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8508 - loss: 0.3660 - val_accuracy: 0.8569 - val_loss: 0.3509
Epoch 5/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8521 - loss: 0.3646 - val_accuracy: 0.8537 - val_loss: 0.3482
Epoch 6/100
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8599 - loss: 0.3457 - val_accuracy: 0.8544 - val_loss: 0.3458
Epoch 7/100
[1m200/20

In [27]:
from sklearn.metrics import classification_report, confusion_matrix

# Predict on test data
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)  # Convert probabilities to binary predictions

# Evaluate the model
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step
Confusion Matrix:
[[1520   87]
 [ 191  202]]

Classification Report:
              precision    recall  f1-score   support

           0       0.89      0.95      0.92      1607
           1       0.70      0.51      0.59       393

    accuracy                           0.86      2000
   macro avg       0.79      0.73      0.75      2000
weighted avg       0.85      0.86      0.85      2000



#### Conclusion
The ANN model achieves an overall accuracy of **86%** on the test set. The model performs well in predicting customers who stay (class 0) with a precision of **0.89** and recall of **0.95**. However, its performance in predicting customers who leave (class 1) is lower, with a precision of **0.70** and recall of **0.51**. This indicates that the model struggles to identify churn cases effectively, which is common in imbalanced datasets.

To improve the model:
1. Address class imbalance using techniques like oversampling (SMOTE) or class weighting.
2. Experiment with different architectures, hyperparameters, or regularization techniques.
3. Use additional evaluation metrics like ROC-AUC to better assess model performance.

Overall, the model provides a good baseline for customer churn prediction but requires further optimization for better performance on minority class predictions.

------------