# Customer Churn Prediction Model - Experiments

This notebook contains the complete workflow for building a deep learning model to predict customer churn using an Artificial Neural Network (ANN). 

## Project Overview
- **Objective**: Predict whether a bank customer will churn (leave the bank)
- **Model**: Deep Neural Network with TensorFlow/Keras
- **Features**: Customer demographics, account information, and banking behavior
- **Target**: Binary classification (Churn: 1, Stay: 0)

## Workflow
1. Data Loading and Exploration
2. Data Preprocessing and Feature Engineering
3. Model Architecture Design
4. Model Training with Callbacks
5. Model Evaluation and Saving

In [16]:
# =============================================================================
# 1. IMPORT NECESSARY LIBRARIES
# =============================================================================

# Data manipulation and analysis
import pandas as pd
import numpy as np

# Machine learning utilities
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder

# Model persistence
import pickle

print("✅ All libraries imported successfully!")

✅ All libraries imported successfully!


In [17]:
# =============================================================================
# 2. DATA LOADING AND INITIAL EXPLORATION
# =============================================================================

# Load the customer churn dataset
# Note: Fixed filename case to match actual file
data = pd.read_csv('Churn_Modelling.csv')

print(f"Dataset shape: {data.shape}")
print("\n📊 First 5 rows of the dataset:")
data.head()

Dataset shape: (10000, 14)

📊 First 5 rows of the dataset:


Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [18]:
# Display all column names to understand the dataset structure
print("📋 Dataset Columns:")
print(data.columns.tolist())

📋 Dataset Columns:
['RowNumber', 'CustomerId', 'Surname', 'CreditScore', 'Geography', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited']


In [19]:
# Get comprehensive information about the dataset
print("📊 Dataset Information:")
print("=" * 50)
data.info()

print("\n" + "=" * 50)
print("📈 Basic Statistics:")
print(data.describe())

print("\n" + "=" * 50)
print("🎯 Target Variable Distribution:")
print(data['Exited'].value_counts())
print(f"Churn Rate: {data['Exited'].mean():.2%}")

📊 Dataset Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

📈 Basic Statistics:
         RowNumber    CustomerId   CreditScore    

In [20]:
# =============================================================================
# 3. DATA PREPROCESSING
# =============================================================================

# Remove unnecessary columns that don't contribute to prediction
# RowNumber: Just an index, not a feature
# CustomerId: Unique identifier, not predictive
# Surname: Customer name, not relevant for churn prediction
columns_to_drop = ['RowNumber', 'CustomerId', 'Surname']
data.drop(columns_to_drop, axis=1, inplace=True)

print(f"✅ Dropped columns: {columns_to_drop}")
print(f"Remaining columns: {data.columns.tolist()}")

✅ Dropped columns: ['RowNumber', 'CustomerId', 'Surname']
Remaining columns: ['CreditScore', 'Geography', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited']


In [21]:
# Display the cleaned dataset structure
print("🧹 Cleaned Dataset (first 5 rows):")
data.head()

🧹 Cleaned Dataset (first 5 rows):


Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [22]:
# =============================================================================
# 4. CATEGORICAL VARIABLE ENCODING
# =============================================================================

# 4.1 Label Encoding for Gender (Binary categorical variable)
# Gender has only 2 categories (Male/Female), so LabelEncoder is appropriate
# Fixed spelling: "lable" → "label"
label_encoder_gender = LabelEncoder()
data['Gender'] = label_encoder_gender.fit_transform(data['Gender'])

print("✅ Gender encoding completed:")
print(f"Original gender classes: {label_encoder_gender.classes_}")
print(f"Encoded values: Male=1, Female=0")
print(f"Unique values in Gender column: {sorted(data['Gender'].unique())}")

✅ Gender encoding completed:
Original gender classes: ['Female' 'Male']
Encoded values: Male=1, Female=0
Unique values in Gender column: [0, 1]


In [23]:
# Display sample of data with encoded Gender column
print("📋 Sample data after Gender encoding:")
data.head(10)

📋 Sample data after Gender encoding:


Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,0,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,0,41,1,83807.86,1,0,1,112542.58,0
2,502,France,0,42,8,159660.8,3,1,0,113931.57,1
3,699,France,0,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,0,43,2,125510.82,1,1,1,79084.1,0
5,645,Spain,1,44,8,113755.78,2,1,0,149756.71,1
6,822,France,1,50,7,0.0,2,1,1,10062.8,0
7,376,Germany,0,29,4,115046.74,4,1,0,119346.88,1
8,501,France,1,44,4,142051.07,2,0,1,74940.5,0
9,684,France,1,27,2,134603.88,1,1,1,71725.73,0


In [24]:
# Check unique values in Geography column before encoding
print("🌍 Unique Geography values:")
geography_values = data['Geography'].unique()
print(geography_values)
print(f"Number of unique geography values: {len(geography_values)}")

🌍 Unique Geography values:
['France' 'Spain' 'Germany']
Number of unique geography values: 3


In [25]:
# 4.2 One-Hot Encoding for Geography (Multi-categorical variable)
# Geography has 3+ categories, so One-Hot Encoding prevents ordinality issues
# This creates binary columns for each geography category
one_hot_encoder_geography = OneHotEncoder()  # drop first to avoid multicollinearity
geo_encoded = one_hot_encoder_geography.fit_transform(data[['Geography']])

print("✅ Geography One-Hot Encoding completed:")
print(f"Original geography categories: {data['Geography'].unique()}")
print(f"Encoded array shape: {geo_encoded.shape}")
print(f"Feature names: {one_hot_encoder_geography.get_feature_names_out()}")

✅ Geography One-Hot Encoding completed:
Original geography categories: ['France' 'Spain' 'Germany']
Encoded array shape: (10000, 3)
Feature names: ['Geography_France' 'Geography_Germany' 'Geography_Spain']


In [26]:
# Check current data structure before applying geography encoding
print("📋 Data structure before geography encoding:")
data.head()

📋 Data structure before geography encoding:


Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,0,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,0,41,1,83807.86,1,0,1,112542.58,0
2,502,France,0,42,8,159660.8,3,1,0,113931.57,1
3,699,France,0,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,0,43,2,125510.82,1,1,1,79084.1,0


In [27]:
# Display the feature names created by One-Hot Encoder
print("🏷️ Generated feature names from One-Hot Encoding:")
feature_names = one_hot_encoder_geography.get_feature_names_out()
for i, name in enumerate(feature_names):
    print(f"{i+1}. {name}")

🏷️ Generated feature names from One-Hot Encoding:
1. Geography_France
2. Geography_Germany
3. Geography_Spain


In [28]:
# Convert the encoded array to a DataFrame for easier handling
geo_encoded_df = pd.DataFrame(
    geo_encoded.toarray(), 
    columns=one_hot_encoder_geography.get_feature_names_out(),
    index=data.index  # Maintain the same index as original data
)

print("✅ Geography encoding DataFrame created")
print(f"Shape: {geo_encoded_df.shape}")

✅ Geography encoding DataFrame created
Shape: (10000, 3)


In [29]:
# Preview the encoded geography DataFrame
print("🌍 Encoded Geography DataFrame (first 10 rows):")
geo_encoded_df.head(10)

🌍 Encoded Geography DataFrame (first 10 rows):


Unnamed: 0,Geography_France,Geography_Germany,Geography_Spain
0,1.0,0.0,0.0
1,0.0,0.0,1.0
2,1.0,0.0,0.0
3,1.0,0.0,0.0
4,0.0,0.0,1.0
5,0.0,0.0,1.0
6,1.0,0.0,0.0
7,0.0,1.0,0.0
8,1.0,0.0,0.0
9,1.0,0.0,0.0


In [30]:
# =============================================================================
# 5. FEATURE ENGINEERING - COMBINE ENCODED FEATURES
# =============================================================================

# Remove original Geography column and add the encoded columns
print("🔄 Combining original data with encoded geography features...")

# Remove the original categorical Geography column
data.drop('Geography', axis=1, inplace=True)

# Concatenate the encoded geography features with the main dataset
data = pd.concat([data, geo_encoded_df], axis=1)

print("✅ Feature engineering completed!")
print(f"Final dataset shape: {data.shape}")
print(f"Final columns: {data.columns.tolist()}")

🔄 Combining original data with encoded geography features...
✅ Feature engineering completed!
Final dataset shape: (10000, 13)
Final columns: ['CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited', 'Geography_France', 'Geography_Germany', 'Geography_Spain']


In [31]:
# Display the final preprocessed dataset
print("🎯 Final Preprocessed Dataset:")
print("=" * 50)
data.head()

🎯 Final Preprocessed Dataset:


Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Geography_France,Geography_Germany,Geography_Spain
0,619,0,42,2,0.0,1,1,1,101348.88,1,1.0,0.0,0.0
1,608,0,41,1,83807.86,1,0,1,112542.58,0,0.0,0.0,1.0
2,502,0,42,8,159660.8,3,1,0,113931.57,1,1.0,0.0,0.0
3,699,0,39,1,0.0,2,0,0,93826.63,0,1.0,0.0,0.0
4,850,0,43,2,125510.82,1,1,1,79084.1,0,0.0,0.0,1.0


In [32]:
# =============================================================================
# 6. SAVE PREPROCESSORS FOR FUTURE USE
# =============================================================================

# Save the fitted encoders and scaler for use in prediction
# These will be needed to preprocess new data for predictions

print("💾 Saving preprocessors...")

# Save Label Encoder for Gender (Fixed spelling: lable → label)
with open('label_encoder_gender.pkl', 'wb') as file:
    pickle.dump(label_encoder_gender, file)
print("✅ Gender Label Encoder saved")

# Save One-Hot Encoder for Geography
with open('one_hot_encoder_geography.pkl', 'wb') as file:
    pickle.dump(one_hot_encoder_geography, file)
print("✅ Geography One-Hot Encoder saved")

print("💾 All preprocessors saved successfully!")

💾 Saving preprocessors...
✅ Gender Label Encoder saved
✅ Geography One-Hot Encoder saved
💾 All preprocessors saved successfully!


In [33]:
# =============================================================================
# 7. PREPARE FEATURES AND TARGET VARIABLES
# =============================================================================

# Separate features (X) and target variable (y)
# Features: All columns except 'Exited' (the target we want to predict)
# Target: 'Exited' column (1 = customer churned, 0 = customer stayed)

X = data.drop('Exited', axis=1)  # Features
y = data['Exited']               # Target

print("🎯 Feature-Target Separation:")
print(f"Features (X) shape: {X.shape}")
print(f"Target (y) shape: {y.shape}")
print(f"Feature columns: {X.columns.tolist()}")
print(f"Target distribution: {y.value_counts().to_dict()}")

🎯 Feature-Target Separation:
Features (X) shape: (10000, 12)
Target (y) shape: (10000,)
Feature columns: ['CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France', 'Geography_Germany', 'Geography_Spain']
Target distribution: {0: 7963, 1: 2037}


In [34]:
# Quick preview of features and target
print("📋 Features (X) - First 5 rows:")
print(X.head())
print("\n🎯 Target (y) - First 10 values:")
print(y.head(10).tolist())

📋 Features (X) - First 5 rows:
   CreditScore  Gender  Age  Tenure    Balance  NumOfProducts  HasCrCard  \
0          619       0   42       2       0.00              1          1   
1          608       0   41       1   83807.86              1          0   
2          502       0   42       8  159660.80              3          1   
3          699       0   39       1       0.00              2          0   
4          850       0   43       2  125510.82              1          1   

   IsActiveMember  EstimatedSalary  Geography_France  Geography_Germany  \
0               1        101348.88               1.0                0.0   
1               1        112542.58               0.0                0.0   
2               0        113931.57               1.0                0.0   
3               0         93826.63               1.0                0.0   
4               1         79084.10               0.0                0.0   

   Geography_Spain  
0              0.0  
1              1.0 

In [35]:
# =============================================================================
# 8. TRAIN-TEST SPLIT AND FEATURE SCALING
# =============================================================================

# Split the dataset into training and testing sets
# 80% for training, 20% for testing
# random_state=42 ensures reproducible results
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2, 
    random_state=42,
    stratify=y  # Maintain the same proportion of target classes in both sets
)

print("📊 Train-Test Split:")
print(f"Training set: X_train {X_train.shape}, y_train {y_train.shape}")
print(f"Testing set: X_test {X_test.shape}, y_test {y_test.shape}")
print(f"Train churn rate: {y_train.mean():.2%}")
print(f"Test churn rate: {y_test.mean():.2%}")

# Feature Scaling using StandardScaler
# Neural networks perform better with normalized/standardized features
print("\n⚖️ Feature Scaling:")
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)  # Fit and transform training data
X_test = scaler.transform(X_test)        # Only transform test data (no fitting)

print("✅ Features scaled using StandardScaler")

📊 Train-Test Split:
Training set: X_train (8000, 12), y_train (8000,)
Testing set: X_test (2000, 12), y_test (2000,)
Train churn rate: 20.38%
Test churn rate: 20.35%

⚖️ Feature Scaling:
✅ Features scaled using StandardScaler


In [36]:
# Verify the final shapes after preprocessing
print("✅ Final Data Shapes After Preprocessing:")
print("=" * 40)
print(f"X_train shape: {X_train.shape}")
print(f"X_test shape:  {X_test.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"y_test shape:  {y_test.shape}")
print(f"Number of features: {X_train.shape[1]}")

✅ Final Data Shapes After Preprocessing:
X_train shape: (8000, 12)
X_test shape:  (2000, 12)
y_train shape: (8000,)
y_test shape:  (2000,)
Number of features: 12


In [37]:
# Preview the scaled training data
print("📊 Scaled Training Data (X_train) - First 5 samples:")
print("Features are now standardized (mean≈0, std≈1)")
print(X_train[:5])

📊 Scaled Training Data (X_train) - First 5 samples:
Features are now standardized (mean≈0, std≈1)
[[ 1.058568    0.90750738  1.71508648  0.68472287 -1.22605881 -0.91025649
   0.64104192 -1.030206    1.04208392  1.00175153 -0.57831252 -0.57773517]
 [ 0.91362605  0.90750738 -0.65993547 -0.6962018   0.41328769 -0.91025649
   0.64104192 -1.030206   -0.62355635 -0.99825153  1.72916886 -0.57773517]
 [ 1.07927399 -1.10191942 -0.18493108 -1.73189531  0.60168748  0.80883036
   0.64104192  0.97067965  0.30812779 -0.99825153  1.72916886 -0.57773517]
 [-0.92920731  0.90750738 -0.18493108 -0.00573947 -1.22605881  0.80883036
   0.64104192 -1.030206   -0.29019914  1.00175153 -0.57831252 -0.57773517]
 [ 0.42703522  0.90750738  0.95507945  0.3394917   0.54831832  0.80883036
  -1.55996038  0.97067965  0.13504224 -0.99825153  1.72916886 -0.57773517]]


In [38]:
# Save the fitted scaler for future use in predictions
# This is crucial for maintaining consistency in feature scaling
with open('scaler.pkl', 'wb') as file:
    pickle.dump(scaler, file)

print("💾 StandardScaler saved successfully!")
print("⚠️  Important: Use the same scaler for new predictions to maintain consistency")

💾 StandardScaler saved successfully!
⚠️  Important: Use the same scaler for new predictions to maintain consistency


In [39]:
# =============================================================================
# 9. ARTIFICIAL NEURAL NETWORK (ANN) IMPLEMENTATION
# =============================================================================

print("🧠 Starting Deep Learning Model Development...")
print("Model Type: Artificial Neural Network (ANN)")
print("Architecture: Multi-layer Perceptron for Binary Classification")

🧠 Starting Deep Learning Model Development...
Model Type: Artificial Neural Network (ANN)
Architecture: Multi-layer Perceptron for Binary Classification


In [40]:
# Import TensorFlow and Keras components for deep learning
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, TensorBoard, ReduceLROnPlateau
from tensorflow.keras.optimizers import Adam
import datetime

# Set random seed for reproducibility
tf.random.set_seed(42)

print("✅ TensorFlow and Keras imported successfully!")
print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")


✅ TensorFlow and Keras imported successfully!
TensorFlow version: 2.15.0
GPU Available: []


In [41]:
# Check the number of input features for the model architecture
input_features = X_train.shape[1]
print(f"🔢 Number of input features: {input_features}")
print("This will be the input layer size for our neural network")

🔢 Number of input features: 12
This will be the input layer size for our neural network


In [42]:
# =============================================================================
# 10. MODEL ARCHITECTURE DESIGN
# =============================================================================

# Build an optimized ANN model for churn prediction
# Architecture: Input → Hidden Layer 1 → Hidden Layer 2 → Output

model = Sequential([
    # Input Layer + First Hidden Layer
    # 64 neurons with ReLU activation for non-linearity
    Dense(64, activation='relu', input_shape=(input_features,), name='hidden_layer_1'),
    BatchNormalization(),  # Normalize inputs to each layer
    Dropout(0.3),          # Prevent overfitting by randomly setting 30% neurons to 0
    
    # Second Hidden Layer
    # 32 neurons (decreasing size for hierarchical feature learning)
    Dense(32, activation='relu', name='hidden_layer_2'),
    BatchNormalization(),
    Dropout(0.2),          # Lower dropout rate for deeper layer
    
    # Output Layer
    # 1 neuron with sigmoid activation for binary classification (0-1 probability)
    Dense(1, activation='sigmoid', name='output_layer')
])

print("🏗️ Model Architecture Created!")
print("Layers: Input → Dense(64) → BatchNorm → Dropout(0.3) → Dense(32) → BatchNorm → Dropout(0.2) → Dense(1)")
print("Activation Functions: ReLU (hidden layers), Sigmoid (output layer)")


🏗️ Model Architecture Created!
Layers: Input → Dense(64) → BatchNorm → Dropout(0.3) → Dense(32) → BatchNorm → Dropout(0.2) → Dense(1)
Activation Functions: ReLU (hidden layers), Sigmoid (output layer)


In [43]:
# Display detailed model architecture
print("📋 Detailed Model Summary:")
print("=" * 60)
model.summary()

# Calculate and display total parameters
total_params = model.count_params()
print(f"\n📊 Total Parameters: {total_params:,}")
print("💡 More parameters = higher capacity but risk of overfitting")

📋 Detailed Model Summary:
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 hidden_layer_1 (Dense)      (None, 64)                832       
                                                                 
 batch_normalization (Batch  (None, 64)                256       
 Normalization)                                                  
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 hidden_layer_2 (Dense)      (None, 32)                2080      
                                                                 
 batch_normalization_1 (Bat  (None, 32)                128       
 chNormalization)                                                
                                                                 
 dropout_1 (Dropout)         (

In [None]:
# =============================================================================
# 11. MODEL COMPILATION
# =============================================================================

# Configure the model for training with optimized hyperparameters
# Use string names for better compatibility across TensorFlow versions
model.compile(
    optimizer='adam',                    # Use string name instead of object
    loss='binary_crossentropy',         # Use string name instead of object
    metrics=['accuracy']                 # Track accuracy metric
)

print("✅ Model Compiled Successfully!")
print("Optimizer: Adam (default learning_rate=0.001)")
print("Loss Function: Binary Crossentropy") 
print("Metrics: Accuracy")
print("💡 Using string names for better TensorFlow version compatibility")

✅ Model Compiled Successfully!
Optimizer: Adam (learning_rate=0.001)
Loss Function: Binary Crossentropy
Metrics: Accuracy, Precision, Recall, F1-Score


In [45]:
# =============================================================================
# 12. SETUP TRAINING CALLBACKS
# =============================================================================

# Create timestamp for unique TensorBoard log directory
from tensorflow.keras.callbacks import EarlyStopping,TensorBoard
log_dir="logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorflow_callback=TensorBoard(log_dir=log_dir,histogram_freq=1)

print(f"📊 TensorBoard log directory: {log_dir}")
print("Use this for monitoring training progress and visualizing metrics")

📊 TensorBoard log directory: logs/fit/20250801-024220
Use this for monitoring training progress and visualizing metrics


In [46]:
# Setup Early Stopping to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss',patience=10,restore_best_weights=True)

# Add Learning Rate Reduction for better convergence
lr_reducer = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,                   # Reduce LR by half
    patience=5,                   # Wait 5 epochs before reducing
    min_lr=0.0001,               # Minimum learning rate
    verbose=1
)

print("✅ Training callbacks configured:")
print("1. Early Stopping: Prevents overfitting (patience=10)")
print("2. Learning Rate Reduction: Improves convergence (patience=5)")
print("3. TensorBoard: Logs training metrics and visualizations")

✅ Training callbacks configured:
1. Early Stopping: Prevents overfitting (patience=10)
2. Learning Rate Reduction: Improves convergence (patience=5)
3. TensorBoard: Logs training metrics and visualizations


In [47]:
# =============================================================================
# 13. MODEL TRAINING
# =============================================================================

print("🚀 Starting Model Training...")
print("This may take a few minutes depending on your hardware.")
print("=" * 60)

# Train the model with optimized parameters
history=model.fit(
    X_train,y_train,validation_data=(X_test,y_test),epochs=100,
    callbacks=[tensorflow_callback,early_stopping]
)

print("\n✅ Model Training Completed!")
print(f"Total epochs trained: {len(history.history['loss'])}")
print("Check TensorBoard for detailed training metrics visualization")

🚀 Starting Model Training...
This may take a few minutes depending on your hardware.
Epoch 1/100


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100

✅ Model Training Completed!
Total epochs trained: 23
Check TensorBoard for detailed training metrics visualization


In [None]:
# =============================================================================
# 14. MODEL SAVING AND EVALUATION  
# =============================================================================

# Save the trained model in both formats for maximum compatibility
print("💾 Saving Model in Multiple Formats...")

# Save in newer Keras format (recommended)
model_keras_filename = 'model.keras'
model.save(model_keras_filename)
print(f"✅ Model saved in Keras format: {model_keras_filename}")

# Save in H5 format for backward compatibility
model_h5_filename = 'model.h5'
model.save(model_h5_filename, save_format='h5')
print(f"✅ Model saved in H5 format: {model_h5_filename}")

print("\n💡 Both formats available:")
print("  • model.keras - Recommended for new deployments")
print("  • model.h5 - For backward compatibility")

# Quick evaluation on test set
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)

print("\n📊 Final Model Performance on Test Set:")
print("=" * 45)
print(f"Test Loss:      {test_loss:.4f}")
print(f"Test Accuracy:  {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")

  saving_api.save_model(


💾 Model Saved Successfully!
Filename: model.h5
The model can now be loaded for predictions in other scripts

📊 Final Model Performance on Test Set:
Test Loss:      0.3332
Test Accuracy:  0.8665 (86.65%)


In [49]:
# =============================================================================
# 15. TENSORBOARD VISUALIZATION
# =============================================================================

# Load TensorBoard extension for Jupyter notebooks
%load_ext tensorboard

print("📊 TensorBoard extension loaded!")
print("You can now visualize training metrics, model architecture, and more")

📊 TensorBoard extension loaded!
You can now visualize training metrics, model architecture, and more


In [50]:
# Launch TensorBoard to visualize training metrics
print("🚀 Launching TensorBoard...")
print("📈 You can view:")
print("  • Training/Validation Loss and Accuracy curves")
print("  • Model architecture graph")
print("  • Weight and bias histograms")
print("  • Learning rate changes")
print("\n" + "="*50)

%tensorboard --logdir logs/fit

🚀 Launching TensorBoard...
📈 You can view:
  • Training/Validation Loss and Accuracy curves
  • Model architecture graph
  • Weight and bias histograms
  • Learning rate changes



Reusing TensorBoard on port 6006 (pid 13216), started 2:04:06 ago. (Use '!kill 13216' to kill it.)

In [51]:
# =============================================================================
# 16. EXPERIMENT CONCLUSION
# =============================================================================

print("🎉 CUSTOMER CHURN PREDICTION MODEL - EXPERIMENT COMPLETED!")
print("=" * 60)
print("✅ Data preprocessing completed")
print("✅ Neural network model trained and saved")
print("✅ Encoders and scaler saved for future predictions")
print("✅ TensorBoard logs generated for analysis")

print("\n📁 Generated Files:")
print("  • model.h5 - Trained neural network")
print("  • label_encoder_gender.pkl - Gender encoder")
print("  • one_hot_encoder_geography.pkl - Geography encoder") 
print("  • scaler.pkl - Feature scaler")
print("  • logs/fit/ - TensorBoard logs")

print("\n🔄 Next Steps:")
print("  1. Use Prediction.ipynb for making individual predictions")
print("  2. Use app.py to run the Streamlit web application")
print("  3. Analyze TensorBoard visualizations for model insights")
print("  4. Consider hyperparameter tuning for improved performance")

🎉 CUSTOMER CHURN PREDICTION MODEL - EXPERIMENT COMPLETED!
✅ Data preprocessing completed
✅ Neural network model trained and saved
✅ Encoders and scaler saved for future predictions
✅ TensorBoard logs generated for analysis

📁 Generated Files:
  • model.h5 - Trained neural network
  • label_encoder_gender.pkl - Gender encoder
  • one_hot_encoder_geography.pkl - Geography encoder
  • scaler.pkl - Feature scaler
  • logs/fit/ - TensorBoard logs

🔄 Next Steps:
  1. Use Prediction.ipynb for making individual predictions
  2. Use app.py to run the Streamlit web application
  3. Analyze TensorBoard visualizations for model insights
  4. Consider hyperparameter tuning for improved performance


## 🔧 Troubleshooting Model Compatibility

If you encounter TensorFlow version compatibility issues:

### Option 1: Quick Fix (Re-run this notebook)
1. Run all cells in this notebook from the beginning
2. This will retrain and save the model with your current TensorFlow version

### Option 2: Update Requirements Only
If the model files exist but have compatibility issues:
```bash
pip install tensorflow>=2.16.0,<2.18.0
```

### Option 3: Manual Model Conversion
```python
# Load old model and save in new format
import tensorflow as tf

# Load with compile=False
old_model = tf.keras.models.load_model('model.h5', compile=False)

# Recompile and save in new format
old_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
old_model.save('model.keras')  # Save in Keras format
```

### File Compatibility Guide:
- **model.keras** - New Keras format (TensorFlow 2.16+)
- **model.h5** - Legacy HDF5 format (compatible with older versions)
- **Both formats** are generated for maximum compatibility