# Customer Churn Prediction Model - Experiments

This notebook contains the complete workflow for building a deep learning model to predict customer churn using an Artificial Neural Network (ANN). 

## Project Overview
- **Objective**: Predict whether a bank customer will churn (leave the bank)
- **Model**: Deep Neural Network with TensorFlow/Keras
- **Features**: Customer demographics, account information, and banking behavior
- **Target**: Binary classification (Churn: 1, Stay: 0)

## Workflow
1. Data Loading and Exploration
2. Data Preprocessing and Feature Engineering
3. Model Architecture Design
4. Model Training with Callbacks
5. Model Evaluation and Saving

## File Structure
- **Input**: `../Data/Churn_Modelling.csv`
- **Output**: Model and preprocessors saved to `../PickelFiles/`

In [1]:
# =============================================================================
# 1. IMPORT NECESSARY LIBRARIES
# =============================================================================

# Data manipulation and analysis
import pandas as pd
import numpy as np
import os

# Machine learning utilities
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder

# Model persistence
import pickle

# Set random seed for reproducibility
np.random.seed(42)

print("✅ All libraries imported successfully!")
print(f"📊 Pandas version: {pd.__version__}")
print(f"🔢 NumPy version: {np.__version__}")
print("🚀 Ready for data analysis and model training!")

✅ All libraries imported successfully!
📊 Pandas version: 2.3.1
🔢 NumPy version: 1.26.4
🚀 Ready for data analysis and model training!


In [2]:
# =============================================================================
# 2. DATA LOADING AND INITIAL EXPLORATION
# =============================================================================

# Load the customer churn dataset from the correct path
try:
    data = pd.read_csv("../Data/Churn_Modelling.csv")
    print(f"✅ Dataset loaded successfully!")
    print(f"📊 Dataset shape: {data.shape}")
    print(f"📋 Columns: {list(data.columns)}")
except FileNotFoundError:
    print("❌ Error: Dataset not found at '../Data/Churn_Modelling.csv'")
    print("💡 Please ensure the file exists in the Data directory")
    raise

# Display first few rows
print("\n📄 First 5 rows of the dataset:")
data.head()

✅ Dataset loaded successfully!
📊 Dataset shape: (10000, 14)
📋 Columns: ['RowNumber', 'CustomerId', 'Surname', 'CreditScore', 'Geography', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited']

📄 First 5 rows of the dataset:


Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [3]:
# =============================================================================
# 3. DATA PREPROCESSING AND CLEANING
# =============================================================================

# Display basic information about the dataset
print("📊 Dataset Information:")
print("=" * 50)
data.info()

print("\n📈 Basic Statistics:")
print(data.describe())

print("\n🎯 Target Variable Distribution:")
print(data['Exited'].value_counts())
print(f"Churn Rate: {data['Exited'].mean():.2%}")

# Drop irrelevant columns that don't contribute to prediction
# - RowNumber: Just an index, not a feature
# - CustomerId: Unique identifier, not predictive
# - Surname: Customer name, not relevant for churn prediction
columns_to_drop = ['RowNumber', 'CustomerId', 'Surname']
data = data.drop(columns_to_drop, axis=1)

print(f"\n✅ Dropped columns: {columns_to_drop}")
print(f"📋 Remaining columns: {list(data.columns)}")
print(f"📊 New dataset shape: {data.shape}")

# Display cleaned dataset
data.head()

📊 Dataset Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

📈 Basic Statistics:
         RowNumber    CustomerId   CreditScore    

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [4]:
# =============================================================================
# 4. CATEGORICAL VARIABLE ENCODING - GENDER
# =============================================================================

# 4.1 Label Encoding for Gender (Binary categorical variable)
# Gender has only 2 categories (Male/Female), so LabelEncoder is appropriate
print("🔤 Encoding Gender column...")
print(f"Original Gender values: {data['Gender'].unique()}")

label_encoder_gender = LabelEncoder()
data['Gender'] = label_encoder_gender.fit_transform(data['Gender'])

print("✅ Gender encoding completed:")
print(f"📋 Original gender classes: {label_encoder_gender.classes_}")
print(f"🔢 Encoded values: {dict(zip(label_encoder_gender.classes_, range(len(label_encoder_gender.classes_))))}")
print(f"📊 Unique values in Gender column: {sorted(data['Gender'].unique())}")

# Display sample of data with encoded Gender column
print("\n📄 Sample data after Gender encoding:")
data.head(10)

🔤 Encoding Gender column...
Original Gender values: ['Female' 'Male']
✅ Gender encoding completed:
📋 Original gender classes: ['Female' 'Male']
🔢 Encoded values: {'Female': 0, 'Male': 1}
📊 Unique values in Gender column: [0, 1]

📄 Sample data after Gender encoding:


Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,0,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,0,41,1,83807.86,1,0,1,112542.58,0
2,502,France,0,42,8,159660.8,3,1,0,113931.57,1
3,699,France,0,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,0,43,2,125510.82,1,1,1,79084.1,0
5,645,Spain,1,44,8,113755.78,2,1,0,149756.71,1
6,822,France,1,50,7,0.0,2,1,1,10062.8,0
7,376,Germany,0,29,4,115046.74,4,1,0,119346.88,1
8,501,France,1,44,4,142051.07,2,0,1,74940.5,0
9,684,France,1,27,2,134603.88,1,1,1,71725.73,0


In [5]:
# =============================================================================
# 5. CATEGORICAL VARIABLE ENCODING - GEOGRAPHY
# =============================================================================

# 5.1 Check unique values in Geography column before encoding
print("🌍 Geography Analysis:")
geography_values = data['Geography'].unique()
print(f"Unique Geography values: {geography_values}")
print(f"Number of unique geography values: {len(geography_values)}")
print(f"Geography distribution:\n{data['Geography'].value_counts()}")

# 5.2 One-Hot Encoding for Geography (Multi-categorical variable)
# Geography has 3+ categories, so One-Hot Encoding prevents ordinality issues
# This creates binary columns for each geography category
print("\n🔄 Applying One-Hot Encoding to Geography...")

onehot_encoder_geo = OneHotEncoder()
geo_encoded = onehot_encoder_geo.fit_transform(data[['Geography']])

print("✅ Geography One-Hot Encoding completed:")
print(f"📊 Original geography categories: {data['Geography'].unique()}")
print(f"📏 Encoded array shape: {geo_encoded.shape}")
print(f"🏷️ Feature names: {list(onehot_encoder_geo.get_feature_names_out(['Geography']))}")

# Convert sparse matrix to dense array for easier handling
geo_encoded_array = geo_encoded.toarray()
print(f"📊 First 5 rows of encoded geography:")
print(geo_encoded_array[:5])

🌍 Geography Analysis:
Unique Geography values: ['France' 'Spain' 'Germany']
Number of unique geography values: 3
Geography distribution:
Geography
France     5014
Germany    2509
Spain      2477
Name: count, dtype: int64

🔄 Applying One-Hot Encoding to Geography...
✅ Geography One-Hot Encoding completed:
📊 Original geography categories: ['France' 'Spain' 'Germany']
📏 Encoded array shape: (10000, 3)
🏷️ Feature names: ['Geography_France', 'Geography_Germany', 'Geography_Spain']
📊 First 5 rows of encoded geography:
[[1. 0. 0.]
 [0. 0. 1.]
 [1. 0. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


In [6]:
# Display the feature names created by One-Hot Encoder
print("🏷️ Generated feature names from One-Hot Encoding:")
feature_names = onehot_encoder_geo.get_feature_names_out(['Geography'])
for i, name in enumerate(feature_names):
    print(f"{i+1}. {name}")

print(f"\n📊 Total new features created: {len(feature_names)}")
print("💡 Each geography location gets its own binary column")

🏷️ Generated feature names from One-Hot Encoding:
1. Geography_France
2. Geography_Germany
3. Geography_Spain

📊 Total new features created: 3
💡 Each geography location gets its own binary column


In [7]:
# Convert the encoded array to a DataFrame for easier handling
print("🔄 Converting encoded geography to DataFrame...")

geo_encoded_df = pd.DataFrame(
    geo_encoded_array, 
    columns=onehot_encoder_geo.get_feature_names_out(['Geography']),
    index=data.index  # Maintain the same index as original data
)

print("✅ Geography encoding DataFrame created")
print(f"📊 Shape: {geo_encoded_df.shape}")
print(f"📋 Columns: {list(geo_encoded_df.columns)}")

print("\n📄 Encoded Geography DataFrame (first 10 rows):")
geo_encoded_df.head(10)

🔄 Converting encoded geography to DataFrame...
✅ Geography encoding DataFrame created
📊 Shape: (10000, 3)
📋 Columns: ['Geography_France', 'Geography_Germany', 'Geography_Spain']

📄 Encoded Geography DataFrame (first 10 rows):


Unnamed: 0,Geography_France,Geography_Germany,Geography_Spain
0,1.0,0.0,0.0
1,0.0,0.0,1.0
2,1.0,0.0,0.0
3,1.0,0.0,0.0
4,0.0,0.0,1.0
5,0.0,0.0,1.0
6,1.0,0.0,0.0
7,0.0,1.0,0.0
8,1.0,0.0,0.0
9,1.0,0.0,0.0


In [8]:
# =============================================================================
# 6. FEATURE ENGINEERING - COMBINE ENCODED FEATURES
# =============================================================================

print("🔄 Combining original data with encoded geography features...")

# Store original shape for comparison
original_shape = data.shape
print(f"📊 Original data shape: {original_shape}")

# Remove the original categorical Geography column
data = data.drop('Geography', axis=1)
print(f"📊 After dropping Geography: {data.shape}")

# Concatenate the encoded geography features with the main dataset
data = pd.concat([data, geo_encoded_df], axis=1)

print("✅ Feature engineering completed!")
print(f"📊 Final dataset shape: {data.shape}")
print(f"📋 Final columns: {list(data.columns)}")
print(f"➕ Added {data.shape[1] - (original_shape[1] - 1)} new encoded columns")

# Verify no missing values
print(f"\n🔍 Missing values check: {data.isnull().sum().sum()} missing values")

# Display the final preprocessed dataset
print("\n📄 Final Preprocessed Dataset (first 5 rows):")
data.head()

🔄 Combining original data with encoded geography features...
📊 Original data shape: (10000, 11)
📊 After dropping Geography: (10000, 10)
✅ Feature engineering completed!
📊 Final dataset shape: (10000, 13)
📋 Final columns: ['CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited', 'Geography_France', 'Geography_Germany', 'Geography_Spain']
➕ Added 3 new encoded columns

🔍 Missing values check: 0 missing values

📄 Final Preprocessed Dataset (first 5 rows):


Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Geography_France,Geography_Germany,Geography_Spain
0,619,0,42,2,0.0,1,1,1,101348.88,1,1.0,0.0,0.0
1,608,0,41,1,83807.86,1,0,1,112542.58,0,0.0,0.0,1.0
2,502,0,42,8,159660.8,3,1,0,113931.57,1,1.0,0.0,0.0
3,699,0,39,1,0.0,2,0,0,93826.63,0,1.0,0.0,0.0
4,850,0,43,2,125510.82,1,1,1,79084.1,0,0.0,0.0,1.0


In [9]:
# =============================================================================
# 7. SAVE PREPROCESSORS FOR FUTURE USE
# =============================================================================

# Create PickelFiles directory if it doesn't exist
os.makedirs('../PickelFiles', exist_ok=True)
print("📁 Created/verified PickelFiles directory")

# Save the fitted encoders and scaler for use in prediction
# These will be needed to preprocess new data for predictions
print("💾 Saving preprocessors...")

try:
    # Save Label Encoder for Gender
    with open('../PickelFiles/label_encoder_gender.pkl', 'wb') as file:
        pickle.dump(label_encoder_gender, file)
    print("✅ Gender Label Encoder saved to '../PickelFiles/label_encoder_gender.pkl'")

    # Save One-Hot Encoder for Geography
    with open('../PickelFiles/onehot_encoder_geo.pkl', 'wb') as file:
        pickle.dump(onehot_encoder_geo, file)
    print("✅ Geography One-Hot Encoder saved to '../PickelFiles/onehot_encoder_geo.pkl'")

    print("\n💾 All preprocessors saved successfully!")
    print("🔗 These files will be used by the Streamlit app for consistent preprocessing")
    
except Exception as e:
    print(f"❌ Error saving preprocessors: {e}")
    raise

📁 Created/verified PickelFiles directory
💾 Saving preprocessors...
✅ Gender Label Encoder saved to '../PickelFiles/label_encoder_gender.pkl'
✅ Geography One-Hot Encoder saved to '../PickelFiles/onehot_encoder_geo.pkl'

💾 All preprocessors saved successfully!
🔗 These files will be used by the Streamlit app for consistent preprocessing


In [10]:
# Verify the final processed data structure
print("🔍 Final Data Verification:")
print(f"📊 Shape: {data.shape}")
print(f"📋 Columns: {list(data.columns)}")
print(f"🎯 Target column present: {'Exited' in data.columns}")

# Check data types
print(f"\n📈 Data Types:")
print(data.dtypes)

# Display final dataset
data.head()

🔍 Final Data Verification:
📊 Shape: (10000, 13)
📋 Columns: ['CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited', 'Geography_France', 'Geography_Germany', 'Geography_Spain']
🎯 Target column present: True

📈 Data Types:
CreditScore            int64
Gender                 int32
Age                    int64
Tenure                 int64
Balance              float64
NumOfProducts          int64
HasCrCard              int64
IsActiveMember         int64
EstimatedSalary      float64
Exited                 int64
Geography_France     float64
Geography_Germany    float64
Geography_Spain      float64
dtype: object


Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Geography_France,Geography_Germany,Geography_Spain
0,619,0,42,2,0.0,1,1,1,101348.88,1,1.0,0.0,0.0
1,608,0,41,1,83807.86,1,0,1,112542.58,0,0.0,0.0,1.0
2,502,0,42,8,159660.8,3,1,0,113931.57,1,1.0,0.0,0.0
3,699,0,39,1,0.0,2,0,0,93826.63,0,1.0,0.0,0.0
4,850,0,43,2,125510.82,1,1,1,79084.1,0,0.0,0.0,1.0


In [11]:
# =============================================================================
# 8. PREPARE FEATURES AND TARGET VARIABLES
# =============================================================================

# Separate features (X) and target variable (y)
# Features: All columns except 'Exited' (the target we want to predict)
# Target: 'Exited' column (1 = customer churned, 0 = customer stayed)

X = data.drop('Exited', axis=1)  # Features
y = data['Exited']               # Target

print("🎯 Feature-Target Separation:")
print(f"📊 Features (X) shape: {X.shape}")
print(f"🎯 Target (y) shape: {y.shape}")
print(f"📋 Feature columns: {list(X.columns)}")
print(f"🎯 Target distribution: {y.value_counts().to_dict()}")

# =============================================================================
# 9. TRAIN-TEST SPLIT AND FEATURE SCALING
# =============================================================================

# Split the dataset into training and testing sets
# 80% for training, 20% for testing
# random_state=42 ensures reproducible results
# stratify=y maintains the same proportion of target classes in both sets

X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2, 
    random_state=42,
    stratify=y
)

print("\n📊 Train-Test Split:")
print(f"📈 Training set: X_train {X_train.shape}, y_train {y_train.shape}")
print(f"📉 Testing set: X_test {X_test.shape}, y_test {y_test.shape}")
print(f"🎯 Train churn rate: {y_train.mean():.2%}")
print(f"🎯 Test churn rate: {y_test.mean():.2%}")

# Feature Scaling using StandardScaler
# Neural networks perform better with normalized/standardized features
print("\n⚖️ Feature Scaling:")
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)  # Fit and transform training data
X_test = scaler.transform(X_test)        # Only transform test data (no fitting)

print("✅ Features scaled using StandardScaler")
print(f"📊 Scaled training data shape: {X_train.shape}")
print(f"📊 Scaled testing data shape: {X_test.shape}")

🎯 Feature-Target Separation:
📊 Features (X) shape: (10000, 12)
🎯 Target (y) shape: (10000,)
📋 Feature columns: ['CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France', 'Geography_Germany', 'Geography_Spain']
🎯 Target distribution: {0: 7963, 1: 2037}

📊 Train-Test Split:
📈 Training set: X_train (8000, 12), y_train (8000,)
📉 Testing set: X_test (2000, 12), y_test (2000,)
🎯 Train churn rate: 20.38%
🎯 Test churn rate: 20.35%

⚖️ Feature Scaling:
✅ Features scaled using StandardScaler
📊 Scaled training data shape: (8000, 12)
📊 Scaled testing data shape: (2000, 12)


In [12]:
# Preview the scaled training data
print("📊 Scaled Training Data (X_train) - First 5 samples:")
print("Features are now standardized (mean≈0, std≈1)")
print(X_train[:5])

print(f"\n📈 Scaling Verification:")
print(f"Mean of scaled features: {X_train.mean():.6f}")
print(f"Standard deviation of scaled features: {X_train.std():.6f}")
print("💡 Values close to 0 and 1 respectively indicate proper scaling")

📊 Scaled Training Data (X_train) - First 5 samples:
Features are now standardized (mean≈0, std≈1)
[[ 1.058568    0.90750738  1.71508648  0.68472287 -1.22605881 -0.91025649
   0.64104192 -1.030206    1.04208392  1.00175153 -0.57831252 -0.57773517]
 [ 0.91362605  0.90750738 -0.65993547 -0.6962018   0.41328769 -0.91025649
   0.64104192 -1.030206   -0.62355635 -0.99825153  1.72916886 -0.57773517]
 [ 1.07927399 -1.10191942 -0.18493108 -1.73189531  0.60168748  0.80883036
   0.64104192  0.97067965  0.30812779 -0.99825153  1.72916886 -0.57773517]
 [-0.92920731  0.90750738 -0.18493108 -0.00573947 -1.22605881  0.80883036
   0.64104192 -1.030206   -0.29019914  1.00175153 -0.57831252 -0.57773517]
 [ 0.42703522  0.90750738  0.95507945  0.3394917   0.54831832  0.80883036
  -1.55996038  0.97067965  0.13504224 -0.99825153  1.72916886 -0.57773517]]

📈 Scaling Verification:
Mean of scaled features: -0.000000
Standard deviation of scaled features: 1.000000
💡 Values close to 0 and 1 respectively indicate 

In [13]:
# Save the fitted scaler for future use in predictions
# This is crucial for maintaining consistency in feature scaling
try:
    with open('../PickelFiles/scaler.pkl', 'wb') as file:
        pickle.dump(scaler, file)
    
    print("💾 StandardScaler saved successfully to '../PickelFiles/scaler.pkl'!")
    print("⚠️  Important: Use the same scaler for new predictions to maintain consistency")
    
except Exception as e:
    print(f"❌ Error saving scaler: {e}")
    raise

💾 StandardScaler saved successfully to '../PickelFiles/scaler.pkl'!
⚠️  Important: Use the same scaler for new predictions to maintain consistency

⚠️  Important: Use the same scaler for new predictions to maintain consistency


In [14]:
# Final verification before model training
print("✅ Data Preparation Summary:")
print("=" * 50)
print(f"✅ Original dataset loaded from '../Data/Churn_Modelling.csv'")
print(f"✅ Categorical variables encoded (Gender: Label, Geography: One-Hot)")
print(f"✅ Features scaled using StandardScaler")
print(f"✅ Data split into train/test sets (80/20)")
print(f"✅ All preprocessors saved to '../PickelFiles/'")

print(f"\n📊 Final Data Shapes:")
print(f"🔢 Number of input features: {X_train.shape[1]}")
print(f"📈 Training samples: {X_train.shape[0]}")
print(f"📉 Testing samples: {X_test.shape[0]}")

print("\n🚀 Ready for Neural Network Model Development!")

✅ Data Preparation Summary:
✅ Original dataset loaded from '../Data/Churn_Modelling.csv'
✅ Categorical variables encoded (Gender: Label, Geography: One-Hot)
✅ Features scaled using StandardScaler
✅ Data split into train/test sets (80/20)
✅ All preprocessors saved to '../PickelFiles/'

📊 Final Data Shapes:
🔢 Number of input features: 12
📈 Training samples: 8000
📉 Testing samples: 2000

🚀 Ready for Neural Network Model Development!


# 🧠 Artificial Neural Network (ANN) Implementation

## Model Architecture Design
We'll build a deep neural network for binary classification with the following characteristics:

### Architecture Overview
- **Input Layer**: Accepts all preprocessed features
- **Hidden Layer 1**: 64 neurons with ReLU activation
- **Hidden Layer 2**: 32 neurons with ReLU activation  
- **Output Layer**: 1 neuron with Sigmoid activation (probability output)

### Key Design Decisions
- **ReLU Activation**: Prevents vanishing gradient problem
- **Sigmoid Output**: Outputs probability between 0 and 1
- **Decreasing Layer Size**: Creates hierarchical feature learning
- **Binary Classification**: Perfect for churn prediction (Yes/No)

### Training Strategy
- **Optimizer**: Adam (adaptive learning rate)
- **Loss Function**: Binary Crossentropy
- **Metrics**: Accuracy tracking
- **Callbacks**: Early Stopping, TensorBoard logging

In [15]:
# =============================================================================
# 10. DEEP LEARNING MODEL IMPLEMENTATION
# =============================================================================

# Import TensorFlow and Keras components for deep learning
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, TensorBoard, ReduceLROnPlateau
from tensorflow.keras.optimizers import Adam
import datetime

# Set random seed for reproducibility
tf.random.set_seed(42)

print("🧠 Starting Deep Learning Model Development...")
print(f"🔧 TensorFlow version: {tf.__version__}")
print(f"🖥️ GPU Available: {len(tf.config.list_physical_devices('GPU')) > 0}")
print(f"💾 Physical devices: {[device.name for device in tf.config.list_physical_devices()]}")
print("✅ All TensorFlow components imported successfully!")


🧠 Starting Deep Learning Model Development...
🔧 TensorFlow version: 2.15.0
🖥️ GPU Available: False
💾 Physical devices: ['/physical_device:CPU:0']
✅ All TensorFlow components imported successfully!
🧠 Starting Deep Learning Model Development...
🔧 TensorFlow version: 2.15.0
🖥️ GPU Available: False
💾 Physical devices: ['/physical_device:CPU:0']
✅ All TensorFlow components imported successfully!


In [16]:
# Check the number of input features for the model architecture
input_features = X_train.shape[1]
print(f"🔢 Number of input features: {input_features}")
print(f"📐 Input shape for neural network: {(input_features,)}")
print("💡 This will be the input layer size for our neural network")

print(f"\n📊 Training Data Summary:")
print(f"📈 Training samples: {X_train.shape[0]:,}")
print(f"📉 Testing samples: {X_test.shape[0]:,}")
print(f"🎯 Features per sample: {input_features}")
print(f"📊 Total training parameters needed: ~{input_features * 64:,} (first layer)")

🔢 Number of input features: 12
📐 Input shape for neural network: (12,)
💡 This will be the input layer size for our neural network

📊 Training Data Summary:
📈 Training samples: 8,000
📉 Testing samples: 2,000
🎯 Features per sample: 12
📊 Total training parameters needed: ~768 (first layer)


In [17]:
# =============================================================================
# 11. MODEL ARCHITECTURE DESIGN
# =============================================================================

# Build an optimized ANN model for churn prediction
# Architecture: Input → Hidden Layer 1 → Hidden Layer 2 → Output

model = Sequential([
    # Input Layer + First Hidden Layer
    # 64 neurons with ReLU activation for non-linearity
    Dense(64, activation='relu', input_shape=(input_features,), name='hidden_layer_1'),
    BatchNormalization(),  # Normalize inputs to each layer
    Dropout(0.3),          # Prevent overfitting by randomly setting 30% neurons to 0
    
    # Second Hidden Layer
    # 32 neurons (decreasing size for hierarchical feature learning)
    Dense(32, activation='relu', name='hidden_layer_2'),
    BatchNormalization(),
    Dropout(0.2),          # Lower dropout rate for deeper layer
    
    # Output Layer
    # 1 neuron with sigmoid activation for binary classification (0-1 probability)
    Dense(1, activation='sigmoid', name='output_layer')
])

print("🏗️ Model Architecture Created!")
print("📋 Architecture: Input → Dense(64) → BatchNorm → Dropout(0.3) → Dense(32) → BatchNorm → Dropout(0.2) → Dense(1)")
print("🔧 Activation Functions: ReLU (hidden layers), Sigmoid (output layer)")
print("⚡ Regularization: Batch Normalization + Dropout")


🏗️ Model Architecture Created!
📋 Architecture: Input → Dense(64) → BatchNorm → Dropout(0.3) → Dense(32) → BatchNorm → Dropout(0.2) → Dense(1)
🔧 Activation Functions: ReLU (hidden layers), Sigmoid (output layer)
⚡ Regularization: Batch Normalization + Dropout
🏗️ Model Architecture Created!
📋 Architecture: Input → Dense(64) → BatchNorm → Dropout(0.3) → Dense(32) → BatchNorm → Dropout(0.2) → Dense(1)
🔧 Activation Functions: ReLU (hidden layers), Sigmoid (output layer)
⚡ Regularization: Batch Normalization + Dropout


In [18]:
# Display detailed model architecture
print("📋 Detailed Model Summary:")
print("=" * 60)
model.summary()

# Calculate and display total parameters
total_params = model.count_params()
trainable_params = sum([tf.keras.backend.count_params(w) for w in model.trainable_weights])
non_trainable_params = sum([tf.keras.backend.count_params(w) for w in model.non_trainable_weights])

print(f"\n📊 Parameter Analysis:")
print(f"🔢 Total Parameters: {total_params:,}")
print(f"🎯 Trainable Parameters: {trainable_params:,}")
print(f"🔒 Non-trainable Parameters: {non_trainable_params:,}")
print("💡 More parameters = higher capacity but risk of overfitting")

# Estimate model complexity
print(f"\n🧠 Model Complexity:")
print(f"📏 Model depth: {len(model.layers)} layers")
print(f"🔗 Connections: {trainable_params:,} weights and biases to learn")
print(f"💾 Memory footprint: ~{(total_params * 4) / 1024:.1f} KB (32-bit floats)")

📋 Detailed Model Summary:
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 hidden_layer_1 (Dense)      (None, 64)                832       
                                                                 
 batch_normalization (Batch  (None, 64)                256       
 Normalization)                                                  
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 hidden_layer_2 (Dense)      (None, 32)                2080      
                                                                 
 batch_normalization_1 (Bat  (None, 32)                128       
 chNormalization)                                                
                                                                 
______________________________

In [19]:
# =============================================================================
# 12. MODEL COMPILATION CONFIGURATION
# =============================================================================

# Configure optimizer and loss function for training
# Using string names for better TensorFlow version compatibility

print("⚙️ Configuring Model Training Parameters...")

# Learning rate selection
learning_rate = 0.001  # Default Adam learning rate, good starting point
print(f"📈 Learning rate: {learning_rate}")

# Create optimizer and loss function
# Using string names instead of objects for better compatibility
optimizer_name = 'adam'
loss_function = 'binary_crossentropy'

print(f"🔧 Optimizer: {optimizer_name}")
print(f"📉 Loss function: {loss_function}")
print(f"📊 Metrics: accuracy")
print("💡 Using string names for better TensorFlow version compatibility")

⚙️ Configuring Model Training Parameters...
📈 Learning rate: 0.001
🔧 Optimizer: adam
📉 Loss function: binary_crossentropy
📊 Metrics: accuracy
💡 Using string names for better TensorFlow version compatibility


In [20]:
# Compile the model with optimized hyperparameters
model.compile(
    optimizer=optimizer_name,        # Use string name for compatibility
    loss=loss_function,              # Binary crossentropy for binary classification
    metrics=['accuracy']             # Track accuracy during training
)

print("✅ Model Compiled Successfully!")
print(f"🔧 Optimizer: {optimizer_name}")
print(f"📉 Loss Function: {loss_function}")
print(f"📊 Metrics: accuracy")
print("🚀 Model is ready for training!")


✅ Model Compiled Successfully!
🔧 Optimizer: adam
📉 Loss Function: binary_crossentropy
📊 Metrics: accuracy
🚀 Model is ready for training!
✅ Model Compiled Successfully!
🔧 Optimizer: adam
📉 Loss Function: binary_crossentropy
📊 Metrics: accuracy
🚀 Model is ready for training!


In [21]:
# =============================================================================
# 13. SETUP TRAINING CALLBACKS
# =============================================================================

# Create timestamp for unique TensorBoard log directory
log_dir = "../logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")

# Ensure logs directory exists
os.makedirs("../logs/fit", exist_ok=True)

# Setup TensorBoard callback for training visualization
tensorflow_callback = TensorBoard(
    log_dir=log_dir,
    histogram_freq=1,           # Log weight histograms every epoch
    write_graph=True,           # Log the model graph
    write_images=True,          # Log model weights as images
    profile_batch=0             # Disable profiling for performance
)

print("📊 TensorBoard Configuration:")
print(f"📁 Log directory: {log_dir}")
print("📈 Logging: Loss, Accuracy, Weight Histograms, Model Graph")
print("💡 Use 'tensorboard --logdir ../logs/fit' to view training progress")
print("✅ TensorBoard callback configured!")

📊 TensorBoard Configuration:
📁 Log directory: ../logs/fit/20250801-162303
📈 Logging: Loss, Accuracy, Weight Histograms, Model Graph
💡 Use 'tensorboard --logdir ../logs/fit' to view training progress
✅ TensorBoard callback configured!


In [22]:
# Setup Early Stopping to prevent overfitting
early_stopping_callback = EarlyStopping(
    monitor='val_loss',              # Monitor validation loss
    patience=10,                     # Wait 10 epochs before stopping
    restore_best_weights=True,       # Restore best weights when stopping
    verbose=1                        # Print message when stopping
)

# Add Learning Rate Reduction for better convergence
lr_reducer = ReduceLROnPlateau(
    monitor='val_loss',              # Monitor validation loss
    factor=0.5,                      # Reduce LR by half
    patience=5,                      # Wait 5 epochs before reducing
    min_lr=0.0001,                   # Minimum learning rate
    verbose=1                        # Print message when reducing
)

print("✅ Training callbacks configured:")
print("🛑 Early Stopping: Prevents overfitting (patience=10)")
print("📉 Learning Rate Reduction: Improves convergence (patience=5)")
print("📊 TensorBoard: Logs training metrics and visualizations")
print("🎯 All callbacks ready for training!")

✅ Training callbacks configured:
🛑 Early Stopping: Prevents overfitting (patience=10)
📉 Learning Rate Reduction: Improves convergence (patience=5)
📊 TensorBoard: Logs training metrics and visualizations
🎯 All callbacks ready for training!


In [23]:
# =============================================================================
# 14. MODEL TRAINING
# =============================================================================

print("🚀 Starting Model Training...")
print("⏱️ This may take a few minutes depending on your hardware.")
print("=" * 60)

# Train the model with optimized parameters
history = model.fit(
    X_train, y_train,                           # Training data
    validation_data=(X_test, y_test),           # Validation data
    epochs=100,                                 # Maximum epochs
    batch_size=32,                             # Batch size for training
    callbacks=[                                # Training callbacks
        tensorflow_callback,                   # TensorBoard logging
        early_stopping_callback,               # Early stopping
        lr_reducer                             # Learning rate reduction
    ],
    verbose=1                                  # Show training progress
)

print("\n✅ Model Training Completed!")
print(f"📊 Total epochs trained: {len(history.history['loss'])}")
print("📈 Check TensorBoard for detailed training metrics visualization")
print(f"💾 Training history saved in 'history' variable")

# Display final training metrics
final_train_loss = history.history['loss'][-1]
final_val_loss = history.history['val_loss'][-1]
final_train_acc = history.history['accuracy'][-1]
final_val_acc = history.history['val_accuracy'][-1]

print(f"\n🎯 Final Training Metrics:")
print(f"📉 Training Loss: {final_train_loss:.4f}")
print(f"📈 Validation Loss: {final_val_loss:.4f}")
print(f"🎯 Training Accuracy: {final_train_acc:.4f} ({final_train_acc*100:.2f}%)")
print(f"🎯 Validation Accuracy: {final_val_acc:.4f} ({final_val_acc*100:.2f}%)")

🚀 Starting Model Training...
⏱️ This may take a few minutes depending on your hardware.
Epoch 1/100
Epoch 1/100




Epoch 2/100
Epoch 2/100
Epoch 3/100
Epoch 3/100
Epoch 4/100
Epoch 4/100
Epoch 5/100
Epoch 5/100
Epoch 6/100
Epoch 6/100
Epoch 7/100
Epoch 7/100
Epoch 8/100
Epoch 8/100
Epoch 9/100
Epoch 9/100
Epoch 10/100
Epoch 10/100
Epoch 11/100
Epoch 11/100
Epoch 12/100
Epoch 12/100
Epoch 13/100
Epoch 13/100
Epoch 14/100
Epoch 14/100
Epoch 15/100
Epoch 15/100
Epoch 16/100
Epoch 16/100
Epoch 17/100
Epoch 17/100
Epoch 18/100
Epoch 18/100
Epoch 19/100
Epoch 19/100
Epoch 20/100
Epoch 20/100
Epoch 21/100
Epoch 21/100
Epoch 22/100
Epoch 22/100
Epoch 22: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 23/100
  1/250 [..............................] - ETA: 0s - loss: 0.4198 - accuracy: 0.7500
Epoch 22: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 23/100
Epoch 24/100
Epoch 24/100
Epoch 25/100
Epoch 25/100
Epoch 26/100
Epoch 26/100
Epoch 27/10

In [24]:
# =============================================================================
# 15. MODEL SAVING AND EVALUATION
# =============================================================================

print("💾 Saving Model in Multiple Formats...")

try:
    # Save in newer Keras format (recommended)
    model_keras_path = '../PickelFiles/model.keras'
    model.save(model_keras_path)
    print(f"✅ Model saved in Keras format: {model_keras_path}")

    # Save in H5 format for backward compatibility
    model_h5_path = '../PickelFiles/model.h5'
    model.save(model_h5_path, save_format='h5')
    print(f"✅ Model saved in H5 format: {model_h5_path}")

    print("\n💡 Both formats available:")
    print("  • model.keras - Recommended for new deployments")
    print("  • model.h5 - For backward compatibility")

except Exception as e:
    print(f"❌ Error saving model: {e}")
    raise

# Quick evaluation on test set
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)

print("\n📊 Final Model Performance on Test Set:")
print("=" * 45)
print(f"🔥 Test Loss:      {test_loss:.4f}")
print(f"🎯 Test Accuracy:  {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")

# Calculate performance metrics
from sklearn.metrics import classification_report, confusion_matrix
y_pred = (model.predict(X_test) > 0.5).astype(int)

print(f"\n📈 Detailed Performance Report:")
print(classification_report(y_test, y_pred))

print(f"\n🎉 Model Training and Saving Complete!")
print(f"📁 All files saved to '../PickelFiles/' directory")

💾 Saving Model in Multiple Formats...
✅ Model saved in Keras format: ../PickelFiles/model.keras
✅ Model saved in H5 format: ../PickelFiles/model.h5

💡 Both formats available:
  • model.keras - Recommended for new deployments
  • model.h5 - For backward compatibility


  saving_api.save_model(



📊 Final Model Performance on Test Set:
🔥 Test Loss:      0.3325
🎯 Test Accuracy:  0.8675 (86.75%)

📈 Detailed Performance Report:
              precision    recall  f1-score   support

           0       0.88      0.96      0.92      1593
           1       0.77      0.50      0.60       407

    accuracy                           0.87      2000
   macro avg       0.83      0.73      0.76      2000
weighted avg       0.86      0.87      0.86      2000


🎉 Model Training and Saving Complete!
📁 All files saved to '../PickelFiles/' directory

📈 Detailed Performance Report:
              precision    recall  f1-score   support

           0       0.88      0.96      0.92      1593
           1       0.77      0.50      0.60       407

    accuracy                           0.87      2000
   macro avg       0.83      0.73      0.76      2000
weighted avg       0.86      0.87      0.86      2000


🎉 Model Training and Saving Complete!
📁 All files saved to '../PickelFiles/' directory


In [25]:
# =============================================================================
# 16. TENSORBOARD VISUALIZATION
# =============================================================================

# Load TensorBoard extension for Jupyter notebooks
%load_ext tensorboard

print("📊 TensorBoard extension loaded!")
print("🚀 You can now visualize training metrics, model architecture, and more")
print("\n📈 Available visualizations:")
print("  • Training/Validation Loss and Accuracy curves")
print("  • Model architecture graph")
print("  • Weight and bias histograms")
print("  • Learning rate changes")
print("  • Gradient distributions")

📊 TensorBoard extension loaded!
🚀 You can now visualize training metrics, model architecture, and more

📈 Available visualizations:
  • Training/Validation Loss and Accuracy curves
  • Model architecture graph
  • Weight and bias histograms
  • Learning rate changes
  • Gradient distributions


In [26]:
# Launch TensorBoard to visualize training metrics
print("🚀 Launching TensorBoard...")
print("📈 Interactive visualization of model training")
print("\n" + "="*50)

%tensorboard --logdir ../logs/fit

print("\n💡 TensorBoard Tips:")
print("• Scalars: View loss and accuracy curves")
print("• Graphs: Explore model architecture")
print("• Histograms: Analyze weight distributions")
print("• Images: Visualize weight matrices")
print("• Use the timeline slider to see training progress")

🚀 Launching TensorBoard...
📈 Interactive visualization of model training




💡 TensorBoard Tips:
• Scalars: View loss and accuracy curves
• Graphs: Explore model architecture
• Histograms: Analyze weight distributions
• Images: Visualize weight matrices
• Use the timeline slider to see training progress


In [27]:
# =============================================================================
# 17. EXPERIMENT CONCLUSION
# =============================================================================

print("🎉 CUSTOMER CHURN PREDICTION MODEL - EXPERIMENT COMPLETED!")
print("=" * 60)
print("✅ Data preprocessing completed")
print("✅ Neural network model trained and saved")
print("✅ Encoders and scaler saved for future predictions")
print("✅ TensorBoard logs generated for analysis")

print("\n📁 Generated Files in '../PickelFiles/':")
print("  • model.keras - Trained neural network (recommended)")
print("  • model.h5 - Trained neural network (compatibility)")
print("  • label_encoder_gender.pkl - Gender encoder")
print("  • onehot_encoder_geo.pkl - Geography encoder") 
print("  • scaler.pkl - Feature scaler")

print("\n📊 Generated Logs in '../logs/fit/':")
print("  • TensorBoard training logs")
print("  • Model architecture graphs")
print("  • Training metrics history")

print("\n🔄 Next Steps:")
print("  1. Use '../Notebook/prediction.ipynb' for individual predictions")
print("  2. Use '../app.py' to run the Streamlit web application")
print("  3. Analyze TensorBoard visualizations for model insights")
print("  4. Consider hyperparameter tuning for improved performance")
print("  5. Deploy to Streamlit Cloud for public access")

print("\n🚀 Ready for Production Deployment!")
print("💡 All preprocessors and models are saved for consistent predictions")

🎉 CUSTOMER CHURN PREDICTION MODEL - EXPERIMENT COMPLETED!
✅ Data preprocessing completed
✅ Neural network model trained and saved
✅ Encoders and scaler saved for future predictions
✅ TensorBoard logs generated for analysis

📁 Generated Files in '../PickelFiles/':
  • model.keras - Trained neural network (recommended)
  • model.h5 - Trained neural network (compatibility)
  • label_encoder_gender.pkl - Gender encoder
  • onehot_encoder_geo.pkl - Geography encoder
  • scaler.pkl - Feature scaler

📊 Generated Logs in '../logs/fit/':
  • TensorBoard training logs
  • Model architecture graphs
  • Training metrics history

🔄 Next Steps:
  1. Use '../Notebook/prediction.ipynb' for individual predictions
  2. Use '../app.py' to run the Streamlit web application
  3. Analyze TensorBoard visualizations for model insights
  4. Consider hyperparameter tuning for improved performance
  5. Deploy to Streamlit Cloud for public access

🚀 Ready for Production Deployment!
💡 All preprocessors and models 