## Determining the optimal number of hidden layers and neurons for an ANN

This can be challenging and often requires experimentation. However, there are some guidelines and methods that can help you in making an informed decision:

1. Start simple: Begin with a simple architecture and gradually increase complexity if needed.

2. Grid Search/Random Search: Use grid search or random search to try different architectures.

3. Cross-validation: Use cross-validation to evaluate the performance of different architectures.

4. Heuristics and Rules of Thumb: Some heuristics and empirical rules can provide starting points, such as:

    1. The number of neurons in the hidden layer should be between the size of the input layer and the size of the output layer.
    2. A common practice is to start with 1-2 hidden layers.

In [19]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder
from sklearn.pipeline import Pipeline
from scikeras.wrappers import KerasClassifier
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.callbacks import EarlyStopping

In [20]:
data = pd.read_csv('Churn_Modelling.csv')
data.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [21]:
## Drop the columns that are not required
data = data.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)
data.head()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [22]:
## Label Encoding for Gender
gender_label_encoder = LabelEncoder()
data['Gender'] = gender_label_encoder.fit_transform(data['Gender'])

## One Hot Encoding for Geography
geography_onehot_encoder = OneHotEncoder()
geo_encoded = geography_onehot_encoder.fit_transform(data[['Geography']])

geo_encoded_df = pd.DataFrame(geo_encoded.toarray(), columns=geography_onehot_encoder.get_feature_names_out())
geo_encoded_df.head()

data_df = pd.concat([geo_encoded_df, data], axis=1)

data_df = data_df.drop(['Geography'], axis=1)

data_df.head()

Unnamed: 0,Geography_France,Geography_Germany,Geography_Spain,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1.0,0.0,0.0,619,0,42,2,0.0,1,1,1,101348.88,1
1,0.0,0.0,1.0,608,0,41,1,83807.86,1,0,1,112542.58,0
2,1.0,0.0,0.0,502,0,42,8,159660.8,3,1,0,113931.57,1
3,1.0,0.0,0.0,699,0,39,1,0.0,2,0,0,93826.63,0
4,0.0,0.0,1.0,850,0,43,2,125510.82,1,1,1,79084.1,0


In [23]:
data_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 13 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Geography_France   10000 non-null  float64
 1   Geography_Germany  10000 non-null  float64
 2   Geography_Spain    10000 non-null  float64
 3   CreditScore        10000 non-null  int64  
 4   Gender             10000 non-null  int64  
 5   Age                10000 non-null  int64  
 6   Tenure             10000 non-null  int64  
 7   Balance            10000 non-null  float64
 8   NumOfProducts      10000 non-null  int64  
 9   HasCrCard          10000 non-null  int64  
 10  IsActiveMember     10000 non-null  int64  
 11  EstimatedSalary    10000 non-null  float64
 12  Exited             10000 non-null  int64  
dtypes: float64(5), int64(8)
memory usage: 1015.8 KB


In [24]:
## Splitting the data into dependent and independent features
X = data_df.drop(['Exited'], axis=1)
y = data_df['Exited']

In [25]:
## Splitting training and testing data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

In [26]:
## Scaling the data
scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
pd.DataFrame(X_train, columns=X.columns).head()

Unnamed: 0,Geography_France,Geography_Germany,Geography_Spain,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,1.001501,-0.579467,-0.576388,0.3565,0.913248,-0.655786,0.34568,-1.218471,0.808436,0.649203,0.974817,1.36767
1,-0.998501,1.725723,-0.576388,-0.203898,0.913248,0.294938,-0.348369,0.696838,0.808436,0.649203,0.974817,1.661254
2,-0.998501,-0.579467,1.734942,-0.961472,0.913248,-1.416365,-0.695393,0.618629,-0.916688,0.649203,-1.025834,-0.252807
3,1.001501,-0.579467,-0.576388,-0.940717,-1.094993,-1.131148,1.386753,0.953212,-0.916688,0.649203,-1.025834,0.915393
4,1.001501,-0.579467,-0.576388,-1.397337,0.913248,1.625953,1.386753,1.057449,-0.916688,-1.540351,-1.025834,-1.0596


In [27]:
## Define a function to create the model and try different hyperparameters (KerasClassifier)

def create_model(neurons=32, layers=1):
    model = Sequential()
    model.add(Input(shape=(X_train.shape[1],)))
    model.add(Dense(neurons, activation='relu'))

    for _ in range(layers - 1):
        model.add(Dense(neurons, activation='relu'))

    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

    return model

In [31]:
## Create a Keras classifier
model = KerasClassifier(layers=1, neurons=32, build_fn=create_model, epochs=50, batch_size=10, verbose=0)

In [29]:
# Define the grid search parameters
param_grid = {
    'neurons': [16, 32, 64, 128],
    'layers': [1, 2],
    'epochs': [50, 100]
}

In [None]:
## Perform grid search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X_train, y_train)

## Print the best parameters
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")