## Determining the optimal number of hidden layers and neurons for an Artifical Neural Network(ANN)

This can be challenging and often requires  experimentation. However, there are some guidelines as methods that can help you in making an informed decision:

1. Start Simple: Begin with a simple architecture and gradually increase complexity if needed.
2. Grid Search/Random Search: Use grid search to try different architectures
3. Cross-Validation: Use cross-validation to evaluate the performance of different architectures.
4. Heuristics and Rules of Thumb: Some heuristics ans empirical rules can provide starting points, such as:
    1. the number of neurns in the layer should be betwen te size of the input layer and the size of the output layer 
    2. a common practice to start with 1-2 hidden layers

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split,GridSearchCV
from sklearn.preprocessing import StandardScaler,LabelEncoder,OneHotEncoder
from sklearn.pipeline import Pipeline
from scikeras.wrappers import KerasClassifier
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
import pickle

In [2]:
data = pd.read_csv("Churn_Modelling.csv")
data = data.drop(['RowNumber','CustomerId','Surname'],axis=1)

label_encoder_gender= LabelEncoder()
data['Gender'] = label_encoder_gender.fit_transform(data['Gender'])


onehot_encoder_geo = OneHotEncoder()
geo_encoder = onehot_encoder_geo.fit_transform(data[['Geography']])
geo_encoded_df = pd.DataFrame(geo_encoder.toarray(),columns=onehot_encoder_geo.get_feature_names_out(['Geography']))

data = pd.concat([data.drop('Geography',axis=1),geo_encoded_df],axis=1)

x = data.drop('Exited',axis=1)
y = data['Exited']

x_train, x_test, y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=42)

## Scale these features
scaler = StandardScaler()
x_train= scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

## Save the encoders and scaler
with open('label_encoder_gender.pkl','wb') as file:
    pickle.dump(label_encoder_gender,file)
with open('onehot_encoder_geo','wb') as file:
    pickle.dump(onehot_encoder_geo,file)
with open('scaler.pkl','wb') as file:
    pickle.dump(scaler,file)


In [3]:
## Define a function to create the model and try different parameters(KerasClassifier)
def create_model(neurons=32,layers=1):
    model= Sequential()
    model.add(Dense(neurons,activation= 'relu',input_shape=(x_train.shape[1],)))

    for _ in range(layers-1):
        model.add(Dense(neurons,activation='relu'))

    model.add(Dense(1,activation='sigmoid'))
    model.compile(optimizer='adam',loss="binary_crossentropy",metrics=['accuracy'])

    return model


In [4]:
## create a keras classifier
model = KerasClassifier(layers =1 ,neurons=32,build_fn=create_model,verbose=1)

In [5]:
## Define the grid search parameters
param_grid ={
    'neurons':[16,32,64,128],
    'layers':[1,2],
    'epochs':[50,100]
}

In [6]:
# Perform grid search
grid = GridSearchCV(estimator=model,param_grid=param_grid,n_jobs=-1,cv=3,verbose=1)
grid_result = grid.fit(x_train,y_train)

# Print the best parameters
print('Best:%f using %s' % (grid_result.best_score_,grid_result.best_params_) )


Fitting 3 folds for each of 16 candidates, totalling 48 fits


KeyboardInterrupt: 