# Determining the optimal number of hidden layers and neurons for an Artificial Neural Network(ANN)

This can be challenging and often requires experimentation. However, there are some guidelines and methods that can help you in making an informed decision.

- Start Simple: Begin with a simple architecture and gradually increase complexity if needed.
- Grid Search/Random Search: Use grid search and random search to try diffrent architecures.
- Cross-Validation: USe cross-validation to evaluate the performance of different architectures. Like we do in MAchine learning.
- Heuristics and Rules of Thumb: Some heuristic and empirical rules can provide starting points, such as:
    - The number of neurons in the hidden layer should be between the size of the input layer and the size of the output layer.
    - A common practice is to strat with 1-2 hidden layers.

- Keras classification is an important library in this regard

In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder
from sklearn.pipeline import Pipeline
# from keras.wrappers.scikit_learn import KerasClassifier
from scikeras.wrappers import KerasClassifier
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
import pickle

In [11]:
## Load the dataset
data=pd.read_csv("Churn_Modelling.csv")

# Preprocess the data
# Drop Irrelevant Columns
data.drop(['RowNumber','CustomerId','Surname'], axis=1,inplace=True)

## Feature Engineering
# Encode categorical variable
label_encoder_gender=LabelEncoder()
data['Gender']=label_encoder_gender.fit_transform(data['Gender'])

## OneHotEncode 'Geography' column as we use ANN here. Making spare False
onehot_encoder_geo = OneHotEncoder()
geo_encoder=onehot_encoder_geo.fit_transform(data[['Geography']])

geo_encoder_df=pd.DataFrame(geo_encoder.toarray(), columns=onehot_encoder_geo.get_feature_names_out(['Geography']))

# Combine one hot encoder columns with the original data
data=pd.concat([data.drop('Geography',axis=1),geo_encoder_df],axis=1)

## Divide the data into Dependent and Independent Fetaures
X = data.drop('Exited', axis=1)
y = data['Exited']

## Split the data into training and testing sets
X_train, X_test, y_train,  y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Scale these features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Save the encoders and Standard Scaler
with open('label_encoder_gender.pkl','wb') as file:
    pickle.dump(label_encoder_gender,file)

with open('onehot_encoder_geo.pkl','wb') as file:
    pickle.dump(onehot_encoder_geo, file)

# Save scaler file in pickle file
with open('scaler.pkl','wb') as file:
    pickle.dump(scaler,file)

In [12]:
## Define a function to create the model and try different parameters(KerasClassifier)

def create_model(neurons=32, layers=1):
    model=Sequential()
    model.add(Dense(neurons, activation='relu', input_shape=(X_train.shape[1],)))

    for _ in range(layers-1):
        model.add(Dense(neurons, activation='relu'))
    
    model.add(Dense(1,activation='sigmoid'))
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

    return model

In [13]:
# Create a Keras Classifier- This will be responsible for creating entire ANN
model=KerasClassifier(layers=1,neurons=32,build_fn=create_model,epochs=50,batch_size=10,verbose=0)


In [16]:
# Define the grid search parameters
param_grid= {
    'neurons': [16,32,64,128],
    'layers': [1,2],
    'epochs': [50,100]
}

In [17]:
# Perform grid search i.e training
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3, verbose=1)
grid_result = grid.fit(X_train, y_train)

# Print the best parameters
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

Fitting 3 folds for each of 16 candidates, totalling 48 fits


  X, y = self._initialize(X, y)






Best: 0.858375 using {'epochs': 50, 'layers': 1, 'neurons': 64}


Best: 0.858375 using {'epochs': 50, 'layers': 1, 'neurons': 64}