### Determining the optimal number of hidden layers and neurons for an Artificial Neural Network (ANN) 
This can be challenging and often requires experimentation. However, there are some guidelines and methods that can help you in making an informed decision:

- Start Simple: Begin with a simple architecture and gradually increase complexity if needed.
- Grid Search/Random Search: Use grid search or random search to try different architectures.
- Cross-Validation: Use cross-validation to evaluate the performance of different architectures.
- Heuristics and Rules of Thumb: Some heuristics and empirical rules can provide starting points, such as:
  -    The number of neurons in the hidden layer should be between the size of the input layer and the size of the output layer.
  -  A common practice is to start with 1-2 hidden layers.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder
from sklearn.pipeline import Pipeline
from scikeras.wrappers import KerasClassifier
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
import pickle

In [3]:
data=pd.read_csv('Churn_Modelling.csv')
data = data.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)

label_encoder_gender = LabelEncoder()
data['Gender'] = label_encoder_gender.fit_transform(data['Gender'])

one_hot_encoder_geo = OneHotEncoder(handle_unknown='ignore')
geo_encoded = one_hot_encoder_geo.fit_transform(data[['Geography']]).toarray()
geo_encoded_df = pd.DataFrame(geo_encoded, columns=one_hot_encoder_geo.get_feature_names_out(['Geography']))

data = pd.concat([data.drop('Geography', axis=1), geo_encoded_df], axis=1)

X = data.drop('Exited', axis=1)
y = data['Exited']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Save encoders and scaler for later use
with open('label_encoder_gender.pkl', 'wb') as file:
    pickle.dump(label_encoder_gender, file)

with open('one_hot_encoder_geo.pkl', 'wb') as file:
    pickle.dump(one_hot_encoder_geo, file)

with open('scaler.pkl', 'wb') as file:
    pickle.dump(scaler, file)

In [4]:
## Define a function to create the model and try different hyperparameters(KerasClassifier)
def create_model(neurons=32, layers=1):
    model = Sequential()
    model.add(Dense(neurons, activation='relu', input_shape=(X_train.shape[1],)))
    
    for _ in range(layers):
        model.add(Dense(neurons, activation='relu'))

    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
    return model

In [8]:
## Create a KerasClassifier
model = KerasClassifier(layers=1, neurons=32, build_fn=create_model, verbose=1)

In [6]:
# Define the grid search parameters
param_grid = {
    'neurons': [32, 64, 128],
    'layers': [1, 2, 3],
    'epochs': [50, 100],
}

In [9]:
# Perform grid search
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3, )
grid_search_result = grid_search.fit(X_train, y_train)

# Print the best parameters and score
print(f"Best Parameters: {grid_search_result.best_params_}")
print(f"Best Score: {grid_search_result.best_score_}")

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
2025-08-06 02:04:33.792680: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.792683: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.792699: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.792682: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.792698: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.792680: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-08-06 02:04:33.

Epoch 1/50


2025-08-06 02:04:35.482967: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50
Epoch 1/50


2025-08-06 02:04:36.057432: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.072272: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.075962: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.080459: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.080739: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.085744: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
2025-08-06 02:04:36.107633: I tensorflow/core/grappler/optimizers/cust

Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 2/50
 11/167 [>.............................] - ETA: 1s - loss: 0.4712 - accuracy: 0.7983Epoch 2/50
  1/167 [..............................] - ETA: 2s - loss: 0.4989 - accuracy: 0.7188Epoch 3/50
Epoch 3/50
 10/167 [>.............................] - ETA: 2s - loss: 0.4512 - accuracy: 0.7969Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 4/50
Epoch 4/50
  1/167 [..............................] - ETA: 2s - loss: 0.5549 - accuracy: 0.7500Epoch 4/50
  5/167 [..............................] - ETA: 2s - loss: 0.4598 - accuracy: 0.8062Epoch 4/50
  9/167 [>.............................] - ETA: 2s - loss: 0.4111 - accuracy: 0.8264Epoch 4/50
Epoch 4/50
Epoch 5/50
  1/167 [..............................] - ETA: 1s - loss: 0.5679 - accuracy: 0.7188Epoch 4/50
Epoch 5/50
Epoch 5/50
 34/167 [=====>........................] - ETA: 1s - loss: 0.4594 - accuracy: 0.8070Epoch 5/50
Epoch 5/50
Epoch 5/50
Epoch 5/5

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 1/50
Epoch 1/50
  6/167 [>.............................] - ETA: 1s - loss: 0.6898 - accuracy: 0.5469  

  X, y = self._initialize(X, y)


 12/167 [=>............................] - ETA: 1s - loss: 0.6694 - accuracy: 0.6302Epoch 1/50

  X, y = self._initialize(X, y)


Epoch 2/50
Epoch 2/50
Epoch 2/50

  X, y = self._initialize(X, y)


Epoch 2/50
Epoch 2/50
Epoch 2/50
Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 3/50
Epoch 2/50
Epoch 4/50
  1/167 [..............................] - ETA: 2s - loss: 0.4611 - accuracy: 0.8438Epoch 4/50
Epoch 2/50
 25/167 [===>..........................] - ETA: 2s - loss: 0.6262 - accuracy: 0.7800Epoch 3/50
Epoch 5/50
  1/167 [..............................] - ETA: 1s - loss: 0.6820 - accuracy: 0.7500Epoch 5/50
Epoch 5/50
Epoch 5/50
Epoch 3/50
Epoch 4/50
Epoch 6/50
Epoch 6/50
Epoch 6/50
Epoch 6/50
Epoch 4/50
Epoch 5/50
Epoch 7/50
Epoch 4/50
Epoch 7/50
 28/167 [====>.........................] - ETA: 2s - loss: 0.6821 - accuracy: 0.7589Epoch 7/50
Epoch 7/50
Epoch 5/50
Epoch 6/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
 31/167 [====>.........................] - ETA: 1s - loss: 1.3688 - accuracy: 0.7248Epoch 7/50
Epoch 6/50
Epoch 9/50
Epoch 9/50
Epoch 9/50
Epoch 9/50
  1/167 [..............................] - ETA: 2s - loss: 4.2775 - accuracy: 0.5312E

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 50/50

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 47/50
  1/167 [..............................] - ETA: 1s - loss: 5381.5352 - accuracy: 0.4688Epoch 1/50
Epoch 48/50


  X, y = self._initialize(X, y)


 37/167 [=====>........................] - ETA: 1s - loss: 15132.5068 - accuracy: 0.6799Epoch 1/50

  X, y = self._initialize(X, y)


Epoch 1/50
14/84 [====>.........................] - ETA: 0sEpoch 1/50 5091.3398 - accuracy: 0.6558
Epoch 1/50
Epoch 48/50
  1/167 [..............................] - ETA: 1s - loss: 8022.9365 - accuracy: 0.5625

  X, y = self._initialize(X, y)


Epoch 49/50
Epoch 49/50
Epoch 50/50
Epoch 50/50


  X, y = self._initialize(X, y)


Epoch 2/50
Epoch 2/100
Epoch 3/50
Epoch 3/50
  1/167 [..............................] - ETA: 2s - loss: 17.0622 - accuracy: 0.4688

  X, y = self._initialize(X, y)


Epoch 3/50
Epoch 3/50
Epoch 3/100
Epoch 4/50
Epoch 4/50
Epoch 4/50
Epoch 4/50
Epoch 4/100
Epoch 2/50
Epoch 5/50
Epoch 5/50
Epoch 5/50
Epoch 5/50
Epoch 3/50
Epoch 6/50
Epoch 6/50
Epoch 6/50
Epoch 6/100
Epoch 3/50
Epoch 3/50
 21/167 [==>...........................] - ETA: 1s - loss: 0.6583 - accuracy: 0.6190Epoch 4/50
Epoch 7/50
Epoch 7/50
Epoch 7/50
 37/167 [=====>........................] - ETA: 1s - loss: 0.5941 - accuracy: 0.7500Epoch 7/50
Epoch 7/100
Epoch 4/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
Epoch 8/50
Epoch 8/100
Epoch 5/50
Epoch 5/50
Epoch 9/50
Epoch 9/50
Epoch 9/50
Epoch 9/100
Epoch 6/50
Epoch 7/50
Epoch 10/50
Epoch 10/50
Epoch 10/50
Epoch 7/50
Epoch 7/50
Epoch 8/50
Epoch 11/50
Epoch 11/50
  5/167 [..............................] - ETA: 2s - loss: 37112.7500 - accuracy: 0.4500Epoch 11/50
  9/167 [>.............................] - ETA: 2s - loss: 39883.8633 - accuracy: 0.5451Epoch 11/50
Epoch 11/100
Epoch 8/50
Epoch 8/50
Epoch 9/50
Epoch 12/50
Epoch 12/50
Epoch 12/50
Epoch 12/10

  X, y = self._initialize(X, y)


Epoch 21/100
Epoch 32/100
Epoch 49/50
Epoch 50/50
Epoch 55/100
Epoch 22/100
Epoch 33/100
Epoch 50/50
Epoch 1/100
  9/167 [>.............................] - ETA: 2s - loss: 53619220.0000 - accuracy: 0.4861Epoch 1/100
Epoch 56/100

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 50/50
  1/167 [..............................] - ETA: 1s - loss: 0.6346 - accuracy: 0.7812Epoch 1/100
  8/167 [>.............................] - ETA: 1s - loss: 0.6128 - accuracy: 0.7734 0.59

  X, y = self._initialize(X, y)


Epoch 34/100
Epoch 57/100

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 35/100
Epoch 58/100
Epoch 25/100
Epoch 36/100
Epoch 59/100
Epoch 2/100
Epoch 2/100
Epoch 2/100
Epoch 2/100
Epoch 37/100
Epoch 60/100
Epoch 2/100
Epoch 3/100
Epoch 38/100
Epoch 61/100
Epoch 4/100
Epoch 4/100
  6/167 [>.............................] - ETA: 1s - loss: 0.4412 - accuracy: 0.7917Epoch 4/100
Epoch 28/100
Epoch 4/100
Epoch 62/100
Epoch 5/100
Epoch 5/100
Epoch 5/100
Epoch 29/100
Epoch 40/100
Epoch 5/100
Epoch 6/100
Epoch 6/100
Epoch 6/100
Epoch 30/100
Epoch 41/100
Epoch 64/100
Epoch 7/100
 37/167 [=====>........................] - ETA: 1s - loss: 0.5041 - accuracy: 0.7998Epoch 7/100
Epoch 2/100
Epoch 7/100
Epoch 2/100
 34/167 [=====>........................] - ETA: 1s - loss: 0.5546 - accuracy: 0.7858Epoch 65/100
Epoch 8/100
Epoch 8/100
Epoch 8/100
  1/167 [..............................] - ETA: 2s - loss: 0.4044 - accuracy: 0.8438Epoch 8/100
Epoch 43/100
Epoch 8/100
Epoch 9/100
Epoch 9/100
  5/167 [..............................] - ETA: 2s - loss: 0.5295 - accuracy: 0.78

  X, y = self._initialize(X, y)


Epoch 48/100
Epoch 82/100
Epoch 48/100
Epoch 72/100
Epoch 49/100
Epoch 44/100
Epoch 49/100
Epoch 49/100
Epoch 43/100
Epoch 73/100
Epoch 50/100
Epoch 45/100
Epoch 50/100
Epoch 50/100
Epoch 44/100
Epoch 50/100
Epoch 85/100
Epoch 45/100
Epoch 51/100
Epoch 75/100
Epoch 51/100
Epoch 47/100
Epoch 86/100
Epoch 52/100
Epoch 52/100
Epoch 52/100
Epoch 76/100
Epoch 53/100
Epoch 48/100
Epoch 53/100
 21/167 [==>...........................] - ETA: 1s - loss: 1.0438 - accuracy: 0.7589Epoch 53/100
Epoch 47/100
Epoch 54/100
 28/167 [====>.........................] - ETA: 1s - loss: 2.4449 - accuracy: 0.7578Epoch 49/100
Epoch 88/100
 31/167 [====>.........................] - ETA: 1s - loss: 1.0309 - accuracy: 0.7550Epoch 54/100
 37/167 [=====>........................] - ETA: 1s - loss: 1.0104 - accuracy: 0.7601Epoch 54/100
Epoch 54/100
Epoch 48/100
Epoch 78/100
Epoch 89/100
Epoch 55/100
 26/167 [===>..........................] - ETA: 1s - loss: 1.6668 - accuracy: 0.7260Epoch 55/100
Epoch 49/100
Epoch 79

  X, y = self._initialize(X, y)


Epoch 2/100
Epoch 69/100
Epoch 69/100
Epoch 93/100
  1/167 [..............................] - ETA: 1s - loss: 2.3266 - accuracy: 0.6875Epoch 70/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 70/100
 23/167 [===>..........................] - ETA: 1s - loss: 2.0105 - accuracy: 0.6970Epoch 64/100
Epoch 4/100
Epoch 71/100
Epoch 67/100
Epoch 95/100
Epoch 72/100
Epoch 65/100
Epoch 72/100
Epoch 68/100
Epoch 72/100
Epoch 66/100
Epoch 6/100
Epoch 73/100
Epoch 97/100
Epoch 74/100
Epoch 73/100
Epoch 67/100
Epoch 7/100
 23/167 [===>..........................] - ETA: 1s - loss: 0.4899 - accuracy: 0.7962Epoch 74/100
Epoch 98/100
Epoch 70/100
Epoch 75/100
Epoch 74/100
Epoch 68/100
 16/167 [=>............................] - ETA: 2s - loss: 0.5705 - accuracy: 0.7715Epoch 75/100
Epoch 71/100
Epoch 76/100
Epoch 75/100
  6/167 [>.............................] - ETA: 1s - loss: 1.2936 - accuracy: 0.7865Epoch 9/100
Epoch 76/100
Epoch 76/100
Epoch 72/100
Epoch 100/100
Epoch 77/100
Epoch 76/100
  9/167 [>..

  X, y = self._initialize(X, y)


Epoch 74/100
Epoch 81/100
Epoch 81/100
Epoch 81/100
Epoch 14/100
 35/167 [=====>........................] - ETA: 1s - loss: 4.6830 - accuracy: 0.7652

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


Epoch 1/100
Epoch 1/100
Epoch 1/100
Epoch 1/100
Epoch 1/100
Epoch 1/100


KeyboardInterrupt: 