## Determining the optimal number of hidden layers and neurons for an Artificail Neural Netword(ANN)

This can be callenging and often requires experimentation. However, there are some guidelines and methods that can help you in making an informed decision.

- Start Simple: Begin with a simple architecture and gradually increase complexity if needed.
- Grid Search/Random Search: Use grid search or random search to try different architecture.
- Croass-Validation: Use cross-validation to evaluate the performance of different architectures.
- Heuristics and Rules of Thumb: Some heuristics and empirical rules can provide starting points, such as:
    - The number of neurons in the hidden layers should be between the size of the input layer and the size of the output layer.
    - A common practice is to start with 1-2 hidden layers.

In [29]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder
from sklearn.pipeline import Pipeline
# from keras.wrappers.scikit_learn import KerasClassifier
from scikeras.wrappers import KerasClassifier
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
import pickle

In [30]:
data = pd.read_csv('Churn_Modelling.csv')
## Processing the data
data = data.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)

## Encoding categorical variables
label_encoder_gender = LabelEncoder()
data['Gender'] = label_encoder_gender.fit_transform(data['Gender'])

onehot_encoder_geography = OneHotEncoder(handle_unknown='ignore')
geography_encoded = onehot_encoder_geography.fit_transform(data[['Geography']]).toarray()
geography_encoded_df = pd.DataFrame(geography_encoded, columns=onehot_encoder_geography.get_feature_names_out(['Geography']))


data = pd.concat([data.drop('Geography', axis=1), geography_encoded_df], axis=1)

X = data.drop('Exited', axis=1)
y = data['Exited']

## Split the data in training and testing stes
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Scaler these features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Save the encoders and scaler for future use
with open('label_encoder_gender.pkl', 'wb') as f:
    pickle.dump(label_encoder_gender, f)

with open('onehot_encoder_geography.pkl', 'wb') as f:
    pickle.dump(onehot_encoder_geography, f)
    
with open('scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)

In [31]:
## Define a function to create the model and try different parameters(KerasClassifier)
def create_model(neurons=32, layers=1):
    model = Sequential()
    model.add(Dense(neurons, activation='relu', input_shape=(X_train.shape[1],)))
    
    for _ in range(layers - 1):
        model.add(Dense(neurons, activation='relu'))
        
    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
    return model

In [32]:
## Create a keras classifier
model=KerasClassifier(layer=1, neurons=32, build_fn=create_model, verbose=1)

In [33]:
param_grid = {
    'model__neurons': [16, 32, 64],  # must match 'neurons' in create_model
    'model__layers': [1, 2, 3],      # must match 'layers' in create_model
    'epochs': [50, 100]              # KerasClassifier parameter
}


In [35]:

model = KerasClassifier(
    model=create_model, 
    verbose=0
)


In [36]:
## Perform grid sarch with cross-validation
# grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3, verbose=1)
# grid_result = grid.fit(X_train, y_train)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3,verbose=1)
grid_result = grid.fit(X_train, y_train)


## print the best parameters and best score
print(f"Best Parameters: {grid_result.best_params_}")
print(f"Best Score: {grid_result.best_score_}")

Fitting 3 folds for each of 18 candidates, totalling 54 fits


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Best Parameters: {'epochs': 50, 'model__layers': 2, 'model__neurons': 16}
Best Score: 0.856624448575586
