# Fine-Tuning Neural Network Hyperparameters

The flexibility of neural networks is also one of their main drawbacks: there are many hyperparameters to tweak.

---

Not only can you use any imaginable network architecture, but even in a simple MLP you can change
1. number of layers,
2. number of neurons per layer,
3. type of activation function to use in each layer,
4. weight initialization logic and much more

---

One option is to simply ***try many combinations of hyperparameters*** and see which one works best on the validation set (or **using K-fold cross-validation**)

---

For this, one approach is simply use

*   GridSearchCV
*   RandomizedSearchCV

to explore the hyper parameter space.

For this, we need to wrap our Keras models in objects that mimic regular Scikit-Learn regressors.

---

The first step is to create a function that will build and compile a Keras model, given a set of hyperparameters:

In [1]:
import tensorflow as tf
from tensorflow import keras

In [2]:
import sklearn

In [3]:
def build_model(n_hidden=1, n_neurons=30, learning_rate=3e-3, input_shape=[8]):
    model = keras.models.Sequential()
    options = {"input_shape": input_shape}

    for layer in range(n_hidden):
        model.add(keras.layers.Dense(n_neurons, activation="relu", **options))
        options = {}

    # output layer (ONLY ONCE)
    model.add(keras.layers.Dense(1))

    optimizer = keras.optimizers.SGD(learning_rate=learning_rate)
    model.compile(loss="mse", optimizer=optimizer)

    return model


In [4]:
# def build_model(n_hidden=1, n_neurons=30, learning_rate=3e-3, input_shape=[8]):
#   model = keras.models.Sequential()
#   options = {"input_shape": input_shape}
#   for layer in range(n_hidden):
#     model.add(keras.layers.Dense(n_neurons, activation="relu", **options))
#     options = {}
#   model.add(keras.layers.Dense(1, **options))
#   optimizer = keras.optimizers.SGD(learning_rate)
#   model.compile(loss="mse", optimizer=optimizer)
#   return model

In [5]:
# pip install -U scikeras

**Build Keras Regressor**

In [6]:
from scikeras.wrappers import KerasRegressor

keras_reg = KerasRegressor(
    model=build_model,
    n_hidden=1,
    n_neurons=30,
    learning_rate=3e-3,
    verbose=1 # Show epoches progress
)


Data Splitting

In [7]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Load from Excel (offline)
housing = pd.read_excel("/content/fetch_california_housing.xlsx")

y = housing['target']
X = housing.drop(columns='target', axis=1)

X_train_full, X_test, y_train_full, y_test = train_test_split(X, y)

X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_valid_scaled = scaler.transform(X_valid)
X_test_scaled = scaler.transform(X_test)

y_train_scaled = scaler.fit_transform(y_train.to_numpy().reshape(-1, 1))
y_valid_scaled = scaler.fit_transform(y_valid.to_numpy().reshape(-1, 1))
y_test_scaled = scaler.fit_transform(y_test.to_numpy().reshape(-1, 1))

X_new = X_train_scaled[:3]

In [8]:
keras_reg.fit(X_train_scaled, y_train_scaled, epochs=30,
 validation_data=(X_valid_scaled, y_valid_scaled),
 callbacks=[keras.callbacks.EarlyStopping(patience=10)])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [9]:
# !pip uninstall -y scikit-learn

In [10]:
# !pip install scikit-learn==1.5.2

The attribute error is due to scikit learn version.

---

Source - https://stackoverflow.com/a/79300623

Posted by Chan Jun Hao
Retrieved 2026-01-01, License - CC BY-SA 4.0

---

!pip uninstall -y scikit-learn

---

!pip install scikit-learn==1.5.2


In [11]:
mse_test = keras_reg.score(X_test_scaled, y_test_scaled)

y_pred = keras_reg.predict(X_new)

print(f'MSE = {mse_test}\n')
print(f'y_pred = {y_pred}')

MSE = 0.6689161378430605

y_pred = [[ 0.8382323 ]
 [-0.18006986]
 [-1.3051577 ]]


However, we do not actually want to train and evaluate a single model like this, we want to train hundreds of variants and see which one performs best on the validation set.

# Let’s try to explore the number of hidden layers, the number of neurons and the learning rate:


1.   Number of Hidden Layers

2.   Number of Neurons per Hidden Layer

3.   Learning Rate, Batch Size and Other Hyperparameters





In [12]:
import numpy as np
from scipy.stats import reciprocal
from sklearn.model_selection import RandomizedSearchCV

# Set range of hyper parameters to tune
param_distribs = {
 "n_hidden": [0, 1, 2, 3],
 "n_neurons": np.arange(1, 100),
 "learning_rate": reciprocal(3e-4, 3e-2),
}

# Randomized Search CV
rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=10, cv=3)

rnd_search_cv.fit(X_train_scaled, y_train_scaled, epochs=30,
 validation_data=(X_valid_scaled, y_valid_scaled),
 callbacks=[keras.callbacks.EarlyStopping(patience=10)])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/sklearn/model_selection/_validation.py", line 971, in _score
    scores = scorer(estimator, X_test, y_test, **score_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sklearn/metrics/_scorer.py", line 455, in __call__
    return estimator.score(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-

In [15]:
rnd_search_cv.best_params_

{'learning_rate': np.float64(0.016167964685424998),
 'n_hidden': 2,
 'n_neurons': np.int64(40)}

In [17]:
rnd_search_cv.best_score_

np.float64(0.7433532901310796)

In [18]:
model = rnd_search_cv.best_estimator_.model
model

You can now save this model, evaluate it on the test set, and if you are satisfied with its performance, deploy it to production.

---

Hyperparameter tuning is still an active area of research. Evolutionary algorithms are making a comeback lately. For example, check out DeepMind’s excellent 2017 paper16 https://arxiv.org/abs/1711.09846


where they jointly optimize a population of models and their hyperparameters.