# Hyperparameter tuning

## Part 1: Manual Hyperparameter Tuning
### Objective
Manually tune hyperparameters of a neural network and observe the impact on model performance.

### Setup
Start with the necessary imports and dataset preparation. We'll use the MNIST dataset for this exercise, as it's complex enough to demonstrate the effects of hyperparameter tuning.

In [1]:
import tensorflow as tf
from tensorflow import keras
#from tensorflow.keras.models import Sequential
#from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# Load and preprocess the MNIST dataset
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
X_train, X_test = X_train.reshape(-1, 784) / 255.0, X_test.reshape(-1, 784) / 255.0
y_train, y_test = keras.utils.to_categorical(y_train, 10), keras.utils.to_categorical(y_test, 10)


## Task: Manual Tuning of Hyperparameters
1. Build a Base Model: Create a simple neural network as a starting point.
2. Manual Tuning: Experiment by manually changing hyperparameters like learning rate, number of layers/neurons, and activation functions.
3. Training and Evaluation: Train the model with different hyperparameter settings and evaluate its performance.

In [14]:
def build_model(hyperparams):
    # Construct a model based on hyperparams
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(hyperparams["layers"][1], activation=hyperparams["activation"]))

    for _ in range(hyperparams["layers"][0]):
        model.add(keras.layers.Dense(hyperparams["layers"][1], activation=hyperparams["activation"]))

    optimizer = keras.optimizers.legacy.Adam(learning_rate=hyperparams["learning_rate"])

    # Output layer for multi-class classification (10 classes) with softmax activation
    model.add(keras.layers.Dense(10, activation='softmax'))
    model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"])
    return model

# Example hyperparameters to tune
learning_rates = [0.001, 0.01, 0.1]
layer_configs = [(32, 32), (64, 64), (128, 128)]
#learning_rates = [0.01]
#layer_configs = [(32, 32)]
models = []

# Loop through different hyperparameters and train models
for lr in learning_rates:
    for layers in layer_configs:
        # Build and train your model
        hyperparams = {"layers": layers, "activation": "relu", "learning_rate": lr}
        model = build_model(hyperparams)
        model.fit(X_train, y_train, epochs=32, verbose=1, validation_split=0.2)
        models.append((model, hyperparams))

Epoch 1/32
Epoch 2/32
Epoch 3/32
Epoch 4/32
Epoch 5/32
Epoch 6/32
Epoch 7/32
Epoch 8/32
Epoch 9/32
Epoch 10/32
Epoch 11/32
Epoch 12/32
Epoch 13/32
Epoch 14/32
Epoch 15/32
Epoch 16/32
Epoch 17/32
Epoch 18/32
Epoch 19/32
Epoch 20/32
Epoch 21/32
Epoch 22/32
Epoch 23/32
Epoch 24/32
Epoch 25/32
Epoch 26/32
Epoch 27/32
Epoch 28/32
Epoch 29/32
Epoch 30/32
Epoch 31/32
Epoch 32/32
Epoch 1/32
Epoch 2/32
Epoch 3/32
Epoch 4/32
Epoch 5/32
Epoch 6/32
Epoch 7/32
Epoch 8/32
Epoch 9/32
Epoch 10/32
Epoch 11/32
Epoch 12/32
Epoch 13/32
Epoch 14/32
Epoch 15/32
Epoch 16/32
Epoch 17/32
Epoch 18/32
Epoch 19/32
Epoch 20/32
Epoch 21/32
Epoch 22/32
Epoch 23/32
Epoch 24/32
Epoch 25/32
Epoch 26/32
Epoch 27/32
Epoch 28/32
Epoch 29/32
Epoch 30/32
Epoch 31/32
Epoch 32/32
Epoch 1/32
Epoch 2/32
Epoch 3/32
Epoch 4/32
Epoch 5/32
Epoch 6/32
Epoch 7/32
Epoch 8/32
Epoch 9/32
Epoch 10/32
Epoch 11/32
Epoch 12/32
Epoch 13/32
Epoch 14/32
Epoch 15/32
Epoch 16/32
Epoch 17/32
Epoch 18/32
Epoch 19/32
Epoch 20/32
Epoch 21/32
Epoch 2

In [15]:
def evaluate_models(models, X_test, y_test):
    
    # Evaluate each model
    for model, hyperparams in models:
        # Evaluate and print accuracy
        loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
        print(f"Model accuracy with {hyperparams}:\t{accuracy}")
        
evaluate_models(models, X_test, y_test)

Model accuracy with {'layers': (32, 32), 'activation': 'relu', 'learning_rate': 0.001}:	0.7922000288963318
Model accuracy with {'layers': (64, 64), 'activation': 'relu', 'learning_rate': 0.001}:	0.11349999904632568
Model accuracy with {'layers': (128, 128), 'activation': 'relu', 'learning_rate': 0.001}:	0.11349999904632568
Model accuracy with {'layers': (32, 32), 'activation': 'relu', 'learning_rate': 0.01}:	0.11349999904632568
Model accuracy with {'layers': (64, 64), 'activation': 'relu', 'learning_rate': 0.01}:	0.11349999904632568
Model accuracy with {'layers': (128, 128), 'activation': 'relu', 'learning_rate': 0.01}:	0.10090000182390213
Model accuracy with {'layers': (32, 32), 'activation': 'relu', 'learning_rate': 0.1}:	0.11349999904632568
Model accuracy with {'layers': (64, 64), 'activation': 'relu', 'learning_rate': 0.1}:	0.11349999904632568
Model accuracy with {'layers': (128, 128), 'activation': 'relu', 'learning_rate': 0.1}:	0.09799999743700027


## Visualization
Plot the accuracy and loss for different hyperparameter settings.

## Analysis and Questions
* How did different learning rates affect the training process and model accuracy?
* What impact did varying the number of layers and neurons have on the model's performance?
* Were there any combinations of hyperparameters that resulted in particularly good or poor performance?


------------------------------
## Part 2: Automated Hyperparameter Tuning
### Objective
Use automated methods like Grid Search and Random Search for hyperparameter tuning.

### Setup
Reuse the MNIST dataset setup from Part 2.

### Task: Automated Hyperparameter Tuning
1. Grid Search and Random Search: Introduce and apply Grid Search and Random Search using scikit-learn's GridSearchCV or RandomizedSearchCV.
2. Integration with Keras: Show how to use these methods with Keras models.

In [3]:
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scikeras.wrappers import KerasClassifier
from sklearn.metrics import classification_report

# Define a function to create a model (for KerasClassifier)
def create_model_to_search(optimizer="adam", hidden_units=32, learning_rate=0.001):
    # Create a Keras model with hyperparameters
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(hidden_units, activation='relu'))
    model.add(keras.layers.Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model    

# Set up GridSearchCV or RandomizedSearchCV
model_to_search = KerasClassifier(build_fn=create_model_to_search, hidden_units=16)
param_grid = {
    # Define a grid of hyperparameters to search
    "hidden_units": [16, 32, 64],
    "optimizer": ['adam', 'sgd']
}
grid = GridSearchCV(estimator=model_to_search, param_grid=param_grid, verbose=1, cv=3)

# Run grid search
grid_result = grid.fit(X_train, y_train)
print(classification_report(y_test, grid_result.predict(X_test)))

Fitting 3 folds for each of 6 candidates, totalling 18 fits


  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


ValueError: 
All the 18 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
18 fits failed with the following error:
Traceback (most recent call last):
  File "/Users/philipdecanesie/.local/share/virtualenvs/Deep_Learning-hkl5jVZo/lib/python3.10/site-packages/sklearn/model_selection/_validation.py", line 729, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/Users/philipdecanesie/.local/share/virtualenvs/Deep_Learning-hkl5jVZo/lib/python3.10/site-packages/scikeras/wrappers.py", line 1491, in fit
    super().fit(X=X, y=y, sample_weight=sample_weight, **kwargs)
  File "/Users/philipdecanesie/.local/share/virtualenvs/Deep_Learning-hkl5jVZo/lib/python3.10/site-packages/scikeras/wrappers.py", line 760, in fit
    self._fit(
  File "/Users/philipdecanesie/.local/share/virtualenvs/Deep_Learning-hkl5jVZo/lib/python3.10/site-packages/scikeras/wrappers.py", line 926, in _fit
    self._check_model_compatibility(y)
  File "/Users/philipdecanesie/.local/share/virtualenvs/Deep_Learning-hkl5jVZo/lib/python3.10/site-packages/scikeras/wrappers.py", line 549, in _check_model_compatibility
    if self.n_outputs_expected_ != len(self.model_.outputs):
TypeError: object of type 'NoneType' has no len()


## Visualization
Visualize the performance of the best model found by the search methods.
## Analysis and Questions
* Compare the results of manual tuning with automated tuning. Which method gave better results?
* What are the advantages and limitations of using automated methods like Grid Search and Random Search?



In [4]:
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import MinMaxScaler

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess the data
x_train = x_train.reshape(x_train.shape[0], -1).astype('float32') / 255.0
x_test = x_test.reshape(x_test.shape[0], -1).astype('float32') / 255.0

# One-hot encode the target labels
num_classes = 10
y_train = np.eye(num_classes)[y_train]
y_test = np.eye(num_classes)[y_test]

# Define the Keras model function
def create_model(optimizer='adam', dropout_rate=0.2, activation='relu'):
    model = Sequential()
    model.add(Dense(512, input_shape=(784,), activation=activation))
    model.add(Dropout(dropout_rate))
    model.add(Dense(256, activation=activation))
    model.add(Dropout(dropout_rate))
    model.add(Dense(num_classes, activation='softmax'))

    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

# Create a KerasClassifier based on the model function
model = KerasClassifier(build_fn=create_model, verbose=0)

# Define the hyperparameters grid for GridSearchCV
param_grid = {
    'batch_size': [64, 128],
    'epochs': [10, 15],
    'optimizer': ['adam', 'sgd'],
    'model__dropout_rate': [0.2, 0.3],
    'model__activation': ['relu', 'tanh']
}

# Perform GridSearchCV
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, verbose=1, n_jobs=-1)
grid_result = grid.fit(x_train, y_train)

# Print the best results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

# Evaluate the best model on the test set
best_model = grid_result.best_estimator_
test_loss, test_acc = best_model.model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc*100:.2f}%")


Fitting 3 folds for each of 32 candidates, totalling 96 fits


  X, y = self._initialize(X, y)
2023-11-30 11:35:46.797905: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1
2023-11-30 11:35:46.797943: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2023-11-30 11:35:46.797954: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2023-11-30 11:35:46.798013: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-11-30 11:35:46.798046: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
  X, y = self._initialize(X, y)
2023-11-30 11:35:46.940064: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1
2023-11-30 11:35

KeyboardInterrupt: 