<p style="text-align:center">
    <a href="https://www.linkedin.com/company/mt-learners/" target="_blank">
    <img src="https://github.com/Mr-MeerMoazzam/Mr-MeerMoazzam/blob/main/Untitled-2.jpg?raw=true" width="200" alt="MT Learners Logo"  />
    </a>
</p>


# **Master Hyperparameter Optimization with Scikit-Learn**


We are already familiar with using `RandomizedSearchCV` and `GridSearchCV` for hyperparameter tuning in machine learning models like linear regression and decision trees. However, we can also apply the same approach to neural networks effortlessly. Keras provides a scikit-learn wrapper that allows us to perform randomized and grid search on its models using a similar syntax, such as `fit()` and `.best_score_`. In this tutorial, we will explore how to use these techniques specifically for a Sequential model.

## **Table of Contents**

<ol>
    <li><a href="https://#Objectives">Objectives</a></li>
    <li>
        <a href="https://#Setup">Setup</a>
        <ol>
            <li><a href="https://#Installing-Required-Libraries">Installing Required Libraries</a></li>
            <li><a href="https://#Importing-Required-Libraries">Importing Required Libraries</a></li>
            <li><a href="https://#Defining-Helper-Functions">Defining Helper Functions</a></li>
        </ol>
    </li>
    <li>
        <a href="https://#Create-the-Model">Create the Model</a>
        <ol>
            <li><a href="https://#Load-the-Data">Load the Data</a></li>
            <li><a href="https://#Data-Wrangling">Data Wrangling</a></li>
            <li><a href="https://#Build-the-Base-Model">Build the Base Model</a></li>
        </ol>
    </li>  
    <li>
        <a href="https://#Randomized-Search">Randomized Search</a>
        <ol>
            <li><a href="https://#Parameters">Parameters</a></li>
            <li><a href="https://#Define-and-Fit-RandomizedSearchCV">Define and Fit RandomizedSearchCV</a></li>
            <li><a href="https://#Performance-Evaluation">Performance Evaluation</a></li>
        </ol>
    </li>


## Objectives

By the end of this tutorial, you will have the following skills:


* Utilize Keras' scikit-learn wrapper to apply sklearn functions on Keras models

* Employ randomized search on Keras models to identify the optimal hyperparameters





## Setup


In this tutorial, we will be using the following libraries:

*   [`numpy`](https://numpy.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for mathematical operations.
*   [`sklearn`](https://scikit-learn.org/stable/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for machine learning and machine-learning-pipeline related functions.
*   [`matplotlib`](https://matplotlib.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for additional plotting tools.
*   [`tensorflow`](https://www.tensorflow.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for machine learning and neural network related functions.


### Installing Required Libraries


In [1]:
!pip install numpy matplotlib scikit-learn tqdm


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


### Importing Required Libraries


In [3]:
import os
from tqdm import tqdm
import numpy as np
%matplotlib inline

import tensorflow as tf
import keras
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV, train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from keras.wrappers.scikit_learn import KerasClassifier

### Defining Helper Functions


In [4]:
# Vectorize integer sequence
def vectorize_sequence(sequence, dimensions):
    results = np.zeros((len(sequence), dimensions))
    for index,value in enumerate(sequence):
        if max(value) < dimensions:
            results[index, value] = 1
    return results

# Convert label into one-hot format
def one_hot_label(labels, dimensions):
    results = np.zeros((len(labels), dimensions))
    for index,value in enumerate(labels):
        if value < dimensions:
            results[index, value] = 1
    return results

## Create the Model


### Load the Data


In this tutorial, we will be working with the [Reuters newswire classification dataset](https://keras.io/api/datasets/reuters/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkML311Coursera35714171-2022-01-01) provided by Keras. The training features of this dataset consist of lists of word indices, each representing the frequency of occurrence within the dataset. The response labels, on the other hand, encompass a diverse range of 46 classes, reflecting the various captivating topics covered by the newswires. 
from Keras. The training features for this dataset are lists of word indices (integers), corresponding to their frequency in the dataset. The response labels take on one of 46 classes, representing the newswire's topic.


In [5]:
X = np.load("/content/x.npy", allow_pickle=True)
y = np.load ("/content/y.npy", allow_pickle=True)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

To get the word for a specific index, we can also extract a dictionary of words to index using the following Keras function.


In [6]:
word_to_ind = tf.keras.datasets.reuters.get_word_index(path="reuters_word_index.json")

### Data Wrangling


Since each observation is a list of words that appear in the newswire, the length varies. Hence, we will vectorize the dataset using `vectorize_sequence()` to ensure that all inputs to our model have the same dimension. Labels are also one-hot encoded with `one_hot_label()` because classes (news topic) are not ordinal.


In [7]:
dim_x = max([max(sequence) for sequence in X_train])+1
dim_y = max(y_train)+1

X_train_vec = vectorize_sequence(X_train, dim_x)
X_test_vec = vectorize_sequence(X_test, dim_x)
y_train_hot = one_hot_label(y_train, dim_y)
y_test_hot = one_hot_label(y_test, dim_y)

### Build the Base Model


In order to apply `RandomizedSearchCV` on Keras models, we will be using `KerasClassifier` from `keras.wrappers.scikit_learn` library, which will let us apply scikit-learn functions on the model.


We define `create_model()` below to detail which layers we want to include in the model. Recall that the final Dense layer has 46 units to correspond to the number of classes. This also prompts us to use categorical cross entropy as a loss function. Here, `neuron` is included as a parameter with default value because we want to tune it later.


In [8]:
# Create Keras Sequential Model as base model
def create_model(neurons = 10):
    model = Sequential()
    model.add(Dense(neurons, activation='linear'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(46, activation='softmax'))
    model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

For the base model, we won't change any parameters so that we can compare them with results after hyperparameter tuning. We also specify some of the default values for hyperparameters that don't appear in `create_model()` (example batch_size, epochs) such that they are defined when applying randomized search.


In [9]:
np.random.seed(0)
base_model = KerasClassifier(build_fn=create_model, verbose=0, batch_size=10, epochs=1)

  base_model = KerasClassifier(build_fn=create_model, verbose=0, batch_size=10, epochs=1)


Fitting the model on the train set, we obtain the test score for our base model.


In [10]:
# Get pre-tuned results
base_model.fit(X_train_vec, y_train_hot)
base_score = base_model.score(X_test_vec, y_test_hot)
print("The baseline accuracy is: %.3f" % base_score)


The baseline accuracy is: 0.734


## Randomized Search


### Parameters



As you may be familiar with from your previous encounters with randomized search on machine learning models, it is necessary to create a dictionary specifying the hyperparameter values for experimentation. To begin, let's define the values we intend to explore. It is important to note that if you wish to test additional parameters, they must also be defined in the base model.

In [1]:
batch_size = [10, 30, 50, 70]
epochs = [1, 5, 7]
neurons = [1, 10, 40, 50]

params = dict(batch_size=batch_size, epochs=epochs, neurons=neurons)
params

{'batch_size': [10, 30, 50, 70],
 'epochs': [1, 5, 7],
 'neurons': [1, 10, 40, 50]}

### Define and Fit RandomizedSearchCV


In [12]:
search = RandomizedSearchCV(estimator=base_model, param_distributions=params, cv=3)


Now, fit randomized search on `X_train_vec` and `y_train_hot` as you would for any other model. **Note that this may take a while to run (10+ minutes)**, especially if there are a lot of parameter combinations, or if the epoch size is big. If you have the resources, you could also switch out `RandomizedSearchCV` for `GridSearchCV` to search over every combination of hyperparameters (takes even more time to run).


In [None]:
search_result = search.fit(X_train_vec, y_train_hot)


### Performance Evaluation


Let's take a look at the results from this search! In particular, we will examine the mean and standard deviation of the cross-validation score under different hyperparameter combinations.


In [None]:
means = search_result.cv_results_['mean_test_score']
stds = search_result.cv_results_['std_test_score']
params = search_result.cv_results_['params']

`RandomizedSearchCV` also has attributes for us to access the best score and parameters directly.


In [None]:
print("Best mean cross-validated score: {} using {}".format(round(search_result.best_score_,3), search_result.best_params_))


We can also print out all the other scores:


In [None]:
for mean, stdev, param in zip(means, stds, params):
    print("Mean cross-validated score: {} ({}) using: {}".format(round(mean,3), round(stdev,3), param))

From this, we can see how different the other models' scores are compared to the optimal model's performance. Some are pretty close to the best score, whereas there are combinations that yield much lower scores.Thank goodness we didn't pick those! With randomized search on neural networks, we are able to determine the best values in an automated way.


Using the best estimator, let's get the test score:


In [None]:
print("Best test score: %.3f" % search_result.best_estimator_.score(X_test_vec, y_test_hot))

Our test score has increased compared to the base model!


## Author


[Moazzam Ali](https://www.linkedin.com/in/moazzam-ali-6a9675237/) is a Machine Learning Engineer at TokToAI.